Contest: Validator Failover Solution - Validator Redundancy and Network Availability
[Dates TBD - Depending on Rust validator release date]
According to Meetup #23, Rust validator may be ready soon. I think it would be good idea to wait for for Rust validator before starting this contest
Contest dates
- Warm-up / commenting / contest improvement period: TBD
- Submission period: TBD
Short description
Develop an architecture solution, scripts, and instructions for deploying secure validator failover system.
Motivation
A secure failover solution for validators will provide two important functions:
-
Prevent slashing - if main server fails, a backup server will continue validating
-
Improve network availability - significantly reduce the probability of > 1/3 of mainnet validators going offline
Professional validators need redundancy to prevent slashing in case of catastrophic server failure such as hardware failure or local internet outage. Also, if a large percentage of validators use a redundant/failover solution, this will improve the network’s availability. For example, if > 1/3 of servers are in the same geographic area, and there is regional internet outage, the backup servers (in a different geographic location) will prevent the network from stalling.
General requirements
- After each election, main server will securely send to backup server all keys, files, and information required for backup server to continue validating if the main server fails / goes offline.
- Remote monitoring / triggering solution:
- Monitor status of main and backup servers
- Detect when main server fails to validate, and trigger backup server to start validating
- Alerting system for status of main/backup servers and failure events
- A method for preventing conflict between main server and backup server (if main server comes back online after failure)
Terms
- System must be secure
Evaluation criteria and winning conditions
Hard criteria
- Final document should be presented in form of white-paper, including an abstract as a preliminary overall description of the system
- Link to documentation at Github/Gitlab or another open repository, with the obligatory backlink to your submission in the repository’s README
- Detailed structured documentation for each part of the system
- Easy to follow instructions for setting up and operating failover system
- System must be tested and proven to operate reliably
Soft criteria
- Simple, elegant solution
- Maximize network robustness / availability
- Minimize the resources/cost required for maintaining backup server(s) and monitoring. The backup server only needs to operate until next elections/validation round (minimum requirement), or until main server comes back online. Assume validation rounds can be one day to one month long.
- Bonus: Obscure/hide location of backup server, for additional network security. For example, external network analysis (or synced node) can not determine location or IP of main server’s backup server. This could use mix-network (for sending election keys/files) with other validators using the same failover solution, onion routing, or a different solution.
- Bonus: Provide optional shared solution for validators that do not want to maintain a full, personal backup server. For example, several validators could share one dedicated server with multiple containers and IPs to backup multiple main servers. (must be additional to personal backup solution).
Artifacts
- Google Doc with the white-paper open for commenting and containing the backlink to the submission.
- Preferably to use block diagrams, schemes, etc.
Rewards
1 place………………………… 100,000 TONs
2 place…………………….…… 75,000 TONs
3 place………….……………… 50,000 TONs
4 place…………………….…… 25,000 TONs
Bonuses:
- +50% of the main reward amount for each bonus (above) achieved
If no participant will demonstrate a reliable working system, an additional stage of contest may be announced later.
Voting
- Jury members who vote in this contest must have a solid understanding of the technology. Those jurors who don’t, should not vote or choose “Abstain.”
- Jurors whose team(s) intend to participate in this contest by providing submissions lose their right to vote in this contest.
- Each juror will vote by rating each submission on a scale of 1 to 10 or can choose to reject it if it does not meet requirements or choose to abstain from voting if they feel unqualified to judge.
- Jurors will provide feedback on your submissions.
- The Jury will reject duplicate, sub-par, incomplete, or inappropriate submissions.
Jury rewards
An amount equal to 5% of the prize fund will be divided equitably between all jurors who vote and provide feedback based on their votes’ quantity and quality. Both voting and feedback are mandatory to collect this reward.
Procedural reminders to all contestants
- Accessibility. All submissions must be accessible for the Jury to open and view, so please double-check your submission. If the submission is inaccessible or does not fit the criteria described, jurors may reject the submission.
- Timing. Contestants must submit their work before the closing of the filing of applications. If not submitted on time, the submission will not count.
- Contact information. All submissions must contain the contestant’s contact information, preferably a Telegram username by which jurors can verify that the submission belongs to the individual who submitted it. If not, jurors may reject your submission.
- Content. The content published in the forum and the provided PDF file should not differ, except for formatting. Otherwise, jurors may reject the submission.
- Well-formed links. If your submission has links to the work performed, the content of those links must have the contestant’s contact details, preferably a Telegram username, so jurors can match it and verify whom the work belongs. If not, jurors may reject your submission.
- Multiple submissions.
- Each contestant has the right to provide several submissions if they contain different approaches to the contest problem’s solving. However, if works are not unique enough or differ just in insignificant details, jurors may reject such repeating submissions.
- If the contestant wants to make an additional submission that overrides the one previously published, he must inform the Jury about this fact and indicate the correct revision to assess. In this case, only the indicated work will count. If the contestant hasn’t indicated the updated submission as the correct one, only the first one will count, the Jury will reject all the others.
Disclaimer
Anyone can participate, but Free TON cannot distribute Tons to US citizens or US entities.
Feedback on This Contest is Strongly Encouraged
Please consider this proposal a draft. Feedback to improve this contest specifications and design are strongly encouraged.