ABSTRACT
Distributed consensus is fundamental in distributed systems for achieving fault-tolerance. The Paxos algorithm has long dominated this domain, although it has been recently challenged by algorithms such as Raft and Viewstamped Replication Revisited. These algorithms rely on Paxos's original assumptions, unfortunately these assumptions are now at odds with the reality of the modern internet. Our insight is that current consensus algorithms have significant availability issues when deployed outside the well defined context of the datacenter. To illustrate this problem, we developed Coracle, a tool for evaluating distributed consensus algorithms in settings that more accurately represent realistic deployments. We have used Coracle to test two examples of network configurations that contradict the liveness claims of the Raft algorithm. Through the process of exercising these algorithms under more realistic assumptions, we demonstrate wider availability issues faced by consensus algorithms when deployed on real world networks.
- H. Howard, M. Schwarzkopf, A. Madhavapeddy, and J. Crowcroft. Raft refloated: Do we have consensus? ACM SIGOPS Operating Systems Review, 49(1):12--21, 2015. Google ScholarDigital Library
- L. Lamport. The part-time parliament. ACM Transactions on Computer Systems (TOCS), 16(2):133--169, 1998. Google ScholarDigital Library
- B. Liskov and J. Cowling. Viewstamped replication revisited. 2012.Google Scholar
- D. Ongaro and J. Ousterhout. In search of an understandable consensus algorithm. In USENIX Annual Technical Conference. Google ScholarDigital Library
- F. B. Schneider. Implementing fault-tolerant services using the state machine approach: A tutorial. ACM Computing Surveys (CSUR), 22(4):299--319, 1990. Google ScholarDigital Library
Index Terms
- Coracle: Evaluating Consensus at the Internet Edge
Recommendations
Coracle: Evaluating Consensus at the Internet Edge
SIGCOMM'15Distributed consensus is fundamental in distributed systems for achieving fault-tolerance. The Paxos algorithm has long dominated this domain, although it has been recently challenged by algorithms such as Raft and Viewstamped Replication Revisited. ...
Relaxed Paxos: quorum intersection revisited (again)
PaPoC '22: Proceedings of the 9th Workshop on Principles and Practice of Consistency for Distributed DataDistributed consensus, the ability to reach agreement in the face of failures, is a fundamental primitive for constructing reliable distributed systems. The Paxos algorithm is synonymous with consensus and widely utilized in production. Paxos uses two ...
Fast and simple distributed consensus
The problem of fault-tolerant agreement is fundamental to distributed computing. When agreement is to be reached in spite of arbitrary behavior by faulty processors, this problem is called Distributed Consensus. By requiring that the number of faulty ...
Comments