ABSTRACT
Unreliable failure detectors were introduced by Chandra and Toueg [2] as a mechanism that provides (possibly incorrect) information about process failures. They showed how unreliable failure detectors can be used to solve the Consensus problem in asynchronous systems. They also showed in [1] that one of the classes of failure detectors they defined, namely Eventually Strong (⋄S), is the weakest class allowing to solve Consensus1.
This brief announcement presents a new algorithm implementing ⋄S. Due to space limitation, the reader is referred to [4] for an in-depth presentation of the algorithm (system model, correctness proof, and performance analysis). Here, we present the general idea of the algorithm and compare it with other algorithms implementing unreliable failure detectors.
The algorithm works as follows. We have n processes, p1, …, pn. Initially, process p1 starts sending messages periodically to the rest of processes. The rest of processes initially trust p1, and wait for its messages. If a process does not receive a message within some timeout period from its trusted process, then it suspects its trusted process and takes the next process as its new trusted process. If a process trusts itself, then it starts sending messages periodically to its successors. Otherwise, it just waits for periodical messages from its trusted process.
If, at some point, a process receives a message from a process pi such that pi precedes its trusted process, then it will trust pi again, increasing the value of its timeout period with respect to pi.
With this algorithm, eventually all the correct processes will permanently trust the same correct process. This provides the eventual weak accuracy property required by ⋄S. By simply suspecting the rest of processes, we obtain the strong completeness property required by ⋄S.
Our algorithm compares favorably with the algorithms proposed in [2] and [3] in terms of the number and size of the messages periodically sent and the total amount of information periodically exchanged. Since algorithms implementing failure detectors need not necessarily be periodic, we propose a new and (we believe) more adequate performance measure, which we call eventual monitoring degree. Informally, this measure counts the number of pairs of correct processes that will infinitely often communicate. We show that the proposed algorithm is optimal with respect to this measure. Table 1 summarizes the comparison, where C denotes the number of correct processes and LFA denotes the proposed algorithm.
- 1.T. D. Chandra, V. Hadzilacos, and S. Toueg. The weakest failure detector for solving consensus. Journal of the ACM, 43(4):685-722, July 1996.]] Google ScholarDigital Library
- 2.T. D. Chandra and S. Toueg. Unreliable failure detectors for reliable distributed systems. Journal of the ACM, 43(2):225-267, March 1996.]] Google ScholarDigital Library
- 3.M. Larrea, S. Ardvalo, and A. Fernandez. Efficient algorithms to implement unreliable failure detectors in partially synchronous systems. In Proceedings of the 13th International Symposium on Distributed Computing (DISC'99), (Formerly WDAG), pages 34-48, September 1999.]] Google ScholarDigital Library
- 4.M. Larrea, A. Fernindez, and S. Ardvalo. Optimal implementation of the weakest failure detector for solving consensus. Technical Report, Universidad Pdblica de Navarra, Departamento de Matemitica e Informitica, January 2000. http://www.gsd.unavarra.es/pres/miembros/mikel/.]]Google ScholarCross Ref
Index Terms
- Optimal implementation of the weakest failure detector for solving consensus (brief announcement)
Recommendations
The weakest failure detector for solving consensus
We determine what information about failures is necessary and sufficient to solve Consensus in asynchronous distributed systems subject to crash failures. In Chandra and Toueg [1996], it is shown that W, a failure detector that provides surprisingly ...
Optimal Implementation of the Weakest Failure Detector for Solving Consensus
SRDS '00: Proceedings of the 19th IEEE Symposium on Reliable Distributed SystemsThe concept of unreliable failure detector was introduced by Chandra and Toueg [2] as a mechanism that provides in-formation about process failures. Depending on the properties the failure detector guarantee, they proposed taxonomy of failure detectors. ...
The Weakest Failure Detector for Eventual Consistency
PODC '15: Proceedings of the 2015 ACM Symposium on Principles of Distributed ComputingIn its classical form, a consistent replicated service requires all replicas to witness the same evolution of the service state. Assuming a message-passing environment with a majority of correct processes, the necessary and sufficient information about ...
Comments