Abstract
This paper presents an algorithm by which a process in a distributed system determines a global state of the system during a computation. Many problems in distributed systems can be cast in terms of the problem of detecting global states. For instance, the global state detection algorithm helps to solve an important class of problems: stable property detection. A stable property is one that persists: once a stable property becomes true it remains true thereafter. Examples of stable properties are “computation has terminated,” “ the system is deadlocked” and “all tokens in a token ring have disappeared.” The stable property detection problem is that of devising algorithms to detect a given stable property. Global state detection can also be used for checkpointing.
- 1 CHANDY, K. M., AND MISRA, J. Distributed computation on graphs: Shortest path algorithms. Cornmun. ACM 25, 11 (Nov. 1982), 833-837. Google Scholar
- 2 CHANDY, K. M., MISRA, J., AND HAAS, L. Distributed deadlock detection. ACM Trans. Cornput. Syst. 1, 2 (May 1983), 144-156. Google Scholar
- 3 DIJKSTRA, E.W. The distributed snapshot of K. M. Chandy and L. Lamport. Tech. Rep. EWD 864a, Univ. of Texas, Austin, Tex., 1984.Google Scholar
- 4 DIJKSTRA, E. W., AND SCHOLTEN, C.S. Termination detection for diffusing computations. Inf. Proc. Lett. 11, 1 (Aug. 1980), 1-4.Google Scholar
- 5 GLIGOR, V. D., AND SHATTUCK, S.H. Deadlock detection in distributed systems. IEEE Trans. Softw. Eng. SE-6, 5 (Sep. 1980), 435-440.Google Scholar
- 6 LAMPORT, L. Time, clocks, and the ordering of events in a distributed system. Cornrnun. ACM 21, 7 (Jul. 1978), 558-565. Google Scholar
- 7 LAMPORT, L., AND CHANDY, K.M. On partially-ordered event models of distributed computations. Submitted for publication.Google Scholar
- 8 MAHOUD, S. A., AND RIORDAN, J. S. Software controlled access to distributed databases. INFOR 15, 1 (Feb. 1977), 22-36.Google Scholar
- 9 MENASCE, D., AND MUNTZ, R. Locking and deadlock detection in distributed data bases. IEEE Trans. Softw. Eng. SE-5, 3 (May 1979), 195-202.Google Scholar
- 10 MISRA, J., AND CHANDY, K.M. Termination detection of diffusing computations in communicating sequential processes. ACM Trans. Program. Lang. Syst. 4, 1 (Jan. 1982), 37-43. Google Scholar
- 11 OBERMARCK, R. Distributed deadlock detection algorithm. A CM Trans. Database Syst. 7, 2 (Jun. 1982), 187-208. Google Scholar
Index Terms
Distributed snapshots: determining global states of distributed systems
Recommendations
Detecting Arbitrary Stable Properties Using Efficient Snapshots
A stable property continues to hold in an execution once it becomes true. Detecting arbitrary stable properties efficiently in distributed executions is still an open problem. The known algorithms for detecting arbitrary stable properties and snapshot ...
Distributed snapshots for ad hoc network systems
Grid and P2P SystemsSnapshot algorithms are fundamental algorithms in recording consistent global states of a distributed system. They find a large class of important applications such as monitoring and debugging of distributed systems and detection of stable properties (...
Efficient Algorithms for Distributed Snapshots and Global Virtual Time Approximation
Special issue on parallel and discrete event simulationThis paper presents snapshot algorithms for determining a consistent global state of a distributed system without significantly affecting the underlying computation. These algorithms do not require channels to be FIFO or messages to be acknowledged. ...
Comments