article

Free Access

The process group approach to reliable distributed computing

Author:
Kenneth P. Birman

View Profile

Authors Info & Claims

Communications of the ACM Volume 36 Issue 12Dec. 1993pp 37–53https://doi.org/10.1145/163298.163303

Published:01 December 1993Publication History

Communications of the ACM

References

1 Ahamad, M., Burns, J., Hutto, R and Neiger, G. Causal memory. Tech. Rep., College of Computing, Georgia Institute of Technology, Atlanta, Ga, July 1991.Google Scholar
2 Allen, T.A., Sheppard, W. and Condon, S. Imis: A distributed query and report formatting system. In Proceedings of the SUN" Users Group Meeting, Sun Microsystems Inc., 1992, pp. 94- I01.Google Scholar
3 Amir, Y., Dolev, D., Kramer, S. and Malki, D. Transis: A communication subsystem for high availability. Tech. Rep. TR 91-13, Computer science Dept., The Hebrew University of Jerusalem, Nov. 1991.Google Scholar
4 Babaoglu, O., Alvisi, L., Amoroso, S., Davoli, R. and Giachini, L.A. Paraiex: An environment for parallel programming distributed systems. In Proceedings of the Sixth ACM Intertill Conference on Supercomputing (Washington, D.C., July 1992), pp. 178- I87. Google ScholarDigital Library
5 Bache, T.C. et. al. The intelligent monitoring system. Bull. Seismological Soc. Am. 80, 6 (Dec. 1990), 59-77.Google Scholar
6 Bernstein, EA., Hadzilacos, V. and Goodman, N. Concurrency Control and Recovery m Database Systems. Addison- Wesley, Reading, Mass., 1987. Google ScholarDigital Library
7 Birman, K.P. Replication and availability in the ISIS system. In Proceedings of the Tenth ACM Symposium on Operating Systems Principles (Orcas Island, Wash. Dec. i985), ACM SIGOPS, pp. 79-86. Google ScholarDigital Library
8 Birman, K. and Cooper, R. The ISIS project: Real experience with a fault tolerant programming system. European SIGOPS Workshop, Sept. 1990. To be published in Oper. Syst. Rev. (Apr. 1991). Also available as Cornell University Computer Science Department Tech. Rep. TR90-1138. Google ScholarDigital Library
9 Birman, K.P. and Joseph, T.A. Exploiting virtual synchrony in distributed systems. In Proceedings of the E&.venth A CM Symposium on Operating Systems Principles (Austin, Tex., Nov. 1987), ACM SIGOPS, pp. 123-138. Google ScholarDigital Library
10 Birman, K. and Joseph, T. Exploiting replication in distributed systems~ in Distributed Systems, Sape Multender, Editor, ACM Press, Addison-Wesley, New York, 1989, pp. 319-368. Google ScholarDigital Library
11 Birman, K., Schiper, A. and Stephenson, P. Lightweight causal and atomic group multicast. ACM Trans. Comput. Syst. 9, 3 (Aug. !99 !). Google ScholarDigital Library
12 Cristian, F. Reaching agreement on processor group membership in synchronous distributed systems. Tech. Rep. Rj5964, IBM Research Laboratory, March 1988.Google Scholar
13 Cristian, F., Aghili, H., Strong, H.R. and Dolev, D. Atomic broadcast: From simple message diffusion to Byzantine agreement. In Proceedings of the F~teenth International Symposium on Fault-Tolerant Computing, (Ann Arbor, Michigan, June 1985), Institution of Electrical and Electronic Engineers, pp. 200-206. A revised version as IBM Tech. Rep. RJ5244.Google Scholar
14 Cristian F. and Dancey, R. Faulttolerance in the advanced automation system. Tech. Rep. RJ7424, IBM Research Laboratory, San Jose, Calif., Apr. 1990.Google Scholar
15 Cheriton, D. and Zwaenepoe!, W. The distributed V kernel and its performance for diskless workstations. In Proceedings of the Ninth ACM Symposium on Opera~ng Systems Principles. (Bretton Woods, New Hampshire, Oct. 1983), ACM SIGOPS, pp. 129- 140. Google ScholarDigital Library
16 Dubois, M., Scheurich, C. and Briggs, F. Memory access buffering in multiprocessors. In Proceedings of the Thirteenth Annual International Symposium on Computer Architecture (June 1986), pp. 434-442. Google ScholarDigital Library
17 Johansen, D. Storrncast: Yet another exercise in distributed computing. In Distributed Open Systems in Perspective, Dag Johansen and Frances Brazier, Eds, IEEE, New York, I993.Google Scholar
18 Joseph T. and Birman, K. Low cost management of replicated data in fault-tolerant distributed systems. ACM Trans. Comput. Syst. 4, 1 (Feb. 1989), 54-70. Google ScholarDigital Library
19 Kaashoek, M,F. Tanenbaum, A.S., Flynn-Hummel, S. and Bal H.E. An efficient reliable broadcast protocol. Oper. Syst. Rev. 23, 4 (Oct. 1989), 5- 19. Google ScholarDigital Library
20 Ladin, R., Liskov, B. and Shrira, I., Lazy replication: Exploring the semantics of distributed services. In Proceedings of the Tenth ACM Symposium on Principles of Dutributed Computing (Quebec City, Quebec, Aug. 1990), ACM SIGOPS-SIGACT, pp. 43-58. Google ScholarDigital Library
21 Lamport, L. Time, clocks, and the ordering of events in a distributed system. Commun. ACM 21, 7 (July 1978), 558-565. Google ScholarDigital Library
22 Liskov, B. and Ladin, R. Highly-available distributed services and faulttolerant distributed garbage collection. In Proceedings of the Fifth ACM Symposium on Principles of Distributed Computing (Calgary, Alberta, Aug. 1986), ACM SIGOPS-SIGACT, pp. 29-39. Google ScholarDigital Library
23 Makpangou, M. and Birman, K. Designing application software in wide area network settings. Tech. Rep. 90- 1165, Department of Computer Science, Cornell University, 1990. Google ScholarDigital Library
24 Marzullo, K., Cooper, R .. Wood, M. and Birman, K. Tools t:br distributed application management. IEEE Com~ put. (Aug. 1991). Google ScholarDigital Library
25 Peterson, L. Preserving context informarion in an ipc abstraction. In Sixth Symposium on Reliability in Distributed Software and Database Systems, IEEE (March 1987), pp. 22-31.Google Scholar
26 Peterson, L.L., Bucholz, N.C. and Schlichting, R. Preserving and using context information in interprocess communication. ACM Trans. Comput. Syst. 7, 3 (Aug. 1989), 217-246. Google ScholarDigital Library
27 Reiter, M., Birman, K.P. and Gong, L. Integrating security in a group oriented distributed system. In Procee& ings of the IEEE Symposium on Research in Security and Pnva~ (May i992), pp. 18-32. Google Scholar
28 Ricciardi, A. and Birman, K. Using process groups to implement failure detection in asynchronous environments. In Proceedings of the Eleventh ACM Symposium on Principles of Distributed Computing (Montreal, Quebec, Aug. 1991), ACM SiGOPS-SIGACT Google ScholarDigital Library
29 Rozier, M., Abrossimov, V., Armand, M., Hermann, F., Kaiser, C., Langlois, S., Leonard, P. and Neuhauser, W. The CHORUS distributed system. Comput. Syst. (Fall 1988), pp. 299-328.Google Scholar
30 Schlichting, R.D. and Schneider, F.B. FaiL-stop processors: An approach to designing llault-tolerant computing systems. A CM Tran,s. Comput, Syst. l, 3 (Aug. 1983), 222-238. Google ScholarDigital Library
31 Schmuck, F. The use of efficient broadcast primitives in asynchronous distributed systems:, Ph.D, dissertation, Cornell University, i988. Google ScholarDigital Library
32 Schneider, F.B. Implementing faulttolerant services using the state machine approach: A tutorial. ACM Comput. Surv 22, 4 (Dec. 1990), 299- 319. Google ScholarDigital Library
33 SiegeL, A., Birman, K. and Marzullo, K. Deceit: A flexible distributed file system. Tech, Rep. 89-1042, Department of Computer Science, Cornelt University, 1989. Google ScholarDigital Library
34 Tanenbaum, A. Computer Network. Prentice-Hail, second ed., 1988. Google ScholarDigital Library
35 Torrellas, j, and Hennessey, J. Estimating the peribrmance advantages of relaxing consistency in a shared memory multiprocessor. Tech. rep. CSI:TN-90-.365, Computer Systems Laboratory, Staniford University, Feb. 1990.Google Scholar
36 Tnrek, J. and Shasha, D. The many faces of Consensus in distributed systems. II~EE Co.mput. 25, 6 (1992), 8- 17. Google ScholarDigital Library
37 Wood, M. Constructing reliable r'eactire systems. Ph.D. dissertation, Cornell University, Department of Computer Science, Dec. 1991.Google Scholar

Index Terms

The process group approach to reliable distributed computing

Recommendations

An ordered and reliable broadcast protocol for distributed systems

The purpose of a reliable broadcast protocol is to allow groups of nodes on unreliable broadcast networks to reliably broadcast messages. A reliable broadcast protocol must guarantee two properties: (1) all of the receivers in a group receive the ...
Read More
Lightweight causal and atomic group multicast
Read More
Ordered and reliable multicast communication
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
Communications of the ACM Volume 36, Issue 12
Dec. 1993
100 pages
ISSN:0001-0782
EISSN:1557-7317
DOI:10.1145/163298
Editor:
Jacques Cohen
Issue’s Table of Contents
Copyright © 1993 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 December 1993
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
fault-tolerant process groups
message ordering
multicast communication
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 468
  Total Citations
  View Citations
- 5,217
  Total Downloads
- Downloads (Last 12 months)679
- Downloads (Last 6 weeks)86
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

The process group approach to reliable distributed computing

Communications of the ACM

References

Cited By

Index Terms

Recommendations

An ordered and reliable broadcast protocol for distributed systems

Lightweight causal and atomic group multicast

Ordered and reliable multicast communication

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

The process group approach to reliable distributed computing

Communications of the ACM

References

Cited By

Index Terms

Recommendations

An ordered and reliable broadcast protocol for distributed systems

Lightweight causal and atomic group multicast

Ordered and reliable multicast communication

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media