research-article

Fast crash recovery in RAMCloud

Authors:
Diego Ongaro

Stanford University

Stanford University
View Profile

,
Stephen M. Rumble

Stanford University

Stanford University
View Profile

,
Ryan Stutsman

Stanford University

Stanford University
View Profile

,
John Ousterhout

Stanford University

Stanford University
View Profile

,
Mendel Rosenblum

Stanford University

Stanford University
View Profile

SOSP '11: Proceedings of the Twenty-Third ACM Symposium on Operating Systems PrinciplesOctober 2011Pages 29–41https://doi.org/10.1145/2043556.2043560

Published:23 October 2011Publication History

SOSP '11: Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles

Pages 29–41

ABSTRACT

RAMCloud is a DRAM-based storage system that provides inexpensive durability and availability by recovering quickly after crashes, rather than storing replicas in DRAM. RAMCloud scatters backup data across hundreds or thousands of disks, and it harnesses hundreds of servers in parallel to reconstruct lost data. The system uses a log-structured approach for all its data, in DRAM as well as on disk: this provides high performance both during normal operation and during recovery. RAMCloud employs randomized techniques to manage the system in a scalable and decentralized fashion. In a 60-node cluster, RAMCloud recovers 35 GB of data from a failed server in 1.6 seconds. Our measurements suggest that the approach will scale to recover larger memory sizes (64 GB or more) in less time with larger clusters.

References

More Details on Today's Outage | Facebook, Sept. 2010. http://www.facebook.com/note.php?note_id=431441338919.Google Scholar
Agiga tech agigaram, Mar. 2011. http://www.agigatech.com/agigaram.php.Google Scholar
memcached: a distributed memory object caching system, Jan. 2011. http://www.memcached.org/.Google Scholar
M. K. Aguilera, A. Merchant, M. Shah, A. Veitch, and C. Karamanolis. Sinfonia: A new paradigm for building scalable distributed systems. ACM Trans. Comput. Syst., 27:5:1--5:48, November 2009. Google ScholarDigital Library
Y. Azar, A. Z. Broder, A. R. Karlin, and E. Upfal. Balanced allocations (extended abstract). In Proceedings of the twenty-sixth annual ACM symposium on theory of computing, STOC '94, pages 593--602, New York, NY, USA, 1994. ACM. Google ScholarDigital Library
F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber. Bigtable: A distributed storage system for structured data. ACM Trans. Comput. Syst., 26:4:1--4:26, June 2008. Google ScholarDigital Library
B. F. Cooper, R. Ramakrishnan, U. Srivastava, A. Silberstein, P. Bohannon, H.-A. Jacobsen, N. Puz, D. Weaver, and R. Yerneni. Pnuts: Yahoo!'s hosted data serving platform. Proc. VLDB Endow., 1:1277--1288, August 2008. Google ScholarDigital Library
J. Dean. Keynote talk: Evolution and future directions of large-scale storage and computation systems at google. In Proceedings of the 1st ACM symposium on Cloud computing, Jun 2010. Google ScholarDigital Library
G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: amazon's highly available key-value store. In Proceedings of twenty-first ACM SIGOPS symposium on operating systems principles. SOSP '07, pages 205--220, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
D. J. DeWitt, R. H. Katz, F. Olken, L. D. Shapiro, M. R. Stonebraker, and D. A. Wood. Implementation techniques for main memory database systems. In Proceedings of the 1984 ACM SIGMOD international conference on management of data, SIGMOD '84, pages 1--8, New York, NY, USA, 1984. ACM. Google ScholarDigital Library
H. Garcia-Molina and K. Salem. Main memory database systems: An overview. IEEE Trans. on Knowl. and Data Eng., 4:509--516, December 1992. Google ScholarDigital Library
S. Ghemawat, H. Gobioff, and S.-T. Leung. The google file system. In Proceedings of the nineteenth ACM symposium on Operating systems principles, SOSP '03, pages 29--43, New York, NY, USA, 2003. ACM. Google ScholarDigital Library
M. P. Herlihy and J. M. Wing. Linearizability: a correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst., 12:463--492, July 1990. Google ScholarDigital Library
P. Hunt, M. Konar, F. P. Junqueira, and B. Reed. Zookeeper: wait-free coordination for internet-scale systems. In Proceedings of the 2010 USENIX annual technical conference, USENIX ATC '10, pages 11--11, Berkeley, CA, USA, 2010. USENIX Association. Google ScholarDigital Library
R. Johnson and J. Rothschild. Personal Communications, March 24 and August 20, 2009.Google Scholar
R. Kallman, H. Kimura, J. Natkins, A. Pavlo, A. Rasin, S. Zdonik, E. P. C. Jones, S. Madden, M. Stonebraker, Y. Zhang, J. Hugg, and D. J. Abadi. H-store: a high-performance, distributed main memory transaction processing system. Proc. VLDB Endow., 1:1496--1499, August 2008. Google ScholarDigital Library
M. D. Mitzenmacher. The power of two choices in randomized load balancing. PhD thesis, University of California, Berkeley, 1996. AAI9723118. Google ScholarDigital Library
J. Ousterhout, P. Agrawal, D. Erickson, C. Kozyrakis, J. Leverich, D. Mazières, S. Mitra, A. Narayanan, D. Ongaro, G. Parulkar, M. Rosenblum, S. M. Rumble, E. Stratmann, and R. Stutsman. The case for ramcloud. Commun. ACM, 54:121--130, July 2011. Google ScholarDigital Library
J. K. Ousterhout, A. R. Cherenson, F. Douglis, M. N. Nelson, and B. B. Welch. The sprite network operating system. Computer, 21:23--36, February 1988. Google ScholarDigital Library
D. A. Patterson, G. Gibson, and R. H. Katz. A case for redundant arrays of inexpensive disks (raid). In Proceedings of the 1988 ACM SIGMOD international conference on management of data, SIGMOD '88, pages 109--116, New York, NY, USA, 1988. ACM. Google ScholarDigital Library
M. Rosenblum and J. K. Ousterhout. The design and implementation of a log-structured file system. ACM Trans. Comput. Syst., 10:26--52, February 1992. Google ScholarDigital Library
M. Seltzer, K. A. Smith, H. Balakrishnan, J. Chang, S. McMains, and V. Padmanabhan. File system logging versus clustering: a performance comparison. In Proceedings of the USENIX 1995 Technical Conference, TCON'95, pages 21--21, Berkeley, CA, USA, 1995. USENIX Association. Google ScholarDigital Library
K. Shvachko, H. Kuang, S. Radia, and R. Chansler. The hadoop distributed file system. In Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), MSST '10, pages 1--10, Washington, DC, USA, 2010. IEEE Computer Society. Google ScholarDigital Library
I. Stoica, R. Morris, D. Liben-Nowell, D. R. Karger, M. F. Kaashoek, F. Dabek, and H. Balakrishnan. Chord: a scalable peer-to-peer lookup protocol for internet applications. IEEE/ACM Trans. Netw., 11:17--32, February 2003. Google ScholarDigital Library

Index Terms

Fast crash recovery in RAMCloud
1. General and reference
  1. Cross-computing tools and techniques
    1. Measurement
2. Software and its engineering
  1. Software organization and properties

Recommendations

The RAMCloud Storage System

RAMCloud is a storage system that provides low-latency access to large-scale datasets. To achieve low latency, RAMCloud stores all data in DRAM at all times. To support large capacities (1PB or more), it aggregates the memories of thousands of servers ...
Read More
The Design of efficient initialization and crash recovery for log-based file systems over flash memory

While flash memory has been widely adopted for storage systems for various embedded systems, issues of performance and reliability have started receiving growing attention in recent years. How to provide efficient roll back and quick mounting for flash-...
Read More
Deterministic Crash Recovery for NAND Flash Based Storage Systems
DAC '14: Proceedings of the 51st Annual Design Automation Conference

NAND flash memory has long been the dominant storage medium in mobile devices. However, power failure may occur at any time and result in loss of important data. Crash recovery therefore becomes vitally important in NAND flash memory storage systems. As ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SOSP '11: Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
October 2011
417 pages
ISBN:9781450309776
DOI:10.1145/2043556
General Chair:
Ted Wobber
MSR Silicon Valley
,
Program Chair:
Peter Druschel
MPI-SWS
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 23 October 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
crash recovery
main memory databases
scalability
storage systems
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate131of716submissions,18%
Upcoming Conference
SOSP '24

Sponsor:

sigops

ACM SIGOPS 29th Symposium on Operating Systems Principles

November 5 - 8, 2024

Austin , TX , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 281
  Total Citations
  View Citations
- 1,949
  Total Downloads
- Downloads (Last 12 months)108
- Downloads (Last 6 weeks)12
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Fast crash recovery in RAMCloud

SOSP '11: Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles

ABSTRACT

References

Cited By

Index Terms

Recommendations

The RAMCloud Storage System

The Design of efficient initialization and crash recovery for log-based file systems over flash memory

Deterministic Crash Recovery for NAND Flash Based Storage Systems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Fast crash recovery in RAMCloud

SOSP '11: Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles

ABSTRACT

References

Cited By

Index Terms

Recommendations

The RAMCloud Storage System

The Design of efficient initialization and crash recovery for log-based file systems over flash memory

Deterministic Crash Recovery for NAND Flash Based Storage Systems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media