skip to main content
10.1145/2043556.2043560acmconferencesArticle/Chapter ViewAbstractPublication PagessospConference Proceedingsconference-collections
research-article

Fast crash recovery in RAMCloud

Published:23 October 2011Publication History

ABSTRACT

RAMCloud is a DRAM-based storage system that provides inexpensive durability and availability by recovering quickly after crashes, rather than storing replicas in DRAM. RAMCloud scatters backup data across hundreds or thousands of disks, and it harnesses hundreds of servers in parallel to reconstruct lost data. The system uses a log-structured approach for all its data, in DRAM as well as on disk: this provides high performance both during normal operation and during recovery. RAMCloud employs randomized techniques to manage the system in a scalable and decentralized fashion. In a 60-node cluster, RAMCloud recovers 35 GB of data from a failed server in 1.6 seconds. Our measurements suggest that the approach will scale to recover larger memory sizes (64 GB or more) in less time with larger clusters.

References

  1. More Details on Today's Outage | Facebook, Sept. 2010. http://www.facebook.com/note.php?note_id=431441338919.Google ScholarGoogle Scholar
  2. Agiga tech agigaram, Mar. 2011. http://www.agigatech.com/agigaram.php.Google ScholarGoogle Scholar
  3. memcached: a distributed memory object caching system, Jan. 2011. http://www.memcached.org/.Google ScholarGoogle Scholar
  4. M. K. Aguilera, A. Merchant, M. Shah, A. Veitch, and C. Karamanolis. Sinfonia: A new paradigm for building scalable distributed systems. ACM Trans. Comput. Syst., 27:5:1--5:48, November 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Y. Azar, A. Z. Broder, A. R. Karlin, and E. Upfal. Balanced allocations (extended abstract). In Proceedings of the twenty-sixth annual ACM symposium on theory of computing, STOC '94, pages 593--602, New York, NY, USA, 1994. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber. Bigtable: A distributed storage system for structured data. ACM Trans. Comput. Syst., 26:4:1--4:26, June 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. B. F. Cooper, R. Ramakrishnan, U. Srivastava, A. Silberstein, P. Bohannon, H.-A. Jacobsen, N. Puz, D. Weaver, and R. Yerneni. Pnuts: Yahoo!'s hosted data serving platform. Proc. VLDB Endow., 1:1277--1288, August 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Dean. Keynote talk: Evolution and future directions of large-scale storage and computation systems at google. In Proceedings of the 1st ACM symposium on Cloud computing, Jun 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: amazon's highly available key-value store. In Proceedings of twenty-first ACM SIGOPS symposium on operating systems principles. SOSP '07, pages 205--220, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. D. J. DeWitt, R. H. Katz, F. Olken, L. D. Shapiro, M. R. Stonebraker, and D. A. Wood. Implementation techniques for main memory database systems. In Proceedings of the 1984 ACM SIGMOD international conference on management of data, SIGMOD '84, pages 1--8, New York, NY, USA, 1984. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. H. Garcia-Molina and K. Salem. Main memory database systems: An overview. IEEE Trans. on Knowl. and Data Eng., 4:509--516, December 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Ghemawat, H. Gobioff, and S.-T. Leung. The google file system. In Proceedings of the nineteenth ACM symposium on Operating systems principles, SOSP '03, pages 29--43, New York, NY, USA, 2003. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. P. Herlihy and J. M. Wing. Linearizability: a correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst., 12:463--492, July 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. P. Hunt, M. Konar, F. P. Junqueira, and B. Reed. Zookeeper: wait-free coordination for internet-scale systems. In Proceedings of the 2010 USENIX annual technical conference, USENIX ATC '10, pages 11--11, Berkeley, CA, USA, 2010. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. R. Johnson and J. Rothschild. Personal Communications, March 24 and August 20, 2009.Google ScholarGoogle Scholar
  16. R. Kallman, H. Kimura, J. Natkins, A. Pavlo, A. Rasin, S. Zdonik, E. P. C. Jones, S. Madden, M. Stonebraker, Y. Zhang, J. Hugg, and D. J. Abadi. H-store: a high-performance, distributed main memory transaction processing system. Proc. VLDB Endow., 1:1496--1499, August 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. D. Mitzenmacher. The power of two choices in randomized load balancing. PhD thesis, University of California, Berkeley, 1996. AAI9723118. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Ousterhout, P. Agrawal, D. Erickson, C. Kozyrakis, J. Leverich, D. Mazières, S. Mitra, A. Narayanan, D. Ongaro, G. Parulkar, M. Rosenblum, S. M. Rumble, E. Stratmann, and R. Stutsman. The case for ramcloud. Commun. ACM, 54:121--130, July 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. K. Ousterhout, A. R. Cherenson, F. Douglis, M. N. Nelson, and B. B. Welch. The sprite network operating system. Computer, 21:23--36, February 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. D. A. Patterson, G. Gibson, and R. H. Katz. A case for redundant arrays of inexpensive disks (raid). In Proceedings of the 1988 ACM SIGMOD international conference on management of data, SIGMOD '88, pages 109--116, New York, NY, USA, 1988. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. Rosenblum and J. K. Ousterhout. The design and implementation of a log-structured file system. ACM Trans. Comput. Syst., 10:26--52, February 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. Seltzer, K. A. Smith, H. Balakrishnan, J. Chang, S. McMains, and V. Padmanabhan. File system logging versus clustering: a performance comparison. In Proceedings of the USENIX 1995 Technical Conference, TCON'95, pages 21--21, Berkeley, CA, USA, 1995. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. K. Shvachko, H. Kuang, S. Radia, and R. Chansler. The hadoop distributed file system. In Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), MSST '10, pages 1--10, Washington, DC, USA, 2010. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. I. Stoica, R. Morris, D. Liben-Nowell, D. R. Karger, M. F. Kaashoek, F. Dabek, and H. Balakrishnan. Chord: a scalable peer-to-peer lookup protocol for internet applications. IEEE/ACM Trans. Netw., 11:17--32, February 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Fast crash recovery in RAMCloud

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SOSP '11: Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
          October 2011
          417 pages
          ISBN:9781450309776
          DOI:10.1145/2043556

          Copyright © 2011 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 23 October 2011

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate131of716submissions,18%

          Upcoming Conference

          SOSP '24

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader