skip to main content
10.1145/3183713.3196912acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Carousel: Low-Latency Transaction Processing for Globally-Distributed Data

Published:27 May 2018Publication History

ABSTRACT

The trend towards global applications and services has created an increasing demand for transaction processing on globally-distributed data. Many database systems, such as Spanner and CockroachDB, support distributed transactions but require a large number of wide-area network roundtrips to commit each transaction and ensure the transaction's state is durably replicated across multiple datacenters. This can significantly increase transaction completion time, resulting in developers replacing database-level transactions with their own error-prone application-level solutions.

This paper introduces Carousel, a distributed database system that provides low-latency transaction processing for multi-partition globally-distributed transactions. Carousel shortens transaction processing time by reducing the number of sequential wide-area network round trips required to commit a transaction and replicate its results while maintaining serializability. This is possible in part by using information about a transaction's potential write set to enable transaction processing, including any necessary remote read operations, to overlap with 2PC and state replication. Carousel further reduces transaction completion time by introducing a consensus protocol that can perform state replication in parallel with 2PC. For a multi-partition 2-round Fixed-set Interactive (2FI) transaction, Carousel requires at most two wide-area network roundtrips to commit the transaction when there are no failures, and only one round trip in the common case if local replicas are available.

References

  1. Atul Adya, Robert Gruber, Barbara Liskov, and Umesh Maheshwari. 1995. Efficient Optimistic Concurrency Control Using Loosely Synchronized Clocks SIGMOD. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Divy Agrawal, Amr El Abbadi, and Kenneth Salem. 2015. A Taxonomy of Partitioned Replicated Cloud-based Database Systems. IEEE Data Eng. Bull. Vol. 38, 1 (2015).Google ScholarGoogle Scholar
  3. Marcos K. Aguilera, Arif Merchant, Mehul Shah, Alistair Veitch, and Christos Karamanolis. 2007. Sinfonia: A New Paradigm for Building Scalable Distributed Systems SOSP. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Jason Baker, Chris Bond, James C. Corbett, JJ Furman, Andrey Khorlin, James Larson, Jean-Michel Leon, Yawei Li, Alexander Lloyd, and Vadim Yushprakh. 2011. Megastore: Providing Scalable, Highly Available Storage for Interactive Services Proceedings of the Conference on Innovative Data system Research (CIDR).Google ScholarGoogle Scholar
  5. Philip A. Bernstein, Istvan Cseri, Nishant Dani, Nigel Ellis, Ajay Kalhan, Gopal Kakivaya, David B. Lomet, Ramesh Manne, Lev Novik, and Tomas Talius. 2011. Adapting microsoft SQL server for cloud computing ICDE. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Nathan Bronson, Zach Amsden, George Cabrera, Prasad Chakka, Peter Dimov, Hui Ding, Jack Ferris, Anthony Giardullo, Sachin Kulkarni, Harry Li, Mark Marchukov, Dmitri Petrov, Lovro Puzar, Yee Jiun Song, and Venkat Venkataramani. 2013. TAO: Facebook's Distributed Data Store for the Social Graph USENIX ATC. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Mike Burrows. 2006. The Chubby Lock Service for Loosely-coupled Distributed Systems OSDI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Cockroach Labs. 2017. CockroachDB. https://github.com/cockroachdb/cockroach. (2017).Google ScholarGoogle Scholar
  9. Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, and Ramana Yerneni. 2008. PNUTS: Yahoo!'s Hosted Data Serving Platform. VLDB (2008). Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking Cloud Serving Systems with YCSB. In SoCC. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, J. J. Furman, Sanjay Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, Wilson Hsieh, Sebastian Kanthak, Eugene Kogan, Hongyi Li, Alexander Lloyd, Sergey Melnik, David Mwaura, David Nagle, Sean Quinlan, Rajesh Rao, Lindsay Rolig, Yasushi Saito, Michal Szymaniak, Christopher Taylor, Ruth Wang, and Dale Woodford. 2012. Spanner: Google's Globally-distributed Database. In OSDI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. CoreOS. 2017. Raft Implementation. https://github.com/coreos/etcd/tree/master. (2017).Google ScholarGoogle Scholar
  13. James Cowling and Barbara Liskov. 2012. Granola: Low-overhead Distributed Transaction Coordination USENIX ATC. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. 2007. Dynamo: Amazon's Highly Available Key-value Store. In SOSP. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Akon Dey, Alan Fekete, Raghunath Nambiar, and Uwe Rohm. 2014. YCSB T: Benchmarking web-scale transactional databases ICDEW.Google ScholarGoogle Scholar
  16. Robert Escriva and Robbert van Renesse. 2016. Consus: Taming the Paxi. CoRR Vol. abs/1612.03457 (2016).Google ScholarGoogle Scholar
  17. Google. 2017. gRPC-go. https://github.com/grpc/grpc-go. (2017).Google ScholarGoogle Scholar
  18. Stephen Hemminger. 2005. Network Emulation with NetEm. In Australia's 6th National Linux Conference.Google ScholarGoogle Scholar
  19. Patrick Hunt, Mahadev Konar, Flavio P. Junqueira, and Benjamin Reed. 2010. ZooKeeper: Wait-free Coordination for Internet-scale Systems USENIX ATC. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Flavio Paiva Junqueira, Benjamin C. Reed, and Marco Serafini. 2011. Zab: High-performance broadcast for primary-backup systems Proceedings of the 2011 IEEE/IFIP International Conference on Dependable Systems and Networks. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Robert Kallman, Hideaki Kimura, Jonathan Natkins, Andrew Pavlo, Alexander Rasin, Stanley Zdonik, Evan P. C. Jones, Samuel Madden, Michael Stonebraker, Yang Zhang, John Hugg, and Daniel J. Abadi. 2008. H-store: A High-performance, Distributed Main Memory Transaction Processing System. VLDB. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. David Karger, Eric Lehman, Tom Leighton, Rina Panigrahy, Matthew Levine, and Daniel Lewin. 1997. Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web. In Proceedings of the Twenty-ninth Annual ACM Symposium on Theory of Computing. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Tim Kraska, Gene Pang, Michael J. Franklin, Samuel Madden, and Alan Fekete. 2013. MDCC: Multi-data Center Consistency. In EuroSys. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Avinash Lakshman and Prashant Malik. 2010. Cassandra: A Decentralized Structured Storage System. SIGOPS Oper. Syst. Rev. Vol. 44, 2 (2010). Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Leslie Lamport. 1998. The Part-time Parliament. ACM Trans. Comput. Syst. Vol. 16, 2 (1998). Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Leslie Lamport. 2001. Paxos Made Simple. Technical Report, Microsoft (2001).Google ScholarGoogle Scholar
  27. Leslie Lamport. 2005. Generalized Consensus and Paxos. Technical Report, Microsoft (2005).Google ScholarGoogle Scholar
  28. Leslie Lamport. 2006. Fast Paxos. Distributed Computing Vol. 19 (October . 2006).Google ScholarGoogle Scholar
  29. Leslie Lamport and Mike Massa. 2004. Cheap Paxos. Technical Report, Microsoft (2004).Google ScholarGoogle Scholar
  30. Costin Leau. 2013. Spring Data Redis - Retwis-J. https://docs.spring.io/spring-data/data-keyvalue/examples/retwisj/current/. (2013).Google ScholarGoogle Scholar
  31. Barbara Liskov, Miguel Castro, Liuba Shrira, and Atul Adya. 1999. Providing Persistent Objects in Distributed Systems ECOOP. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Wyatt Lloyd, Michael J. Freedman, Michael Kaminsky, and David G. Andersen. 2011. Don'T Settle for Eventual: Scalable Causal Consistency for Wide-area Storage with COPS. In SOSP. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Wyatt Lloyd, Michael J. Freedman, Michael Kaminsky, and David G. Andersen. 2013. Stronger Semantics for Low-latency Geo-replicated Storage NSDI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Hatem Mahmoud, Faisal Nawab, Alexander Pucher, Divyakant Agrawal, and Amr El Abbadi. 2013. Low-latency Multi-datacenter Databases Using Replicated Commit. VLDB. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Yanhua Mao, Flavio P. Junqueira, and Keith Marzullo. 2008. Mencius: Building Efficient Replicated State Machines for WANs OSDI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Iulian Moraru, David G. Andersen, and Michael Kaminsky. 2013. There is More Consensus in Egalitarian Parliaments SOSP. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Shuai Mu, Yang Cui, Yang Zhang, Wyatt Lloyd, and Jinyang Li. 2014. Extracting More Concurrency from Distributed Transactions OSDI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Shuai Mu, Lamont Nelson, Wyatt Lloyd, and Jinyang Li. 2016. Consolidating Concurrency Control and Consensus for Commits under Conflicts OSDI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Brian M. Oki and Barbara H. Liskov. 1988. Viewstamped Replication: A New Primary Copy Method to Support Highly-Available Distributed Systems. In PODC. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Diego Ongaro and John Ousterhout. 2014. In Search of an Understandable Consensus Algorithm USENIX ATC. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Andrew Pavlo. 2017. What Are We Doing With Our Lives?: Nobody Cares About Our Concurrency Control Research. In SIGMOD. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Dan R. K. Ports, Jialin Li, Vincent Liu, Naveen Kr. Sharma, and Arvind Krishnamurthy. 2015. Designing Distributed Systems Using Approximate Synchrony in Data Center Networks NSDI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Yair Sovran, Russell Power, Marcos K. Aguilera, and Jinyang Li. 2011. Transactional Storage for Geo-replicated Systems. In SOSP. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. D. B. Terry, M. M. Theimer, Karin Petersen, A. J. Demers, M. J. Spreitzer, and C. H. Hauser. 1995. Managing Update Conflicts in Bayou, a Weakly Connected Replicated Storage System SOSP. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Alexander Thomson and Daniel J. Abadi. 2010. The Case for Determinism in Database Systems. Proc. VLDB Endowment Vol. 3, 1--2 (2010), 70--80. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Alexander Thomson, Thaddeus Diamond, Shu-Chun Weng, Kun Ren, Philip Shao, and Daniel J. Abadi. 2012. Calvin: Fast Distributed Transactions for Partitioned Database Systems SIGMOD. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. UWSysLab. 2017. TAPIR Implementation. https://github.com/UWSysLab/tapir. (2017).Google ScholarGoogle Scholar
  48. Robbert Van Renesse and Deniz Altinbuken. 2015. Paxos Made Moderately Complex. ACM Comput. Surv. Vol. 47, 3 (2015). Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Robbert van Renesse, Nicolas Schiper, and Fred B. Schneider. 2015. Vive La Différence: Paxos vs. Viewstamped Replication vs. Zab. IEEE Trans. Dependable Sec. Comput. Vol. 12, 4 (2015).Google ScholarGoogle Scholar
  50. Irene Zhang, Naveen Kr. Sharma, Adriana Szekeres, Arvind Krishnamurthy, and Dan R. K. Ports. 2015. Building Consistent Transactions with Inconsistent Replication SOSP. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Yang Zhang, Russell Power, Siyuan Zhou, Yair Sovran, Marcos K. Aguilera, and Jinyang Li. 2013. Transaction Chains: Achieving Serializability with Low Latency in Geo-distributed Storage Systems. In SOSP. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Carousel: Low-Latency Transaction Processing for Globally-Distributed Data

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGMOD '18: Proceedings of the 2018 International Conference on Management of Data
        May 2018
        1874 pages
        ISBN:9781450347037
        DOI:10.1145/3183713

        Copyright © 2018 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 27 May 2018

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        SIGMOD '18 Paper Acceptance Rate90of461submissions,20%Overall Acceptance Rate785of4,003submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader