ABSTRACT
Web applications have been shifting their storage systems from sql to nosql systems. nosql systems scale well but drop many convenient sql features, such as joins, secondary indexes, and/or transactions. We design, develop, and evaluate Yesquel, a system that provides performance and scalability comparable to nosql with all the features of a sql relational system. Yesquel has a new architecture and a new distributed data structure, called YDBT, which Yesquel uses for storage, and which performs well under contention by many concurrent clients. We evaluate Yesquel and find that Yesquel performs almost as well as Redis---a popular nosql system---and much better than mysql Cluster, while handling sql queries at scale.
Supplemental Material
- Adya, A., Gruber, R., Liskov, B., and Maheshwari, U. Efficient optimistic concurrency control using loosely synchronized clocks. In International Conference on Management of Data (May 1995), pp. 23--34. Google ScholarDigital Library
- Aguilera, M. K., Golab, W., and Shah, M. A practical scalable distributed B-tree. Proceedings of the VLDB Endowment 1, 1 (Aug. 2008), 598--609. Google ScholarDigital Library
- Aguilera, M. K., Leners, J. B., Kotla, R., and Walfish, M. Yesquel: Scalable SQL storage for Web applications. In International Conference on Distributed Computing and Networking (Jan. 2015). Invited keynote presentation. Google ScholarDigital Library
- Aguilera, M. K., Leners, J. B., and Walfish, M. Distributed SQL query processing using key-value storage system, Dec. 2012. United States Patent Application 20140172898, filed 13 December 2012.Google Scholar
- Aguilera, M. K., Merchant, A., Shah, M., Veitch, A., and Karamanolis, C. Sinfonia: A new paradigm for building scalable distributed systems. ACM Transactions on Computer Systems 27, 3 (Nov. 2009), 5:1--5:48. Google ScholarDigital Library
- Alsberg, P. A., and Day, J. D. A principle for resilient sharing of distributed resources. In International Conference on Software Engineering (Oct. 1976), pp. 562--570. Google ScholarDigital Library
- Aspnes, J., and Shah, G. Skip graphs. ACM Transactions on Algorithms 3, 4 (Nov. 2007), 37. Google ScholarDigital Library
- Berenson, H., et al. A critique of ANSI SQL isolation levels. In International Conference on Management of Data (May 1995), pp. 1--10. Google ScholarDigital Library
- Bernstein, P. A., Hadzilacos, V., and Goodman, N. Concurrency Control and Recovery in Database Systems. Addison-Wesley, 1987. Google ScholarDigital Library
- Charron-Bost, B., Pedone, F., and Schiper, A., Eds. Replication: Theory and Practice. Springer, 2010. Google ScholarDigital Library
- Corbett, J. C., et al. Spanner: Google's globally-distributed database. In Symposium on Operating Systems Design and Implementation (Oct. 2012), pp. 251--264. Google ScholarDigital Library
- Dean, J., and Ghemawat, S. MapReduce: Simplified data processing on large clusters. In Symposium on Operating Systems Design and Implementation (Dec. 2004), pp. 137--150. Google ScholarDigital Library
- Diaconu, C., Freedman, C., Ismert, E., Larson, P.-A., Mittal, P., Stonecipher, R., Verma, N., and Zwilling, M. Hekaton: SQL Server's memory-optimized OLTP engine. In International Conference on Management of Data (June 2013), pp. 1243--1254. Google ScholarDigital Library
- https://www.mapr.com/products/apache-drill.Google Scholar
- Du, J., Elnikety, S., and Zwaenepoel, W. Clock-SI: Snapshot isolation for partitioned data stores using loosely synchronized clocks. In IEEE Symposium on Reliable Distributed Systems (Sept. 2013), pp. 173--184. Google ScholarDigital Library
- Escriva, R., Wong, B., and Sirer, E. G. HyperDex: A distributed, searchable key-value store for cloud computing. In ACM SIGCOMM Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (Aug. 2012), pp. 25--36. Google ScholarDigital Library
- Eswaran, K. P., Gray, J. N., Lorie, R. A., and Traiger, I. L. The notions of consistency and predicate locks in a database system. Commun. ACM 19, 11 (Nov. 1976), 624--633. Google ScholarDigital Library
- Floratou, A., Minhas, U. F., and Özcan, F. SQL-on-Hadoop: Full circle back to shared-nothing database architectures. Proceedings of the VLDB Endowment 7, 12 (Aug. 2014), 1295--1306. Google ScholarDigital Library
- http://foundationdb.com.Google Scholar
- Friedman, E., Pawlowski, P., and Cieslewicz, J. sql/MapReduce: A practical approach to self-describing, polymorphic, and parallelizable user-defined functions. Proceedings of the VLDB Endowment 2, 2 (Aug. 2009), 1402--1413. Google ScholarDigital Library
- Goel, A. K., Pound, J., Auch, N., Bumbulis, P., MacLean, S., Färber, F., Gropengiesser, F., Mathis, C., Bodner, T., and Lehner, W. Towards scalable real-time analytics: An architecture for scale-out of OLxP workloads. Proceedings of the VLDB Endowment 8, 12 (Aug. 2015), 1716--1727. Google ScholarDigital Library
- Graefe, G. Write-optimized B-trees. In International Conference on Very Large Data Bases (Aug. 2004), pp. 672--683. Google ScholarDigital Library
- Gray, J., Helland, P., O'Neil, P., and Shasha, D. The dangers of replication and a solution. In International Conference on Management of Data (June 1996), pp. 173--182. Google ScholarDigital Library
- Gray, J., and Reuter, A. Transaction processing: concepts and techniques. Morgan Kaufmann Publishers, 1993. Google ScholarDigital Library
- Gribble, S. D., Brewer, E. A., Hellerstein, J. M., and Culler, D. Scalable, distributed data structures for Internet service construction. In Symposium on Operating Systems Design and Implementation (Oct. 2000), pp. 319--332. Google ScholarDigital Library
- Gupta, A., et al. Mesa: Geo-replicated, near real-time, scalable data warehousing. Proceedings of the VLDB Endowment 7, 12 (Aug. 2014), 1259--1270. Google ScholarDigital Library
- http://hadoop.apache.org.Google Scholar
- http://hbase.apache.org.Google Scholar
- Hellerstein, J. M., Stonebraker, M., and Hamilton, J. Architecture of a database system. Foundations and Trends in Databases 1, 2 (Feb. 2007), 141--259. Google ScholarDigital Library
- Kallman, R., et al. H-store: a high-performance, distributed main memory transaction processing system. Proceedings of the VLDB Endowment 1, 2 (Aug. 2008), 1496--1499. Google ScholarDigital Library
- Kate, B., Kohler, E., Kester, M. S., Narula, N., Mao, Y., and Morris, R. Easy freshness with Pequod cache joins. In Symposium on Networked Systems Design and Implementation (Apr. 2014), pp. 415--428. Google ScholarDigital Library
- Kornacker, M., et al. Impala: A modern, open-source SQL engine for Hadoop. In Conference on Innovative Data Systems Research (Jan. 2015).Google Scholar
- Kung, H. T., and Lehman, P. L. Concurrent manipulation of binary search trees. ACM Transactions on Database Systems 5, 3 (Sept. 1980), 354--382. Google ScholarDigital Library
- http://en.wikipedia.org/wiki/LAMP (software bundle).Google Scholar
- Lamport, L. The part-time parliament. ACM Transactions on Computer Systems 16, 2 (May 1998), 133--169. Google ScholarDigital Library
- Lehman, P. L., and Yao, S. B. Efficient locking for concurrent operations on B-trees. ACM Transactions on Database Systems 6, 4 (Dec. 1981), 650--670. Google ScholarDigital Library
- Levandoski, J. J., Lomet, D., Mokbel, M. F., and Zhao, K. K. Deuteronomy: Transaction support for cloud data. In Conference on Innovative Data Systems Research (Jan. 2011), pp. 123--133.Google Scholar
- Levin, K. D., and Morgan, H. L. Optimizing distributed data bases: a framework for research. In National computer conference (May 1975), pp. 473--478. Google ScholarDigital Library
- Liskov, B. Practical uses of synchronized clocks in distributed systems. Distributed Computing 6, 4 (July 1993), 211--219. Google ScholarDigital Library
- Loesing, S., Pilman, M., Etter, T., and Kossmann, D. On the design and scalability of distributed shared-data databases. In International Conference on Management of Data (May 2015), pp. 663--676. Google ScholarDigital Library
- Lomet, D. B., Sengupta, S., and Levandoski, J. J. The Bw-tree: A B-tree for new hardware platforms. In International Conference on Data Engineering (Apr. 2013), pp. 302--313. Google ScholarDigital Library
- MacCormick, J., Murphy, N., Najork, M., Thekkath, C. A., and Zhou, L. Boxwood: Abstractions as the foundation for storage infrastructure. In Symposium on Operating Systems Design and Implementation (Dec. 2004), pp. 105--120. Google ScholarDigital Library
- Manolopoulos, Y. B-trees with lazy parent split. Information Sciences 79, 1-2 (July 1994), 73--88. Google ScholarDigital Library
- http://www.mediawiki.org.Google Scholar
- Melnik, S., Gubarev, A., Long, J. J., Romer, G., Shivakumar, S., Tolton, M., and Vassilakis, T. Dremel: Interactive analysis of web-scale datasets. Proceedings of the VLDB Endowment 3, 1-2 (Sept. 2010), 330--339. Google ScholarDigital Library
- http://memcached.org.Google Scholar
- Mohan, C. Big data: Hype and reality. http://bit.ly/CMnMDS.Google Scholar
- http://www.mysql.com.Google Scholar
- Narula, N., and Morris, R. Executing Web application queries on a partitioned database. In USENIX Conference on Web Application Development (June 2012), pp. 63--74. Google ScholarDigital Library
- Nielsen, J. Usability Engineering. Morgan Kaufmann, San Francisco, 1994.Google ScholarDigital Library
- Pavlo, A., Curino, C., and Zdonik, S. B. Skew-aware automatic database partitioning in shared-nothing, parallel OLTP systems. In International Conference on Management of Data (May 2012), pp. 61--72. Google ScholarDigital Library
- Ports, D. R. K., Clements, A. T., Zhang, I., Madden, S., and Liskov, B. Transactional consistency and automatic management in an application data cache. In Symposium on Operating Systems Design and Implementation (Oct. 2010), pp. 279--292. Google ScholarDigital Library
- http://prestodb.io.Google Scholar
- Rabkin, A., Arye, M., Sen, S., Pai, V. S., and Freedman, M. J. Aggregation and degradation in JetStream: Streaming analytics in the wide area. In Symposium on Networked Systems Design and Implementation (Apr. 2014), pp. 275--288. Google ScholarDigital Library
- Rae, I., Rollins, E., Shute, J., Sodhi, S., and Vingralek, R. Online, asynchronous schema change in F1. Proceedings of the VLDB Endowment 6, 11 (Aug. 2013), 1045--1056. Google ScholarDigital Library
- Reed, D. P. Implementing atomic actions on decentralized data. ACM Transactions on Computer Systems 1, 1 (Feb. 1983), 3--23. Google ScholarDigital Library
- http://www.scalearc.com.Google Scholar
- Sewall, J., Chhugani, J., Kim, C., Satish, N., and Dubey, P. PALM: Parallel architecture-friendly latch-free modifications to B+ trees on many-core processors. Proceedings of the VLDB Endowment 4, 11 (Aug. 2011), 795--806.Google ScholarDigital Library
- Shasha, D., and Goodman, N. Concurrent search structure algorithms. ACM Transactions on Database Systems 13, 1 (Mar. 1988), 53--90. Google ScholarDigital Library
- Shute, J., et al. F1: A distributed SQL database that scales. Proceedings of the VLDB Endowment 6, 11 (Aug. 2013), 1068--1079. Google ScholarDigital Library
- Sowell, B., Golab, W. M., and SHAH, M. A. Minuet: A scalable distributed multiversion B-tree. Proceedings of the VLDB Endowment 5, 9 (May 2012), 884--895. Google ScholarDigital Library
- http://www.sqlite.org.Google Scholar
- Stonebraker, M. The case for shared nothing. IEEE Database Engineering Bulletin 9, 1 (Mar. 1986), 4--9.Google Scholar
- Stonebraker, M., Madden, S., Abadi, D. J., Harizopoulos, S., Hachem, N., and Helland, P. The end of an architectural era (it's time for a complete rewrite). In International Conference on Very Large Data Bases (Sept. 2007), pp. 1150--1160. Google ScholarDigital Library
- http://tajo.incubator.apache.org.Google Scholar
- Terry, D., Prabhakaran, V., Kotla, R., Balakrishnan, M., and Aguilera, M. K. Transactions with consistency choices on geo-replicated cloud storage. Tech. Rep. MSR-TR-2013-82, Microsoft Research, Sept. 2013.Google Scholar
- Tomic, A. MoSQL, A Relational Database Using NoSQL Technology. PhD thesis, Faculty of Informatics, University of Lugano, 2011.Google Scholar
- Tomic, A., Sciascia, D., and Pedone, F. MoSQL: An elastic storage engine for MySQL. In Symposium On Applied Computing (Mar. 2013), pp. 455--462. Google ScholarDigital Library
- http://www.wikipedia.org.Google Scholar
- Xin, R. S., Rosen, J., Zaharia, M., Franklin, M. J., Shenker, S., and Stoica, I. Shark: SQL and rich analytics at scale. In International Conference on Management of Data (June 2013), pp. 13--24. Google ScholarDigital Library
Index Terms
- Yesquel: scalable sql storage for web applications
Recommendations
CockroachDB: The Resilient Geo-Distributed SQL Database
SIGMOD '20: Proceedings of the 2020 ACM SIGMOD International Conference on Management of DataWe live in an increasingly interconnected world, with many organizations operating across countries or even continents. To serve their global user base, organizations are replacing their legacy DBMSs with cloud-based systems capable of scaling OLTP ...
Benchmarking cloud serving systems with YCSB
SoCC '10: Proceedings of the 1st ACM symposium on Cloud computingWhile the use of MapReduce systems (such as Hadoop) for large scale data analysis has been widely recognized and studied, we have recently seen an explosion in the number of systems developed for cloud data serving. These newer systems address "cloud ...
Spark SQL: Relational Data Processing in Spark
SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of DataSpark SQL is a new module in Apache Spark that integrates relational processing with Spark's functional programming API. Built on our experience with Shark, Spark SQL lets Spark programmers leverage the benefits of relational processing (e.g. ...
Comments