ABSTRACT
Database Cracking is an appealing approach to adaptive indexing: on every range-selection query, the data is partitioned using the supplied predicates as pivots. The core of database cracking is, thus, pivoted partitioning. While pivoted partitioning, like scanning, requires a single pass through the data it tends to have much higher costs due to lower CPU efficiency. In this paper, we conduct an in-depth study of the reasons for the low CPU efficiency of pivoted partitioning. Based on the findings, we develop an optimized version with significantly higher (single-threaded) CPU efficiency. We also develop a number of multi-threaded implementations that are effectively bound by memory bandwidth. Combining all of these optimizations we achieve an implementation that has costs close to or better than an ordinary scan on a variety of systems ranging from low-end (cheaper than $300) desktop machines to high-end (above $60,000) servers.
- Intel 64 and IA-32 Architectures Optimization Reference Manual. Appendix B: Using Performance Monitoring Events. Intel Corporation, June 2013.Google Scholar
- S. Agrawal, S. Chaudhuri, L. Kollár, A. P. Marathe, V. R. Narasayya, and M. Syamala. Database Tuning Advisor for Microsoft SQL Server 2005. In VLDB, pages 1110--1121, 2004.Google Scholar
- A. Ailamaki, D. J. DeWitt, M. D. Hill, and D. A. Wood. DBMSs on a Modern Processor: Where Does Time Go? In VLDB, pages 266--277, 1999. Google ScholarDigital Library
- C. Balkesen, G. Alonso, J. Teubner, and M. T. Özsu. Multi-Core, Main-Memory Joins: Sort vs. Hash Revisited. PVLDB, 7(1):85--96, 2013.Google ScholarDigital Library
- P. A. Boncz, M. Zukowski, and N. Nes. MonetDB/X100: Hyper-Pipelining Query Execution. In CIDR, pages 225--237, 2005.Google Scholar
- N. Bruno and S. Chaudhuri. An Online Approach to Physical Design Tuning. In ICDE, pages 826--835, 2007.Google ScholarCross Ref
- J. Chhugani, A. D. Nguyen, V. W. Lee, W. Macy, M. Hagog, Y.-K. Chen, A. Baransi, S. Kumar, and P. Dubey. Efficient Implementation of Sorting on Multi-Core SIMD CPU Architecture. PVLDB, 1(2):1313--1324, 2008. Google ScholarDigital Library
- J. Cieslewicz and K. A. Ross. Adaptive Aggregation on Chip Multi-processors. In VLDB, pages 339--350, 2007. Google ScholarDigital Library
- B. Dageville, D. Das, K. Dias, K. Yagoub, M. Zaït, and M. Ziauddin. Automatic SQL Tuning in Oracle 10g. In VLDB, pages 1098--1109, 2004. Google ScholarDigital Library
- G. Graefe, F. Halim, S. Idreos, H. A. Kuno, and S. Manegold. Concurrency Control for Adaptive Indexing. PVLDB, 5(7):656--667, 2012. Google ScholarDigital Library
- G. Graefe, F. Halim, S. Idreos, H. A. Kuno, S. Manegold, and B. Seeger. Transactional Support for Adaptive Indexing. VLDBJ, 23(2):303--328, 2014.Google Scholar
- G. Graefe, S. Idreos, H. A. Kuno, and S. Manegold. Benchmarking Adaptive Indexing. In TPCTC, pages 169--184, 2010. Google ScholarDigital Library
- G. Graefe and H. Kuno. Adaptive Indexing for Relational Keys. In ICDE Workshops, pages 69--74, 2010.Google Scholar
- G. Graefe and H. A. Kuno. Self-Selecting, Self-Tuning, Incrementally Optimized Indexes. In EDBT, pages 371--381, 2010. Google ScholarDigital Library
- F. Halim, S. Idreos, P. Karras, and R. H. C. Yap. Stochastic Database Cracking: Towards Robust Adaptive Indexing in Main-Memory Column-Stores. PVLDB, 5(6):502--513, 2012. Google ScholarDigital Library
- J. L. Hennessy and D. A. Patterson. Computer Architecture - A Quantitative Approach. Morgan Kaufmann, 5th edition, 2012. Google ScholarDigital Library
- C. A. R. Hoare. Algorithm 64: Quicksort. CACM, 4(7), 1961. Google ScholarDigital Library
- S. Idreos. Database Cracking: Towards Auto-tuning Database Kernels. PhD thesis, CWI, June 2010.Google Scholar
- S. Idreos, M. L. Kersten, and S. Manegold. Database Cracking. In CIDR, pages 68--78, 2007.Google Scholar
- S. Idreos, M. L. Kersten, and S. Manegold. Updating a Cracked Database. In SIGMOD, pages 413--424, 2007. Google ScholarDigital Library
- S. Idreos, M. L. Kersten, and S. Manegold. Self-Organizing Tuple Reconstruction in Column-Stores. In SIGMOD, pages 297--308, 2009. Google ScholarDigital Library
- S. Idreos, S. Manegold, H. A. Kuno, and G. Graefe. Merging What's Cracked, Cracking What's Merged: Adaptive Indexing in Main-Memory Column-Stores. PVLDB, 4(9):585--597, 2011. Google ScholarDigital Library
- M. Johnson. Superscalar Microprocessor Design. Prentice Hall, 1991.Google Scholar
- C. Kim, E. Sedlar, J. Chhugani, T. Kaldewey, A. D. Nguyen, A. D. Blas, V. W. Lee, N. Satish, and P. Dubey. Sort vs. Hash Revisited: Fast Join Implementation on Modern Multi-Core CPUs. PVLDB, 2(2):1378--1389, 2009. Google ScholarDigital Library
- M. Lühring, K.-U. Sattler, K. Schmidt, and E. Schallehn. Autonomous Management of Soft Indexes. In SMDB, pages 450--458, 2007. Google ScholarDigital Library
- S. Richter, J.-A. Quiané-Ruiz, S. Schuh, and J. Dittrich. Towards zero-overhead static and adaptive indexing in hadoop. VLDBJ, 23(3):469--494, 2014. Google ScholarDigital Library
- K. A. Ross. Selection Conditions in Main Memory. TODS, pages 132--161, 2004. Google ScholarDigital Library
- K. Schnaitter, S. Abiteboul, T. Milo, and N. Polyzotis. COLT: Continuous On-Line Tuning. In SIGMOD, pages 793--795, 2006. Google ScholarDigital Library
- F. M. Schuhknecht, A. Jindal, and J. Dittrich. The Uncracked Pieces in Database Cracking. PVLDB, 7(2):97--108, 2013.Google Scholar
- J. Zhou and K. A. Ross. Implementing Database Operations Using SIMD Instructions. In SIGMOD, pages 145--156, 2002. Google ScholarDigital Library
- D. C. Zilio, J. Rao, S. Lightstone, G. M. Lohman, A. J. Storm, C. Garcia-Arellano, and S. Fadden. DB2 Design Advisor: Integrated Automatic Physical Database Design. In VLDB, pages 1087--1097, 2004. Google ScholarDigital Library
Index Terms
- Database cracking: fancy scan, not poor man's sort!
Recommendations
GPU-accelerated string matching for database applications
Implementations of relational operators on GPU processors have resulted in order of magnitude speedups compared to their multicore CPU counterparts. Here we focus on the efficient implementation of string matching operators common in SQL queries. Due to ...
Accelerating SQL database operations on a GPU with CUDA
GPGPU-3: Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing UnitsPrior work has shown dramatic acceleration for various database operations on GPUs, but only using primitives that are not part of conventional database languages such as SQL. This paper implements a subset of the SQLite command processor directly on ...
Accelerating tandem MS protein database searches using OpenCL
ECMLS '12: Proceedings of the 3rd international workshop on Emerging computational methods for the life sciencesGPUs and multicore processors are now pervasive in com- putational sciences and high-performance computing. Their high arithmetic throughput and memory bandwidth com- bined with their ever increasing programmability make them suitable for a widening ...
Comments