research-article

Database cracking: fancy scan, not poor man's sort!

Authors:
Holger Pirk

CWI, Amsterdam

CWI, Amsterdam
View Profile

,
Eleni Petraki

CWI, Amsterdam

CWI, Amsterdam
View Profile

,
Stratos Idreos

Harvard University

Harvard University
View Profile

,
Stefan Manegold

CWI, Amsterdam

CWI, Amsterdam
View Profile

,
Martin Kersten

CWI, Amsterdam

CWI, Amsterdam
View Profile

DaMoN '14: Proceedings of the Tenth International Workshop on Data Management on New HardwareJune 2014Article No.: 4Pages 1–8https://doi.org/10.1145/2619228.2619232

Published:23 June 2014Publication History

DaMoN '14: Proceedings of the Tenth International Workshop on Data Management on New Hardware

Pages 1–8

ABSTRACT

Database Cracking is an appealing approach to adaptive indexing: on every range-selection query, the data is partitioned using the supplied predicates as pivots. The core of database cracking is, thus, pivoted partitioning. While pivoted partitioning, like scanning, requires a single pass through the data it tends to have much higher costs due to lower CPU efficiency. In this paper, we conduct an in-depth study of the reasons for the low CPU efficiency of pivoted partitioning. Based on the findings, we develop an optimized version with significantly higher (single-threaded) CPU efficiency. We also develop a number of multi-threaded implementations that are effectively bound by memory bandwidth. Combining all of these optimizations we achieve an implementation that has costs close to or better than an ordinary scan on a variety of systems ranging from low-end (cheaper than $300) desktop machines to high-end (above $60,000) servers.

References

Intel 64 and IA-32 Architectures Optimization Reference Manual. Appendix B: Using Performance Monitoring Events. Intel Corporation, June 2013.Google Scholar
S. Agrawal, S. Chaudhuri, L. Kollár, A. P. Marathe, V. R. Narasayya, and M. Syamala. Database Tuning Advisor for Microsoft SQL Server 2005. In VLDB, pages 1110--1121, 2004.Google Scholar
A. Ailamaki, D. J. DeWitt, M. D. Hill, and D. A. Wood. DBMSs on a Modern Processor: Where Does Time Go? In VLDB, pages 266--277, 1999. Google ScholarDigital Library
C. Balkesen, G. Alonso, J. Teubner, and M. T. Özsu. Multi-Core, Main-Memory Joins: Sort vs. Hash Revisited. PVLDB, 7(1):85--96, 2013.Google ScholarDigital Library
P. A. Boncz, M. Zukowski, and N. Nes. MonetDB/X100: Hyper-Pipelining Query Execution. In CIDR, pages 225--237, 2005.Google Scholar
N. Bruno and S. Chaudhuri. An Online Approach to Physical Design Tuning. In ICDE, pages 826--835, 2007.Google ScholarCross Ref
J. Chhugani, A. D. Nguyen, V. W. Lee, W. Macy, M. Hagog, Y.-K. Chen, A. Baransi, S. Kumar, and P. Dubey. Efficient Implementation of Sorting on Multi-Core SIMD CPU Architecture. PVLDB, 1(2):1313--1324, 2008. Google ScholarDigital Library
J. Cieslewicz and K. A. Ross. Adaptive Aggregation on Chip Multi-processors. In VLDB, pages 339--350, 2007. Google ScholarDigital Library
B. Dageville, D. Das, K. Dias, K. Yagoub, M. Zaït, and M. Ziauddin. Automatic SQL Tuning in Oracle 10g. In VLDB, pages 1098--1109, 2004. Google ScholarDigital Library
G. Graefe, F. Halim, S. Idreos, H. A. Kuno, and S. Manegold. Concurrency Control for Adaptive Indexing. PVLDB, 5(7):656--667, 2012. Google ScholarDigital Library
G. Graefe, F. Halim, S. Idreos, H. A. Kuno, S. Manegold, and B. Seeger. Transactional Support for Adaptive Indexing. VLDBJ, 23(2):303--328, 2014.Google Scholar
G. Graefe, S. Idreos, H. A. Kuno, and S. Manegold. Benchmarking Adaptive Indexing. In TPCTC, pages 169--184, 2010. Google ScholarDigital Library
G. Graefe and H. Kuno. Adaptive Indexing for Relational Keys. In ICDE Workshops, pages 69--74, 2010.Google Scholar
G. Graefe and H. A. Kuno. Self-Selecting, Self-Tuning, Incrementally Optimized Indexes. In EDBT, pages 371--381, 2010. Google ScholarDigital Library
F. Halim, S. Idreos, P. Karras, and R. H. C. Yap. Stochastic Database Cracking: Towards Robust Adaptive Indexing in Main-Memory Column-Stores. PVLDB, 5(6):502--513, 2012. Google ScholarDigital Library
J. L. Hennessy and D. A. Patterson. Computer Architecture - A Quantitative Approach. Morgan Kaufmann, 5th edition, 2012. Google ScholarDigital Library
C. A. R. Hoare. Algorithm 64: Quicksort. CACM, 4(7), 1961. Google ScholarDigital Library
S. Idreos. Database Cracking: Towards Auto-tuning Database Kernels. PhD thesis, CWI, June 2010.Google Scholar
S. Idreos, M. L. Kersten, and S. Manegold. Database Cracking. In CIDR, pages 68--78, 2007.Google Scholar
S. Idreos, M. L. Kersten, and S. Manegold. Updating a Cracked Database. In SIGMOD, pages 413--424, 2007. Google ScholarDigital Library
S. Idreos, M. L. Kersten, and S. Manegold. Self-Organizing Tuple Reconstruction in Column-Stores. In SIGMOD, pages 297--308, 2009. Google ScholarDigital Library
S. Idreos, S. Manegold, H. A. Kuno, and G. Graefe. Merging What's Cracked, Cracking What's Merged: Adaptive Indexing in Main-Memory Column-Stores. PVLDB, 4(9):585--597, 2011. Google ScholarDigital Library
M. Johnson. Superscalar Microprocessor Design. Prentice Hall, 1991.Google Scholar
C. Kim, E. Sedlar, J. Chhugani, T. Kaldewey, A. D. Nguyen, A. D. Blas, V. W. Lee, N. Satish, and P. Dubey. Sort vs. Hash Revisited: Fast Join Implementation on Modern Multi-Core CPUs. PVLDB, 2(2):1378--1389, 2009. Google ScholarDigital Library
M. Lühring, K.-U. Sattler, K. Schmidt, and E. Schallehn. Autonomous Management of Soft Indexes. In SMDB, pages 450--458, 2007. Google ScholarDigital Library
S. Richter, J.-A. Quiané-Ruiz, S. Schuh, and J. Dittrich. Towards zero-overhead static and adaptive indexing in hadoop. VLDBJ, 23(3):469--494, 2014. Google ScholarDigital Library
K. A. Ross. Selection Conditions in Main Memory. TODS, pages 132--161, 2004. Google ScholarDigital Library
K. Schnaitter, S. Abiteboul, T. Milo, and N. Polyzotis. COLT: Continuous On-Line Tuning. In SIGMOD, pages 793--795, 2006. Google ScholarDigital Library
F. M. Schuhknecht, A. Jindal, and J. Dittrich. The Uncracked Pieces in Database Cracking. PVLDB, 7(2):97--108, 2013.Google Scholar
J. Zhou and K. A. Ross. Implementing Database Operations Using SIMD Instructions. In SIGMOD, pages 145--156, 2002. Google ScholarDigital Library
D. C. Zilio, J. Rao, S. Lightstone, G. M. Lohman, A. J. Storm, C. Garcia-Arellano, and S. Fadden. DB2 Design Advisor: Integrated Automatic Physical Database Design. In VLDB, pages 1087--1097, 2004. Google ScholarDigital Library

Index Terms

Database cracking: fancy scan, not poor man's sort!

Recommendations

GPU-accelerated string matching for database applications

Implementations of relational operators on GPU processors have resulted in order of magnitude speedups compared to their multicore CPU counterparts. Here we focus on the efficient implementation of string matching operators common in SQL queries. Due to ...
Read More
Accelerating SQL database operations on a GPU with CUDA
GPGPU-3: Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units

Prior work has shown dramatic acceleration for various database operations on GPUs, but only using primitives that are not part of conventional database languages such as SQL. This paper implements a subset of the SQLite command processor directly on ...
Read More
Accelerating tandem MS protein database searches using OpenCL
ECMLS '12: Proceedings of the 3rd international workshop on Emerging computational methods for the life sciences

GPUs and multicore processors are now pervasive in com- putational sciences and high-performance computing. Their high arithmetic throughput and memory bandwidth com- bined with their ever increasing programmability make them suitable for a widening ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
DaMoN '14: Proceedings of the Tenth International Workshop on Data Management on New Hardware
June 2014
71 pages
ISBN:9781450329712
DOI:10.1145/2619228
Editors:
Alfons Kemper
Technische Universitat Munchen
,
Ippokratis Pandis
Cloudera
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 23 June 2014
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate80of102submissions,78%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 37
  Total Citations
  View Citations
- 390
  Total Downloads
- Downloads (Last 12 months)39
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Database cracking: fancy scan, not poor man's sort!

DaMoN '14: Proceedings of the Tenth International Workshop on Data Management on New Hardware

ABSTRACT

References

Cited By

Index Terms

Recommendations

GPU-accelerated string matching for database applications

Accelerating SQL database operations on a GPU with CUDA

Accelerating tandem MS protein database searches using OpenCL