skip to main content
10.1145/3299869.3320212acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

DuckDB: an Embeddable Analytical Database

Published:25 June 2019Publication History

ABSTRACT

The immense popularity of SQLite shows that there is a need for unobtrusive in-process data management solutions. However, there is no such system yet geared towards analytical workloads. We demonstrate DuckDB, a novel data management system designed to execute analytical SQL queries while embedded in another process. In our demonstration, we pit DuckDB against other data management solutions to showcase its performance in the embedded analytics scenario. DuckDB is available as Open Source software under a permissive license.

References

  1. Peter A. Boncz, Marcin Zukowski, and Niels Nes. 2005. MonetDB/X100: Hyper-Pipelining Query Execution. In CIDR 2005, Second Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA, January 4--7, 2005 . 225--237. http://cidrdb.org/cidr2005/papers/P19.pdfGoogle ScholarGoogle Scholar
  2. Lukas Fittl. 2019. C library for accessing the PostgreSQL parser outside of the server environment. https://github.com//fittl/libpg_query .Google ScholarGoogle Scholar
  3. Richard Hipp. 2019 a. Database File Format. https://www.sqlite.org/fileformat.html .Google ScholarGoogle Scholar
  4. Richard Hipp. 2019 b. Most Widely Deployed and Used Database Engine. https://www.sqlite.org/mostdeployed.html .Google ScholarGoogle Scholar
  5. Harald Lang, Tobias Mü hlbauer, Florian Funke, et almbox. 2016. Data Blocks: Hybrid OLTP and OLAP on Compressed Storage using both Vectorization and Compilation. In Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, June 26 - July 01, 2016. 311--326. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Wes McKinney. 2010. Data Structures for Statistical Computing in Python. In Proceedings of the 9th Python in Science Conference, Stéfan van der Walt and Jarrod Millman (Eds.). 51 -- 56.Google ScholarGoogle ScholarCross RefCross Ref
  7. Guido Moerkotte and Thomas Neumann. 2008. Dynamic programming strikes back. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2008, Vancouver, BC, Canada, June 10--12, 2008. 539--552. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Thomas Neumann. 2011. Efficiently Compiling Efficient Query Plans for Modern Hardware. PVLDB, Vol. 4, 9 (2011), 539--550. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Thomas Neumann and Alfons Kemper. 2015. Unnesting Arbitrary Queries. In Datenbanksysteme fü r Business, Technologie und Web (BTW), 16. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme" (DBIS), 4.-6.3.2015 in Hamburg, Germany. Proceedings . 383--402. https://dl.gi.de/20.500.12116/2418Google ScholarGoogle Scholar
  10. Thomas Neumann, Tobias Mü hlbauer, and Alfons Kemper. 2015. Fast Serializable Multi-Version Concurrency Control for Main-Memory Database Systems. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31 - June 4, 2015 . 677--689. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Thomas Neumann and Bernhard Radke. 2018. Adaptive Optimization of Very Large Join Queries. In Proceedings of the 2018 International Conference on Management of Data (SIGMOD '18). ACM, New York, NY, USA, 677--692. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Mark Raasveldt and Hannes Mü hleisen. 2017. Don't Hold My Data Hostage - A Case For Client Protocol Redesign. PVLDB, Vol. 10, 10 (2017), 1022--1033.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Mark Raasveldt and Hannes Mü hleisen. 2018. MonetDBLite: An Embedded Analytical Database. CoRR, Vol. abs/1805.08520 (2018). arxiv: 1805.08520 http://arxiv.org/abs/1805.08520Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Hadley Wickham, Romain Franccois, Lionel Henry, and Kirill Müller. 2018. dplyr: A Grammar of Data Manipulation . https://CRAN.R-project.org/package=dplyr R package version 0.7.8.Google ScholarGoogle Scholar

Index Terms

  1. DuckDB: an Embeddable Analytical Database

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGMOD '19: Proceedings of the 2019 International Conference on Management of Data
      June 2019
      2106 pages
      ISBN:9781450356435
      DOI:10.1145/3299869

      Copyright © 2019 Owner/Author

      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 25 June 2019

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      SIGMOD '19 Paper Acceptance Rate88of430submissions,20%Overall Acceptance Rate785of4,003submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader