skip to main content
10.1145/2901318.2901344acmotherconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article

Data tiering in heterogeneous memory systems

Published:18 April 2016Publication History

ABSTRACT

Memory-based data center applications require increasingly large memory capacities, but face the challenges posed by the inherent difficulties in scaling DRAM and also the cost of DRAM. Future systems are attempting to address these demands with heterogeneous memory architectures coupling DRAM with high capacity, low cost, but also lower performance, non-volatile memories (NVM) such as PCM and RRAM. A key usage model intended for NVM is as cheaper high capacity volatile memory. Data center operators are bound to ask whether this model for the usage of NVM to replace the majority of DRAM memory leads to a large slowdown in their applications? It is crucial to answer this question because a large performance impact will be an impediment to the adoption of such systems.

This paper presents a thorough study of representative applications -- including a key-value store (MemC3), an in-memory database (VoltDB), and a graph analytics framework (GraphMat) -- on a platform that is capable of emulating a mix of memory technologies. Our conclusions are that it is indeed possible to use a mix of a small amount of fast DRAM and large amounts of slower NVM without a proportional impact to an application's performance. The caveat is that this result can only be achieved through careful placement of data structures. The contribution of this paper is the design and implementation of a set of libraries and automatic tools that enables programmers to achieve optimal data placement with minimal effort on their part.

With such guided placement and with DRAM constituting only 6% of the total memory footprint for GraphMat and 25% for VoltDB and MemC3 (remaining memory is NVM with 4x higher latency and 8x lower bandwidth than DRAM), we show that our target applications demonstrate only a 13% to 40% slowdown. Without guided placement, these applications see, in the worst case, 1.5x to 5.9x slowdown on the same configuration. Based on a realistic assumption that NVM will be 5x cheaper (per bit) than DRAM, this hybrid solution also results in 2x to 2.8x better performance/$ than a DRAM-only system.

References

  1. https://en.wikipedia.org/wiki/3D_XPoint.Google ScholarGoogle Scholar
  2. https://github.com/jemalloc.Google ScholarGoogle Scholar
  3. AutoNuma. https://www.kernel.org/pub/linux/kernel/people/andrea/autonuma/autonuma_bench-20120530.pdf, 2012.Google ScholarGoogle Scholar
  4. Crossbar Resistive Memory: The Future Technology for NAND Flash. http://www.crossbar-inc.com/assets/img/media/Crossbar-RRAM-Technology-Whitepaper-080413.pdf, 2013.Google ScholarGoogle Scholar
  5. fadvise - Linux man page. http://linux.die.net/man/2/fadvise, 2014.Google ScholarGoogle Scholar
  6. The Machine from HP. http://www8.hp.com/hpnext/posts/discover-day-two-future-now-machine-hp#.U9MZNPldWSo, 2014.Google ScholarGoogle Scholar
  7. Intel Xeon Phi (Knights Landing) Architectural Overview. http://www8.hp.com/hpnext/posts/discover-day-two-future-now-machine-hp#.U9MZNPldWSo, 2014.Google ScholarGoogle Scholar
  8. madvise - Linux man page. http://linux.die.net/man/2/madvise, 2014.Google ScholarGoogle Scholar
  9. Memaslap. http://docs.libmemcached.org/bin/memaslap.html, 2014.Google ScholarGoogle Scholar
  10. Memcached - a distributed memory object caching system. http://memcached.org, 2014.Google ScholarGoogle Scholar
  11. NVM Library. http://pmem.io/nvml, 2014.Google ScholarGoogle Scholar
  12. Oracle Database In-Memory. http://www.oracle.com/technetwork/database/in-memory/overview/twp-oracle-database-in-memory-2245633.html, 2014.Google ScholarGoogle Scholar
  13. SAP HANA for Next-Generation Business Applications and Real-Time Analytics. http://www.saphana.com/servlet/JiveServlet/previewBody/1507-102-3-2096/SAP\%20HANA\%20Whitepaper.pdf, 2014.Google ScholarGoogle Scholar
  14. TPC-C. http://www.tpc.org/tpcc, 2014.Google ScholarGoogle Scholar
  15. VoltDB. http://voltdb.com/downloads/datasheets_collateral/technical_overview.pdf, 2014.Google ScholarGoogle Scholar
  16. Yahoo Cloud Serving Benchmark (YCSB). http://labs.yahoo.com/news/yahoo-cloud-serving-benchmark, 2014.Google ScholarGoogle Scholar
  17. Improving page reclaim. https://lwn.net/Articles/636972, 2015.Google ScholarGoogle Scholar
  18. N. Agarwal, D. Nellans, M. Stephenson, M. O'Connor, and S. W. Keckler. Page Placement Strategies for GPUs Within Heterogeneous Memory Systems. In Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '15, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Amazon. Supermicro Certified MEM-DR432L-SL01-LR21 Samsung 32GB DDR4-2133 4Rx4 LP ECC LRDIMM Memory, 2015.Google ScholarGoogle Scholar
  20. J. Arulraj, A. Pavlo, and S. R. Dulloor. Let's Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, SIGMOD '15, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. B. Atikoglu, Y. Xu, E. Frachtenberg, S. Jiang, and M. Paleczny. Workload Analysis of a Large-scale Key-value Store. In Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS '12, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. A. Badam and V. S. Pai. SSDAlloc: Hybrid SSD/RAM Memory Management Made Easy. In Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation, NSDI'11, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. Brin and L. Page. The Anatomy of a Large-scale Hypertextual Web Search Engine. In Proceedings of the Seventh International Conference on World Wide Web 7, WWW7, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. P. Cheng, R. Harper, and P. Lee. Generational Stack Collection and Profile-driven Pretenuring. In Proceedings of the ACM SIGPLAN 1998 Conference on Programming Language Design and Implementation, PLDI '98. ACM, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. Coburn, A. M. Caulfield, A. Akel, L. M. Grupp, R. K. Gupta, R. Jhala, and S. Swanson. NV-Heaps: Making Persistent Objects Fast and Safe with Next-generation, Non-volatile Memories. In Proceedings of the Sixteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVI, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. J. DeBrabant, J. Arulraj, A. Pavlo, M. Stonebraker, S. Zdonik, and S. Dulloor. A prolegomenon on OLTP database systems for non-volatile memory. In ADMS@VLDB, 2014.Google ScholarGoogle Scholar
  27. S. R. Dulloor, S. Kumar, A. Keshavamurthy, P. Lantz, D. Reddy, R. Sankaran, and J. Jackson. System Software for Persistent Memory. In Proceedings of the Ninth European Conference on Computer Systems, EuroSys '14, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. B. Fan, D. G. Andersen, and M. Kaminsky. MemC3: Compact and Concurrent MemCache with Dumber Caching and Smarter Hashing. In Proceedings of the 10th USENIX Conference on Networked Systems Design and Implementation, nsdi'13, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. PowerGraph: Distributed Graph-parallel Computation on Natural Graphs. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, OSDI'12, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Intel. 3d x-point press announcement. http://newsroom.intel.com/docs/DOC-6713, 2015.Google ScholarGoogle Scholar
  31. M. R. Jantz, C. Strickland, K. Kumar, M. Dimitrov, and K. A. Doshi. A Framework for Application Guidance in Virtual Memory Systems. In Proceedings of the 9th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, VEE '13, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. M. Jump, S. M. Blackburn, and K. S. McKinley. Dynamic Object Sampling for Pretenuring. In Proceedings of the 4th International Symposium on Memory Management, ISMM '04, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. H. Kim, S. Seshadri, C. L. Dickey, and L. Chiu. Evaluating Phase Change Memory for Enterprise Storage Systems: A Study of Caching and Tiering Approaches. In Proceedings of the 12th USENIX Conference on File and Storage Technologies, FAST'14, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. H. Kwak, C. Lee, H. Park, and S. Moon. What is Twitter, a Social Network or a News Media? In Proceedings of the 19th International Conference on World Wide Web, WWW '10, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. K. Lim, J. Chang, T. Mudge, P. Ranganathan, S. K. Reinhardt, and T. F. Wenisch. Disaggregated Memory for Expansion and Sharing in Blade Servers. In Proceedings of the 36th Annual International Symposium on Computer Architecture, ISCA '09, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. G. H. Loh. 3D-Stacked Memory Architectures for Multi-core Processors. In Proceedings of the 35th Annual International Symposium on Computer Architecture, ISCA '08, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood. Pin: Building customized program analysis tools with dynamic instrumentation. In Proceedings of the Conference on Programming Language Design and Implementation, pages 190--200. ACM, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: A System for Large-scale Graph Processing. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, SIGMOD '10, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. J. Malicevic, S. R. Dulloor, N. Sundaram, N. Satish, J. Jackson, and W. Zwaenepoel. Exploiting nvm in large-scale graph analytics. In Proceedings of the 3rd Workshop on Interactions of NVM/FLASH with Operating Systems and Workloads, INFLOW '15, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. M. P. Mesnier and J. B. Akers. Differentiated Storage Services. SIGOPS Oper. Syst. Rev., 45(1), Feb. 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. D. Nguyen, A. Lenharth, and K. Pingali. A Lightweight Infrastructure for Graph Analytics. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, SOSP '13, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. R. Nishtala, H. Fugal, S. Grimm, M. Kwiatkowski, H. Lee, H. C. Li, R. McElroy, M. Paleczny, D. Peek, P. Saab, D. Stafford, T. Tung, and V. Venkataramani. Scaling Memcache at Facebook. In Proceedings of the 10th USENIX Conference on Networked Systems Design and Implementation, nsdi'13, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. I. Oukid, D. Booss, W. Lehner, P. Bumbulis, and T. Willhalm. SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery. In Proceedings of the Tenth International Workshop on Data Management on New Hardware, DaMoN '14, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. I. Oukid, W. Lehner, K. Thomas, T. Willhalm, and P. Bumbulis. Instant Recovery for Main-Memory Databases. In Proceedings of the Seventh Biennial Conference on Innovative Data Systems Research, CIDR '15, 2015.Google ScholarGoogle Scholar
  45. J. Ousterhout, P. Agrawal, D. Erickson, C. Kozyrakis, J. Leverich, D. Mazières, S. Mitra, A. Narayanan, G. Parulkar, M. Rosenblum, S. M. Rumble, E. Stratmann, and R. Stutsman. The Case for RAMClouds: Scalable High-performance Storage Entirely in DRAM. SIGOPS Oper. Syst. Rev., 43(4), Jan. 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. M. K. Qureshi, V. Srinivasan, and J. A. Rivers. Scalable High Performance Main Memory System Using Phase-change Memory Technology. In Proceedings of the 36th Annual International Symposium on Computer Architecture, ISCA '09, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. A. Roy, I. Mihailovic, and W. Zwaenepoel. X-Stream: Edgecentric Graph Processing Using Streaming Partitions. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, SOSP '13, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. N. Satish, N. Sundaram, M. M. A. Patwary, J. Seo, J. Park, M. A. Hassaan, S. Sengupta, Z. Yin, and P. Dubey. Navigating the Maze of Graph Analytics Frameworks Using Massive Graph Datasets. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, SIGMOD '14, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. M. Sivathanu, V. Prabhakaran, F. I. Popovici, T. E. Denehy, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau. Semantically-Smart Disk Systems. In Proceedings of the 2Nd USENIX Conference on File and Storage Technologies, FAST '03, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. B. Steensgaard. Points-to Analysis in Almost Linear Time. In Proceedings of the Symposium on Principles of Programming Languages. ACM, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. M. Stonebraker, S. Madden, D. J. Abadi, S. Harizopoulos, N. Hachem, and P. Helland. The End of an Architectural Era: (It's Time for a Complete Rewrite). In Proceedings of the 33rd International Conference on Very Large Data Bases, VLDB '07, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. N. Sundaram, N. Satish, M. M. A. Patwary, S. R. Dulloor, S. G. Vadlamudi, D. Das, and P. Dubey. GraphMat: High performance graph analytics made productive. http://arxiv.org/abs/1503.07241, 2015.Google ScholarGoogle Scholar
  53. S. Venkataraman, N. Tolia, P. Ranganathan, and R. H. Campbell. Consistent and Durable Data Structures for Nonvolatile Byte-addressable Memory. In Proceedings of the 9th USENIX Conference on File and Stroage Technologies, FAST'11, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. H. Volos, A. J. Tack, and M. M. Swift. Mnemosyne: Lightweight Persistent Memory. In Proceedings of the Sixteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVI, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. X. Yang, S. M. Blackburn, and K. S. McKinley. Computer performance microscopy with shim. In Proceedings of the 42Nd Annual International Symposium on Computer Architecture, ISCA '15. ACM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Y. Zhang, J. Yang, A. Memaripour, and S. Swanson. Mojim: A Reliable and Highly-Available Non-Volatile Memory System. In Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '15, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    EuroSys '16: Proceedings of the Eleventh European Conference on Computer Systems
    April 2016
    605 pages
    ISBN:9781450342407
    DOI:10.1145/2901318

    Copyright © 2016 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 18 April 2016

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    EuroSys '16 Paper Acceptance Rate38of180submissions,21%Overall Acceptance Rate241of1,308submissions,18%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader