research-article

Public Access

Page migration support for disaggregated non-volatile memories

Authors:
Vamsee Reddy Kommareddy

University of Central Florida

University of Central Florida
View Profile

,
Simon David Hammond

Sandia National Labs

Sandia National Labs
View Profile

,
Clayton Hughes

Sandia National Labs

Sandia National Labs
View Profile

,
Ahmad Samih

Intel Corporation

Intel Corporation
View Profile

,
Amro Awad

University of Central Florida

University of Central Florida
View Profile

MEMSYS '19: Proceedings of the International Symposium on Memory SystemsSeptember 2019Pages 417–427https://doi.org/10.1145/3357526.3357543

Published:30 September 2019Publication History

MEMSYS '19: Proceedings of the International Symposium on Memory Systems

Pages 417–427

ABSTRACT

As demands for memory-intensive applications continue to grow, the memory capacity of each computing node is expected to grow at a similar pace. In high-performance computing (HPC) systems, the memory capacity per compute node is decided upon the most demanding application that would likely run on such system, and hence the average capacity per node in future HPC systems is expected to grow significantly. However, since HPC systems run many applications with different capacity demands, a large percentage of the overall memory capacity will likely be underutilized; memory modules can be thought of as private memory for its corresponding computing node. Thus, as HPC systems are moving towards the exascale era, a better utilization of memory is strongly desired. Moreover, upgrading memory system requires significant efforts. Fortunately, disaggregated memory systems promise better utilization by defining regions of global memory, typically referred to as memory blades, which can be accessed by all computing nodes in the system, thus achieving much better utilization.

Disaggregated memory systems are expected to be built using dense, power-efficient memory technologies. Thus, emerging nonvolatile memories (NVMs) are placing themselves as the main building blocks for such systems. However, NVMs are slower than DRAM. Therefore, it is expected that each computing node would have a small local memory that is based on either HBM or DRAM, whereas a large shared NVM memory would be accessible by all nodes. Managing such system with global and local memory requires a novel hardware/software co-design to initiate page migration between global and local memory to maximize performance while enabling access to huge shared memory. In this paper we provide support to migrate pages, investigate such memory management aspects and the major system-level aspects that can affect design decisions in disaggregated NVM systems

References

Nadav Amit. 2017. Optimizing the TLB shootdown algorithm with page access tracking. In Proc. USENIX Ann. Conf. 27--39.Google ScholarDigital Library
A. Arpaci-Dusseau. 2000. Translation Lookaside Buffers (TLBs). http://pages.cs.wisc.edu/~eli/537/lectures/TLB.2x2.pdfGoogle Scholar
Amro Awad, Arkaprava Basu, Sergey Blagodurov, Yan Solihin, and Gabriel H Loh. 2017. Avoiding TLB Shootdowns Through Self-Invalidating TLB Entries. In Parallel Architectures and Compilation Techniques (PACT), 2017 26th International Conference on. IEEE, 273--287.Google ScholarCross Ref
Amro Awad, Sergey Blagodurov, and Yan Solihin. 2016. Write-aware management of NVM-based memory extensions. In Proceedings of the 2016 International Conference on Supercomputing. ACM, 9.Google ScholarDigital Library
David H Bailey. 2011. Nas parallel benchmarks. In Encyclopedia of Parallel Computing. Springer, 1254--1259.Google Scholar
Daniel Turull Chakri Padala and Vinay Yadav. 2017. Time for memory disaggregation? Ericsson Research Blog. Online]. https://www.ericsson.com/research-blog/time-memory-disaggregation/ (may 2017).Google Scholar
Chiachen Chou, Aamer Jaleel, and Moinuddin Qureshi. 2017. BATMAN: techniques for maximizing system bandwidth of memory systems with stacked-DRAM. In Proceedings of the International Symposium on Memory Systems. ACM, 268--280.Google ScholarDigital Library
Chiachen Chou, Aamer Jaleel, and Moinuddin K Qureshi. 2014. Cameo: A two-level memory organization with capacity of main memory and flexibility of hardware-managed cache. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 1--12.Google ScholarDigital Library
Dan Comperchio and Jason Stevens. 2014. Emerging Computing Technologies: Hewlett-PackardâĂ&Zacute;s âĂIJThe MachineâĂ$ID Project. In HP Discover 2014 conference held in Las Vegas June 10--12. Willdan Energy Solutions, 1--4.Google Scholar
CCIX Consortium. 2017. Online]. https://www.ccixconsortium.com/ (2017).Google Scholar
GenZ Consortium. 2017. GenZ Core Specification. Online]. https://www.ericsson.com/research-blog/time-memory-disaggregation/ (May 2017).Google Scholar
Howard David, Chris Fallin, Eugene Gorbatov, Ulf R Hanebutte, and Onur Mutlu. 2011. Memory power management via dynamic voltage/frequency scaling. In Proceedings of the 8th ACM international conference on Autonomic computing. ACM, 31--40.Google ScholarDigital Library
Qingyuan Deng, David Meisner, Luiz Ramos, Thomas F Wenisch, and Ricardo Bianchini. 2011. Memscale: active low-power modes for main memory. In ACM SIGPLAN Notices, Vol. 46. ACM, 225--238.Google ScholarDigital Library
Charles R Ferenbaugh. 2015. PENNANT: an unstructured mesh mini-app for advanced architecture research. Concurrency and Computation: Practice and Experience 27, 17 (2015), 4555--4572.Google ScholarDigital Library
Geoffrey Gunow, John Tramm, Benoit Forget, Kord Smith, and Tim He. 2015. Simplemoc-a performance abstraction for 3d moc. (2015).Google Scholar
J. Cao H.-Y. Chen S. B. Eryilmaz S. W. Fong J. A. Incorvia Z. Jiang H. Li C. Neumann K. Okabe S. Qin J. Sohn Y. Wu S. Yu X. Zheng H.-S. P. Wong, C. Ahn. [n. d.]. Stanford Memory Trends. Retrieved February 1, 2019 from https://nano.stanford.edu/stanford-memory-trendsGoogle Scholar
Jim Handy. 2015. Understanding the Intel/Micron 3D XPoint memory. In Proc. SDC.Google Scholar
Michael A Heroux, Douglas W Doerfler, Paul S Crozier, James M Willenbring, H Carter Edwards, Alan Williams, Mahesh Rajan, Eric R Keiter, Heidi K Thornquist, and Robert W Numrich. 2009. Improving performance via mini-applications. Sandia National Laboratories, Tech. Rep. SAND2009-5574 3 (2009).Google Scholar
Forbes Technology Council Jai Menon. 2018. The Rise Of Memory-Centric Architectures. Online]. https://www.forbes.com/sites/forbestechcouncil/2018/11/16/the-rise-of-memory-centric-architectures/ (November 2018).Google Scholar
Brian G Johnson and Charles H Dennison. 2004. Phase change memory. US Patent 6,791,102.Google Scholar
Sudarsun Kannan, Ada Gavrilovska, Vishal Gupta, and Karsten Schwan. 2017. Heteroos: Os design for heterogeneous memory management in datacenter. In ACM SIGARCH Computer Architecture News, Vol. 45. ACM, 521--534.Google ScholarDigital Library
Ian Karlin, Jeff Keasler, and JR Neely. 2013. Lulesh 2.0 updates and changes. Technical Report. Lawrence Livermore National Laboratory (LLNL), Livermore, CA.Google Scholar
VR Kommareddy, A Awad, C Hughes, and SD Hammond. [n. d.]. Opal: A Centralized Memory Manager for Investigating Disaggregated Memory Systems. ([n. d.]).Google Scholar
Shuang Liang, Ranjit Noronha, and Dhabaleswar K Panda. 2005. Swapping to remote memory over infiniband: An approach using a high performance network block device. In 2005 IEEE International Conference on Cluster Computing. IEEE, 1--10.Google ScholarCross Ref
Kevin Lim, Jichuan Chang, Trevor Mudge, Parthasarathy Ranganathan, Steven K Reinhardt, and Thomas F Wenisch. 2009. Disaggregated memory for expansion and sharing in blade servers. In ACM SIGARCH Computer Architecture News, Vol. 37. ACM, 267--278.Google ScholarDigital Library
Kevin Lim, Yoshio Turner, Jose Renato Santos, Alvin AuYoung, Jichuan Chang, Parthasarathy Ranganathan, and Thomas F Wenisch. 2012. System-level implications of disaggregated memory. In High Performance Computer Architecture (HPCA), 2012 IEEE 18th International Symposium on. IEEE, 1--12.Google ScholarDigital Library
Chung-Hsiang Lin, Chia-Lin Yang, and Ku-Jei King. 2009. PPT: joint performance/power/thermal management of DRAM memory for multi-core systems. In Proceedings of the 2009 ACM/IEEE international symposium on Low power electronics and design. ACM, 93--98.Google ScholarDigital Library
Felix Xiaozhu Lin and Xu Liu. 2016. Memif: Towards programming heterogeneous memory asynchronously. ACM SIGARCH Computer Architecture News 44, 2 (2016), 369--383.Google ScholarDigital Library
Song Liu, Brian Leung, Alexander Neckar, Seda Ogrenci Memik, Gokhan Memik, and Nikos Hardavellas. 2011. Hardware/software techniques for DRAM thermal management. (2011).Google Scholar
Mitesh R Meswani, Sergey Blagodurov, David Roberts, John Slice, Mike Ignatowski, and Gabriel H Loh. 2015. Heterogeneous memory architectures: A HW/SW approach for mixing die-stacked and off-package memories. In High Performance Computer Architecture (HPCA), 2015 IEEE 21st International Symposium on. IEEE, 126--136.Google ScholarCross Ref
Hugo Meyer, Jose Carlos Sancho, Josue V Quiroga, Ferad Zyulkyarov, Damian Roca, and Mario Nemirovsky. 2017. Disaggregated computing. an evaluation of current trends for datacentres. Procedia Computer Science 108 (2017), 685--694.Google ScholarCross Ref
Guilherme Piccoli, Henrique N Santos, Raphael E Rodrigues, Christiane Pousa, Edson Borin, and Fernando M Quintão Pereira. 2014. Compiler support for selective page migration in NUMA architectures. In Proceedings of the 23rd international conference on Parallel architectures and compilation. ACM, 369--380.Google ScholarDigital Library
Andreas Prodromou, Mitesh Meswani, Nuwan Jayasena, Gabriel Loh, and Dean M Tullsen. 2017. MemPod: A clustered architecture for efficient and scalable migration in flat address space multi-level memories. In High Performance Computer Architecture (HPCA), 2017 IEEE International Symposium on. IEEE, 433--444.Google ScholarCross Ref
Luiz E Ramos, Eugene Gorbatov, and Ricardo Bianchini. 2011. Page placement in hybrid memory systems. In Proceedings of the international conference on Supercomputing. ACM, 85--95.Google ScholarDigital Library
Arun F Rodrigues, K Scott Hemmert, Brian W Barrett, Chad Kersey, Ron Oldfield, Marlo Weston, Rolf Risen, Jeanine Cook, Paul Rosenfeld, E CooperBalls, et al. 2011. The structural simulation toolkit. ACM SIGMETRICS Performance Evaluation Review 38, 4 (2011), 37--42.Google ScholarDigital Library
Bogdan F Romanescu, Alvin R Lebeck, Daniel J Sorin, and Anne Bracy. 2010. UNified instruction/translation/data (UNITD) coherence: One protocol to rule them all. In Proceedings-International Symposium on High-Performance Computer Architecture.Google ScholarCross Ref
Jaewoong Sim, Alaa R Alameldeen, Zeshan Chishti, Chris Wilkerson, and Hyesoon Kim. 2014. Transparent hardware management of stacked dram as part of memory. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 13--24.Google ScholarDigital Library
Xiaoyuan Wang. 2018. Supporting Superpages and Lightweight Page Migration in Hybrid Memory Systems. arXiv preprint arXiv:1806.00776 (2018).Google Scholar
H-S Philip Wong, Simone Raoux, SangBum Kim, Jiale Liang, John P Reifenberg, Bipin Rajendran, Mehdi Asheghi, and Kenneth E Goodson. 2010. Phase change memory. Proc. IEEE 98, 12 (2010), 2201--2227.Google ScholarCross Ref
HanBin Yoon, Justin Meza, Rachata Ausavarungnirun, Rachael A Harding, and Onur Mutlu. 2012. Row buffer locality aware caching policies for hybrid memories. In Computer Design (ICCD), 2012 IEEE 30th International Conference on. IEEE, 337--344.Google ScholarDigital Library

Index Terms

Page migration support for disaggregated non-volatile memories
1. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Heterogeneous (hybrid) systems

Recommendations

Embedded non-volatile memories
SBCCI '07: Proceedings of the 20th annual conference on Integrated circuits and systems design

This tutorial covers trends in embedded non-volatile memories including details of issues for scaling NAND and NOR flash and descriptions of scaled flash memory technologies and various evolutionary flash memory technologies such as trapping site ...
Read More
Reducing write activities on non-volatile memories in embedded CMPs via data migration and recomputation
DAC '10: Proceedings of the 47th Design Automation Conference

Recent advances in circuit and process technologies have pushed non-volatile memory technologies into a new era. These technologies exhibit appealing properties such as low power consumption, non-volatility, shock-resistivity, and high density. However, ...
Read More
Towards Write-Activity-Aware Page Table Management for Non-volatile Main Memories

Non-volatile memories such as phase change memory (PCM) and memristor are being actively studied as an alternative to DRAM-based main memory in embedded systems because of their properties, which include low power consumption and high density. Though ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

MEMSYS '19: Proceedings of the International Symposium on Memory Systems
September 2019
517 pages
ISBN:9781450372060
DOI:10.1145/3357526

Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 30 September 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 11
  Total Citations
  View Citations
- 641
  Total Downloads
- Downloads (Last 12 months)180
- Downloads (Last 6 weeks)27
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Page migration support for disaggregated non-volatile memories

MEMSYS '19: Proceedings of the International Symposium on Memory Systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

Embedded non-volatile memories

Reducing write activities on non-volatile memories in embedded CMPs via data migration and recomputation

Towards Write-Activity-Aware Page Table Management for Non-volatile Main Memories

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Page migration support for disaggregated non-volatile memories

MEMSYS '19: Proceedings of the International Symposium on Memory Systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

Embedded non-volatile memories

Reducing write activities on non-volatile memories in embedded CMPs via data migration and recomputation

Towards Write-Activity-Aware Page Table Management for Non-volatile Main Memories

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media