ABSTRACT
Recent architectural trends---cheap, fast solid-state storage, inexpensive DRAM, and multi-core CPUs---provide an opportunity to rethink the interface between applications and persistent storage. To leverage these advances, we propose a new system architecture called Hathi that provides an in-memory transactional heap made persistent using high-speed flash drives. With Hathi, programmers can make consistent concurrent updates to in-memory data structures that survive system failures.
Hathi focuses on three major design goals: ACID semantics, a simple programming interface, and fine-grained programmer control. Hathi relies on software transactional memory to provide a simple concurrent interface to in-memory data structures, and extends it with persistent logs and checkpoints to add durability.
To reduce the cost of durability, Hathi uses two main techniques. First, it provides split-phase and partitioned commit interfaces, that allow programmers to overlap commit I/O with computation and to avoid unnecessary synchronization. Second, it uses partitioned logging, which reduces contention on in-memory log buffers and exploits internal SSD parallelism. We find that our implementation of Hathi can achieve 1.25 million txns/s with a single SSD.
- Fusion-IO PCI-e ioDrive. www.fusionio.com/products/iodrive.Google Scholar
- FusionIO Auto-Commit Memory. http://www.fusionio.com/blog/auto-commit-memory\\-cutting-latency-by-eliminating-block-i/o.Google Scholar
- GemStone Object Server. www.gemstone.com/products/gemstone.Google Scholar
- memcached: High-performance Main-Memory Key-Value Store. www.memcached.org.Google Scholar
- Oracle TimesTen In-Memory Database. www.oracle.com/timesten.Google Scholar
- VoltDB: SQL DBMS with ACID. www.voltdb.com.Google Scholar
- N. Agrawal, V. Prabhakaran, T. Wobber, J. Davis, M. Manasse, and R. Panigrahy. Design tradeoffs for ssd performance. In USENIX, 2008. Google ScholarDigital Library
- M. K. Aguilera, A. Merchant, M. A. Shah, A. C. Veitch, and C. T. Karamanolis. Sinfonia: a new paradigm for building scalable distributed systems. In SOSP, 2007. Google ScholarDigital Library
- A. Badam and V. S. Pai. SSDAlloc: Hybrid ssd/ram memory management made easy. In NSDI, 2011. Google ScholarDigital Library
- T. Cao, M. Vaz Salles, B. Sowell, Y. Yue, A. Demers, J. Gehrke, and W. White. Fast checkpoint recovery algorithms for frequently consistent applications. In SIGMOD, 2011. Google ScholarDigital Library
- F. Chen, R. Lee, and X. Zhang. Essential roles of exploiting internal parallelism of flash memory based solid state drives in high-speed data processing. In HPCA, 2011. Google ScholarDigital Library
- J. Coburn, A. M. Caulfield, A. Akel, L. M. Grupp, R. K. Gupta, R. Jhala, and S. Swanson. Nv-heaps: Making persistent objects fast and safe with next-generation, non-volatile memories. In ASPLOS, 2011. Google ScholarDigital Library
- A. Dearle, R. di Bona, J. Farrow, F. Henskens, A. Lindström, J. Rosenberg, and F. Vaughan. Grasshopper: an orthogonally persistent operating system. In Journal of Computer Systems, volume 7, pages 289--312, 1994. Google ScholarDigital Library
- G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: amazon's highly available key-value store. In SOSP, 2007. Google ScholarDigital Library
- P. Felber, C. Fetzer, and T. Riegel. Dynamic performance tuning of word-based software transactional memory. In PPoPP, 2008. Google ScholarDigital Library
- J. Gray. The transaction concept: Virtues and limitations. In VLDB, 1981. Google ScholarDigital Library
- P. Helland, H. Sammer, J. Lyon, R. Carr, and P. Garrett. Group commit timers and high-volume transaction systems. In Tandem TR 88.1, 1988.Google Scholar
- Intel. X-25 mainstream ssd datasheet. http://www.intel.com/design/flash/nand/mainstream/index.htm.Google Scholar
- R. Kallman, H. Kimura, J. Natkins, A. Pavlo, A. Rasin, S. Zdonik, E. P. C. Jones, S. Madden, M. Stonebraker, Y. Zhang, J. Hugg, and D. J. Abadi. H-store: a high-performance, distributed main memory transaction processing system. Proc. VLDB Endow., 1(2): 1496--1499, 2008. Google ScholarDigital Library
- J. R. Larus and R. Rajwar. Transactional Memory. Morgan & Claypool Publishers, 2006.Google ScholarCross Ref
- D. Lowell and P. Chen. Free transactions with rio vista. In SOSP, 1997. Google ScholarDigital Library
- C. C. Minh, J. Chung, C. Kozyrakis, and K. Olukotun. STAMP: Stanford transactional applications for multi-processing. In IISWC, 2008.Google Scholar
- J. Ousterhout, P. Agrawal, D. Erickson, C. Kozyrakis, J. Leverich, D. Mazières, S. Mitra, A. Narayanan, G. Parulkar, M. Rosenblum, S. M. Rumble, E. Stratmann, and R. Stutsman. The case for ramclouds: scalable high-performance storage entirely in dram. SIGOPS Oper. Syst. Rev., 43: 92--105, January 2010. Google ScholarDigital Library
- M. Satyanarayanan, H. Mashburn, P. Kumar, D. Steere, and J. Kistler. Lightweight recoverable virtual memory. In ACM Transactions on Computer Systems, 1994. Google ScholarDigital Library
- M. Saxena and M. M. Swift Flash VM: Virtual Memory Management on Flash. In Usenix Annual Technical Conference, 2010. Google ScholarDigital Library
- E. Shekita and M. Zwilling. Cricket: A mapped, persistent object store. In Workshop on Persistent Object Systems, 1990.Google Scholar
- V. Singhal, S. V. Kakkad, and P. R. Wilson. Texas: good, fast, cheap persistence for c++. In SIGPLAN OOPS Mess, 1993. Google ScholarDigital Library
- M. Vaz Salles, T. Cao, B. Sowell, A. Demers, J. Gehrke, C. Koch, and W. White. An evaluation of checkpoint recovery for massively multiplayer online games. In VLDB, 2009. Google ScholarDigital Library
- H. Volos, A. J. Tack, and M. M. Swift. Mnemosyne: Lightweight persistent memory. In ASPLOS, 2011. Google ScholarDigital Library
- S. J. White and D. J. Dewitt. Quickstore: A high performance mapped object store. In VLDB Journal, pages 629--673, 1995. Google ScholarDigital Library
- M. Wu and W. Zwaenepoel. eNVy: A non-volatile, main memory storage system. In ASPLOS-VI, 1994. Google ScholarDigital Library
Recommendations
Wait-n-GoTM: improving HTM performance by serializing cyclic dependencies
ASPLOS '13Transactional memory (TM) has been proposed to alleviate some key programmability problems in chip multiprocessors. Most TMs optimistically allow concurrent transactions, detecting read-write or write-write conflicts. Upon conflicts, existing hardware ...
Comments