Abstract
A key-value store (KVS), such as memcached and Redis, is widely used as a caching layer to augment the slower persistent backend storage in data centers. DRAM-based KVS provides fast key-value access, but its scalability is limited by the cost, power and space needed by the machine cluster to support a large amount of DRAM. This paper offers a 10X to 100X cheaper solution based on flash storage and hardware accelerators. In BlueCache key-value pairs are stored in flash storage and all KVS operations, including the flash controller are directly implemented in hardware. Furthermore, BlueCache includes a fast interconnect between flash controllers to provide a scalable solution. We show that BlueCache has 4.18X higher throughput and consumes 25X less power than a flash-backed KVS software implementation on x86 servers. We further show that BlueCache can outperform DRAM-based KVS when the latter has more than 7.4% misses for a read-intensive aplication. BlueCache is an attractive solution for both rack-level appliances and data-center-scale key-value cache.
- A Hash Function for Hash Table Lookup. http://goo.gl/VDzzLb.Google Scholar
- Bluespec Inc. http://www.bluespec.com.Google Scholar
- Netflix EVCache. http://goo.gl/9zoxJ6.Google Scholar
- Nvidia GeForce GTX 780 Specifications. http://goo.gl/6Yhlv6.Google Scholar
- Redis. http://redis.io.Google Scholar
- Samsung 850 PRO. http://goo.gl/vjPj7V.Google Scholar
- Samsung 950 PRO. http://goo.gl/DCwQpd.Google Scholar
- Samsung M393A2G40DB0-CPB. http://goo.gl/BOL4ye.Google Scholar
- Samsung DDR4 SDRAM. http://goo.gl/LO1ExG, June 2013.Google Scholar
- D. G. Andersen, J. Franklin, M. Kaminsky, A. Phanishayee, L. Tan, and V. Vasudevan. FAWN: A Fast Array of Wimpy Nodes. In SOSP, pages 1--14, 2009. Google ScholarDigital Library
- B. Atikoglu, Y. Xu, E. Frachtenberg, S. Jiang, and M. Paleczny. Workload analysis of a large-scale key-value store. In SIGMETRICS, pages 53--64, 2012. Google ScholarDigital Library
- A. Badam and V. S. Pai. SSDAlloc: Hybrid SSD/RAM Memory Management Made Easy. In NSDI, pages 211--224, 2011. Google ScholarDigital Library
- S. Barahmand and S. Ghandeharizadeh. BG: A Benchmark to Evaluate Interactive Social Networking Actions. In CIDR, 2013.Google Scholar
- M. A. Bender, M. Farach-Colton, R. Johnson, R. Kraner, B. C. Kuszmaul, D. Medjedovic, P. Montes, P. Shetty, R. P. Spillane, and E. Zadok. Don'T Thrash: How to Cache Your Hash on Flash. Proc. VLDB Endow., pages 1627--1637, 2012. Google ScholarDigital Library
- M. Berezecki, E. Frachtenberg, M. Paleczny, and K. Steele. Many-core Key-value Store. In IGCC, pages 1--8, 2011. Google ScholarDigital Library
- M. Blott, K. Karras, L. Liu, K. Vissers, J. Bär, and Z. István. Achieving 10Gbps Line-rate Key-value Stores with FPGAs. In Presented as part of the 5th USENIX Workshop on Hot Topics in Cloud Computing, 2013.Google Scholar
- M. Blott, L. Liu, K. Karras, and K. Vissers. Scaling Out to a Single-Node 80Gbps Memcached Server with 40Terabytes of Memory. In HotStorage, 2015. Google ScholarDigital Library
- N. Bronson, Z. Amsden, G. Cabrera, P. Chakka, P. Dimov, H. Ding, J. Ferris, A. Giardullo, S. Kulkarni, H. Li, M. Marchukov, D. Petrov, L. Puzar, Y. J. Song, and V. Venkataramani. TAO: Facebook's Distributed Data Store for the Social Graph. In USENIX ATC, pages 49--60, 2013. Google ScholarDigital Library
- S. R. Chalamalasetti, K. Lim, M. Wright, A. AuYoung, P. Ranganathan, and M. Margala. An FPGA Memcached Appliance. In FPGA, pages 245--254, 2013. Google ScholarDigital Library
- F. Chen, R. Lee, and X. Zhang. Essential roles of exploiting internal parallelism of flash memory based solid state drives in high-speed data processing. In HPCA, pages 266--277, 2011. Google ScholarDigital Library
- T.-S. Chung, D.-J. Park, S. Park, D.-H. Lee, S.-W. Lee, and H.-J. Song. System Software for Flash Memory: A Survey. In EUC, pages 394--404, 2006. Google ScholarDigital Library
- B. K. Debnath, S. Sengupta, and J. Li. FlashStore: High Throughput Persistent Key-Value Store. Proc. VLDB Endow., pages 1414--1425, 2010. Google ScholarDigital Library
- A. Dragojević, D. Narayanan, O. Hodson, and M. Castro. FaRM: Fast Remote Memory. In NSDI, pages 401--414, 2014. Google ScholarDigital Library
- E. S. Fukuda, H. Inoue, T. Takenaka, D. Kim, T. Sadahisa, T. Asai, and M. Motomura. Caching memcached at reconfigurable network interface. In FPL, pages 1--6, 2014.Google ScholarCross Ref
- Fusion IO. using membrain as a flash-based cache. http://goo.gl/Khecz6, December 2011.Google Scholar
- S. Gunelius. The Data Explosion in 2014 Minute by Minute Infographic. http://goo.gl/9CqKj5, July 2014.Google Scholar
- T. H. Hetherington, M. O'Connor, and T. M. Aamodt. MemcachedGPU: Scaling-up Scale-out Key-value Stores. In SoCC, pages 43--57, 2015. Google ScholarDigital Library
- T. H. Hetherington, T. G. Rogers, L. Hsu, M. O'Connor, and T. M. Aamodt. Characterizing and Evaluating a Key-value Store Application on Heterogeneous CPU-GPU Systems. In ISPASS, pages 88--98, 2012. Google ScholarDigital Library
- Intel Inc. Intel Data Direct I/O Technology. http://goo.gl/2puCwN.Google Scholar
- Intel Inc. Intel Data Plane Development Kit(Intel DPDK) Overview - Packet Processing on Intel Architecture. http://goo.gl/W5oBBV, December 2012.Google Scholar
- J. Jose, H. Subramoni, K. Kandalla, M. Wasi-ur Rahman, H. Wang, S. Narravula, and D. K. Panda. Scalable Memcached Design for InfiniBand Clusters Using Hybrid Transports. In CCGRID, pages 236--243, 2012. Google ScholarDigital Library
- J. Jose, H. Subramoni, M. Luo, M. Zhang, J. Huang, M. Wasi-ur Rahman, N. S. Islam, X. Ouyang, H. Wang, S. Sur, and D. K. Panda. Memcached Design on High Performance RDMA Capable Interconnects. In ICPP, pages 743--752, 2011. Google ScholarDigital Library
- S.-W. Jun, M. Liu, K. E. Fleming, and Arvind. Scalable Multi-access Flash Store for Big Data Analytics. In FPGA, pages 55--64, 2014. Google ScholarDigital Library
- S.-W. Jun, M. Liu, S. Lee, J. Hicks, J. Ankcorn, M. King, S. Xu, and Arvind. BlueDBM: An Appliance for Big Data Analytics. In ISCA, pages 1--13, 2015. Google ScholarDigital Library
- S.-W. Jun, M. Liu, S. Xu, and Arvind. A transport-layer network for distributed fpga platforms. In FPL, pages 1--4, 2015.Google ScholarCross Ref
- A. Kalia, M. Kaminsky, and D. G. Andersen. Using RDMA Efficiently for Key-value Services. In SIGCOMM, pages 295--306, 2014. Google ScholarDigital Library
- D. Karger, E. Lehman, T. Leighton, R. Panigraphy, M. Levine, and D. Lewin. Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web. In STOC, pages 654--663, 1997. Google ScholarDigital Library
- M. King, J. Hicks, and J. Ankcorn. Software-Driven Hardware Development. In FPGA, pages 13--22, 2015. Google ScholarDigital Library
- S. Ko, S. Jun, Y. Ryu, O. Kwon, and K. Koh. A New Linux Swap System for Flash Memory Storage Devices. In ICCSA, pages 151--156, 2008. Google ScholarDigital Library
- M. Lavasani, H. Angepat, and D. Chiou. An FPGA-based In-Line Accelerator for Memcached. Computer Architecture Letters, pages 57--60, 2014. Google ScholarDigital Library
- C. Lee, D. Sim, J. Hwang, and S. Cho. F2FS: A New File System for Flash Storage. In FAST, pages 273--286, 2015. Google ScholarDigital Library
- S. Lee, J. Kim, and Arvind. Refactored Design of I/O Architecture for Flash Storage. Computer Architecture Letters, pages 70--74, 2015.Google ScholarCross Ref
- S. Lee, M. Liu, S. Jun, S. Xu, J. Kim, and Arvind. Application-Managed Flash. In FAST, pages 339--353, 2016. Google ScholarDigital Library
- S.-W. Lee, B. Moon, C. Park, J.-M. Kim, and S.-W. Kim. A case for flash memory ssd in enterprise database applications. In SIGMOD, pages 1075--1086, 2008. Google ScholarDigital Library
- S. Li, H. Lim, V. W. Lee, J. H. Ahn, A. Kalia, M. Kaminsky, D. G. Andersen, O. Seongil, S. Lee, and P. Dubey. Architecting to Achieve a Billion Requests Per Second Throughput on a Single Key-value Store Server Platform. In ISCA, pages 476--488, 2015. Google ScholarDigital Library
- H. Lim, B. Fan, D. G. Andersen, and M. Kaminsky. Silt: A memory-efficient, high-performance key-value store. In SOSP, pages 1--13, 2011. Google ScholarDigital Library
- H. Lim, D. Han, D. G. Andersen, and M. Kaminsky. MICA: A Holistic Approach to Fast In-memory Key-value Storage. In NSDI, pages 429--444, 2014. Google ScholarDigital Library
- K. Lim, D. Meisner, A. G. Saidi, P. Ranganathan, and T. F. Wenisch. Thin servers with smart pipes: Designing soc accelerators for memcached. In ISCA, pages 36--47, 2013. Google ScholarDigital Library
- M. Liu, S.-W. Jun, S. Lee, J. Hicks, and Arvind. minflash: A minimalistic clustered flash array. In DATE, pages 1255--1260, 2016. Google ScholarDigital Library
- X. Liu and K. Salem. Hybrid Storage Management for Database Systems. Proc. VLDB Endow., pages 541--552, 2013. Google ScholarDigital Library
- C. Mitchell, Y. Geng, and J. Li. Using One-Sided RDMA Reads to Build a Fast, CPU-Efficient Key-Value Store. In USENIX ATC, pages 103--114, 2013. Google ScholarDigital Library
- R. Nishtala, H. Fugal, S. Grimm, M. Kwiatkowski, H. Lee, H. C. Li, R. McElroy, M. Paleczny, D. Peek, P. Saab, D. Stafford, T. Tung, and V. Venkataramani. Scaling Memcache at Facebook. In NSDI, pages 385--398, 2013. Google ScholarDigital Library
- J. Ousterhout. RAMCloud and the Low-Latency Datacenter. http://goo.gl/uWsPnu, 2014.Google Scholar
- J. Ousterhout, P. Agrawal, D. Erickson, C. Kozyrakis, J. Leverich, D. Mazières, S. Mitra, A. Narayanan, G. Parulkar, M. Rosenblum, S. M. Rumble, E. Stratmann, and R. Stutsman. The Case for RAMClouds: Scalable High-performance Storage Entirely in DRAM. ACM SIGOPS Operating Systems Review, pages 92--105, 2010. Google ScholarDigital Library
- J. Ouyang, S. Lin, S. Jiang, Z. Hou, Y. Wang, and Y. Wang. SDF: Software-defined Flash for Web-scale Internet Storage Systems. In ASPLOS, pages 471--484, 2014. Google ScholarDigital Library
- X. Ouyang, N. Islam, R. Rajachandrasekar, J. Jose, M. Luo, H. Wang, and D. Panda. SSD-Assisted Hybrid Memory to Accelerate Memcached over High Performance Networks. In ICPP, pages 470--479, 2012. Google ScholarDigital Library
- I. Petrov, G. Almeida, A. Buchmann, and U. Gräf. Building Large Storage Based On Flash Disks. In ADMS@ VLDB, 2010.Google Scholar
- M. Rosenblum and A. N. Mario Flajslik. Low Latency RPC in RAMCloud. http://goo.gl/3FwCnU, 2011.Google Scholar
- SanDisk. Fusion ioMemory PX600 PCIe Application Accelerators. http://goo.gl/rqePxN.Google Scholar
- M. Saxena and M. M. Swift. FlashVM: Virtual Memory Management on Flash. In USENIX ATC, pages 14--14, 2010. Google ScholarDigital Library
- R. Stoica and A. Ailamaki. Improving Flash Write Performance by Using Update Frequency. Proc. VLDB Endow., pages 733--744, 2013. Google ScholarDigital Library
- P. Stuedi, A. Trivedi, and B. Metzler. Wimpy Nodes with 10GbE: Leveraging One-Sided Operations in Soft-RDMA to Boost Memcached. In USENIX ATC, pages 347--353, 2012. Google ScholarDigital Library
- Twitter Inc. Fatcache: memcache on SSD. https://github.com/twitter/fatcache.Google Scholar
- K. Zhang, K. Wang, Y. Yuan, L. Guo, R. Lee, and X. Zhang. Mega-KV: A Case for GPUs to Maximize the Throughput of In-memory Key-value Stores. Proc. VLDB Endow., pages 1226--1237, 2015. Google ScholarDigital Library
Recommendations
HPDA: A hybrid parity-based disk array for enhanced performance and reliability
Flash-based Solid State Drive (SSD) has been productively shipped and deployed in large scale storage systems. However, a single flash-based SSD cannot satisfy the capacity, performance and reliability requirements of the modern storage systems that ...
A multiple-file write scheme for improving write performance of small files in Fast File System
Fast File System (FFS) stores files to disk in separate disk writes, each of which incurs a disk positioning (seek + rotation) limiting the write performance for small files. We propose a new scheme called co-writing to accelerate small file writes in ...
Flash-Aware RAID Techniques for Dependable and High-Performance Flash Memory SSD
Solid-state disks (SSDs), which are composed of multiple NAND flash chips, are replacing hard disk drives (HDDs) in the mass storage market. The performances of SSDs are increasing due to the exploitation of parallel I/O architectures. However, ...
Comments