Abstract
Solid-state Drives (SSDs) have changed the landscape of storage systems and present a promising storage solution for data-intensive applications due to their low latency, high bandwidth, and low power consumption compared to traditional hard disk drives. SSDs achieve these desirable characteristics using internal parallelism—parallel access to multiple internal flash memory chips—and a Flash Translation Layer (FTL) that determines where data are stored on those chips so that they do not wear out prematurely. However, current state-of-the-art cache-based FTLs like the Demand-based Flash Translation Layer (DFTL) do not allow IO schedulers to take full advantage of internal parallelism, because they impose a tight coupling between the logical-to-physical address translation and the data access. To address this limitation, we introduce a new FTL design called Parallel-DFTL that works with the DFTL to decouple address translation operations from data accesses. Parallel-DFTL separates address translation and data access operations into different queues, allowing the SSD to use concurrent flash accesses for both types of operations. We also present a Parallel-LRU cache replacement algorithm to improve the concurrency of address translation operations. To compare Parallel-DFTL against existing FTL approaches, we present a Parallel-DFTL performance model and compare its predictions against those for DFTL and an ideal page-mapping approach. We also implemented the Parallel-DFTL approach in an SSD simulator using real device parameters, and used trace-driven simulation to evaluate Parallel-DFTL’s efficacy. Our evaluation results show that Parallel-DFTL improved the overall performance by up to 32% for the real IO workloads we tested, and by up to two orders of magnitude with synthetic test workloads. We also found that Parallel-DFTL is able to achieve reasonable performance with a very small cache size and that it provides the best benefit for those workloads with large request size or with high write ratio.
- 2016. Exchange1 block IO trace, the SNIA IOTTA Repository. Retrieved from http://iotta.snia.org/.Google Scholar
- 2016. Financial1 IO trace, U Mass Trace Repository. Retrieved from http://traces.cs.umass.edu/index.php/Storage/Storage/.Google Scholar
- 2017. DRAMless SSD Roundup. Retrieved from http://www.tomshardware.com/reviews/dramless-ssd-roundup,4833.html.Google Scholar
- Nitin Agrawal, Vijayan Prabhakaran, Ted Wobber, John D. Davis, Mark S. Manasse, and Rina Panigrahy. 2008. Design tradeoffs for SSD performance. In Proceedings of the USENIX Annual Technical Conference. Google ScholarDigital Library
- Christoph Albrecht, Arif Merchant, Murray Stokely, Muhammad Waliji, François Labelle, Nate Coehlo, Xudong Shi, and C. Eric Schrock. 2013. Janus: Optimal flash provisioning for cloud storage workloads. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’13). Google ScholarDigital Library
- Werner Bux and Ilias Iliadis. 2010. Performance of greedy garbage collection in flash-based solid-state drives. In Performance Evaluation (2010). Google ScholarDigital Library
- Feng Chen, David A. Koufaty, and Xiaodong Zhang. 2011. Hystor: Making the best use of solid state drives in high performance storage systems. In Proceedings of the International Conference on Supercomputing (ICS’11). ACM, New York, NY, 22--32. Google ScholarDigital Library
- Feng Chen, Rubao Lee, and Xiaodong Zhang. {n. d.}. Essential roles of exploiting internal parallelism of flash memory based solid state drives in high-speed data processing. In Proceedings of the IEEE 17th International Symposium on High Performance Computer Architecture (HPCA). Google ScholarDigital Library
- M.-L. Chiang and R.-C. Chang. 1999. Cleaning policies in mobile computers using flash memory. J. Syst. Softw. 48, 3 (1999), 213--231. Google ScholarDigital Library
- Mei-Ling Chiang, Paul C. H. Lee, and Ruei-Chuan Chang. 1997. Managing flash memory in personal communication devices. In Proceedings of 1997 IEEE International Symposium on Consumer Electronics 1997 (ISCE’97). IEEE, 177--182.Google Scholar
- Mei-Ling Chiang, Paul C. H. Lee, Ruei-Chuan Chang, et al. 1999. Using data clustering to improve cleaning performance for flash memory. Softw. Pract. Exper. 29, 3 (1999), 267--290. Google ScholarDigital Library
- Dong Dai, Yong Chen, Dries Kimpe, and Rob R. Ross. 2014. Two-choice randomized dynamic I/O scheduler for object storage systems. In Proceedings of the ACM/IEEE Supercomputing Conference (SC’14), 2014. Google ScholarDigital Library
- Peter Desnoyers. 2012. Analytic modeling of SSD write performance. In Proceedings of the 5th Annual International Systems and Storage Conference. ACM, 12. Google ScholarDigital Library
- Cagdas Dirik and Bruce Jacob. 2009. The performance of PC solid-state disks (SSDs) as a function of bandwidth, concurrency, device architecture, and system organization. In Proceedings of the ACM SIGARCH Computer Architecture News. Google ScholarDigital Library
- Congming Gao, Liang Shi, Mengying Zhao, Chun Jason Xue, Kaijie Wu, and Edwin H.-M. Sha. 2014. Exploiting parallelism in I/O scheduling for access conflict minimization in flash-based solid state drives. In Proceedings of the 2014 30th Symposium on Mass Storage Systems and Technologies (MSST’14). IEEE, 1--11.Google Scholar
- Aayush Gupta, Youngjae Kim, and Bhuvan Urgaonkar. 2009. DFTL: A Flash Translation Layer Employing Demand-based Selective Caching of Page-level Address Mappings. Google ScholarDigital Library
- Jen-Wei Hsieh, Tei-Wei Kuo, and Li-Pin Chang. 2006. Efficient identification of hot data for flash memory storage systems. ACM Trans. Stor. 2, 1 (2006), 22--40. Google ScholarDigital Library
- J. W. Hsieh, H. Y. Lin, and D. L. Yang. 2014. Multi-channel architecture-based FTL for reliable and high-performance SSD. IEEE Trans. Comput. 63, 12 (Dec. 2014), 3079--3091. Google ScholarDigital Library
- Jian Hu, Hong Jiang, Lei Tian, and Lei Xu. 2010. PUD-LRU: An erase-efficient write buffer management algorithm for flash memory SSD. In Proceedings of the 2010 IEEE International Symposium on Modeling, Analysis 8 Simulation of Computer and Telecommunication Systems (MASCOTS’10). IEEE, 69--78. Google ScholarDigital Library
- Xiao-Yu Hu, Evangelos Eleftheriou, Robert Haas, Ilias Iliadis, and Roman Pletka. 2009. Write amplification analysis in flash-based solid state drives. In Proceedings of the Israeli Experimental Systems Conference (SYSTOR’09). ACM, 10. Google ScholarDigital Library
- Yang Hu, Hong Jiang, Dan Feng, Lei Tian, Hao Luo, and Shuping Zhang. 2011. Performance impact and interplay of SSD parallelism through advanced commands, allocation strategy and data granularity. In Proceedings of the International Conference on Supercomputing. Google ScholarDigital Library
- Yang Hu, Hong Jiang, Dan Feng, Lei Tian, Shuping Zhang, Jingning Liu, Wei Tong, Yi Qin, and Liuzheng Wang. {n. d.}. Achieving page-mapping FTL performance at block-mapping FTL cost by hiding address translation. In Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST). Google ScholarDigital Library
- Myoungsoo Jung and Mahmut Kandemir. {n. d.}. Sprinkler: Maximizing resource utilization in many-chip solid state disks. In Proceedings of the 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).Google Scholar
- Myoungsoo Jung and Mahmut Kandemir. 2012. An evaluation of different page allocation strategies on high-speed SSDs. In Proceedings of the 4th USENIX Conference on Hot Topics in Storage and File Systems. Google ScholarDigital Library
- Myoungsoo Jung, Ellis H. Wilson, III, and Mahmut Kandemir. {n. d.}. Physically addressed queueing (PAQ): Improving parallelism in solid state disks. In Proceedings of the 39th Annual International Symposium on Computer Architecture (ISCA’12). Google ScholarDigital Library
- Jesung Kim, Jong Min Kim, Sam H. Noh, Sang Lyul Min, and Yookun Cho. 2002. A space-efficient flash translation layer for compactflash systems. IEEE Trans. Consum. Electron. 48, 2 (2002), 366--375. Google ScholarDigital Library
- Youngjae Kim, Brendan Tauras, Aayush Gupta, and Bhuvan Urgaonkar. 2009. Flashsim: A simulator for NAND flash-based solid-state drives. In Proceedings of the 1st International Conference on Advances in System Simulation 2009 (SIMUL’09). IEEE, 125--131. Google ScholarDigital Library
- Sang-Won Lee, Dong-Joo Park, Tae-Sun Chung, Dong-Ho Lee, Sangwon Park, and Ha-Joo Song. 2007. A log buffer-based flash translation layer using fully-associative sector translation. ACM Trans. Embed. Comput. Syst. 6, 3, Article 18. Google ScholarDigital Library
- Sang-Phil Lim, Sang-Won Lee, and Bongki Moon. 2010. FASTer FTL for enterprise-class flash memory SSDs. In Proceedings of the 2010 International Workshop on Storage Network Architecture and Parallel I/Os (SNAPI’10). IEEE, 3--12. Google ScholarDigital Library
- Jialin Liu, Bradly Crysler, Yin Lu, and Yong Chen. 2013. Locality-driven high-level I/O aggregation for processing scientific datasets. In Proceedings of the IEEE International Conference on Big Data (Big Data’13).Google ScholarCross Ref
- Ning Liu, Jason Cope, Philip Carns, Christopher Carothers, Robert Ross, Gary Grider, Adam Crume, and Carlos Maltzahn. 2012. On the role of burst buffers in leadership-class storage systems. In Proceedings of the 2012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST’12). IEEE, 1--11.Google ScholarCross Ref
- Yin Lu, Yong Chen, Rob Latham, and Yu Zhuang. 2014. Revealing applications’ access pattern in collective I/O for cache management. In Proceedings of the 28th International Conference on Supercomputing (ICS’14). Google ScholarDigital Library
- Dongzhe Ma, Jianhua Feng, and Guoliang Li. 2011. LazyFTL: A page-level flash translation layer optimized for NAND flash memory. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data. Google ScholarDigital Library
- Changwoo Min, Kangnyeon Kim, Hyunjin Cho, Sang-Won Lee, and Young Ik Eom. 2012. SFS: Random write considered harmful in solid state drives. In Proceedings of the 10th USENIX conference on File and Storage Technologies (FAST’12). Google ScholarDigital Library
- E. H. Nam, B. S. J. Kim, H. Eom, and S. L. Min. 2011. Ozone (O3): An out-of-order flash memory controller architecture. IEEE Trans. Comput. 60, 5 (May 2011), 653--666. Google ScholarDigital Library
- C. Park, Euiseong Seo, Ji-Yong Shin, Seungryoul Maeng, and Joonwon Lee. 2010. Exploiting internal parallelism of flash-based SSDs. Comput. Arch. Lett. 9 (2010). Google ScholarDigital Library
- Dongchul Park and David H. C. Du. 2011. Hot data identification for flash-based storage systems using multiple bloom filters. In Proceedings of the 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST’11). IEEE, 1--11. Google ScholarDigital Library
- Yoon Jae Seong, Eyec Hyun Nam, Jin Hyuk Yoon, Hongseok Kim, Jin-yong Choi, Sookwan Lee, Young Hyun Bae, Jaejin Lee, Yookun Cho, and Sang Lyul Min. 2010. Hydra: A block-mapped parallel flash memory solid-state disk architecture. IEEE Trans. Comput. 59, 7 (2010), 905--921. Google ScholarDigital Library
- Anand Lal Shimpi. 2013. Samsung SSD 840 EVO Review. Retrieved from http://www.anandtech.com/show/7173.Google Scholar
- Ilhoon Shin. 2011. Hot/cold clustering for page mapping in NAND flash memory. IEEE Trans. Cons. Electron. 57, 4 (2011), 1728--1731.Google ScholarCross Ref
- Rajeev Thakur, William Gropp, and Ewing Lusk. 1999. Data sieving and collective I/O in ROMIO. In Proceedings of the Frontiers of Massively Parallel Computation 1999 (Frontiers’99). IEEE, 182--189. Google ScholarDigital Library
- Benny Van Houdt. 2013. A mean field model for a class of garbage collection algorithms in flash-based solid state drives. In ACM SIGMETRICS Performance Evaluation Review. Google ScholarDigital Library
- Chundong Wang and Weng-Fai Wong. 2012. ADAPT: Efficient workload-sensitive flash management based on adaptation, prediction and aggregation. In Proceedings of the 2012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST’12). IEEE, 1--12.Google ScholarCross Ref
- Q. Wei, C. Chen, and J. Yang. 2014. CBM: A cooperative buffer management for SSD. In Proceedings of the 2014 30th Symposium on Mass Storage Systems and Technologies (MSST’14). 1--12.Google Scholar
- Qingsong Wei, Bozhao Gong, Suraj Pathak, Bharadwaj Veeravalli, LingFang Zeng, and Kanzo Okada. 2011. WAFTL: A workload adaptive flash translation layer with data partition. In Proceedings of the 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST’11). IEEE, 1--12. Google ScholarDigital Library
- Wei Xie and Yong Chen. 2017. Elastic consistent hashing for distributed storage systems. In Proceedings of the 31st IEEE International Parallel and Distributed Processing Symposium (IPDPS’17).Google ScholarCross Ref
- Wei Xie, Yong Chen, and Philip C. Roth. 2016. ASA-FTL: An adaptive separation aware flash translation layer for solid state drives. Parallel Comput. (2016).Google Scholar
- Wei Xie, Yong Chen, and Philip C. Roth. 2016. Parallel-DFTL: A flash translation layer that exploits internal parallelism in solid state drives. In Proceedings of the 11th IEEE International Conference on Networking, Architecture, and Storage.Google Scholar
- Wei Xie, Jiang Zhou, M. Reyes, J. Noble, and Yong Chen. 2015. Two-mode data distribution scheme for heterogeneous storage in data centers. In Proceedings of the 2015 IEEE International Conference on Big Data (Big Data’15). 327--332. Google ScholarDigital Library
Index Terms
- Exploiting Internal Parallelism for Address Translation in Solid-State Drives
Recommendations
Understanding and Alleviating the Impact of the Flash Address Translation on Solid State Devices
Special Issue on MSST 2016 and Regular PapersFlash-based solid state devices (SSDs) have been widely employed in consumer and enterprise storage systems. However, the increasing SSD capacity imposes great pressure on performing efficient logical to physical address translation in a page-level ...
Improving Hybrid FTL by Fully Exploiting Internal SSD Parallelism with Virtual Blocks
Compared with either block or page-mapping Flash Translation Layer (FTL), hybrid-mapping FTL for flash Solid State Disks (SSDs), such as Fully Associative Section Translation (FAST), has relatively high space efficiency because of its smaller mapping ...
A hybrid flash translation layer design for SLC-MLC flash memory based multibank solid state disk
This paper presents the design of a NAND flash based solid state disk (SSD), which can support various storage access patterns commonly observed in a PC environment. It is based on a hybrid model of high-performance SLC (single-level cell) NAND and low ...
Comments