skip to main content
research-article
Public Access

Exploiting Internal Parallelism for Address Translation in Solid-State Drives

Published:15 December 2018Publication History
Skip Abstract Section

Abstract

Solid-state Drives (SSDs) have changed the landscape of storage systems and present a promising storage solution for data-intensive applications due to their low latency, high bandwidth, and low power consumption compared to traditional hard disk drives. SSDs achieve these desirable characteristics using internal parallelism—parallel access to multiple internal flash memory chips—and a Flash Translation Layer (FTL) that determines where data are stored on those chips so that they do not wear out prematurely. However, current state-of-the-art cache-based FTLs like the Demand-based Flash Translation Layer (DFTL) do not allow IO schedulers to take full advantage of internal parallelism, because they impose a tight coupling between the logical-to-physical address translation and the data access. To address this limitation, we introduce a new FTL design called Parallel-DFTL that works with the DFTL to decouple address translation operations from data accesses. Parallel-DFTL separates address translation and data access operations into different queues, allowing the SSD to use concurrent flash accesses for both types of operations. We also present a Parallel-LRU cache replacement algorithm to improve the concurrency of address translation operations. To compare Parallel-DFTL against existing FTL approaches, we present a Parallel-DFTL performance model and compare its predictions against those for DFTL and an ideal page-mapping approach. We also implemented the Parallel-DFTL approach in an SSD simulator using real device parameters, and used trace-driven simulation to evaluate Parallel-DFTL’s efficacy. Our evaluation results show that Parallel-DFTL improved the overall performance by up to 32% for the real IO workloads we tested, and by up to two orders of magnitude with synthetic test workloads. We also found that Parallel-DFTL is able to achieve reasonable performance with a very small cache size and that it provides the best benefit for those workloads with large request size or with high write ratio.

References

  1. 2016. Exchange1 block IO trace, the SNIA IOTTA Repository. Retrieved from http://iotta.snia.org/.Google ScholarGoogle Scholar
  2. 2016. Financial1 IO trace, U Mass Trace Repository. Retrieved from http://traces.cs.umass.edu/index.php/Storage/Storage/.Google ScholarGoogle Scholar
  3. 2017. DRAMless SSD Roundup. Retrieved from http://www.tomshardware.com/reviews/dramless-ssd-roundup,4833.html.Google ScholarGoogle Scholar
  4. Nitin Agrawal, Vijayan Prabhakaran, Ted Wobber, John D. Davis, Mark S. Manasse, and Rina Panigrahy. 2008. Design tradeoffs for SSD performance. In Proceedings of the USENIX Annual Technical Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Christoph Albrecht, Arif Merchant, Murray Stokely, Muhammad Waliji, François Labelle, Nate Coehlo, Xudong Shi, and C. Eric Schrock. 2013. Janus: Optimal flash provisioning for cloud storage workloads. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’13). Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Werner Bux and Ilias Iliadis. 2010. Performance of greedy garbage collection in flash-based solid-state drives. In Performance Evaluation (2010). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Feng Chen, David A. Koufaty, and Xiaodong Zhang. 2011. Hystor: Making the best use of solid state drives in high performance storage systems. In Proceedings of the International Conference on Supercomputing (ICS’11). ACM, New York, NY, 22--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Feng Chen, Rubao Lee, and Xiaodong Zhang. {n. d.}. Essential roles of exploiting internal parallelism of flash memory based solid state drives in high-speed data processing. In Proceedings of the IEEE 17th International Symposium on High Performance Computer Architecture (HPCA). Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M.-L. Chiang and R.-C. Chang. 1999. Cleaning policies in mobile computers using flash memory. J. Syst. Softw. 48, 3 (1999), 213--231. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Mei-Ling Chiang, Paul C. H. Lee, and Ruei-Chuan Chang. 1997. Managing flash memory in personal communication devices. In Proceedings of 1997 IEEE International Symposium on Consumer Electronics 1997 (ISCE’97). IEEE, 177--182.Google ScholarGoogle Scholar
  11. Mei-Ling Chiang, Paul C. H. Lee, Ruei-Chuan Chang, et al. 1999. Using data clustering to improve cleaning performance for flash memory. Softw. Pract. Exper. 29, 3 (1999), 267--290. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Dong Dai, Yong Chen, Dries Kimpe, and Rob R. Ross. 2014. Two-choice randomized dynamic I/O scheduler for object storage systems. In Proceedings of the ACM/IEEE Supercomputing Conference (SC’14), 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Peter Desnoyers. 2012. Analytic modeling of SSD write performance. In Proceedings of the 5th Annual International Systems and Storage Conference. ACM, 12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Cagdas Dirik and Bruce Jacob. 2009. The performance of PC solid-state disks (SSDs) as a function of bandwidth, concurrency, device architecture, and system organization. In Proceedings of the ACM SIGARCH Computer Architecture News. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Congming Gao, Liang Shi, Mengying Zhao, Chun Jason Xue, Kaijie Wu, and Edwin H.-M. Sha. 2014. Exploiting parallelism in I/O scheduling for access conflict minimization in flash-based solid state drives. In Proceedings of the 2014 30th Symposium on Mass Storage Systems and Technologies (MSST’14). IEEE, 1--11.Google ScholarGoogle Scholar
  16. Aayush Gupta, Youngjae Kim, and Bhuvan Urgaonkar. 2009. DFTL: A Flash Translation Layer Employing Demand-based Selective Caching of Page-level Address Mappings. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Jen-Wei Hsieh, Tei-Wei Kuo, and Li-Pin Chang. 2006. Efficient identification of hot data for flash memory storage systems. ACM Trans. Stor. 2, 1 (2006), 22--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. W. Hsieh, H. Y. Lin, and D. L. Yang. 2014. Multi-channel architecture-based FTL for reliable and high-performance SSD. IEEE Trans. Comput. 63, 12 (Dec. 2014), 3079--3091. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Jian Hu, Hong Jiang, Lei Tian, and Lei Xu. 2010. PUD-LRU: An erase-efficient write buffer management algorithm for flash memory SSD. In Proceedings of the 2010 IEEE International Symposium on Modeling, Analysis 8 Simulation of Computer and Telecommunication Systems (MASCOTS’10). IEEE, 69--78. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Xiao-Yu Hu, Evangelos Eleftheriou, Robert Haas, Ilias Iliadis, and Roman Pletka. 2009. Write amplification analysis in flash-based solid state drives. In Proceedings of the Israeli Experimental Systems Conference (SYSTOR’09). ACM, 10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Yang Hu, Hong Jiang, Dan Feng, Lei Tian, Hao Luo, and Shuping Zhang. 2011. Performance impact and interplay of SSD parallelism through advanced commands, allocation strategy and data granularity. In Proceedings of the International Conference on Supercomputing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Yang Hu, Hong Jiang, Dan Feng, Lei Tian, Shuping Zhang, Jingning Liu, Wei Tong, Yi Qin, and Liuzheng Wang. {n. d.}. Achieving page-mapping FTL performance at block-mapping FTL cost by hiding address translation. In Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST). Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Myoungsoo Jung and Mahmut Kandemir. {n. d.}. Sprinkler: Maximizing resource utilization in many-chip solid state disks. In Proceedings of the 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).Google ScholarGoogle Scholar
  24. Myoungsoo Jung and Mahmut Kandemir. 2012. An evaluation of different page allocation strategies on high-speed SSDs. In Proceedings of the 4th USENIX Conference on Hot Topics in Storage and File Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Myoungsoo Jung, Ellis H. Wilson, III, and Mahmut Kandemir. {n. d.}. Physically addressed queueing (PAQ): Improving parallelism in solid state disks. In Proceedings of the 39th Annual International Symposium on Computer Architecture (ISCA’12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Jesung Kim, Jong Min Kim, Sam H. Noh, Sang Lyul Min, and Yookun Cho. 2002. A space-efficient flash translation layer for compactflash systems. IEEE Trans. Consum. Electron. 48, 2 (2002), 366--375. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Youngjae Kim, Brendan Tauras, Aayush Gupta, and Bhuvan Urgaonkar. 2009. Flashsim: A simulator for NAND flash-based solid-state drives. In Proceedings of the 1st International Conference on Advances in System Simulation 2009 (SIMUL’09). IEEE, 125--131. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Sang-Won Lee, Dong-Joo Park, Tae-Sun Chung, Dong-Ho Lee, Sangwon Park, and Ha-Joo Song. 2007. A log buffer-based flash translation layer using fully-associative sector translation. ACM Trans. Embed. Comput. Syst. 6, 3, Article 18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Sang-Phil Lim, Sang-Won Lee, and Bongki Moon. 2010. FASTer FTL for enterprise-class flash memory SSDs. In Proceedings of the 2010 International Workshop on Storage Network Architecture and Parallel I/Os (SNAPI’10). IEEE, 3--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Jialin Liu, Bradly Crysler, Yin Lu, and Yong Chen. 2013. Locality-driven high-level I/O aggregation for processing scientific datasets. In Proceedings of the IEEE International Conference on Big Data (Big Data’13).Google ScholarGoogle ScholarCross RefCross Ref
  31. Ning Liu, Jason Cope, Philip Carns, Christopher Carothers, Robert Ross, Gary Grider, Adam Crume, and Carlos Maltzahn. 2012. On the role of burst buffers in leadership-class storage systems. In Proceedings of the 2012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST’12). IEEE, 1--11.Google ScholarGoogle ScholarCross RefCross Ref
  32. Yin Lu, Yong Chen, Rob Latham, and Yu Zhuang. 2014. Revealing applications’ access pattern in collective I/O for cache management. In Proceedings of the 28th International Conference on Supercomputing (ICS’14). Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Dongzhe Ma, Jianhua Feng, and Guoliang Li. 2011. LazyFTL: A page-level flash translation layer optimized for NAND flash memory. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Changwoo Min, Kangnyeon Kim, Hyunjin Cho, Sang-Won Lee, and Young Ik Eom. 2012. SFS: Random write considered harmful in solid state drives. In Proceedings of the 10th USENIX conference on File and Storage Technologies (FAST’12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. E. H. Nam, B. S. J. Kim, H. Eom, and S. L. Min. 2011. Ozone (O3): An out-of-order flash memory controller architecture. IEEE Trans. Comput. 60, 5 (May 2011), 653--666. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. C. Park, Euiseong Seo, Ji-Yong Shin, Seungryoul Maeng, and Joonwon Lee. 2010. Exploiting internal parallelism of flash-based SSDs. Comput. Arch. Lett. 9 (2010). Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Dongchul Park and David H. C. Du. 2011. Hot data identification for flash-based storage systems using multiple bloom filters. In Proceedings of the 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST’11). IEEE, 1--11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Yoon Jae Seong, Eyec Hyun Nam, Jin Hyuk Yoon, Hongseok Kim, Jin-yong Choi, Sookwan Lee, Young Hyun Bae, Jaejin Lee, Yookun Cho, and Sang Lyul Min. 2010. Hydra: A block-mapped parallel flash memory solid-state disk architecture. IEEE Trans. Comput. 59, 7 (2010), 905--921. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Anand Lal Shimpi. 2013. Samsung SSD 840 EVO Review. Retrieved from http://www.anandtech.com/show/7173.Google ScholarGoogle Scholar
  40. Ilhoon Shin. 2011. Hot/cold clustering for page mapping in NAND flash memory. IEEE Trans. Cons. Electron. 57, 4 (2011), 1728--1731.Google ScholarGoogle ScholarCross RefCross Ref
  41. Rajeev Thakur, William Gropp, and Ewing Lusk. 1999. Data sieving and collective I/O in ROMIO. In Proceedings of the Frontiers of Massively Parallel Computation 1999 (Frontiers’99). IEEE, 182--189. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Benny Van Houdt. 2013. A mean field model for a class of garbage collection algorithms in flash-based solid state drives. In ACM SIGMETRICS Performance Evaluation Review. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Chundong Wang and Weng-Fai Wong. 2012. ADAPT: Efficient workload-sensitive flash management based on adaptation, prediction and aggregation. In Proceedings of the 2012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST’12). IEEE, 1--12.Google ScholarGoogle ScholarCross RefCross Ref
  44. Q. Wei, C. Chen, and J. Yang. 2014. CBM: A cooperative buffer management for SSD. In Proceedings of the 2014 30th Symposium on Mass Storage Systems and Technologies (MSST’14). 1--12.Google ScholarGoogle Scholar
  45. Qingsong Wei, Bozhao Gong, Suraj Pathak, Bharadwaj Veeravalli, LingFang Zeng, and Kanzo Okada. 2011. WAFTL: A workload adaptive flash translation layer with data partition. In Proceedings of the 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST’11). IEEE, 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Wei Xie and Yong Chen. 2017. Elastic consistent hashing for distributed storage systems. In Proceedings of the 31st IEEE International Parallel and Distributed Processing Symposium (IPDPS’17).Google ScholarGoogle ScholarCross RefCross Ref
  47. Wei Xie, Yong Chen, and Philip C. Roth. 2016. ASA-FTL: An adaptive separation aware flash translation layer for solid state drives. Parallel Comput. (2016).Google ScholarGoogle Scholar
  48. Wei Xie, Yong Chen, and Philip C. Roth. 2016. Parallel-DFTL: A flash translation layer that exploits internal parallelism in solid state drives. In Proceedings of the 11th IEEE International Conference on Networking, Architecture, and Storage.Google ScholarGoogle Scholar
  49. Wei Xie, Jiang Zhou, M. Reyes, J. Noble, and Yong Chen. 2015. Two-mode data distribution scheme for heterogeneous storage in data centers. In Proceedings of the 2015 IEEE International Conference on Big Data (Big Data’15). 327--332. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Exploiting Internal Parallelism for Address Translation in Solid-State Drives

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Storage
      ACM Transactions on Storage  Volume 14, Issue 4
      Special Section on Systor 2017 and Regular Papers
      November 2018
      175 pages
      ISSN:1553-3077
      EISSN:1553-3093
      DOI:10.1145/3297750
      • Editor:
      • Sam H. Noh
      Issue’s Table of Contents

      Copyright © 2018 ACM

      Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 15 December 2018
      • Accepted: 1 July 2018
      • Revised: 1 April 2018
      • Received: 1 February 2017
      Published in tos Volume 14, Issue 4

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader