Exploiting Data Similarity to Improve SSD Read Performance
Abstract
:1. Introduction
- We conduct preliminary experiments that reveal a low and skewed utilization issue among channels, indicating insufficient bus utilization within the SSD. Additionally, we analyze the data redundancy ratio in various workloads, which highlight the potential opportunity in data placement. These findings motivate us to leverage the data duplication characteristics for performance improvement.
- We reveal the limitations of the traditional static page allocation scheme and dynamic page allocation scheme. To address these limitations, we propose a new approach called Data Similarity aware Flash Translation Layer (DS-FTL). DS-FTL consists of a content-aware page allocation scheme and a multi-path read scheme. This approach maximizes channel-level and chip-level parallelism while preventing read stalls caused by bus-sharing mechanisms.
- We conduct a series of experiments and validate the effectiveness of our scheme. The results demonstrate its superiority compared to state-of-the-art approaches. On average, DS-FTL improves the channel utilization ratio and reduces read latency by 35.3%.
2. Background and Research Motivation
2.1. SSD Architecture
2.2. Page Allocation Scheme
2.3. Motivation
2.3.1. Uneven Utilization of Channels and Chips
2.3.2. Data Similarity in the Workload
2.3.3. Key Idea of the Proposed Solution
3. Related Work
3.1. Page Allocation Schemes
Problem | Literature |
---|---|
Adopting diverse page allocation strategies to enhance the parallelism of SSD read and write operations | Shin et al. [12], Hu et al. [13], Jung et al. [11], Wu et al. [20], Chang et al. [17], Reddy et al. [18], Paik et al. [19], Sun et al. [21] |
Implementing data deduplication in SSD to reduce the amount of data being written | Chen et al. [4], Gupta et al. [15], Kim et al. [5], Wu et al. [22], Chen et al. [23], Ni et al. [6] |
Employing path conflict resolution in SSD to improve the parallelism of read and write operations | Schuetz et al. [24], Gillingham et al. [25], Kim et al. [3], Tavakkol et al. [26], Kim et al. [14], Nadig et al. [27] |
3.2. Data Deduplication in SSD
3.3. Path Conflict Resolution in SSD
4. Design
4.1. Design Overview
- How to integrate the content similarity comparison into the existing write procedure inside SSD with less or negligible overhead.
- How to utilize the physical page with the same content at different parallel units for read performance improvement, that is how to redesign the mapping management and IO scheduler inside SSD for multi-path read.
4.2. Content-Aware Page Allocation Scheme
4.2.1. Write Operation Procedure
4.2.2. Data Content Comparison
4.3. Multi-Path Read Scheme
4.3.1. Address Mapping Management
4.3.2. I/O Scheduling Optimization
4.4. Analysis and Limitation Discussions
5. Experiment
5.1. Experimental Environment Setup
5.1.1. SSD Simulator
5.1.2. SSD Configurations
5.1.3. Workload Characteristics
5.1.4. Comparison with Other Schemes
- static allocation [12]. This scheme is a simple but effective page allocation used widely, it could spread the user data by calculating LPN across the parallel unit evenly.
- dynamic allocation [11]. This scheme improves the write performance by allocating physical pages for write requests in a round-robin way, and sacrifices the read parallelism.
- HIPA [21]. This scheme uses a load prediction mechanism to monitor chip-level load in real time, thus determining the page allocation scheme based on the state of chip-level load and improving SSD read and write parallelism.
- DS-FTL. We implement DS-FTL in the SSD simulator to optimize the read performance based on the access conflict reduction idea.
Trace | Total I/Os | Write I/Os | Write Ratio | Average Request Size (KB) |
---|---|---|---|---|
home1 | 8,973,201 | 8,882,831 | 0.990 | 16.06 |
home2 | 5,390,703 | 4,901,091 | 0.909 | 23.47 |
home3 | 918,010 | 908,863 | 0.990 | 34.16 |
home4 | 2,491,157 | 2,354,060 | 0.945 | 7.99 |
online | 5,700,499 | 4,211,806 | 0.739 | 7.99 |
webmail | 7,795,815 | 6,381,985 | 0.819 | 7.99 |
webresearch | 2,414,031 | 2,414,008 | 0.999 | 24.57 |
webusers | 5,697,272 | 5,127,101 | 0.900 | 12.31 |
5.2. Results and Analysis
5.2.1. Read and Write Latency Comparison
5.2.2. Distribution Degree of Redundancy Data
5.2.3. Proportion of Read and Write Requests That Are Redirected
5.2.4. Comparison of Channel Utilization
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Gao, C.; Shi, L.; Zhao, M.; Xue, C.J.; Wu, K.; Sha, E.H.M. Exploiting parallelism in I/O scheduling for access conflict minimization in flash-based solid state drives. In Proceedings of the 30th Symposium on Mass Storage Systems and Technologies (MSST), Santa Clara, CA, USA, 2–6 June 2014; pp. 1–11. [Google Scholar]
- Wu, S.; Du, C.; Zhang, W.; Mao, B.; Jiang, H. Deduphr: Exploiting content locality to alleviate read/write interference in deduplication-based flash storage. IEEE Trans. Comput. 2021, 71, 1332–1343. [Google Scholar] [CrossRef]
- Kim, J.; Kang, S.; Park, Y.; Kim, J. Networked SSD: Flash Memory Interconnection Network for High-Bandwidth SSD. In Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture (MICRO), Chicago, IL, USA, 1–5 October 2022; pp. 388–403. [Google Scholar]
- Feng, C.; Tian, L.; Zhang, X. CAFTL: A Content-Aware Flash Translation Layer Enhancing the Lifespan of Flash Memory based Solid State Drives. In Proceedings of the 9th USENIX Conference on File and Stroage Technologies, San Jose, CA, USA, 15–17 February 2011. [Google Scholar]
- Kim, J.; Lee, C.; Lee, S.; Son, I.; Choi, J.; Yoon, S.; Lee, H.-u.; Kang, S.; Won, Y.; Cha, J. Deduplication in SSDs: Model and quantitative analysis. In Proceedings of the IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST), Pacific Grove, CA, USA, 16–20 April 2012; pp. 1–12. [Google Scholar]
- Ni, F.; Wu, X.; Li, W.; Wang, L.; Jiang, S. WOJ: Enabling Write-Once Full-data Journaling in SSDs by using weak-hashing-based deduplication. Perform. Eval. 2018, 127–128, 56–69. [Google Scholar] [CrossRef]
- Tavakkol, A.; Sadrosadati, M.; Ghose, S.; Kim, J.; Mutlu, O. FLIN: Enabling Fairness and Enhancing Performance in Modern NVMe Solid State Drives. In Proceedings of the ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA), Los Angeles, CA, USA, 1–6 June 2018. [Google Scholar]
- Arash, T.; Gómez-Luna, J.; Mohammad, S.; Saugata, G.; Onur, M. MQsim: A framework for enabling realistic studies of modern multi-queue SSD devices. In Proceedings of the FAST, Oakland, CA, USA, 12–15 February 2018. [Google Scholar]
- Arash, T.; Mohammad, T.; Hamid, S.-A. Unleashing the potentials of dynamism for page allocation strategies in SSDs. In Proceedings of the SIGMETRICS’14: The 2014 ACM International Conference on Measurement and Modeling of Computer Systems, Austin, TX, USA, 16–20 June 2014; pp. 551–552. [Google Scholar]
- Arash, T.; Pooyan, M.; Mohammad, A.; Hamid, S.-A. Performance evaluation of dynamic page allocation strategies in SSDs. ACM Trans. Model. Perform. Eval. Comput. Syst. 2016, 1, 1–33. [Google Scholar]
- Jung, M.; Kandemir, M. An evaluation of different page allocation strategies on high-speed SSDs. In Proceedings of the HotStorage’12: Proceedings of the 4th USENIX Conference on Hot Topics in Storage and File Systems, Berkeley, CA, USA, 13–14 June 2012. [Google Scholar]
- Shin, J.Y.; Xia, Z.L.; Xu, N.Y.; Gao, R.; Cai, X.F.; Maeng, S.; Hsu, F.H. FTL design exploration in reconfigurable high-performance SSD for server applications. In Proceedings of the ICS ’09: Proceedings of the 23rd International Conference on Supercomputing, Yorktown Heights, NY, USA, 8–12 June 2009. [Google Scholar]
- Yang, H.; Hong, J.; Dan, F.; Lei, T.; Shu, P.Z. Performance impact and interplay of SSD parallelism through advanced commands, allocation strategy and data granularity. In Proceedings of the ICS ’11: Proceedings of the International Conference on Supercomputing, Tucson, AR, USA, 31 May–4 June 2011. [Google Scholar]
- Kim, J.; Jung, M.; Kim, J. Decoupled SSD: Reducing Data Movement on NAND-Based Flash SSD. IEEE Comput. Archit. Lett. 2021, 20, 150–153. [Google Scholar] [CrossRef]
- Gupta, A.; Pisolkar, R.; Urgaonkar, B.; Sivasubramaniam, A. Leveraging Value Locality in Optimizing NAND Flash-based SSDs. In Proceedings of the 9th USENIX Conference on File and Stroage Technologies, San Jose, CA, USA, 15–17 February 2011. [Google Scholar]
- Hu, Y.; Jiang, H.; Feng, D.; Tian, L.; Luo, H.; Ren, C. Exploring and Exploiting the Multilevel Parallelism Inside SSDs for Improved Performance and Endurance. IEEE Trans. Comput. 2013, 62, 1141–1155. [Google Scholar] [CrossRef]
- Chang, L.P.; Kuo, T.W. An adaptive striping architecture for flash memory storage systems of embedded systems. In Proceedings of the 8th IEEE Real-Time and Embedded Technology and Applications Symposium, San Jose, CA, USA, 27 September 2002. [Google Scholar]
- Reddy, B.R.; Narendra, C.; Prakash, T.; Kavirayani, V.R.C.; Singh, P.N.P. Method and an Apparatus for Analyzing Data to Facilitate Data Allocation in a Storage Device. U.S. Patent 9,582,199, 28 February 2017. [Google Scholar]
- Paik, J.Y.; Chung, T.S.; Cho, E.S. Dynamic Allocation Mechanism to Reduce Read Latency in Collaboration With a Device Queue in Multichannel Solid-State Devices. IEEE Trans.-Comput.-Aided Des. Integr. Circuits Syst. 2017, 36, 600–613. [Google Scholar] [CrossRef]
- Wu, F.; Lu, Z.; Zhou, Y.; He, X.; Tan, Z.; Xie, C. OSPADA: One-shot programming aware data allocation policy to improve 3D NAND flash read performance. In Proceedings of the IEEE 36th International Conference on Computer Design (ICCD), Orlando, FL, USA, 7–10 October 2018; pp. 51–58. [Google Scholar]
- Sun, H.; Cheng, X.; Zhang, C.; Yue, Y.; Qin, X. HIPA: A hybrid load balancing method in SSDs for improved parallelism performance. J. Syst. Archit. 2022, 131, 102705. [Google Scholar] [CrossRef]
- Wu, G.; He, X. Delta-FTL: Improving SSD lifetime via exploiting content locality. In Proceedings of the EuroSys ’12: Proceedings of the 7th ACM European Conference on Computer Systems, Bern, Switzerland, 10–13 April 2012. [Google Scholar]
- Chen, Z.; Chen, Z.; Nong, X.; Fang, L. NF-Dedupe: A novel no-fingerprint deduplication scheme for flash-based SSDs. In Proceedings of the IEEE Symposium on Computers and Communication (ISCC), Larnaca, Cyprus, 6–9 July 2015. [Google Scholar]
- Schuetz, R.; Oh, H.J.; Kim, J.K.; Pyeon, H.B.; Gillingham, P. HyperLink NAND Flash Architecture for Mass Storage Applications. In Proceedings of the IEEE Non-Volatile Semiconductor Memory Workshop, Monterey, CA, USA, 26–30 August 2007. [Google Scholar]
- Gillingham, P.; Chinn, D.; Choi, E.; Kim, J.K.; Macdonald, D.; Oh, H.; Pyeon, H.B.; Schuetz, R. 800 MB/s DDR NAND Flash Memory Multi-Chip Package With Source-Synchronous Interface for Point-to-Point Ring Topology. IEEE Access 2013, 1, 811–816. [Google Scholar] [CrossRef]
- Tavakkol, A.; Arjomand, M.; Sarbazi-Azad, H. Network-on-SSD: A Scalable and High-Performance Communication Design Paradigm for SSDs. IEEE Comput. Archit. Lett. 2013, 12, 5–8. [Google Scholar] [CrossRef]
- Nadig, R.; Sadrosadati, M.; Mao, H.; Ghiasi, N.M.; Tavakkol, A.; Park, J.; Sarbazi-Azad, H.; Luna, J.G.; Mutlu, O. Venice: Improving Solid-State Drive Parallelism at Low Cost via Conflict-Free Accesses. arXiv 2023, arXiv:2305.07768. [Google Scholar]
- Bloom, B.H. Space/time trade-offs in hash coding with allowable errors. Commun. ACM 1970, 13, 422–426. [Google Scholar] [CrossRef]
- Wu, S.; Du, C.; Zhu, W.; Zhou, J.; Jiang, H.; Mao, B.; Zeng, L. EaD: ECC-Assisted Deduplication With High Performance and Low Memory Overhead for Ultra-Low Latency Flash Storage. IEEE Trans. Comput. 2022, 72, 208–221. [Google Scholar] [CrossRef]
- Cha, J.; Kang, S. Data randomization scheme for endurance enhancement and interference mitigation of multilevel flash memory devices. Etri J. 2013, 35, 166–169. [Google Scholar] [CrossRef]
- Zhou, Y.; Wu, Q.; Wu, F.; Jiang, H.; Zhou, J.; Xie, C. {Remap-SSD}: Safely and Efficiently Exploiting {SSD} Address Remapping to Eliminate Duplicate Writes. In Proceedings of the FAST, Santa Clara, CA, USA, 23–25 February 2021; pp. 187–202. [Google Scholar]
- Traces from SyLab Energy Proportional Storage Systems. Available online: https://drive.google.com/drive/folders/1Ajt0UImAx-ielUKoN0QssPFAhgqTZ-Ef (accessed on 23 October 2023).
- Koller, R.; Rangaswami, R. I/O deduplication: Utilizing content similarity to improve I/O performance. ACM Trans. Storage (TOS) 2010, 6, 1–26. [Google Scholar] [CrossRef]
Parameter | Value |
---|---|
Number of SSD channels | 8 |
Chips per Channel | 2 |
Dies per Chip | 2 |
Planes per Die | 2 |
Blocks per Plane | 2048 |
Pages per Block | 64 |
Flash Page Size | 4 KB |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Nie, S.; Niu, J.; Zhang, Z.; Hu, Y.; Shi, C.; Wu, W. Exploiting Data Similarity to Improve SSD Read Performance. Appl. Sci. 2023, 13, 13017. https://doi.org/10.3390/app132413017
Nie S, Niu J, Zhang Z, Hu Y, Shi C, Wu W. Exploiting Data Similarity to Improve SSD Read Performance. Applied Sciences. 2023; 13(24):13017. https://doi.org/10.3390/app132413017
Chicago/Turabian StyleNie, Shiqiang, Jie Niu, Zeyu Zhang, Yingmeng Hu, Chenguang Shi, and Weiguo Wu. 2023. "Exploiting Data Similarity to Improve SSD Read Performance" Applied Sciences 13, no. 24: 13017. https://doi.org/10.3390/app132413017
APA StyleNie, S., Niu, J., Zhang, Z., Hu, Y., Shi, C., & Wu, W. (2023). Exploiting Data Similarity to Improve SSD Read Performance. Applied Sciences, 13(24), 13017. https://doi.org/10.3390/app132413017