Transformer: An OS-Supported Reconfigurable Hybrid Memory Architecture
Abstract
:1. Introduction
2. Related Work
3. Design and Implementation
3.1. Page Hotness Monitoring
3.1.1. Page Access Counting
3.1.2. A Level Array for Storing Page Access Counts
3.2. Architecture Dynamic Conversion
Algorithm 1: Architecture dynamic conversion. |
3.3. Page Migration
3.3.1. Page Migration in the Parallel Architecture
3.3.2. Page Migration in the Hierarchical Architecture
4. Evaluation
4.1. Experimental Setup
4.2. Experimental Results
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Malladi, K.T.; Shaeffer, I.; Gopalakrishnan, L.; Lo, D.; Lee, B.C.; Horowitz, M. Rethinking DRAM power modes for energy proportionality. In Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture, Vancouver, BC, Canada, 1–5 December 2012; pp. 131–142. [Google Scholar]
- Hao, Y.; Xiang, S.; Han, G.; Zhang, J.; Ma, X.; Zhu, Z.; Guo, X.; Zhang, Y.; Han, Y.; Song, Z. Recent progress of integrated circuits and optoelectronic chips. Sci. China Inf. Sci. 2021, 64, 201401. [Google Scholar] [CrossRef]
- Deng, Q.; Ramos, L.; Bianchini, R.; Meisner, D.; Wenisch, T. Active low-power modes for main memory with memscale. IEEE Micro 2012, 32, 60–69. [Google Scholar] [CrossRef]
- Lee, B.C.; Ipek, E.; Mutlu, O.; Burger, D. Architecting phase change memory as a scalable DRAM alternative. In Proceedings of the 2009 36th Annual International Symposium on Computer Architecture (ISCA), Austin, TX, USA, 20–24 June 2009; pp. 2–13. [Google Scholar]
- Xu, C.; Niu, D.; Muralimanohar, N.; Jouppi, N.P.; Xie, Y. Understanding the trade-offs in multi-level cell ReRAM memory design. In Proceedings of the 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC), Austin, TX, USA, 29 May–7 June 2013; pp. 1–6. [Google Scholar]
- Kültürsay, E.; Kandemir, M.; Sivasubramaniam, A.; Mutlu, O. Evaluating STT-RAM as an energy-efficient main memory alternative. In Proceedings of the 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), Austin, TX, USA, 21–23 April 2013; pp. 256–267. [Google Scholar]
- Cai, M.; Huang, H. A survey of operating system support for persistent memory. Frontiers Comput. Sci. 2021, 15, 154207. [Google Scholar] [CrossRef]
- Cheng, C.; Tiw, P.; Cai, Y.; Yan, X.; Yang, Y.; Huang, R. In-memory computing with emerging nonvolatile memory devices. Sci. China Inf. Sci. 2021, 64, 221402. [Google Scholar] [CrossRef]
- Xue, C.J.; Zhang, Y.; Chen, Y.; Sun, G.; Yang, J.J.; Li, H. Emerging non-volatile memories: Opportunities and challenges. In Proceedings of the 2011 9th IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), Taipei, China, 9–14 October 2011; pp. 325–334. [Google Scholar]
- García, A.A.; Jong, R.D.; Wang, W.; Diestelhorst, S. Composing lifetime enhancing techniques for non-volatile main memories. In Proceedings of the 2017 International Symposium on Memory Systems (MEMSYS), Alexandria, VA, USA, 2–5 October 2017; pp. 363–373. [Google Scholar]
- Liu, H.; Chen, Y.; Liao, X.; Jin, H.; He, B.; Zheng, L.; Guo, R. Hardware/software cooperative caching for hybrid DRAM/NVM memory architectures. In Proceedings of the 2017 International Conference on Supercomputing (ICS), Chicago, IL, USA, 14–16 June 2017; pp. 1–10. [Google Scholar]
- Chen, T.; Liu, H.; Liao, X.; Jin, H. Resource abstraction and data placement for distributed hybrid memory pool. Front. Comput. Sci. 2021, 15, 153103. [Google Scholar] [CrossRef]
- Jain, S.; Sapatnekar, S.; Wang, J.; Roy, K.; Raghunathan, A. Computing-in-memory with spintronics. In Proceedings of the 2018 Design, Automation and Test in Europe Conference and Exhibition (DATE), Dresden, Germany, 19–23 March 2018; pp. 1640–1645. [Google Scholar]
- Meza, J.; Chang, J.; Yoon, H.; Mutlu, O.; Ranganathan, P. Enabling efficient and scalable hybrid memories using fine-granularity DRAM cache management. IEEE Comput. Archit. Lett. 2012, 11, 61–64. [Google Scholar] [CrossRef] [Green Version]
- Zhang, W.; Li, T. Exploring phase change memory and 3D die-stacking for power/thermal friendly, fast and durable memory architectures. In Proceedings of the 18th International Conference on Parallel Architectures and Compilation Techniques (PACT), Raleigh, NC, USA, 12–16 September 2009; pp. 101–112. [Google Scholar]
- Loh, G.; Hill, M. Efficiently enabling conventional block sizes for very large die-stacked DRAM caches. In Proceedings of the 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), Porto Alegre, Brazil, 4–5 December 2011; pp. 454–464. [Google Scholar]
- Qureshi, M.K.; Srinivasan, V.; Rivers, J.A. Scalable high performance main memory system using phase-change memory technology. In Proceedings of the 2009 36th Annual International Symposium on Computer Architecture (ISCA), Austin, TX, USA, 20–24 June 2009; pp. 24–33. [Google Scholar]
- Vasilakis, E.; Papaefstathiou, V.; Trancoso, P.; Sourdis, I. Hybrid2: Combining caching and migration in hybrid memory systems. In Proceedings of the 2020 26th IEEE International Symposium on High Performance Computer Architecture (HPCA), San Diego, CA, USA, 22–26 February 2020; pp. 649–662. [Google Scholar]
- Yoon, H.; Meza, J.; Ausavarungnirun, R.; Harding, R.A.; Mutlu, O. Row buffer locality aware caching policies for hybrid memories. In Proceedings of the 2012 30th International Conference on Computer Design (ICCD), Montreal, QC, Canada, 30 September–3 October 2012; pp. 337–344. [Google Scholar]
- Dhiman, G.; Ayoub, R.; Rosing, T. PDRAM: A hybrid PRAM and DRAM main memory system. In Proceedings of the 2009 46th ACM/IEEE Design Automation Conference (DAC), San Francisco, CA, USA, 26–31 July 2009; pp. 664–669. [Google Scholar]
- Lee, S.; Bahn, H.; Noh, S.H. CLOCK-DWF: A write-history-aware page replacement algorithm for hybrid PCM and DRAM memory architecture. IEEE Trans. Comput. 2014, 63, 2187–2200. [Google Scholar] [CrossRef]
- Peng, B.; Dong, Y.; Yao, J.; Wu, F.; Guan, H. FlexHM: A Practical System for Heterogeneous Memory with Flexible and Efficient Performance Optimizations. ACM Trans. Archit. Code Optim. 2022; Accepted. [Google Scholar] [CrossRef]
- Hirofuchi, T.; Takano, R. Raminate: Hypervisor-based virtualization for hybrid main memory systems. In Proceedings of the 2016 7th ACM Symposium on Cloud Computing (SoCC), Santa Clara, CA, USA, 5–7 October 2016; pp. 112–125. [Google Scholar]
- Agarwal, N.; Wenisch, T.F. Thermostat: Application-transparent page management for two-tiered main memory. In Proceedings of the 2017 22nd International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Xi’an, China, 8 April 2017; pp. 631–644. [Google Scholar]
- Spradling, C.D. SPEC CPU2006 benchmark tools. ACM SIGARCH Comput. Archit. News 2007, 35, 130–134. [Google Scholar] [CrossRef]
- Bienia, C.; Kumar, S.; Singh, J.P.; Li, K. The Parsec benchmark suite: Characterization and architectural implications. In Proceedings of the 2008 17th International Conference on Parallel Architectures and Compilation Techniques (PACT), Toronto, ON, Canada, 25–29 October 2008; pp. 72–81. [Google Scholar]
- Murphy, R.C.; Wheeler, K.B.; Barrett, B.W.; Ang, J.A. Introducing the graph 500. Cray Users Group 2010, 19, 45–74. [Google Scholar]
Total_Pages | The Total Number of Memory Pages Accessed during the Program’s Execution |
hot_pages | The number of pages whose access counts exceed the hotness threshold |
The threshold for converting the memory architecture | |
type | Current memory architecture: 0 and 1 denote parallel and hierarchical, respectively |
nvmPage_Free() | API for releasing NVM pages |
mapLink_Create() | API for creating a mapping between NVM and DRAM pages |
mapLink_Free() | API for releasing the mapping between NVM and DRAM pages |
mapTable_Create() | API for creating the address mapping table |
mapTable_Free() | API for releasing the address mapping table |
tag_Modify() | API for updating tags in the mapping table |
CPU | Intel Xeon Gold 6230 2.10 GHz |
---|---|
DRAM | 128 GigaBytes |
NVM | 512 GigaBytes |
Disk | 320 GigaBytes |
Operating System | CentOS 7.7 |
Kernel | Linux 5.1.1 |
SPEC2006 | mcf, wrf, sjeng, libquantum, aster, lbm |
PARSEC | streamcluster, blackscholes |
Graph500 | BFS |
MIX1 | libquantum+aster+lbm |
MIX2 | libquantum+mcf+sjeng |
MIX2 | streamcluster+blackscholes |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chi, Y.; Liu, H.; Peng, G.; Liao, X.; Jin, H. Transformer: An OS-Supported Reconfigurable Hybrid Memory Architecture. Appl. Sci. 2022, 12, 12995. https://doi.org/10.3390/app122412995
Chi Y, Liu H, Peng G, Liao X, Jin H. Transformer: An OS-Supported Reconfigurable Hybrid Memory Architecture. Applied Sciences. 2022; 12(24):12995. https://doi.org/10.3390/app122412995
Chicago/Turabian StyleChi, Ye, Haikun Liu, Ganwei Peng, Xiaofei Liao, and Hai Jin. 2022. "Transformer: An OS-Supported Reconfigurable Hybrid Memory Architecture" Applied Sciences 12, no. 24: 12995. https://doi.org/10.3390/app122412995
APA StyleChi, Y., Liu, H., Peng, G., Liao, X., & Jin, H. (2022). Transformer: An OS-Supported Reconfigurable Hybrid Memory Architecture. Applied Sciences, 12(24), 12995. https://doi.org/10.3390/app122412995