TA-CLOCK: Tendency-Aware Page Replacement Policy for Hybrid Main Memory in High-Performance Embedded Systems
Abstract
:1. Introduction
2. Background and Related Works
2.1. Phase Change Memory
2.2. Hybrid Main Memory
3. Design of TA-CLOCK
- TA-CLOCK improves the limited write endurance of PCMs by maintaining write pages in the DRAM CLOCK through the classification of the page access tendency.
- TA-CLOCK reduces the energy consumption of the hybrid main memory in an embedded system by enhancing the write hit ratio of the DRAM CLOCK, thus reducing unnecessary page migrations between the DRAM CLOCK and PCM CLOCK.
3.1. Structure of TA-CLOCK
3.2. TA-CLOCK Algorithm
Algorithm 1 TA-CLOCK (page p, operation op). |
1: if p in DRAM CLOCK then |
2: if op = write then |
3: ← set; |
4: ← + 1; |
5: else |
6: ← set; |
7: ← + 1; |
8: end if |
9: else if p in PCM CLOCK then |
10: if op = write then |
11: // Decide whether the page migrates to DRAM CLOCK. |
12: DRAMPageReplacement() ← p; |
13: else |
14: ← set; |
15: ← + 1; |
16: end if |
17: else // Page fault. |
18: // No free page in DRAM CLOCK. |
19: if DRAM CLOCK is full then |
20: // Secure a free page and insert p into secured free page. |
21: DRAMPageReplacement() ← p; |
22: else |
23: Insert p into free page in DRAM CLOCK. |
24: if op = write then |
25: ← set; |
26: ← 1; //Initialize write counts of p. |
27: else |
28: ← set; |
29: ← 1; //Initialize read counts of p. |
30: end if |
31: end if |
32: end if |
Algorithm 2 DRAMPageRelpacement (void). |
1: p ← GetCurrentClockPointer(); |
2: while page p is not empty do |
3: if is set then |
4: ← reset; |
5: p ← GetNextClockPointer(); |
6: else |
7: if is set then // page p is dirty page. |
8: Calculate by Equation (1); |
9: Classify the page access tendency of page p; |
10: if is not SW then |
11: Calculate of page p by Equation (2); |
12: Classify the page access tendency of page p; |
13: if is WW then |
14: p ← GetNextClockPointer(); |
15: else if is WR then |
16: Evict page p to storage; |
17: else if is SR then |
18: Migrate page p to PCM CLOCK; |
19: end if |
20: end if |
21: p ← GetNextClockPointer(); |
22: else // page p is clean page. |
23: Discard page p; |
24: end if |
25: end if |
26: end while |
3.3. Scenarios of TA-CLOCK
4. Performance Evaluation
4.1. Experimental Setup
4.2. Experimental Results
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Mittal, S.; Vetter, J.S.; Li, D. A survey of architectural approaches for managing embedded DRAM and non-volatile on-chip caches. IEEE Trans. Parallel Distrib. Syst. 2014, 26, 1524–1537. [Google Scholar] [CrossRef] [Green Version]
- Xia, F.; Jiang, D.J.; Xiong, J.; Sun, N.H. A survey of phase change memory systems. J. Comput. Sci. Technol. 2015, 30, 121–144. [Google Scholar] [CrossRef]
- Mittal, S.; Vetter, J.S. A survey of software techniques for using non-volatile memories for storage and main memory systems. IEEE Trans. Parallel Distrib. Syst. 2015, 27, 1537–1550. [Google Scholar] [CrossRef]
- Mittal, S.; Vetter, J.S. A survey of architectural approaches for data compression in cache and main memory systems. IEEE Trans. Parallel Distrib. Syst. 2015, 27, 1524–1536. [Google Scholar] [CrossRef]
- Mittal, S. A survey of power management techniques for phase change memory. Int. J. Comput. Aided Eng. Technol. 2016, 8, 424–444. [Google Scholar] [CrossRef]
- Carballo, J.A.; Chan, W.T.J.; Gargini, P.A.; Kahng, A.B.; Nath, S. ITRS 2.0: Toward a re-framing of the Semiconductor Technology Roadmap. In Proceedings of the 2014 IEEE 32nd International Conference on Computer Design (ICCD), Seoul, Korea, 19–22 October 2014; pp. 139–146. [Google Scholar]
- Dayarathna, M.; Wen, Y.; Fan, R. Data center energy consumption modeling: A survey. IEEE Commun. Surv. Tutor. 2015, 18, 732–794. [Google Scholar] [CrossRef]
- Lin, Y.J.; Yang, C.L.; Li, H.P.; Wang, C.Y.M. A buffer cache architecture for smartphones with hybrid DRAM/PCM memory. In Proceedings of the 2015 IEEE Non-Volatile Memory System and Applications Symposium (NVMSA), Hong Kong, China, 19–21 August 2015; pp. 1–6. [Google Scholar]
- Lee, M.; Kang, D.H.; Kim, J.; Eom, Y.I. M-CLOCK: Migration-optimized page replacement algorithm for hybrid DRAM and PCM memory architecture. In Proceedings of the 30th Annual ACM Symposium on Applied Computing, Salamanca, Spain, 13–17 April 2015; pp. 2001–2006. [Google Scholar]
- Lee, S.; Bahn, H.; Noh, S.H. CLOCK-DWF: A write-history-aware page replacement algorithm for hybrid PCM and DRAM memory architectures. IEEE Trans. Comput. 2013, 63, 2187–2200. [Google Scholar] [CrossRef]
- Zhang, H.; Wang, X. TriBHMM: An Energy-Efficient and Latency-Aware Hybrid Main Memory. In Proceedings of the 2019 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom), Xiamen, China, 16–18 December 2019; pp. 1451–1456. [Google Scholar]
- Boukhobza, J.; Rubini, S.; Chen, R.; Shao, Z. Emerging NVM: A survey on architectural integration and research challenges. In ACM Transactions on Design Automation of Electronic Systems (TODAES); ACM: New York, NY, USA, 2017; Volume 23, pp. 1–32. [Google Scholar]
- Yu, S.; Chen, P.Y. Emerging memory technologies: Recent trends and prospects. IEEE Solid State Circuits Mag. 2016, 8, 43–56. [Google Scholar] [CrossRef]
- Qureshi, M.K.; Srinivasan, V.; Rivers, J.A. Scalable high performance main memory system using phase-change memory technology. In Proceedings of the 36th Annual International Symposium on Computer Architecture, Austin, TX, USA, 20–24 June 2009; pp. 24–33. [Google Scholar]
- Kim, J.G.; Kim, S.D.; Yoon, S.K. Q-Selector-Based Prefetching Method for DRAM/NVM Hybrid Main Memory System. Electronics 2020, 9, 2158. [Google Scholar] [CrossRef]
- Sun, H.; Chen, L.; Hao, X.; Liu, C.; Ni, M. An Energy-Efficient and Fast Scheme for Hybrid Storage Class Memory in an AIoT Terminal System. Electronics 2020, 9, 1013. [Google Scholar] [CrossRef]
- Lin, M.; Yao, Z.; Xiong, J. History-aware page replacement algorithm for NAND flash-based consumer electronics. IEEE Trans. Consum. Electron. 2016, 62, 23–29. [Google Scholar] [CrossRef]
- Bansal, S.; Modha, D.S. CAR: Clock with Adaptive Replacement. In Proceedings of the FAST’04: Proceedings of the 3rd Usenix Conference on File and Storage Technologies, San Francisco, CA, USA, 31 March–2 April 2004; Volume 4, pp. 187–200. Available online: https://www.usenix.org/legacy/events/fast04/ (accessed on 7 May 2021).
- Binkert, N.; Beckmann, B.; Black, G.; Reinhardt, S.K.; Saidi, A.; Basu, A.; Hestness, J.; Hower, D.R.; Krishna, T.; Sardashti, S.; et al. The gem5 simulator. In ACM SIGARCH Computer Architecture News; ACM: New York, NY, USA, 2011; Volume 39, pp. 1–7. [Google Scholar]
- Jiang, S.; Chen, F.; Zhang, X. CLOCK-Pro: An Effective Improvement of the CLOCK Replacement. In Proceedings of the USENIX Annual Technical Conference, General Track, Anaheim, CA, USA, 10–15 April 2005; pp. 323–336. [Google Scholar]
- Guthaus, M.R.; Ringenberg, J.S.; Ernst, D.; Austin, T.M.; Mudge, T.; Brown, R.B. MiBench: A free, commercially representative embedded benchmark suite. In Proceedings of the fourth annual IEEE international workshop on workload characterization. WWC-4 (Cat. No. 01EX538), Austin, TX, USA, 2 December 2001; pp. 3–14. [Google Scholar]
- Niu, N.; Fu, F.; Yang, B.; Yuan, J.; Lai, F.; Zhao, C.; Wang, J. PRO: A periodical reset optimized page migration scheme for hybrid memory system. J. Syst. Archit. 2020, 111, 101786. [Google Scholar] [CrossRef]
- Monazzah, A.M.H.; Rahmani, A.M.; Miele, A.; Dutt, N. CAST: Content-Aware STT-MRAM Cache Write Management for Different Levels of Approximation. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 2020, 39, 4385–4398. [Google Scholar] [CrossRef]
- Li, W.; Shuai, Z.; Xue, C.J.; Yuan, M.; Li, Q. A wear leveling aware memory allocator for both stack and heap management in pcm-based main memory systems. In Proceedings of the 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE), Florence, Italy, 25–29 March 2019; pp. 228–233. [Google Scholar]
DRAM | PCM | |
---|---|---|
Cell size () | 6 | 4∼30 |
Data retention | ∼64 ms | >10 years |
Read latency (ns) | ∼10 | <10 |
Write latency (ns) | ∼10 | <50 |
Voltage (V) | <1 | <3 |
Write energy (J/bit) | ∼10 fJ | ∼10 pJ |
Write endurance | 1015 | 108∼109 |
Parameter | Description |
---|---|
i | The index of DRAM CLOCK page. |
Read_Thresholdi | The read threshold of page i for the classification of the page access tendency. |
Write_ThresholdDRAM | The write threshold of DRAM CLOCK for the classificati- on of the page access tendency. |
page_count | The number of pages in DRAM CLOCK. |
read_counti | The read counts of page i. |
write_counti | The write counts of page i. |
weightwrite | The weight of write operations (constant). |
weightread | The weight of read operations (constant). |
Tendency Classification | Condition |
---|---|
SW (Strong Write) | ≥ |
WW (Weak Write) | < & ≥ 0.5 |
WR (Weak Read) | < & < 0.5 & ≥ 0.25 |
SR (Strong Read) | < & < 0.25 |
Parameter | Description |
---|---|
pref | The reference bit of page p. |
pdirty | The dirty bit of page p. |
pread_count | The read count of page p. |
ppwrite_count | The write count of page p. |
ptendency | The page access tendency of page p. |
Parameter | Description | |
---|---|---|
Read/Write latency | DRAM (ns) PCM (ns) | 50/50 50/350 |
Read/Write energy | DRAM (nJ/bit) PCM (nJ/bit) | 0.1/0.1 0.2/1.0 |
Static power | DRAM (W/GB) PCM (W/GB) | 1 0.1 |
Storage | Access latency (ms) | 5 |
Workload | Memory Footprint (KB) | Read/Write Ratio (%) |
---|---|---|
basicmath | 26,140 | 71/29 |
blowfish | 17,261 | 63/37 |
dijkstra | 19,112 | 72/28 |
patricia | 19,281 | 76/24 |
qsort | 25,735 | 58/42 |
stringsearch | 16,716 | 72/28 |
Page Replacement Policy | The Average Write Counts for Each Page on the PCM | The Standard Deviation of Write Counts on the PCM |
---|---|---|
CLOCK | 18.54 | 52.77 |
CLOCK-PRO | 4.69 | 13.18 |
M-CLOCK | 1.57 | 1.06 |
CLOCK-DWF | 1.48 | 0.5 |
TA-CLOCK | 0.85 | 0.35 |
Page Replacement Policy | DRAM | PCM | Memory Space Overhead on Average |
---|---|---|---|
CLOCK | 2 bits | 2 bits | 0.0061% |
CLOCK-PRO | 2 bits | 2 bits | 0.0061% |
M-CLOCK | 2 bits | 3 bits | 0.0079% |
CLOCK-DWF | 7 bits | 2 bits | 0.0122% |
TA-CLOCK | 8 bits | 1 bit | 0.0115% |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Choi, J.H.; Kim, K.M.; Kwak, J.W. TA-CLOCK: Tendency-Aware Page Replacement Policy for Hybrid Main Memory in High-Performance Embedded Systems. Electronics 2021, 10, 1111. https://doi.org/10.3390/electronics10091111
Choi JH, Kim KM, Kwak JW. TA-CLOCK: Tendency-Aware Page Replacement Policy for Hybrid Main Memory in High-Performance Embedded Systems. Electronics. 2021; 10(9):1111. https://doi.org/10.3390/electronics10091111
Chicago/Turabian StyleChoi, Jun Hyeong, Kyung Min Kim, and Jong Wook Kwak. 2021. "TA-CLOCK: Tendency-Aware Page Replacement Policy for Hybrid Main Memory in High-Performance Embedded Systems" Electronics 10, no. 9: 1111. https://doi.org/10.3390/electronics10091111
APA StyleChoi, J. H., Kim, K. M., & Kwak, J. W. (2021). TA-CLOCK: Tendency-Aware Page Replacement Policy for Hybrid Main Memory in High-Performance Embedded Systems. Electronics, 10(9), 1111. https://doi.org/10.3390/electronics10091111