Efficient Integrity-Tree Structure for Convolutional Neural Networks through Frequent Counter Overflow Prevention in Secure Memories
Abstract
:1. Introduction
- 1.
- CNN feature analysis for secure memory: We analyze the impact of CNN workloads on secure memory and the inefficiency of previously proposed techniques.
- 2.
- Countermark-tree: We propose an integrity-tree design for CNN workloads that reduces overflow by assigning counters of different sizes depending on the memory block write intensities.
2. Related Works
3. Background and Motivation
3.1. Security Primitives
3.1.1. Trusted Computing Base
- Confidentiality: To prevent an attacker from reading the memory content, the data transmitted outside of the secure processor are encrypted. Every encryption operation is performed using a unique key.
- Integrity: Although an attacker cannot violate confidentiality, they can violate integrity by injecting arbitrary data into a requested address (spoofing), assigning a value to another address (splicing), and reusing old data to a previously available address (replay).
3.1.2. Counters for Data Encryption
3.1.3. Split Counter Scheme
3.1.4. Replay-Attack Protection with Integrity-Tree
3.1.5. VAULT
3.2. Neural Network
3.2.1. CNN Structure and Inference
3.2.2. Forwarding Operation and Memory Access
3.2.3. Problematic Overwrites on Narrow Memory Space
3.3. Motivation: Memory Access Behavior of CNNs
4. Countermark-Tree
4.1. Inefficiency of Previously Proposed Schemes
4.2. Designing Countermark-Tree
4.3. Implementation Methodology
5. Evaluation
5.1. Experimental Methodology
5.1.1. Simulation Configuration
5.1.2. CNN Workloads
5.2. Impact on Storage Overhead
5.3. Impact on Performance
5.4. Analyzing Extra Memory Traffic
5.5. Impact on Power and Energy Consumption
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Luo, L.; Zhang, Y.; White, C.; Keating, B.; Pearson, B.; Shao, X.; Ling, Z.; Yu, H.; Zou, C.; Fu, X. On Security of TrustZone-M Based IoT Systems. IEEE Internet Things J. 2022, 9, 9683–9699. [Google Scholar] [CrossRef]
- Jang, M.; Kim, J.; Kim, J.; Kim, S. Encore compression: Exploiting narrow-width values for quantized deep neural networks. In Proceedings of the 2022 Design, Automation and Test in Europe Conference and Exhibition (DATE) Antwerp, Belgium, 14–23 March 2022. [Google Scholar]
- de la Piedra, A.; Collado, R. Protection Profile Bricks for Secure IoT Devices. In Proceedings of the 2020 IEEE International Conference on Internet of Things and Intelligence System (IoTaIS), Bali, Indonesia, 27–28 January 2021. [Google Scholar]
- Ju, Z.; Zhang, H.; Li, X.; Chen, X.; Han, J.; Yang, M. A survey on attack detection and resilience for connected and automated vehicles: From vehicle dynamics and control perspective. IEEE Trans. Intell. Veh. 2022, 1–24. [Google Scholar] [CrossRef]
- Wang, Z.; Shu, X.; Wang, Y.; Feng, Y.; Zhang, L.; Yi, Z. A Feature Space-Restricted Attention Attack on Medical Deep Learning Systems. IEEE Trans. Cybern. 2022, 1–13. [Google Scholar] [CrossRef] [PubMed]
- Tramèr, F.; Zhang, F.; Juels, A.; Reiter, M.K.; Ristenpart, T. Stealing machine learning models via prediction APIs. In Proceedings of the 25th USENIX Security Symposium (USENIX Security 16), Austin, TX, USA, 10–12 August 2016. [Google Scholar]
- Yan, M.; Fletcher, C.W.; Torrellas, J. Cache telepathy: Leveraging shared resource attacks to learn DNN architectures. In Proceedings of the 29th USENIX Security Symposium (USENIX Security 20), Boston, MA, USA, 12–14 August 2020. [Google Scholar]
- Hua, W.; Zhang, Z.; Suh, G.E. Reverse engineering convolutional neural networks through side-channel information leaks. In Proceedings of the 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC), San Francisco, CA, USA, 24–28 June 2018; IEEE: Piscataway, NJ, USA, 2018. [Google Scholar]
- Suh, G.E.; Clarke, D.; Gassend, B.; Van Dijk, M.; Devadas, S. AEGIS: Architecture for tamper-evident and tamper-resistant processing. In Proceedings of the ACM International Conference on Supercomputing 25th Anniversary Volume, Munich, Germany, 10–13 June 2014. [Google Scholar]
- Freij, A.; Zhou, H.; Yan, S. Bonsai merkle forests: Efficiently achieving crash consistency in secure persistent memory. In Proceedings of the MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture, Virtual Event, Greece, 18–22 October 2021. [Google Scholar]
- Lei, M.; Li, F.; Wang, F.; Feng, D.; Zou, X.; Xiao, R. SecNVM: An Efficient and Write-Friendly Metadata Crash Consistency Scheme for Secure NVM. ACM Trans. Archit. Code Optim. (TACO) 2021, 19, 1–26. [Google Scholar] [CrossRef]
- Inoue, A.; Minematsu, K.; Oda, M.; Ueno, R.; Homma, N. ELM: A Low-Latency and Scalable Memory Encryption Scheme. IEEE Trans. Inf. Forensics Secur. 2022, 17, 2628–2643. [Google Scholar]
- Taassori, M.; Shafiee, A.; Balasubramonian, R. VAULT: Reducing paging overheads in SGX with efficient integrity verification structures. In Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, Williamsburg, VA, USA, 24–28 March 2018. [Google Scholar]
- Yan, C.; Englender, D.; Prvulovic, M.; Rogers, B.; Solihin, Y. Improving cost, performance, and security of memory encryption and authentication. ACM SIGARCH Comput. Archit. News 2006, 34, 179–190. [Google Scholar] [CrossRef]
- Saileshwar, G.; Nair, P.J.; Ramrakhyani, P.; Elsasser, W.; Qureshi, M.K. Synergy: Rethinking secure-memory design for error-correcting memories. In Proceedings of the 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA), Vienna, Austria, 24–28 February 2018; IEEE: Piscataway, NJ, USA, 2018. [Google Scholar]
- Yitbarek, S.F.; Austin, T. Reducing the overhead of authenticated memory encryption using delta encoding and ECC memory. In Proceedings of the 55th Annual Design Automation Conference, San Francisco, CA, USA, 24–29 June 2018. [Google Scholar]
- Elbaz, R.; Champagne, D.; Lee, R.B.; Torres, L.; Sassatelli, G.; Guillemin, P. Tec-tree: A low-cost, parallelizable tree for efficient defense against memory replay attacks. In International Workshop on Cryptographic Hardware and Embedded Systems; Springer: Berlin/Heidelberg, Germany, 2007. [Google Scholar]
- Hall, W.E.; Jutla, C.S. Parallelizable authentication trees. In International Workshop on Selected Areas in Cryptography; Springer: Berlin/Heidelberg, Germany, 2005. [Google Scholar]
- Gueron, S. A memory encryption engine suitable for general purpose processors. Cryptol. Eprint Arch. 2016. [Google Scholar]
- Alwadi, M.; Zubair, K.; Mohaisen, D.; Awad, A. Phoenix: Towards ultra-low overhead, recoverable, and persistently secure nvm. IEEE Trans. Dependable Secur. Comput. 2020, 19, 1049–1063. [Google Scholar] [CrossRef]
- Yang, F.; Chen, Y.; Mao, H.; Lu, Y.; Shu, J. ShieldNVM: An efficient and fast recoverable system for secure non-volatile memory. ACM Trans. Storage (TOS) 2020, 16, 1–31. [Google Scholar] [CrossRef]
- Chen, Z.; Zhang, Y.; Xiao, N. CacheTree: Reducing Integrity Verification Overhead of Secure Nonvolatile Memories. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 2020, 40, 1340–1353. [Google Scholar] [CrossRef]
- Zou, Y.; Zubair, K.A.; Alwadi, M.; Shadab, R.M.; Ham, S.; Awad, A.; Lin, M. ARES: Persistently Secure Non-Volatile Memory with Processor-transparent and Hardware-friendly Integrity Verification and Metadata Recovery. ACM Trans. Embed. Comput. Syst. TECS 2022, 21, 1–32. [Google Scholar] [CrossRef]
- Yuan, S.; Awad, A.; Yudha, A.W.; Solihin, Y.; Zhou, H. Adaptive Security Support for Heterogeneous Memory on GPUs. In Proceedings of the 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA), Seoul, Korea, 2–6 April 2022; IEEE: Piscataway, NJ, USA, 2022. [Google Scholar]
- Na, S.; Lee, S.; Kim, Y.; Park, J.; Huh, J. Common counters: Compressed encryption counters for secure GPU memory. In Proceedings of the 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA), Seoul, Korea, 27 February–3 March 2021; IEEE: Piscataway, NJ, USA, 2021. [Google Scholar]
- Wang, X.; Hou, R.; Zhu, Y.; Zhang, J.; Meng, D. NPUFort: A secure architecture of DNN accelerator against model inversion attack. In Proceedings of the 16th ACM International Conference on Computing Frontiers, Alghero, Italy, 30 April–2 May 2019. [Google Scholar]
- Lee, S.; Kim, J.; Na, S.; Park, J.; Huh, J. TNPU: Supporting Trusted Execution with Tree-less Integrity Protection for Neural Processing Unit. In Proceedings of the 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA), Seoul, Korea, 2–6 April 2022; IEEE: Piscataway, NJ, USA, 2022. [Google Scholar]
- McKeen, F.; Alex, I.; Anati, I.; Caspi, D.; Johnson, S.; Leslie-Hurd, R.; Rozas, C. Intel® software guard extensions (intel® sgx) support for dynamic memory management inside an enclave. In Proceedings of the Hardware and Architectural Support for Security and Privacy, Seoul, Korea, 18 June 2016; pp. 1–9. [Google Scholar]
- Rogers, B.; Chhabra, S.; Prvulovic, M.; Solihin, Y. Using address independent seed encryption and bonsai merkle trees to make secure processors os-and performance-friendly. In Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007), Chicago, IL, USA, 1–5 December 2007; IEEE: Piscataway, NJ, USA, 2007. [Google Scholar]
- Rondon, L.P.; Babun, L.; Aris, A.; Akkaya, K.; Uluagac, A.S. Survey on enterprise Internet-of-Things systems (E-IoT): A security perspective. Ad Hoc Netw. 2022, 125, 102728. [Google Scholar] [CrossRef]
- Chen, T.; Du, Z.; Sun, N.; Wang, J.; Wu, C.; Chen, Y.; Temam, O. Diannao: A small-footprint high-throughput accelerator for ubiquitous machine-learning. ACM SIGARCH Comput. Archit. News 2014, 42, 269–284. [Google Scholar] [CrossRef] [Green Version]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef] [Green Version]
- Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Xie, S.; Girshick, R.; Dollár, P.; Tu, Z.; He, K. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Costan, V.; Devadas, S. Intel SGX explained. Cryptol. Eprint Arch. 2016. [Google Scholar]
- Chatterjee, N.; Balasubramonian, R.; Shevgoor, M.; Pugsley, S.; Udipi, A.; Shafiee, A.; Sudan, K.; Awasthi, M.; Chishti, Z. Usimm: The utah simulated memory module. Univ. Utah Tech. Rep. 2012, 1–24. [Google Scholar]
- Carlson, T.E.; Heirman, W.; Eeckhout, L. Sniper: Exploring the level of abstraction for scalable and accurate parallel multi-core simulation. In Proceedings of the 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, Seattle, WA, USA, 12–18 November 2011. [Google Scholar]
Workloads | Normal | Hotspot |
---|---|---|
AlexNet | 0.9864 | 0.0136 |
DarkNet19 | 0.9357 | 0.0643 |
DarkNet19_448 | 0.9946 | 0.0054 |
DarkNet53 | 0.9683 | 0.0317 |
DarkNet53_448 | 0.9137 | 0.0863 |
Extraction | 0.9712 | 0.0288 |
ResNet152 | 0.9908 | 0.0092 |
ResNet18 | 0.9405 | 0.0595 |
ResNet50 | 0.9650 | 0.0350 |
ResNext152 | 0.9699 | 0.0301 |
ResNext50 | 0.9596 | 0.0404 |
VGG-16 | 0.5921 | 0.4079 |
Avg. | 0.9323 | 0.0677 |
Workloads | Split Counter-64 | VAULT-64 | Split Counter-128 | VAULT-128 |
---|---|---|---|---|
AlexNet | 4.02 | 1.72 | 21.11 | 20.40 |
DarkNet19 | 6.41 | 2.86 | 33.07 | 31.65 |
DarkNet19_448 | 3.14 | 1.29 | 18.56 | 15.27 |
DarkNet53 | 7.30 | 3.41 | 38.60 | 37.46 |
DarkNet53_448 | 2.80 | 1.18 | 15.85 | 13.22 |
Extraction | 5.61 | 2.71 | 29.77 | 29.58 |
ResNet152 | 6.11 | 2.90 | 32.81 | 32.64 |
ResNet18 | 6.82 | 3.12 | 34.22 | 33.88 |
ResNet50 | 5.69 | 2.52 | 30.40 | 30.11 |
ResNext152 | 4.52 | 2.18 | 24.78 | 24.55 |
ResNext50 | 2.24 | 1.01 | 13.79 | 13.60 |
VGG-16 | 34.21 | 0.36 | 317.52 | 304.18 |
Avg. | 7.40 | 2.10 | 50.87 | 48.88 |
Workloads | Alloc. Hotspot | Write Intensity | # of Chunks |
---|---|---|---|
AlexNet | 6.7 MB | 94.39% | 1 |
DarkNet19 | 18 MB | 96.37% | 1 |
DarkNet19_448 | 55.1 MB | 81.71% | 1 |
DarkNet53 | 18 MB | 96.54% | 1 |
DarkNet53_448 | 93.4 MB | 80.11% | 7 |
Extraction | 7 MB | 97.85% | 1 |
ResNet152 | 9.2 MB | 83.74% | 1 |
ResNet18 | 9.2 MB | 98.17% | 1 |
ResNet50 | 15.2 MB | 81.02% | 3 |
ResNext152 | 41.8 MB | 80.18% | 38 |
ResNext50 | 20 MB | 80.02% | 10 |
VGG-16 | 480 MB | 80.95% | 4 |
Configuration | Value |
---|---|
Number of cores Processor clock speed Processor ROB size Processor fetch/retire width | 4 3.2 GHz 192 4 |
Last level cache (Shared) Metadata cache (Shared) | 1 MB *, 8-way, 64 B lines 128 KB, 8-way, 64 B lines |
Memory size Memory bus speed Bank, Rank, Channels Rows per bank Cache lines per row | 16 GB 800 MHz 8, 2, 2 64 K 128 |
Page allocation policy | Random |
Encryption latency | 40 ns |
Workloads | Read-PKI | Write-PKI | Footprint (GB) |
---|---|---|---|
AlexNet | 17.9 | 16.7 | 0.3 |
DarkNet19 | 33.7 | 24.2 | 0.2 |
DarkNet19_448 | 37.4 | 11.8 | 0.4 |
DarkNet53 | 34.9 | 28.5 | 0.4 |
DarkNet53_448 | 39.5 | 10.2 | 0.8 |
Extraction | 35.6 | 22.6 | 0.2 |
ResNet152 | 27.0 | 24.8 | 0.7 |
ResNet18 | 39.3 | 26.2 | 0.1 |
ResNet50 | 28.8 | 22.9 | 0.3 |
ResNext50 | 16.4 | 9.6 | 0.4 |
ResNext152 | 22.1 | 19.5 | 1.0 |
VGG-16 | 37.3 | 7.8 | 1.1 |
Configuration | Tree Depth | Enc. Cnts | Integrity-Tree |
---|---|---|---|
split counter-64 | 3 | 256 MB | 4.06 MB |
VAULT-64 | 6 | 256 MB | 8.59 MB |
split counter-128 | 2 | 128 MB | 1.01 MB |
VAULT-128 | 3 | 128 MB | 2.06 MB |
CM-tree(VGG-16) | 2 | 128.1 MB | 1.11 MB |
CM-tree(Avg.) | 2 | 128 MB | 1.01 MB |
Workloads | VAULT-64 | VAULT-128 | CM-Tree |
---|---|---|---|
AlexNet | 1.72 | 20.40 | 2.00 |
DarkNet19 | 2.86 | 31.65 | 3.14 |
DarkNet19_448 | 1.29 | 15.27 | 1.69 |
DarkNet53 | 3.41 | 37.46 | 3.56 |
DarkNet53_448 | 1.18 | 13.22 | 1.47 |
Extraction | 2.71 | 29.58 | 2.82 |
ResNet152 | 2.90 | 32.64 | 3.07 |
ResNet18 | 3.12 | 33.88 | 3.20 |
ResNet50 | 2.52 | 30.11 | 2.84 |
ResNext152 | 2.18 | 24.55 | 2.24 |
ResNext50 | 1.01 | 13.60 | 1.16 |
VGG-16 | 0.36 | 304.18 | 1.68 |
Avg. | 2.10 | 48.88 | 2.41 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kim, J.; Lee, W.; Hong, J.; Kim, S. Efficient Integrity-Tree Structure for Convolutional Neural Networks through Frequent Counter Overflow Prevention in Secure Memories. Sensors 2022, 22, 8762. https://doi.org/10.3390/s22228762
Kim J, Lee W, Hong J, Kim S. Efficient Integrity-Tree Structure for Convolutional Neural Networks through Frequent Counter Overflow Prevention in Secure Memories. Sensors. 2022; 22(22):8762. https://doi.org/10.3390/s22228762
Chicago/Turabian StyleKim, Jesung, Wonyoung Lee, Jeongkyu Hong, and Soontae Kim. 2022. "Efficient Integrity-Tree Structure for Convolutional Neural Networks through Frequent Counter Overflow Prevention in Secure Memories" Sensors 22, no. 22: 8762. https://doi.org/10.3390/s22228762
APA StyleKim, J., Lee, W., Hong, J., & Kim, S. (2022). Efficient Integrity-Tree Structure for Convolutional Neural Networks through Frequent Counter Overflow Prevention in Secure Memories. Sensors, 22(22), 8762. https://doi.org/10.3390/s22228762