ParaSM2: Enhancing SM2 Cryptographic Performance via Parallel Restructuring of KDF and HASH
Abstract
1. Introduction
- We mathematically reveal the universal rule of first-block reuse in KDF computation and globally reuse the initial compression output to eliminate redundant operations.
- We design a dual-path acceleration scheme: SIMD-based multi-message compression parallelism, and a pre-computed sparse-matrix representation of the second-block message expansion, reducing storage overhead by 39.7%.
- We identify a natural 2:1 computational matching pattern between KDF and HASH and propose a dynamic task-bundling scheduler that groups two KDF blocks with one HASH block into atomic units, enabling cross-component parallel execution via SIMD and significantly improving resource utilization.
2. Background
3. Method
3.1. Encryption Optimization of ParaSM2 (A5&A7: Component-Level Parallel Scheduling of KDF and HASH)
3.2. Decryption Optimization of ParaSM2 (B4: Hierarchical Parallel Optimization of KDF)
3.2.1. KDF Computational Bottleneck and Optimization Principle
3.2.2. KDF Optimizes the Iteration of the First Message Block
3.2.3. KDF Optimizes the Iteration of the Second Message Block
- Step 1. Initialize j = 16. Since W0 is related to the value of i, when W0 appears in the subsequent steps, it will be marked in bold.
- Step 2. Perform the following statistical and coloring operations on j. List the dependent words required for calculating Wj based on the message expansion word calculation formula. For example, in W16, it depends on W0, W3, W7, W10, and W13. Then, verify whether there are bold-marked dependent words in the dependent words of Wj. If there is, mark Wj as bold. For instance, W16 depends on the bold W0, meaning that W16 is related to the value of i, so W16 is also marked as bold.
- Step 3. If j has reached 67, terminate the statistics and coloring operations. Otherwise, after j accumulates by 1, it jumps to step 2.
3.3. Decryption Optimization of ParaSM2 (B6: Vectorized Reconstruction of Hash Message Extension)
4. Experiments
4.1. Types of Graphics
4.2. Theoretical Analysis
5. Experimental Evaluation
5.1. Efficiency Analysis
5.2. Experimental Results and Discussion
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Chang, Q.; Ma, T.; Yang, W. Low power IoT device communication through hybrid AES-RSA encryption in MRA mode. Sci. Rep. 2025, 15, 14485. [Google Scholar] [CrossRef] [PubMed]
- Lilhore, U.K.; Simaiya, S.; Dalal, S.; Sharma, Y.K.; Tomar, S.; Hashmi, A. Secure WSN Architecture Utilizing Hybrid Encryption with DKM to Ensure Consistent IoV Communication. Wirel. Pers. Commun. 2024. [Google Scholar] [CrossRef]
- Karmous, N.; Hizem, M.; Dhiab, Y.B.; Aoueileyine, M.O.-E.; Boual-legue, R.; Youssef, N. Hybrid Cryptographic End-to-End Encryption Method for Protecting IoT Devices Against MitM Attacks. Radio Eng. 2024, 33, 583–592. [Google Scholar] [CrossRef]
- Zheng, X.; Xu, C.Y.; Hu, X.H.; Zhang, Y.; Xiong, X. The software/hardware co-design and implementation of SM2/3/4 encryption/decryption and digital signature system. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 2020, 39, 2055–2066. [Google Scholar] [CrossRef]
- Li, P.; Ou, W.; Liang, H.; Han, W.; Zhang, Q.; Zeng, G. A zero trust and blockchain-based defense model for smart electric vehicle chargers. J. Netw. Comput. Appl. 2023, 213, 103599. [Google Scholar] [CrossRef]
- Liu, Z.; Liang, T.; Lyu, J.; Lang, D. A security-enhanced scheme for MQTT protocol based on domestic cryptographic algorithm. Comput. Commun. 2024, 221, 1–9. [Google Scholar] [CrossRef]
- Hu, A.; Wu, H.; Liu, C. A Novel Weakness of SM2 Algorithm. In Proceedings of the 2024 14th International Conference on Information Technology in Medicine and Education (ITME), Guiyang, China, 13–15 September 2024; pp. 878–880. [Google Scholar]
- Backendal, M.; Clermont, S.; Fischlin, M.; Günther, F. Key derivation functions without a grain of salt. In Proceedings of the Annual International Conference on the Theory and Applications of Cryptographic Techniques, Madrid, Spain, 4–8 May 2025; pp. 393–426. [Google Scholar]
- Nair, V.; Song, D. Multi-Factor Key Derivation Function (MFKDF) for Fast, Flexible, Secure, & Practical Key Management. In Proceedings of the 32nd USENIX Security Symposium (USENIX Security 23), Anaheim, CA, USA, 9–11 August 2023; pp. 2097–2114. [Google Scholar]
- Li, X.; Yi, Z.; Li, R.; Wang, X.-A.; Li, H.; Yang, X. SM2-based offline/online efficient data integrity verification scheme for multiple application scenarios. Sensors 2023, 23, 4307. [Google Scholar] [CrossRef] [PubMed]
- May, A.; Schneider, C. Dlog is Practically as Hard (or Easy) as DH-Solving Dlogs via DH Oracles on EC Standards. IACR Trans. Cryptogr. Hardw. Embed. Syst. 2023, 2023, 146–166. [Google Scholar]
- Han, G.; Bai, X.; Geng, S.; Qin, B. Efficient two-party SM2 signing protocol based on secret sharing. J. Syst. Archit. 2022, 132, 102738. [Google Scholar] [CrossRef]
- Zhu, H.; Li, D.; Sun, Y.; Chen, Q.; Tian, Z.; Song, Y. Optimization of SM2 algorithm based on polynomial segmentation and parallel computing. Electronics 2024, 13, 4661. [Google Scholar] [CrossRef]
- Bhati, A.S.; Dufka, A.; Andreeva, E.; Roy, A.; Preneel, B. Skye: An Expanding PRF based Fast KDF and its Applications. In Proceedings of the 19th ACM Asia Conference on Computer and Communications Security; Association for Computing Machinery: New York, NY, USA, 2024; pp. 1082–1098. [Google Scholar]
- GB/T 32918-2016; Information Security Technology—Public Key Cryptographic Algorithm SM2 Based on Elliptic Curves. General Administration of Quality Supervision, Inspection and Quarantine of P.R.China; Standardization Administration of China. China Standards Press: Beijing, China, 2016.
- GB/T 32905-2016; Information Security Techniques—SM3 cryptographic Hash Algorithm. General Administration of Quality Supervision, Inspection and Quarantine of P.R.China, Standardization Administration of China. China Standards Press: Beijing, China, 2016.
- Cheng, Y. Study on the Encryption and Decryption of a Hybrid Domestic Cryptographic Algorithm in Secure Transmission of Data Communication. Int. J. Netw. Secur. 2022, 24, 947–952. [Google Scholar] [CrossRef] [PubMed]
- Ye, Z.; Song, R.; Zhang, H.; Chen, D.; Cheung, R.C.-C.; Huang, K. A Highly-efficient Lattice-based Post-Quantum Cryptography Processor for IoT Applications. IACR Trans. Cryptogr. Hardw. Embed. Syst. 2024, 2024, 130–153. [Google Scholar] [CrossRef]
- Chen, L.; Tang, Y.; Zhao, L.; Gong, Z. SIMD Optimizations of White-Box Block Cipher Implementations with the Self-equivalence Framework. In Proceedings of the International Conference on Information Security and Cryptology, Seoul, Republic of Korea, 20–22 November 2024; pp. 129–149. [Google Scholar]
- Polubelova, M.; Bhargavan, K.; Protzenko, J.; Beurdouche, B.; Fromherz, A.; Kulatova, N.; Zanella-B’eguelin, S. HACLxN: Verified Generic SIMD Crypto (for all your favourite platforms). In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security; Association for Computing Machinery: New York, NY, USA, 2020; pp. 899–918. [Google Scholar]
- NIST SP 800-38D[EB/OL]; Recommendation for block cipher modes of operation: Galois/Counter Mode (GCM) and GMAC. National Institute of Standards and Technology: Gaithersburg, MD, USA, 2001. [CrossRef]
- NIST SP 800-38A[EB/OL]; Recommendation for Block Cipher Modes of Operation: Methods and Techniques. National Institute of Standards and Technology: Gaithersburg, MD, USA, 2001. [CrossRef]







| r | ⌈klen/256⌉ | ⌈(klen + 65)/512⌉ |
|---|---|---|
| r = 0 | 2t | t + 1 |
| 1 ≤ r ≤ 255 | 2t + 1 | t + 1 |
| 256 ≤ r ≤ 447 | 2t + 2 | t + 1 |
| 448 ≤ r ≤ 511 | 2t + 2 | t + 1 |
| Wj | Characters Required for Calculating Wj | ||||
|---|---|---|---|---|---|
| W16 | W0 | W3 | W7 | W10 | W13 |
| W17 | W1 | W4 | W8 | W11 | W14 |
| W18 | W2 | W5 | W9 | W12 | W15 |
| W19 | W3 | W6 | W10 | W13 | W16 |
| W20 | W4 | W7 | W11 | W14 | W17 |
| W21 | W5 | W8 | W12 | W15 | W18 |
| W22 | W6 | W9 | W13 | W16 | W19 |
| W23 | W7 | W10 | W14 | W17 | W20 |
| W24 | W8 | W11 | W15 | W18 | W21 |
| W25 | W9 | W12 | W16 | W19 | W22 |
| W26 | W10 | W13 | W17 | W20 | W23 |
| W27 | W11 | W14 | W18 | W21 | W24 |
| W28 | W12 | W15 | W19 | W22 | W25 |
| W29 | W13 | W16 | W20 | W23 | W26 |
| W30 | W14 | W17 | W21 | W24 | W27 |
| W31 | W15 | W18 | W22 | W25 | W28 |
| W32 | W16 | W19 | W23 | W26 | W29 |
| W33 | W17 | W20 | W24 | W27 | W30 |
| W34 | W18 | W21 | W25 | W28 | W31 |
| W35 | W19 | W22 | W26 | W29 | W32 |
| W36 | W20 | W23 | W27 | W30 | W33 |
| W37 | W21 | W24 | W28 | W31 | W34 |
| W38 | W22 | W25 | W29 | W32 | W35 |
| W39 | W23 | W26 | W30 | W33 | W36 |
| W40 | W24 | W27 | W31 | W34 | W37 |
| W41 | W25 | W28 | W32 | W35 | W38 |
| W42 | W26 | W29 | W33 | W36 | W39 |
| W43 | W27 | W30 | W34 | W37 | W40 |
| W44 | W28 | W31 | W35 | W38 | W41 |
| W45 | W29 | W32 | W36 | W39 | W42 |
| W46 | W30 | W33 | W37 | W40 | W43 |
| W47 | W31 | W34 | W38 | W41 | W44 |
| W48 | W32 | W35 | W39 | W42 | W45 |
| W49 | W33 | W36 | W40 | W43 | W46 |
| W50 | W34 | W37 | W41 | W44 | W47 |
| W51 | W35 | W38 | W42 | W45 | W48 |
| W52 | W36 | W39 | W43 | W46 | W49 |
| W53 | W37 | W40 | W44 | W47 | W50 |
| W54 | W38 | W41 | W45 | W48 | W51 |
| W55 | W39 | W42 | W46 | W49 | W52 |
| W56 | W40 | W43 | W47 | W50 | W53 |
| W57 | W41 | W44 | W48 | W51 | W54 |
| W58 | W42 | W45 | W49 | W52 | W55 |
| W59 | W43 | W46 | W50 | W53 | W56 |
| W60 | W44 | W47 | W51 | W54 | W57 |
| W61 | W45 | W48 | W52 | W55 | W58 |
| W62 | W46 | W49 | W53 | W56 | W59 |
| W63 | W47 | W50 | W54 | W57 | W60 |
| W64 | W48 | W51 | W55 | W58 | W61 |
| W65 | W49 | W52 | W56 | W59 | W62 |
| W66 | W50 | W53 | W57 | W60 | W63 |
| W67 | W51 | W54 | W58 | W61 | W64 |
| Schemes | The End-to-End Delay for Calculating the Size of the Data (ms) | |||||
|---|---|---|---|---|---|---|
| 64 KB | 256 KB | 1 MB | 4 MB | 16 MB | 64 MB | |
| SM2 Digital Envelope@x86 | 1.26 | 2.923 | 9.136 | 33.649 | 131.882 | 518.54 |
| ParaSM2@x86 | 0.946 | 1.844 | 5.403 | 19.779 | 77.049 | 301.14 |
| SM2 Digital Envelope@ARM | 1.141 | 2.476 | 8.857 | 31.623 | 123.959 | 482.85 |
| ParaSM2@ARM | 0.869 | 1.631 | 4.892 | 17.596 | 69.551 | 271.08 |
| Schemes | The End-to-End Delay for Calculating the Size of the Data (ms) | |||||
|---|---|---|---|---|---|---|
| 64 KB | 256 KB | 1 MB | 4 MB | 16 MB | 64 MB | |
| SM2 Digital Envelope@x86 | 0.962 | 2.715 | 8.023 | 30.776 | 124.829 | 511.93 |
| ParaSM2@x86 | 0.676 | 1.689 | 5.87 | 22.55 | 88.42 | 353.68 |
| SM2 Digital Envelope@ARM | 0.916 | 2.501 | 7.365 | 28.621 | 118.019 | 472.09 |
| ParaSM2@ARM | 0.695 | 1.818 | 5.898 | 22.532 | 88.953 | 351.21 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Kang, H.; Guo, B.; Sun, Y.; Zhao, M.; Chen, X.; Ye, K. ParaSM2: Enhancing SM2 Cryptographic Performance via Parallel Restructuring of KDF and HASH. Cryptography 2026, 10, 42. https://doi.org/10.3390/cryptography10030042
Kang H, Guo B, Sun Y, Zhao M, Chen X, Ye K. ParaSM2: Enhancing SM2 Cryptographic Performance via Parallel Restructuring of KDF and HASH. Cryptography. 2026; 10(3):42. https://doi.org/10.3390/cryptography10030042
Chicago/Turabian StyleKang, Hongjuan, Bing Guo, Yufang Sun, Mingjie Zhao, Xin Chen, and Kui Ye. 2026. "ParaSM2: Enhancing SM2 Cryptographic Performance via Parallel Restructuring of KDF and HASH" Cryptography 10, no. 3: 42. https://doi.org/10.3390/cryptography10030042
APA StyleKang, H., Guo, B., Sun, Y., Zhao, M., Chen, X., & Ye, K. (2026). ParaSM2: Enhancing SM2 Cryptographic Performance via Parallel Restructuring of KDF and HASH. Cryptography, 10(3), 42. https://doi.org/10.3390/cryptography10030042

