Accelerating Post-Quantum Cryptography: A High-Efficiency NTT for ML-KEM on RISC-V
Abstract
1. Introduction
- Performance: The accelerator provides a speedup of up to 14.51× for NTT and 16.75× for inverse NTT operations compared to other RISC-V platforms.
- Efficiency: We achieved speedup efficiencies of 56.5%, 50.9%, and 45.4% for security levels I, III, and V, respectively. This performance comes with a minimal area overhead of 8.7%.
- ASIC Fabrication: The complete chip was fabricated using 180 nm CMOS technology, with a total area of 297 k gate equivalents (GE), consuming a minimum of 5.913 μW at an operation frequency of 10 kHz and a VDD of 0.9 V. The SoC achieved a maximum frequency of 118 MHz at a supply voltage of 2.0 V.
2. Background
3. Proposed Architecture
3.1. Rocket RISC-V System-on-Chip
3.2. NTT Accelerator Architecture
4. Implementation Results and Discussion
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- National Institute of Standards and Technology. Module-Lattice-Based KeyEncapsulation Mechanism Standard; Federal Information Processing Standards Publication NIST FIPS 203; Department of Commerce: Washington, DC, USA, 2024. [CrossRef]
- National Institute of Standards and Technology. Module-Lattice-Based Digital Signature Standard; Federal Information Processing Standards Publication NIST FIPS 204; Department of Commerce: Washington, DC, USA, 2024. [CrossRef]
- Fouque, P.-A.; Hoffstein, J.; Kirchner, P.; Lyubashevsky, V.; Pornin, T.; Prest, T.; Ricosset, T.; Seiler, G.; Whyte, W.; Zhang, Z. Falcon: Fast-Fourier Lattice-Based Compact Signatures Over NTRU (v1.2). NIST PQC Round 2020, 1–67. [Google Scholar]
- Aumasson, J.-P.; Bernstein, D.-J.; Beullens, W.; Dobraunig, C.; Eichlseder, M.; Fluhrer, S.; Gazdag, S.-L.; Hülsing, A.; Kampanakis, P.; Kölbl, S.; et al. SPHINCS+—Submission To the 3rd Round of the NIST Post-quantum Project, v3.1. NIST PQC Round 2022, 1–63. [Google Scholar]
- Nguyen, T.-H.; Dam, D.-T.; Duong, P.-P.; Kieu-Do-Nguyen, B.; Pham, C.-K.; Hoang, T.-T. Efficient Hardware Implementation of the Lightweight CRYSTALS-Kyber. IEEE Trans. Circ. Syst. I Regul. Pap. 2025, 72, 610–622. [Google Scholar] [CrossRef]
- Cui, Y.; Chen, J.; Ni, Z.; Zhang, Z.; Wang, C.; Liu, W. Instruction-Based High-Performance Hardware Controller of CRYSTALS-Kyber With Balanced Resource Utilization. IEEE Trans. Circ. Syst. I Regul. Pap. 2025, 72, 2394–2407. [Google Scholar] [CrossRef]
- Kim, H.; Jung, H.; Satriawan, A.; Lee, H. A Configurable ML-KEM/Kyber Key-Encapsulation Hardware Accelerator Architecture. IEEE Trans. Circ. Syst. II Express Briefs 2024, 71, 4678–4682. [Google Scholar] [CrossRef]
- Nguyen, T.-H.; Dang, T.-K.; Dam, D.-T.; Nguyen, K.-D.; Duong, P.-P.; Pham, C.-K.; Hoang, T.-T. An Area-Time Efficient Hardware Architecture for ML-KEM Post-Quantum Cryptography Standard. IEEE Access 2025, 13, 103834–103847. [Google Scholar] [CrossRef]
- Gewehr, C.; Luza, L.; Moraes, F.G. Hardware Acceleration of Crystals-Kyber in Low-Complexity Embedded Systems With RISC-V Instruction Set Extensions. IEEE Access 2024, 12, 94477–94495. [Google Scholar] [CrossRef]
- Wang, T.; Zhang, C.; Zhang, X.; Gu, D.; Cao, P. Optimized Hardware-Software Co-Design for Kyber and Dilithium on RISC-V SoC FPGA. IACR Trans. Cryptogr. Hardw. Embed. Syst. 2024, 2024, 99–135. [Google Scholar] [CrossRef]
- Ye, Z.; Song, R.; Zhang, H.; Chen, D.; Cheung, R.C.-C.; Huang, K. A Highly-efficient Lattice-based Post-Quantum Cryptography Processor for IoT Applications. IACR Trans. Cryptogr. Hardw. Embed. Syst. 2024, 2024, 130–153. [Google Scholar] [CrossRef]
- Dam, D.-T.; Nguyen, T.-H.; Kieu-Do-Nguyen, B.; Hoang, T.-T.; Pham, C.-K. RISC-V SoC with NTT-Blackbox for CRYSTALS-Kyber Post-Quantum Cryptography. In Proceedings of the 9th International Conference on Integrated Circuits, Design, and Verification, Hanoi, Vietnam, 6–8 June 2024; pp. 49–54. [Google Scholar]
- Dam, D.-T.; Tran, T.-H.; Hoang, V.-P.; Pham, C.-K.; Hoang, T.-T. A Survey of Post-Quantum Cryptography: Start of a New Race. Cryptography 2023, 7, 40. [Google Scholar] [CrossRef]
- Cooley, J.W.; Tukey, J.W. An Algorithm for the Machine Calculation of Complex Fourier Series. Math. Comput. 1965, 19, 297–301. [Google Scholar] [CrossRef]
- Gentleman, W.M.; Sande, G. Fast Fourier Transforms: For Fun and Profit. In Proceedings of the Fall Joint Computer Conference (AFIPS), San Francisco, CA, USA, 7–10 November 1966; pp. 563–578. [Google Scholar]
- Chipyard. Rocketchip–Version: Stable. 2024. Available online: https://chipyard.readthedocs.io/en/stable/Generators/Rocket-Chip.html (accessed on 1 December 2024).
- Barrett, P. Implementing the Rivest Shamir and Adleman Public Key Encryption Algorithm on a Standard Digital Signal Processor. In Proceedings of the Annual International Cryptology Conference (CRYPTO), Santa Barbara, CA, USA, 17–21 August 1986; pp. 311–323. [Google Scholar]
- Di Matteo, S.; Sarno, I.; Saponara, S. CRYPHTOR: A Memory-Unified NTT-Based Hardware Accelerator for Post-Quantum CRYSTALS Algorithms. IEEE Access 2024, 12, 25501–25511. [Google Scholar] [CrossRef]
- Sun, J.; Bai, X. A High-Speed Hardware Architecture of NTT Accelerator for CRYSTALS-Kyber. Integr. Circuits Syst. 2024, 1, 92–102. [Google Scholar] [CrossRef]
- Liu, S.-H.; Kuo, C.-Y.; Mo, Y.-N.; Su, T. An Area-Efficient, Conflict-Free, and Configurable Architecture for Accelerating NTT/INTT. IEEE Trans. Very Large Scale Inte. (VLSI) Syst. 2023, 32, 519–529. [Google Scholar] [CrossRef]
- Miteloudi, K.; Bos, J.; Bronchain, O.; Fay, B.; Renes, J. PQ. V. ALU. E: Post-quantum RISC-V Custom ALU Extensions on Dilithium and Kyber. In Proceedings of the International Conference on Smart Card Research and Advanced Applications, Amsterdam, The Netherlands, 14–16 November 2023; Springer: Berlin/Heidelberg, Germany, 2023; pp. 190–209. [Google Scholar]
- Huang, J.; Zhao, H.; Zhang, J.; Dai, W.; Zhou, L.; Cheung, R.C.C.; Koç, Ç.K.; Chen, D. Yet another Improvement of Plantard Arithmetic for Faster Kyber on Low-end 32-bit IoT Devices. IEEE Trans. Inf. Forensics Secur. 2024, 19, 3800–3813. [Google Scholar] [CrossRef]
- Fritzmann, T.; Sigl, G.; Sepúlveda, J. RISQ-V: Tightly Coupled RISC-V Accelerators for Post-quantum Cryptography. IACR Trans. Crypt. Hardw. Embed. Syst. 2020, 239–280. [Google Scholar] [CrossRef]
- Dam, D.-T.; Nguyen, T.-H.; Tran, T.-H.; Le, D.-H.; Hoang, T.-T.; Pham, C.-K. High-Efficiency Multi-Standard Polynomial Multiplication Accelerator on RISC-V SoC for Post-Quantum Cryptography. IEEE Access 2024, 12, 195015–195031. [Google Scholar] [CrossRef]
- Dolmeta, A.; Valpreda, E.; Martina, M.; Masera, G. Implementation and integration of NTT/INTT accelerator on RISC-V for CRYSTALS-Kyber. In Proceedings of the ACM International Conference on Computing Frontiers: Workshops and Special Sessions, Ischia, Italy, 7–9 May 2024; pp. 59–62. [Google Scholar]
- Alkim, E.; Evkan, H.; Lahr, N.; Niederhagen, R.; Petri, R. ISA Extensions for Finite Field Arithmetic Accelerating Kyber and NewHope on RISC-V. IACR Trans. Crypt. Hard. Embed. Syst. 2020, 2020, 219–242. [Google Scholar]
- Ji, X.; Dong, J.; Huang, J.; Yuan, Z.; Dai, W.; Xiao, F.; Lin, J. ECO-CRYSTALS: Efficient Cryptography CRYSTALS on Standard RISC-V ISA. IEEE Trans. Comput. 2025, 74, 401–413. [Google Scholar] [CrossRef]
- Li, L.; Qin, G.; Yu, Y.; Wang, W. Compact Instruction Set Extensions for Kyber. IEEE Trans.-Comput.-Aided Des. Integr. Circuits Syst. 2024, 43, 756–760. [Google Scholar] [CrossRef]
- PQ-Crystals. Kyber: Post-Quantum Key-Encapsulation Library. 2025. Available online: https://github.com/pq-crystals/kyber (accessed on 8 December 2025).
- Nannipieri, P.; Di Matteo, S.; Zulberti, L.; Albicocchi, F.; Saponara, S.; Fanucci, L. A RISC-V Post Quantum Cryptography Instruction Set Extension for Number Theoretic Transform to Speed-up CRYSTALS Algorithms. IEEE Access 2021, 9, 150798–150808. [Google Scholar] [CrossRef]
- Stillmaker, A.; Baas, B. Scaling equations for the accurate prediction of CMOS device performance from 180 nm to 7 nm. Integration 2017, 58, 74–81. [Google Scholar] [CrossRef]












| LUTs | FFs | Slice | BRAMs | DSPs | |
|---|---|---|---|---|---|
| SoC | 17,010 | 8944 | 5315 | 10 | 12 |
| Bus, Peripheral | 9800 | 5429 | 2531 | 0 | 0 |
| ROM | 321 | 0 | 116 | 0 | 0 |
| RAM | 96 | 149 | 59 | 4 | 0 |
| Rocket CPU | |||||
| — Core | 3800 | 1697 | 1382 | 0 | 10 |
| — NTT | 1314 | 1012 | 548 | 1 | 2 |
| — D-Cache | 1562 | 559 | 632 | 4 | 0 |
| — I-Cache | 117 | 98 | 47 | 1 | 0 |
| Works | Platform | NTT | INTT | Area Overhead | ||
|---|---|---|---|---|---|---|
| CCs | Latency | CCs | Latency | |||
| Ours | Rocket | 1514 | 1× | 1413 | 1× | 8.7% |
| [21] | RI5CY | 2577 | ↑ 1.70× | 3851 | ↑ 2.72× | 16% |
| [22] | M3 | 8026 | ↑ 5.30× | 8594 | ↑ 6.08× | - |
| E310 | 15,888 | ↑ 10.49× | 15,719 | ↑ 11.12× | - | |
| PQRISCV | 21,975 | ↑ 14.51× | 23,666 | ↑ 16.75× | - | |
| [23] | RISQ-V | 1935 | ↑ 1.28× | 1930 | ↑ 1.37× | 60% |
| [24] | Rocket | 4156 | ↑ 2.75× | 4172 | ↑ 2.95× | 12.93% |
| [25] | X-HEEP | 1531 | ↑ 1.01× | 1531 | ↑ 1.08× | 32.64% |
| [26] | Vex-Riscv | 6868 | ↑ 4.54× | 6367 | ↑ 4.51× | 6% |
| [27] | SiFive U74 | 8845 | ↑ 5.84× | 10,262 | ↑ 7.26× | - |
| 5700 | ↑ 3.76× | 5618 | ↑ 3.98× | - | ||
| [28] | E203 | 4302 | ↑ 2.84× | 3426 | ↑ 2.42× | 4.3% |
| Platforms | Process | Power (mW) | Energy (nJ/cycle) | VDD (V) | (MHz) | Area (mm2) | |
|---|---|---|---|---|---|---|---|
| This work | ASIC | 180 nm | 307.9 | 2.609 | 2 | 118 | 3.8025 |
| 65 nm * | 4.471 | 0.175 | 1.2 | 546 | 0.3169 | ||
| 32 nm * | 0.461 | 0.038 | 0.9 | 1150 | 0.0951 | ||
| RISQ-V [22] | FPGA | 65 nm | 2.57 | 0.257 | 1.2 | 10 | 0.1432 |
| X-HEEP [25] | FPGA | 65 nm | - | - | - | - | 0.5067 |
| E203 [28] | FPGA | 55 nm | 0.309 | 0.009 | - | 32.9 | - |
| RI5CY [21] | FPGA | 28 nm | - | - | - | 100 | 0.0158 |
| KeyGen | Encaps | Decaps | KeyGen | Encaps | Decaps | KeyGen | Encaps | Decaps | ||
|---|---|---|---|---|---|---|---|---|---|---|
| Energy (μJ) | This work | 1894.13 | 2841.20 | 3253.42 | 3699.56 | 4717.07 | 5494.55 | 6394.66 | 7860.92 | 8672.32 |
| This work * | 127.05 | 190.58 | 218.23 | 248.15 | 316.40 | 368.55 | 428.93 | 527.28 | 581.70 | |
| RISQ-V [22] | 494.98 | 685.93 | 526.34 | 797.73 | 968.38 | 877.14 | 1256.73 | 1465.93 | 1350.28 | |
| E203 [28] | 5.60 | 7.07 | 6.42 | 8.89 | 11.13 | 10.20 | 13.89 | 16.66 | 15.47 | |
| Throughput (Op/s) | This work | 163 | 108 | 95 | 83 | 65 | 56 | 48 | 39 | 35 |
| This work * | 752 | 501 | 438 | 385 | 302 | 259 | 223 | 181 | 164 | |
| RISQ-V [22] | 5 | 4 | 5 | 3 | 3 | 3 | 2 | 2 | 2 | |
| E203 [28] | 53 | 42 | 46 | 33 | 27 | 29 | 21 | 18 | 19 | |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Dam, D.-T.; Nguyen, K.-D.; Le, D.-H.; Pham, C.-K. Accelerating Post-Quantum Cryptography: A High-Efficiency NTT for ML-KEM on RISC-V. Electronics 2026, 15, 100. https://doi.org/10.3390/electronics15010100
Dam D-T, Nguyen K-D, Le D-H, Pham C-K. Accelerating Post-Quantum Cryptography: A High-Efficiency NTT for ML-KEM on RISC-V. Electronics. 2026; 15(1):100. https://doi.org/10.3390/electronics15010100
Chicago/Turabian StyleDam, Duc-Thuan, Khai-Duy Nguyen, Duc-Hung Le, and Cong-Kha Pham. 2026. "Accelerating Post-Quantum Cryptography: A High-Efficiency NTT for ML-KEM on RISC-V" Electronics 15, no. 1: 100. https://doi.org/10.3390/electronics15010100
APA StyleDam, D.-T., Nguyen, K.-D., Le, D.-H., & Pham, C.-K. (2026). Accelerating Post-Quantum Cryptography: A High-Efficiency NTT for ML-KEM on RISC-V. Electronics, 15(1), 100. https://doi.org/10.3390/electronics15010100

