Secure Implementation of RISC-V’s Scalar Cryptography Extension Set
Abstract
1. Introduction
- The proposal of an optimized unsecure AES module through shared Sbox logic (i.e., logic optimized both for encryption and decryption) for RISC-V’s Zkne/Zknd extensions.
- The proposal of a DOM-protected AES module against SCAs for RISC-V’s Zkne/Zknd extensions with minimal area overhead.
- Assembly-level optimizations for partial-round operations and key scheduling to realize zero execution overhead.
- Empirical security validation using evaluation-style (i.e., CPA and TPA) and conformance-style (i.e., Test Vector Leakage Assessment (TVLA) and Signal-to-Noise Ratio (SNR)) testing to ensure compliance with NIST’s SCA resilience guidelines [12].
- Comprehensive evaluation of area and power overhead and comparisons with the state of the art.
2. Related Work
3. Background
3.1. AES Algorithm
- Key expansion: Let Nr be the number of rounds (10, 12, or 14 for 128/192/256-bit keys) and denote the key size in 32-bit words (i.e., = 4, 6, or 8). The Key expansion module generates 32-bit partial round keys from the original cipher key. An example is shown in Figure 2 for a 128-bit key. The expanded key array is constructed as follows:
- Initialization: Initialize the first words with the cipher key:
- Iterative Expansion: For words , computewhere
- –
- Sbox(x) applies the AES Sbox substitution to byte x.
- –
- SLR(x) performs a byte circular left shift on a 32-bit word. For example,SLR() = .
- –
- AddRoundKey: This module performs a bitwise XOR between the state array and the round key obtained from the Key expansion module.
- SubBytes: This module (also referred to as Sbox) is the only non-linear module in AES and plays a crucial role in thwarting differential and linear cryptanalysis [11]. Each byte in the state array is substituted with another byte. The substitution is based on a multiplicative inversion in the Galois Field GF() combined with an affine transformation [22].
- ShiftRows (SR): In this module, the state matrix is updated by cyclically shifting the second, third, and fourth rows by one, two, and three bytes to the left, respectively.
- MixColumns (MC): This module performs a modular polynomial multiplication in Galois Field GF() on each column of the state array in Equation (2). Given a column s of the state matrix, MC computes the following matrix vector products:
3.2. AES RISC-V Zkne/Zknd Instruction Set
- –
- In the first instruction, where bs = 0, selects a4’s byte 0x6c, applies Sbox and partial MixColumns, and then performs XORs with t0 (which contains the partial-round key) to yield 0xe7aeaa00.
- –
- Secondly, when bs = 1, it processes state input 0x1b which is stored in the second byte of a5, performs Sbox and MC. As the MC matrix contains four columns, which are rotated each byte, the same MC is performed as for bs = 0 but the output is byte-rotated by one byte and subsequently XORed with the partial result of the first instruction stored in t0; rotating the MC result is beneficial as the second column of the MC matrix is equal to the first one when one byte is rotated, as can be seen in Equation (2). This observation also applies to the remaining MC matrix columns.
- –
- Thirdly, when (bs = 2), it similarly processes state input 0xe8, which is the third byte of a6. Then, the MC result is byte-rotated by two bytes.
- –
- Finally, when bs = 3, byte 0x45 is processed. Then, the MC result is byte-rotated by three bytes.
3.3. Power Side-Channel Attacks (SCAs)
4. Secure Scalar Cryptography Extension
4.1. Motivation
4.2. Design and Implementation of Proposed AES Optimization
- Shared Forward and Inverse Sbox: The original design [21] instantiates both forward and inverse Sboxes separately and uses a multiplexer to select between Sbox outputs (see Figure 6):However, our optimized AES Sbox (Figure 7) uses linear top and bottom layers to apply either the forward or inverse affine transforms while sharing the non-linear middle layer between encryption and decryption[25]:The shared non-linear middle layer in Equation (9) computes a GF inversion in (denoted by ) using the multiplicative inverse with irreducible polynomial :Overall, the shared Sbox outputs can be expressed as follows:
- MixColumns Optimization: The reduced GF multiplication logic is shared within the MixColumns/InvMixColumns block, as shown in Figure 7, reducing hardware for both operations. The MixColumns operation in AES, which depends on the inputs after shiftRows, requires finite field multiplications with fixed constants (e.g., 2, 3, 9, 11). The original design in [21] computed these constants dynamically using nested calls to the xtime2 function. The xtime2 function [26] is defined asFor constants like , the original code is calculated using the following equation:where .The optimized version simplifies the intermediate term by computing once and reusing it as follows:This saves one xtime2() operation.
- Byte Rotation Optimization: The original rotation logic uses explicit bitwise operations for each byte_sel operation, which is synthesized into a 4:1 multiplexer. The optimized version uses a barrel-shifter-like structure (where bs represents the byte selected):
4.3. Design and Implementation of Proposed DOM Countermeasure
4.4. Integrating Zkne/Zknd Scalar Cryptography into the CV32E40S Pipeline
5. Experimental Results
5.1. Experimental Setup
5.2. Security Evaluation
5.3. Performance Evaluation
5.4. AES Assembly Optimization
| Listing 1. Conventional MR. |
![]() |
| Listing 2. Optimized MR. |
![]() |
| Listing 3. Key Expansion logic. |
![]() |
| Listing 4. Complete AES Round with Key Scheduling. |
![]() |
6. Discussion
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- IHS Markit. The Internet of Things: A Movement, Not a Market. 2017. Available online: https://cdn.ihs.com/www/pdf/IoT_ebook.pdf (accessed on 1 June 2025).
- National Institute of Standards and Technology. The NIST Cybersecurity Framework (CSF) 2.0; NIST Cybersecurity White Paper (CSWP) NIST CSWP 29; National Institute of Standards and Technology: Gaithersburg, MD, USA, 2024. [CrossRef]
- National Institute of Standards and Technology. Recommendation for Key Management—Part 1: General; Technical Report NIST SP 800-57 Part 1 Revision 5, NIST; National Institute of Standards and Technology: Gaithersburg, MD, USA, 2020. [CrossRef]
- Kocher, P.C.; Jaffe, J.; Jun, B. Differential Power Analysis. In Proceedings of the Advances in Cryptology—CRYPTO ’99, 19th Annual International Cryptology Conference, Santa Barbara, CA, USA, 15–19 August 1999; Wiener, M.J., Ed.; Proceedings; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 1999; Volume 1666, pp. 388–397. [Google Scholar] [CrossRef]
- Brier, E.; Clavier, C.; Olivier, F. Correlation Power Analysis with a Leakage Model. In Proceedings of the Cryptographic Hardware and Embedded Systems—CHES 2004: 6th International Workshop, Cambridge, MA, USA, 11–13 August 2004; Joye, M., Quisquater, J., Eds.; Proceedings; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2004; Volume 3156, pp. 16–29. [Google Scholar] [CrossRef]
- Chari, S.; Rao, J.R.; Rohatgi, P. Template Attacks. In Proceedings of the Cryptographic Hardware and Embedded Systems—CHES 2002, 4th International Workshop, Redwood Shores, CA, USA, 13–15 August 2002; Kaliski, B.S., Koç, Ç.K., Paar, C., Eds.; Revised Papers; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2002; Volume 2523, pp. 13–28. [Google Scholar] [CrossRef]
- F, M.A.K.; Ganesan, V.; Bodduna, R.; Rebeiro, C. PARAM: A Microprocessor Hardened for Power Side-Channel Attack Resistance. In Proceedings of the 2020 IEEE International Symposium on Hardware Oriented Security and Trust, HOST 2020, San Jose, CA, USA, 7–11 December 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 23–34. [Google Scholar] [CrossRef]
- Shaout, A.; Ahmad, O.; Al-Dulaimi, Y. AES-RV: A Low-Latency and Energy-Efficient AES Accelerator with Instruction Extension for RISC-V SoC. arXiv 2024, arXiv:2505.11880. [Google Scholar]
- Cui, S.; Balasch, J. Efficient Software Masking of AES through Instruction Set Extensions. In Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, DATE 2023, Antwerp, Belgium, 17–19 April 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 1–6. [Google Scholar] [CrossRef]
- RISC-V Cryptography Extension Task Group. RISC-V Cryptography Extensions Volume I: Scalar & Entropy Source Instructions; Version 0.9.3-DRAFT; RISC-V: Zurich, Switzerland, 2023. [Google Scholar]
- Groß, H.; Mangard, S.; Korak, T. Domain-Oriented Masking: Compact Masked Hardware Implementations with Arbitrary Protection Order. In Proceedings of the ACM Workshop on Theory of Implementation Security, TIS@CCS 2016, Vienna, Austria, 24 October 2016; Bilgin, B., Nikova, S., Rijmen, V., Eds.; ACM: New York, NY, USA, 2016; p. 3. [Google Scholar] [CrossRef]
- National Institute of Standards and Technology. Security Requirements for Cryptographic Modules. In Federal Information Processing Standards Publication; FIPS 140-3; National Institute of Standards and Technology: Gaithersburg, MD, USA, 2019. [Google Scholar] [CrossRef]
- Lu, M.; Fan, A.; Xu, J.; Shan, W. A Compact, Lightweight and Low-Cost 8-Bit Datapath AES Circuit for IoT Applications in 28nm CMOS. In Proceedings of the 17th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/12th IEEE International Conference On Big Data Science And Engineering, TrustCom/BigDataSE 2018, New York, NY, USA, 1–3 August 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1464–1469. [Google Scholar] [CrossRef]
- Dhanuskodi, S.N.; Allen, S.; Holcomb, D.E. Efficient Register Renaming Architectures for 8-bit AES Datapath at 0.55 pJ/bit in 16-nm FinFET. IEEE Trans. Very Large Scale Integr. Syst. 2020, 28, 1807–1820. [Google Scholar] [CrossRef]
- Wamser, M.S.; Sigl, G. Pushing the limits further: Sub-atomic AES. In Proceedings of the 2017 IFIP/IEEE International Conference on Very Large Scale Integration, VLSI-SoC 2017, Abu Dhabi, United Arab Emirates, 23–25 October 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 1–6. [Google Scholar] [CrossRef]
- Banik, S.; Bogdanov, A.; Regazzoni, F. Atomic-AES: A Compact Implementation of the AES Encryption/Decryption Core. In Proceedings of the Progress in Cryptology—INDOCRYPT 2016—17th International Conference on Cryptology in India, Kolkata, India, 11–14 December 2016; Dunkelman, O., Sanadhya, S.K., Eds.; Proceedings; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2016; Volume 10095, pp. 173–190. [Google Scholar] [CrossRef]
- Moradi, A.; Poschmann, A.; Ling, S.; Paar, C.; Wang, H. Pushing the Limits: A Very Compact and a Threshold Implementation of AES. In Proceedings of the Advances in Cryptology—EUROCRYPT 2011, Tallinn, Estonia, 15–19 May 2011; Lecture Notes in Computer Science. Springer: Berlin/Heidelberg, Germany, 2011; Volume 6632, pp. 69–88. [Google Scholar]
- Yu, J.; Aagaard, M. Benchmarking and Optimizing AES for Lightweight Cryptography on ASICs. In Proceedings of the Lightweight Cryptography Workshop, Gaithersburg, MD, USA, 4–6 November 2019. [Google Scholar]
- Dao, M.H.; Hoang, V.P.; Dao, V.L.; Tran, X.T. An Energy Efficient AES Encryption Core for Hardware Security Implementation in IoT Systems. In Proceedings of the 2018 International Conference on ATC, Ho Chi Minh City, Vietnam, 18–20 October 2018; pp. 301–304. [Google Scholar] [CrossRef]
- Tran, K. Integration of the AES Cryptography Extension into a RISC-V Architecture. Master’s Thesis, Oklahoma State University, Stillwater, OK, USA, 2025. [Google Scholar]
- Marshall, B.; Newell, G.R.; Page, D.; Saarinen, M.O.; Wolf, C. The design of scalar AES Instruction Set Extensions for RISC-V. IACR Trans. Cryptogr. Hardw. Embed. Syst. 2021, 2021, 109–136. [Google Scholar] [CrossRef]
- Zhang, X.; Parhi, K.K. High-speed VLSI architectures for the AES algorithm. IEEE Trans. Very Large Scale Integr. Syst. 2004, 12, 957–967. [Google Scholar] [CrossRef]
- Waterman, A.; Asanović, K. The RISC-V Instruction Set Manual; RISC-V International: Zurich, Switzerland, 2019. [Google Scholar]
- Hojati, Z.; Jahanpeima, Z.; Rajabalipanah, M.; Ta’ati, H.; Rabiei, A.; Navabi, Z. Sharing AES Engine for RISC-V Custom Instructions Performing Encryption and Decryption. In Proceedings of the IEEE East-West Design & Test Symposium, EWDTS 2024, Yerevan, Armenia, 13–17 November 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 1–6. [Google Scholar] [CrossRef]
- Boyar, J.; Peralta, R. A small depth-16 circuit for the AES S-box. In Proceedings of the SEC 2012; Springer: Berlin/Heidelberg, Germany, 2012; pp. 287–298. [Google Scholar]
- Daemen, J.; Rijmen, V. The Design of Rijndael: AES—The Advanced Encryption Standard; Springer: Berlin/Heidelberg, Germany, 2002. [Google Scholar] [CrossRef]
- Clermont, J.; Heuser, A.; Rioul, O.; Standaert, F.X. Vertical Attack Correlation: Exploiting Data Compression in Side-Channel Analysis. In Proceedings of the IACR Transactions on Cryptographic Hardware and Embedded Systems; Ruhr-Universität Bochum: Bochum, Germany, 2021; Volume 2021, pp. 1–27. [Google Scholar] [CrossRef]
- NewAE Technology Inc. CW305 Artix FPGA Target Board. 2023. Available online: https://rtfm.newae.com/Targets/CW305%20Artix%20FPGA/ (accessed on 15 April 2023).
- Cadence Design Systems, Inc. Cadence Genus Synthesis Solution. 2021. Available online: https://www.cadence.com/en_US/home/tools/digital-design-and-signoff/synthesis/genus-synthesis-solution.html (accessed on 8 May 2021).
- Becker, G.; Cooper, J. Test Vector Leakage Assessment (TVLA) Methodology in Practice. Available online: https://www.semanticscholar.org/paper/Test-Vector-Leakage-Assessment-(-TVLA-)-methodology-Becker-Cooper/60b993cb11fff28c9ea657b0e2882867b8f810e1 (accessed on 9 November 2023).
- Mangard, S. Hardware Countermeasures against DPA—A Statistical Analysis of Their Effectiveness. In Proceedings of the Topics in Cryptology—CT-RSA 2004, San Francisco, CA, USA, 23–27 February 2004; Okamoto, T., Ed.; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2004; Volume 2964, pp. 222–235. [Google Scholar] [CrossRef]
- Mangard, S.; Oswald, E.; Popp, T. Power Analysis Attacks—Revealing the Secrets of Smart Cards; Springer: Berlin/Heidelberg, Germany, 2007. [Google Scholar]
- Schneider, T.; Moradi, A. Leakage Assessment Methodology. In Proceedings of the Cryptographic Hardware and Embedded Systems (CHES), Saint-Malo, France, 13–16 September 2015; pp. 495–513. [Google Scholar] [CrossRef]
- Trichina, E.; Seta, D.D.; Germani, L. Simplified Adaptive Multiplicative Masking for AES. In Proceedings of the Cryptographic Hardware and Embedded Systems—CHES 2002, 4th International Workshop, Redwood Shores, CA, USA, 13–15 August 2002; Kaliski, B.S., Koç, Ç.K., Paar, C., Eds.; Revised Papers; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2002; Volume 2523, pp. 187–197. [Google Scholar] [CrossRef]
- Moradi, A.; Mischke, O.; Eisenbarth, T. Correlation-Enhanced Power Analysis Collision Attack. In Proceedings of the Cryptographic Hardware and Embedded Systems, CHES 2010, 12th International Workshop, Santa Barbara, CA, USA, 17–20 August 2010; Mangard, S., Standaert, F., Eds.; Proceedings; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2010; Volume 6225, pp. 125–139. [Google Scholar] [CrossRef]
- Prouff, E.; Rivain, M.; Bevan, R. Study of Second-Order Side-Channel Attacks on AES Masked Implementations. IEEE Trans. Inf. Forensics Secur. 2009, 4, 636–645. [Google Scholar] [CrossRef]
- Maghrebi, H. Deep Learning based Side Channel Attacks in Practice. IACR Cryptol. ePrint Arch. 2019, 2019, 578. [Google Scholar]















| i | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
| rc[i] | 0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80 | 0x1B | 0x36 |
| Instruction | Formula |
|---|---|
| aes32esmi | |
| aes32esi | |
| aes32dsmi | |
| aes32dsi |
| Instruction | State Value | Initial t0 (rs1) | Final t0 (rd) | |||
|---|---|---|---|---|---|---|
| aes32esmi t0, t0, a4, 0 | 17fefaa0 | e7aeaa00 | ||||
| aes32esmi t0, t0, a5, 1 | e7aeaa00 | 4801efeaa | ||||
| aes32esmi t0, t0, a6, 2 | 4801efeaa | d32c5971 | ||||
| aes32esmi t0, t0, a7, 3 | d32c5971 | 0f9e371f | ||||
| Attack Point | Leakage Model |
|---|---|
| Leak1_Round0 | HW[pt ^ key] |
| Leak2_Round1 | HW[Sbox[pt ^ key]] |
| Leak3_Round1 | HW[(Sbox[pt ^ key] * 2) & 0xFF] |
| Leak4_Round1 | HW[mc_out ^ Round_key] |
| Leak5_Round10 | HW[Sbox(ct ^ key)] |
| Leak6_Round10 | HW[cipher ^ key] |
| Attack Point | DPA Traces Min-Max | CPA Traces Min-Max | Template Traces Min-Max |
|---|---|---|---|
| Leak1_Round0 | 50–200 | 300–500 | 10–50 |
| Leak2_Round1 | 100–300 | 500–4000 | 20–50 |
| Leak3_Round1 | 300–500 | 500–5000 | 30–100 |
| Leak4a_Round1 | 300–500 | 500–4000 | 20–50 |
| Leak4b_Round1 | 500–1000 | 1000–10,000 | 50–200 |
| Leak5_Round10 | 50–200 | 500–4000 | 10–50 |
| Leak6_Round10 | 50–200 | 300–500 | 10–50 |
| Design | Freq. (MHz) | Area (μm2) | Area Ratio | Power (μW) |
|---|---|---|---|---|
| Unsecure 1-Sbox unoptimized AES [21] | 100–200 | 966.024 | 1 | 18.404 |
| Unsecure 1-Sbox optimized AES | 100–200 | 530.098 | 0.55 | 8.864 |
| 1-Sbox Secure DOM | 100–200 | 841.666 | 0.87 | 6.407 |
| Resource | DOM-Protected | Unsecure 1-Sbox AES [21] | ||||
|---|---|---|---|---|---|---|
| Util. | Avail. | Util.% | Util. | Avail. | Util.% | |
| LUTs | 7863 | 63,400 | 12.40% | 8304 | 63,400 | 13.10% |
| FFs | 4985 | 126,800 | 3.93% | 4947 | 126,800 | 3.90% |
| BRAMs | 29 | 135 | 21.48% | 29 | 135 | 21.48% |
| DSPs | 7 | 240 | 2.92% | 7 | 240 | 2.92% |
| IOs | 44 | 170 | 25.88% | 44 | 170 | 25.88% |
| BUFG | 2 | 32 | 6.25% | 2 | 32 | 6.25% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Kassimi, A.; Aljuffri, A.; Larmann, C.; Hamdioui, S.; Taouil, M. Secure Implementation of RISC-V’s Scalar Cryptography Extension Set. Cryptography 2026, 10, 6. https://doi.org/10.3390/cryptography10010006
Kassimi A, Aljuffri A, Larmann C, Hamdioui S, Taouil M. Secure Implementation of RISC-V’s Scalar Cryptography Extension Set. Cryptography. 2026; 10(1):6. https://doi.org/10.3390/cryptography10010006
Chicago/Turabian StyleKassimi, Asmaa, Abdullah Aljuffri, Christian Larmann, Said Hamdioui, and Mottaqiallah Taouil. 2026. "Secure Implementation of RISC-V’s Scalar Cryptography Extension Set" Cryptography 10, no. 1: 6. https://doi.org/10.3390/cryptography10010006
APA StyleKassimi, A., Aljuffri, A., Larmann, C., Hamdioui, S., & Taouil, M. (2026). Secure Implementation of RISC-V’s Scalar Cryptography Extension Set. Cryptography, 10(1), 6. https://doi.org/10.3390/cryptography10010006





