Implementation Efficiency of Falcon Digital Signature Scheme on Arty-7 XC7A35T Board
Abstract
1. Introduction
2. Related Work
2.1. Mathematical Foundations in Post-Quantum Digital Signature Schemes
- Step 1: Compute such that .
- Step 2: Use the private basis B to compute a vector (the orthogonal lattice), such that v is close to .
- Step 3: Verify the shortness of . If and are sufficiently close, then s is a short vector, confirming the validity of the signature.
2.2. Selection of Software and Hardware Components for the Post-Quantum Falcon Digital Signature Device
- For the Falcon software: (1) key_gen function for generating the public and private keys. (2) expand_key function to transform the private key into the falcon_tree structure. (3) sign_tree function to generate the signature from the expanded private key, the message to be signed, and an input nonce.
- For the Falcon hardware device: (1) A random number generator module to generate seeds for key generation and nonce seeds for signing. (2) A modular adder () for public key generation. (3) The verify_raw function, implemented in hardware, is responsible for digital signature verification.

- Private key generation: Using the Falcon pseudo-random number generator, two polynomials f and g with small coefficients are created. These polynomials must satisfy specific conditions to ensure modular invertibility.
- Public key computation: The public key h is calculated using the generated polynomials: .
- Falcon Sampler Tree generation: This is the most critical and complex step, which uses the Fast Fourier Transform (FFT) on floating-point polynomials. This sampler tree is later used in the signing process to find a short vector.
| Algorithm 1: (signature authentication) verify_raw(, , h) |
| 1. Input: , , h size n = 2. Output: - Returns “1”: valid signature; - Returns “0”: invalid signature. 3. Require: Step 1: Normalize to the domain [0, q − 1]: |
3. Design and Development of the Falcon Digital Signature Device Based on FPGA Platform
- In the hardware design of the Falcon digital signature device, UART interface block (1) is responsible for UART communication between the hardware design and the CP2102 chip (USB to UART). Blocks (2), (3), and (4) correspond to the core functions of signature verification, modulo-q integer addition, and pseudo-random number generation, respectively. This arrangement enables stable communication between the Falcon device and the host computer during operation, ensuring reliable performance for digital signature services.
- Furthermore, the design incorporates block (5), which consists of physical switches on the device. These switches are used to configure the operational mode of the device, allowing users to identify whether it is currently set to key generation, digital signing, or signature verification. Lastly, block (6) is integrated into the Falcon signature device design to serve as the central processing unit, managing data flow between the blocks and controlling the overall operation of the device.

3.1. Design and Implementation of the Trng_8bit Block
- A 3-bit counter register;
- An 8-bit accumulation register;
- Bit Merge Logic (RTL_BMERGE) to insert the random_bit into the correct position.

3.2. Design and Implementation of a Modular Addition Unit for Two Integers (mq_Add Block)
3.3. Design and Implementation of the Verify_Raw Block
3.4. Design and Implementation of the UART Peripheral Interface Block
- (1)
- Baudrate Generator—This block generates the baud_tick signal to synchronize the UART transmission and reception process based on the system clock and the configured baud rate. The baud_tick signal is issued after a specific number of clock cycles (calculated as CLOCK_FREQ/BAUD_RATE), ensuring correct transmission speed.
- (2)
- UART Receiver—This is the UART data receiving unit, which receives data from the transmission line (rx), samples and processes the data through multiple states: IDLE, START_BIT, DATA_BITS, and STOP_BIT. Once 8 bits are received, it raises the rx_done = 1 signal and outputs the data.
- (3)
- UART Transmitter—This is the UART data transmitting unit, which also operates through the following states: IDLE, START_BIT, DATA_BITS, and STOP_BIT. When it receives the tx_start signal, it begins transmission. Each data bit is sent out on every baud_tick cycle, and after transmitting all 8 bits, it asserts the tx_done = 1 signal.

4. Results and Discussion
- Verification Performance—The verification time of the proposed design is 2.238 ms, equivalent to approximately 223,800 cycles (at 100 MHz), which is slower compared to highly optimized implementations.
- Key Generation Time—The slower key generation time (1516.2 ms) is primarily due to UART communication overhead (low-speed serial interface) rather than computational logic.
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| AES | Advanced Encryption Standard |
| DSA | Digital Subtraction Angiography |
| DSP | Digital Signal Processing block |
| ECDSA | Elliptic Curve Digital Signature Algorithm |
| FFT | Fast Fourier Transform |
| FPGA | Field-Programmable Gate Array |
| GenAI | Generative Artificial Intelligence |
| GPV | Gentry–Peikert–Vaikuntanathan |
| HLS | High-Level Synthesis |
| NIST | National Institute of Standards and Technology |
| NTRU | N-th degree Truncated polynomial Ring Units |
| PQC | Post-Quantum Cryptography |
| SIS | Short Integer Solution |
| TRNG | True Random Number Generator |
References
- Bernstein, D.J. Introduction to post-quantum cryptography. In Post-Quantum Cryptography; Springer: Berlin/Heidelberg, Germany, 2025; pp. 1–14. [Google Scholar] [CrossRef]
- Sarah, D.; Peter, C. On the practical cost of Grover for AES key recovery. In Proceedings of the 5th NIST PQC Standardization Conference, Rockville, MD, USA, 10–12 April 2024; pp. 1–22. [Google Scholar]
- Gidney, C.; Ekerå, M. How to factor 2048 bit RSA integers in 8 hours using 20 million noisy qubits. Quantum 2021, 5, 433. [Google Scholar] [CrossRef]
- Webber, M.; Elfving, V.; Weidt, S.; Hensinger, W.K. The impact of hardware specifications on reaching quantum advantage in the fault tolerant regime. AVS Quantum Sci. 2022, 4, 13801. [Google Scholar] [CrossRef]
- Alagic, G.; Bros, M.; Ciadoux, P.; Cooper, D.; Dang, Q.; Dang, T.; Kelsey, J.; Lichtinger, J.; Liu, Y.-K.; Miller, C.; et al. Status Report on the Fourth Round of the NIST Post-Quantum Cryptography Standardization Process; US Department of Commerce, National Institute of Standards and Technology: Gaithersburg, MD, USA, 2025. [Google Scholar] [CrossRef]
- Karabulut, E.; Aysu, A. A Hardware-Software Co-Design for the Discrete Gaussian Sampling of FALCON Digital Signature. In Proceedings of the 2024 IEEE International Symposium on Hardware Oriented Security and Trust (HOST), Tysons Corner, VA, USA, 6–9 May 2024; pp. 90–100. [Google Scholar] [CrossRef]
- Schmid, M.; Amiet, D.; Wendler, J.; Zbinden, P.; Wei, T. Falcon Takes Off-A Hardware Implementation of the Falcon Signature Scheme. Cryptol. Eprint Arch. 2023, 1–17. Available online: https://eprint.iacr.org/2023/1885 (accessed on 14 November 2025).
- Lee, Y.; Youn, J.; Nam, K.; Jung, H.H.; Cho, M.; Na, J.; Park, J.-Y.; Jeon, S.; Kang, B.G.; Oh, H.; et al. An Efficient Hardware/Software Co-Design for FALCON on Low-End Embedded Systems. IEEE Access 2024, 12, 57947–57958. [Google Scholar] [CrossRef]
- Bai, S.; Jangir, H.; Lin, H.; Ngo, T.; Wen, W.; Zheng, J. Compact Encryption Based on Module-NTRU Problems. In Post-Quantum Cryptography; Springer: Cham, Switzerland, 2024; Volume 14771, pp. 371–405. [Google Scholar] [CrossRef]
- NTRU: A Submission to the NIST Post-Quantum Standardization Effort. 2025. Available online: https://ntru.org/ (accessed on 12 May 2025).
- Basu, K.; Soni, D.; Nabeel, M.; Karri, R. NIST Post-Quantum Cryptography—A Hardware Evaluation Study. IACR Cryptol. ePrint Arch. 2019, 2019, 1–16. [Google Scholar]
- Dione, D.; Seck, B.; Diop, I.; Cayrel, P.-L.; Faye, D.; Gueye, I. Hardware Security for IoT in the Quantum Era: Survey and Challenges. J. Inf. Secur. 2023, 14, 227–249. [Google Scholar] [CrossRef]
- NIST.IR.8547; Transition to Post-Quantum Cryptography Standards. National Institute of Standards and Technology: Gaithersburg, MD, USA, 2024. [CrossRef]
- Luc, N.Q.; Nguyen, T.T.; Quach, D.H.; Dao, T.T.; Pham, N.T. Building Applications and Developing Digital Signature Devices based on the Falcon Post-Quantum Digital Signature Scheme. Eng. Technol. Appl. Sci. Res. 2023, 13, 10401–10406. [Google Scholar] [CrossRef]
- Castelvero, L.; Grande, I.H.L.; Pruneri, V. High-Performance Time-to-Digital Conversion on a 16-nm Ultrascale+ FPGA. IEEE Access 2024, 12, 149569–149579. [Google Scholar] [CrossRef]
- Boutin, C. NIST Announces First Four Quantum-Resistant Cryptographic Algorithms. 2025. Available online: https://www.nist.gov/news-events/news/2022/07/nist-announces-first-four-quantum-resistant-cryptographic-algorithms (accessed on 9 June 2025).
- NIST.FIPS.203; Module-Lattice-Based Key-Encapsulation Mechanism Standard. National Institute of Standards and Technology (NIST): Gaithersburg, MD, USA, 2024. [CrossRef]
- NIST.FIPS.204; Module-Lattice-Based Digital Signature Standard. National Institute of Standards and Technology (NIST): Gaithersburg, MD, USA, 2024. [CrossRef]
- Shannon, C. The lattice theory of information. Trans. IRE Prof. Gr. Inf. Theory 1953, 1, 105–107. [Google Scholar] [CrossRef]
- Fouque, P.-A.; Hoffstein, J.; Kirchner, P.; Lyubashevsky, V.; Pornin, T.; Prest, T.; Ricosset, T.; Seiler, G.; Whyte, W.; Zhang, Z. Falcon: Fast-Fourier Lattice-based Compact Signatures over NTRU Specifications v1.2. 2020, pp. 1–65. Available online: https://falcon-sign.info/ (accessed on 15 June 2025).
- Xilinx Inc. 7 Series FPGAs Datasheet; Xilinx Technology Document: San Jose, CA, USA, 2020; Volume 180, pp. 1–19. [Google Scholar]
- NIST.SP.800-90B; Recommendation for the Entropy Sources Used for Random Bit Generation. National Institute of Standards and Technology: Gaithersburg, MD, USA, 2018. [CrossRef]
- Zode, P.; Zode, P.; Deshmukh, R. FPGA Based Novel True Random Number Generator using LFSR with Dynamic Seed. In Proceedings of the 2019 IEEE 16th India Council International Conference (INDICON), Gujarat, India, 13–15 December 2019; pp. 1–3. [Google Scholar] [CrossRef]
- Tsoi, K.H.; Leung, K.H.; Leong, P.H.W. Compact FPGA-based true and pseudo random number generators. In Proceedings of the FCCM 2003 11th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, Napa, CA, USA, 9–11 April 2003; FCCM 2003. pp. 51–61. [Google Scholar] [CrossRef]
- Ye, Z.; Huang, J.; Huang, T.; Bai, Y.; Li, J.; Zhang, H.; Li, G.; Chen, D.; Cheung, R.C.C.; Huang, K. PQNTRU: Acceleration of NTRU-Based Schemes via Customized Post-Quantum Processor. IEEE Trans. Comput. 2025, 74, 1649–1662. [Google Scholar] [CrossRef]
- Pendyala, S.; Magesh, R.; Kavun, E.B.; Aysu, A. Outrunning the Millennium FALCON: Speed Records for FALCON on Xilinx FPGAs. Cryptol. Eprint Arch. 2025, 1–23. [Google Scholar]
- Berthet, Q.; Upegui, A.; Gantel, L.; Duc, A.; Traverso, G. An Area-Efficient SPHINCS + Post-Quantum Signature Coprocessor. In Proceedings of the 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Portland, OR, USA, 17–21 June 2021; pp. 180–187. [Google Scholar] [CrossRef]
- Wang, T.; Member, G.S.; Zhang, C.; Cao, P.; Gu, D. Efficient Implementation of Dilithium Signature Scheme on FPGA SoC Platform. IEEE Trans. Very Large Scale Integr. Syst. 2022, 30, 1158–1171. [Google Scholar] [CrossRef]














| Function | Resource | Utilization | Available | Utilization (%) |
|---|---|---|---|---|
| trng_8bit | LUT | 11 | 20,800 | 0.05 |
| FF | 13 | 41,600 | 0.03 | |
| IO | 11 | 250 | 4.40 | |
| mq_add | LUT | 80 | 20,800 | 0.38 |
| FF | 32 | 41,600 | 0.08 | |
| IO | 98 | 250 | 39.20 |
| Function | Verify_Raw |
|---|---|
| Degree | 1024 |
| BRAM | 1 |
| DSP | 15 |
| FF | 2984 |
| LUT | 3690 |
| Clock Cycles | 70 |
| Latency (ms) | 0.0007 |
| Clock (MHz) | 100 |
| Name | BRAM_18K | DSP | FF | LUT | URAM |
|---|---|---|---|---|---|
| DSP | _ | 1 | _ | _ | _ |
| Expression | _ | _ | 0 | 1131 | _ |
| FIFO | _ | _ | _ | _ | _ |
| Instance | _ | 14 | 1330 | 2274 | _ |
| Memory | 1 | _ | 27 | 5 | _ |
| Multiplexer | _ | _ | _ | 280 | _ |
| Register | _ | _ | 1627 | _ | _ |
| Total | 1 | 15 | 2984 | 3690 | 0 |
| Available | 50 | 90 | 41,600 | 20,800 | 0 |
| Utilization (%) | 2 | 16.7 | 7.2 | 17.7 | 0 |
| Criteria | USB Hardware Signed (FPGA + SW) | Fully Software-Based Digital Signature [20] |
|---|---|---|
| Version | 1024 | 1024 |
| Key generation time (ms) | 1516.2 | 27.45 |
| Signing time (ms) | 5.0 | 2.913 |
| Authentication time (ms) | 2.2381 | 7.326 |
| Falcon-1024 | Generate Key (ms) | Signing (ms) | Verification (ms) |
|---|---|---|---|
| [20] | 80 | 2.22 | 0.298 |
| [7] | 320.3 | 8.7 | 1.258 |
| This work | 1516.2 | 5.0 | 2.238 |
| Criterion | Target Context | Clock Frequency | Key Gen | Sign Time | Verify Time | LUT Utilization | DSP Utilization | Power Consumption | Main Advantage |
|---|---|---|---|---|---|---|---|---|---|
| This work (Arty-7 + SW) | Low-Cost/Resource Constrained FPGA | 100 MHz (Specify) | 151,620,000 cycles (1516.2 ms) | 500,000 cycles (5 ms) | 223,800 cycles (2.238 ms) | 17.7% | 16.7% | FPGA Power 97 mW | Low-Cost Platform Feasibility |
| Lee et al. (ASIC + SW) [8] | Low-Power ASIC Accelerator | 250 MHz (for ASIC core) | 37.82 ms (FALCON-1024) | 1.80 ms (FALCON-512) | N/A | Area: 38 k uM | Core Only | Core Only 5.972 mW | Highest Power Efficiency |
| Ye et al. (RISC-V SoC) [25] | Custom SoC Processor | Fill Freq | N/A | N/A | 179,080 cycles (FALCON-512) | N/A | Core Only | Core Only 3.05 mW | High Integration and Flexibility |
| Pendyala et al. (Zynq FPGA) [26] | High-Performance FPGA | Fill Freq | Fill Cycles | Fill Cycles | Fill Cycles | Fill Freq | Fill Freq | Fill Power | Max Throughput via Pipelining |
| SPHINCS+ [27] | Area-Efficient FPGA | 149–156 MHz | Fill Cycles | Fill Cycles | Fill Cycles | 8.3% (Low Area) | 0% | 400 mW Dynamic (XZU3EG Total) | High Security (Hash-Based) |
| Dilithium-III [28] | High-Performance SoC FPGA | 182–217 MHz | Fill Cycles | Fill Cycles | Fill Cycles | 19,614 LUTs (7.8% Z7000) | 8–10 DSPs | Core Only (Low μW Range) | Highest Throughput PQC |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Nguyen, T.-T.; Nguyen, D.-D.; Dao, T.-T.; Luc, N.-Q. Implementation Efficiency of Falcon Digital Signature Scheme on Arty-7 XC7A35T Board. Electronics 2025, 14, 4504. https://doi.org/10.3390/electronics14224504
Nguyen T-T, Nguyen D-D, Dao T-T, Luc N-Q. Implementation Efficiency of Falcon Digital Signature Scheme on Arty-7 XC7A35T Board. Electronics. 2025; 14(22):4504. https://doi.org/10.3390/electronics14224504
Chicago/Turabian StyleNguyen, Tat-Thang, Duc-Duy Nguyen, Toan-Thanh Dao, and Nhu-Quynh Luc. 2025. "Implementation Efficiency of Falcon Digital Signature Scheme on Arty-7 XC7A35T Board" Electronics 14, no. 22: 4504. https://doi.org/10.3390/electronics14224504
APA StyleNguyen, T.-T., Nguyen, D.-D., Dao, T.-T., & Luc, N.-Q. (2025). Implementation Efficiency of Falcon Digital Signature Scheme on Arty-7 XC7A35T Board. Electronics, 14(22), 4504. https://doi.org/10.3390/electronics14224504

