1. Introduction
The emergence of large-scale quantum computing represents a significant threat to modern cryptographic infrastructure. Classical public-key cryptosystems—including RSA, Diffie-Hellman, and elliptic curve cryptography (ECC)—derive their security from the computational intractability of integer factorization and discrete logarithm problems. However, quantum algorithms such as Shor’s algorithm [
1] can solve these problems in polynomial time, rendering current public-key infrastructure (PKI) fundamentally vulnerable. Recent advancements suggest that within the next decade, quantum computing hardware may reach the threshold required to break 2048-bit RSA and comparable ECC implementations [
2,
3]. This “harvest now, decrypt later” threat model—where adversaries store encrypted communications today for future quantum-enabled decryption—has mobilized governments, standards bodies, and industry stakeholders to accelerate development and deployment of post-quantum cryptography (PQC).
In response to this emerging vulnerability, the U.S. National Institute of Standards and Technology (NIST) initiated a comprehensive standardization process for quantum-resistant cryptographic algorithms in 2016 [
4,
5]. As of 2023, this process has yielded several candidate algorithms for standardization, with CRYSTALS-Kyber selected as the primary key encapsulation mechanism (KEM), and CRYSTALS-Dilithium, Falcon, and SPHINCS+ designated for digital signatures. Parallel efforts by the European Telecommunications Standards Institute (ETSI), the Internet Engineering Task Force (IETF), and other international bodies have further emphasized the urgent need to transition critical digital infrastructure to quantum-resistant cryptography.
Despite these significant standardization efforts, a critical gap persists between theoretical cryptographic security and real-world implementation readiness. While numerous studies have examined the mathematical foundations and security properties of PQC algorithms, comprehensive performance evaluations across heterogeneous computing environments remain limited. Existing benchmarks typically focus on either theoretical complexity analyses or isolated performance measurements on specialized platforms [
6,
7]. This narrow approach fails to address the diverse nature of modern computing ecosystems, where cryptographic protocols must operate efficiently across environments ranging from high-performance cloud servers to severely resource-constrained IoT devices. Moreover, the practical implications of integrating PQC into established security protocols like TLS, SSH, and IPsec—particularly regarding bandwidth overhead, latency, and backward compatibility—remain inadequately characterized.
To address these critical knowledge gaps, our research pursues three primary objectives:
- 1.
Comprehensive Cross-Platform Performance Analysis: We conduct extensive empirical evaluations of five leading PQC algorithms: CRYSTALS-Kyber, NTRU, and BIKE for key encapsulation mechanisms (KEMs), alongside CRYSTALS-Dilithium and Falcon for digital signatures. Our testing encompasses performance metrics such as computational latency, memory utilization, energy consumption, and cryptographic material sizes across three distinct hardware profiles ranging from high-performance servers to resource-constrained embedded devices, providing insights directly applicable to diverse deployment scenarios.
- 2.
Protocol Integration and Network Impact Assessment: We systematically analyze the practical challenges of integrating PQC into existing cryptographic protocols, with particular emphasis on TLS 1.3 handshakes. This includes quantifying handshake latency increases, packet fragmentation effects under various network conditions, and bandwidth consumption patterns. We also evaluate hybrid classical-quantum approaches that maintain compatibility during transition periods and measure performance scaling under high-concurrency loads.
- 3.
Deployment Strategy Framework: Based on our empirical findings, we develop a risk-based migration framework and algorithm selection guidelines tailored to specific deployment environments and use cases. These recommendations account for varying security requirements, performance constraints, and operational considerations across sectors ranging from financial services to IoT infrastructure.
Our work makes several significant contributions to the field. First, we present the most extensive cross-platform benchmarking of NIST-selected PQC algorithms to date, covering both lattice-based and code-based approaches across multiple security levels (NIST Levels 1, 3, and 5). Second, we identify and quantify critical performance trade-offs that inform algorithm selection for resource-constrained environments, including specific optimizations that yield up to 40% performance improvement on targeted platforms. Third, we evaluate practical protocol integration challenges, revealing that naive PQC implementation in TLS 1.3 can increase handshake size by up to 7× compared to classical approaches, and demonstrate mitigation strategies that significantly reduce this overhead.
By bridging the gap between cryptographic theory and systems engineering practice, our findings provide actionable guidance for security architects, protocol designers, and policymakers navigating the global transition toward quantum-resistant infrastructure. As quantum computing continues its rapid advancement, the insights presented in this paper help ensure that cryptographic protections evolve in parallel with emerging threats, preserving the confidentiality and integrity of sensitive communications in the post-quantum era.
The remainder of this paper is organized as follows:
Section 2 provides a comprehensive overview of quantum computing threats and the current state of NIST’s PQC standardization process, including a detailed survey of related work.
Section 3 presents our algorithm selection criteria, testing environments, and experimental methodology.
Section 4 details our benchmarking results across various platforms and operational scenarios.
Section 5 analyzes these findings in the context of real-world deployment considerations, offering recommendations for algorithm selection and implementation strategies. This section also summarizes our key contributions and outlines directions for future research in this rapidly evolving field.
3. Methodology and Experimental Setup
This section presents our comprehensive approach to evaluating post-quantum cryptographic (PQC) algorithms across diverse computing environments. We detail the selected algorithms, testing platforms, measurement procedures, and mathematical foundations to ensure both scientific rigor and reproducibility of our results.
3.1. Algorithm Selection and Mathematical Foundations
To provide a representative assessment of the PQC landscape, we selected five algorithms spanning different mathematical approaches and use cases:
CRYSTALS-Kyber: A lattice-based Key Encapsulation Mechanism (KEM) built on the Module Learning With Errors (MLWE) problem. Kyber’s security relies on the difficulty of finding a secret vector s given a random matrix and vector , where e is a small error vector and operations occur in a polynomial ring with and .
NTRU: A lattice-based KEM using a different approach based on polynomial quotient rings. The NTRU problem involves finding small polynomials f and g such that where h is public. Security is based on the difficulty of finding these small polynomials.
BIKE (Bit Flipping Key Encapsulation): A code-based KEM using quasi-cyclic moderate-density parity-check codes. Unlike lattice-based schemes, BIKE’s security is founded on the difficulty of decoding random linear codes, providing algorithmic diversity in our evaluation portfolio.
CRYSTALS-Dilithium: A lattice-based digital signature scheme that shares its mathematical foundation with Kyber. Dilithium employs a “Fiat-Shamir with aborts” paradigm to transform an interactive proof system into a non-interactive signature scheme.
Falcon: A lattice-based signature scheme using NTRU lattices with Fast Fourier sampling. Falcon achieves compact signatures through a sophisticated sampling technique over a structured lattice.
For each algorithm, we evaluated all parameter sets corresponding to NIST security levels 1, 3, and 5, which are calibrated to provide security comparable to AES-128, AES-192, and AES-256 respectively, even against quantum attackers. This comprehensive coverage allows us to analyze performance scaling with increasing security requirements.
The formal mathematical operations for each algorithm are summarized below:
Key Generation: where A is random, s is secret, and e is a small error vector.
Encryption: where r is a random polynomial and m is the message.
Decryption:
Key Generation: Find small polynomials where and compute
Encryption: , where r is a random small polynomial
Decryption:
Key Generation: , where s is the secret key
Signing: Uses commitment-challenge-response with , ,
Verification: Check and
Key Generation: Find small with NTRU relation and compute a Gram-Schmidt basis
Signing: Sample a short vector s such that using the Gram-Schmidt basis
Verification: Check and
3.2. Testing Environments
To capture performance characteristics across the computing spectrum, we established three representative hardware platforms that reflect common deployment scenarios from high-performance servers to resource-constrained edge devices:
These environments represent distinct operational scenarios:
Server E1 represents high-performance cloud or data center infrastructure where computational resources are abundant and processing power is prioritized.
Laptop E2 represents typical end-user devices used by individuals and small businesses, with moderate performance capabilities.
Device E3 represents resource-constrained edge devices, similar to those found in IoT deployments, where power consumption and memory constraints are significant concerns.
This diverse selection allows us to comprehensively evaluate how PQC algorithms perform across the computing spectrum and identify the most suitable algorithms for each environment.
Table 2 describes the Hardware specifications of the test environments.
3.3. Software Environment and Implementation Details
To ensure reliable and consistent results, we standardized the software environment across all platforms:
3.4. Measurement Methodology
Our experimental methodology was designed to collect statistically robust data across multiple dimensions:
3.4.1. Core Cryptographic Operations
For each algorithm and security level, we measured the fundamental operations with the following procedure:
3.4.2. Resource Utilization Metrics
Beyond timing measurements, we captured comprehensive resource utilization:
3.4.3. Protocol Integration Assessment
To evaluate practical impacts in network protocols, we conducted detailed TLS benchmarks:
3.5. Statistical Analysis and Validation Procedures
To ensure the validity and significance of our findings, we employed rigorous statistical methods:
This comprehensive methodology provides a robust framework for evaluating PQC algorithms across diverse computing environments. The following section presents the detailed results of our benchmarking efforts and analyzes their implications for real-world deployments.
4. Results and Analysis
This section presents our comprehensive benchmarking results for the five post-quantum cryptographic (PQC) algorithms: CRYSTALS-Kyber, NTRU, BIKE, CRYSTALS-Dilithium, and Falcon across our three hardware environments. We analyze raw operation times, memory usage, network overhead in TLS sessions, concurrency scaling, and energy consumption patterns. Essential interpretations are integrated directly alongside the results to clarify implications for real-world deployments.
4.1. Raw Operation Times
We begin our analysis with the fundamental operations of each PQC algorithm, examining performance across different security levels and hardware environments.
4.1.1. Performance at NIST Security Level 1
Table 3 presents the operation times at NIST Security Level 1, offering approximately 128-bit classical security against quantum attacks.
4.1.2. Performance at NIST Security Level 3
Table 4 summarizes operation times at NIST Security Level 3, corresponding to approximately 192-bit classical security, which is generally recommended for long-term security.
4.1.3. Performance at NIST Security Level 5
Table 5 shows operation times at NIST Security Level 5, providing the highest security equivalent to 256-bit classical security, appropriate for protecting highly sensitive information.
Server E1: All operations complete in ≤0.28 ms across all algorithms and security levels, indicating negligible overhead for most cloud or data-center workloads. BIKE’s key generation at Level 5 (0.24 ms) represents the most computationally intensive KEM operation.
Laptop E2: Operations complete in under 1 ms for lattice-based schemes, with BIKE operations at Level 5 approaching 1 ms. This suggests that modern consumer devices can readily support PQC with minimal perceptible impact.
Device E3: Times increase by approximately one order of magnitude compared to Server E1. BIKE shows the most significant slowdown on resource-constrained hardware, with key generation at Level 5 exceeding 4 ms.
Algorithm Comparison: Among KEMs, CRYSTALS-Kyber consistently delivers the best performance across all operations and security levels, followed by NTRU, with BIKE showing higher computational costs. For signature schemes, Falcon offers significantly faster verification than Dilithium (approximately 45% faster on Device E3) but at the cost of much slower signing operations and key generation.
Performance Variability: Standard deviations on Device E3 are higher for BIKE operations (approximately 4–5% of the mean value) compared to lattice-based schemes (typically 3–4%), indicating less predictable execution times. This reflects BIKE’s probabilistic decoding process, which may require varying numbers of iterations.
Scaling with Security Levels: We observe a consistent overhead increase of 25–35% for lattice-based schemes when moving from Level 1 to Level 3, and a further 20–30% increase from Level 3 to Level 5. BIKE shows steeper scaling, with approximately 35–40% increase from Level 1 to Level 3 and 35–45% from Level 3 to Level 5.
4.1.4. Impact of Security Level
Figure 1 illustrates how encryption/verification performance scales from NIST Level 1 to Level 5 for each algorithm on Device E3. Each data point represents the mean time per operation over 1000 trials.
Interpretation: Systems with strict real-time constraints may need to limit themselves to Level 1 or 3 if frequent cryptographic operations are required, particularly when using BIKE or Falcon for signing.
Figure 2 demonstrates the same scaling pattern on Server E1, where the absolute times remain small enough that security level can be selected based primarily on security requirements rather than performance constraints.
4.2. Key, Ciphertext, and Signature Sizes
Beyond computational cost, PQC often incurs higher bandwidth usage due to larger key and ciphertext sizes.
Table 6,
Table 7 and
Table 8 list the byte lengths for the tested schemes at different security levels.
Figure 3 illustrates the differences in ciphertext and signature sizes across algorithms and security levels.
Algorithm Comparison: Lattice-based KEMs (Kyber and NTRU) generally offer the most compact public keys and ciphertexts. BIKE shows substantially larger key and ciphertext sizes, with Level 5 parameters exceeding 5 KB, which may present challenges in bandwidth-constrained environments.
Signature Size Comparison: Falcon offers significantly smaller signatures than Dilithium (approximately 63% smaller at Level 3), which could be critical for bandwidth-constrained applications that frequently transmit signed messages.
Network Considerations: Standard 4G/5G or broadband links can generally handle these key and ciphertext sizes. However, narrow-bandwidth channels (e.g., satellite or LPWAN) may face challenges with BIKE’s larger keys and Level 5 parameters or Dilithium signatures.
Embedded Constraints: Many IoT protocols limit packet sizes to 1–2 KB, which would necessitate fragmentation for BIKE at all security levels and for other algorithms at Level 5.
Impact of Security Level: Increasing security levels result in proportionally larger key and signature sizes. This scaling is roughly linear for lattice-based schemes (30–40% increase from Level 3 to Level 5), but BIKE shows a steeper increase of approximately 65–70%.
4.3. Comparison with Classical Cryptography
To contextualize our PQC benchmarks,
Table 9 compares the performance of post-quantum algorithms with traditional cryptographic schemes at roughly equivalent security levels (NIST Level 3 for PQC versus 192-bit security for classical algorithms) on Server E1.
This comparison reveals several important insights:
Key Generation Efficiency: All PQC algorithms demonstrate dramatically faster key generation than RSA (over 1000x faster for Kyber), addressing a significant bottleneck in classical asymmetric cryptography.
Encryption Performance: Lattice-based and code-based encryption operations are competitive with or faster than RSA encryption, with Kyber showing the best performance.
Verification Trade-offs: While RSA verification remains the fastest operation, Falcon’s verification approaches RSA’s speed. ECDSA shows the slowest verification times among the tested algorithms.
Overall Balance: PQC schemes generally offer more balanced performance profiles than classical algorithms, which typically excel at either signing (ECDSA) or verification (RSA) but not both.
4.4. Memory Usage
We measured memory footprint (peak resident set size, RSS) across all algorithms to understand their resource requirements.
Table 10 reports average RSS during cryptographic operations at NIST Level 3.
These measurements reveal important distinctions in memory efficiency:
Lattice-based KEMs: (Kyber and NTRU) demonstrate the lowest memory requirements, with Kyber being the most memory-efficient algorithm overall.
Code-based Cryptography: BIKE shows substantially higher memory usage compared to lattice-based KEMs, requiring approximately 50% more memory than Kyber. This reflects the larger matrices and syndrome computation tables needed for BIKE’s decoding operations.
Signature Schemes: Falcon exhibits high memory requirements, comparable to BIKE, primarily due to its Fast Fourier Transform operations and floating-point arithmetic during signing operations.
Implications for Embedded Systems: While all algorithms function on our Device E3 platform (1 GB RAM), highly constrained microcontrollers with only 256 KB or less available RAM would struggle with BIKE and Falcon operations without significant implementation optimizations.
4.5. Network Overhead in TLS Handshakes
To measure real-world impact on secure channels, we tested ephemeral key exchange in OpenSSL-OQS TLS 1.3 handshakes. For each KEM algorithm, we measured the total handshake time, including cryptographic operations and protocol message exchanges (
Table 11).
Key Observations:
The handshake overhead remains under 3 ms for all tested PQC algorithms. BIKE shows the highest overhead due to its larger key sizes and more complex decryption operations.
The “Hybrid Mode” column shows results when combining classical ECDHE with post-quantum KEMs. This approach adds only a small additional overhead (approximately 0.3–0.4 ms) while offering both traditional and quantum security.
Post-quantum handshakes are approximately 22–55% slower than pure ECDHE handshakes, with BIKE showing the largest relative increase (55%).
In heavily loaded servers with thousands of concurrent TLS handshakes, the performance differences between algorithms (0.6 ms between Kyber and BIKE) can have meaningful impact on throughput.
Figure 4 demonstrates how the relative impact of cryptographic overhead diminishes as network latency increases. At typical Internet latencies (20–60 ms), the difference between post-quantum and classical handshakes becomes negligible, constituting less than 2.5% of the total handshake time.
We also investigated the effects of packet fragmentation when using different PQC algorithms.
Table 12 shows the number of IP packets required to complete a TLS handshake across different MTU sizes.
These results highlight potential challenges for narrow-bandwidth networks and protocols with restricted packet sizes. BIKE exhibits substantially higher fragmentation, requiring up to 20 packets to complete a handshake on networks with small MTUs. This fragmentation increases both latency and the risk of handshake failures due to packet loss, particularly in congested or unreliable networks.
4.6. Concurrency and Scalability
We evaluated how the PQC algorithms perform under multi-threaded loads to understand their behavior in high-throughput environments.
Figure 5 shows how throughput (operations per second) scales with increasing threads on Server E1.
Our concurrency testing reveals several important characteristics:
All algorithms show near-linear scaling up to 8 threads, corresponding to the number of physical cores on Server E1. This indicates efficient parallelization potential for server-class deployments.
Beyond 8 threads, performance gains begin to taper due to shared cache and memory bandwidth limitations, although Falcon verification continues to show good scaling.
Falcon verification demonstrates the best absolute performance and scaling, achieving over 11,000 operations per second with 16 threads. This exceptional throughput makes Falcon particularly suitable for applications requiring high-volume signature verification.
BIKE shows the poorest scaling profile among the tested algorithms, achieving only about 70% of the throughput of Kyber at high thread counts. This is likely due to BIKE’s more complex memory access patterns and higher cache pressure.
The ability to efficiently parallelize cryptographic operations confirms that multi-core servers can handle high-volume PQC tasks without creating major performance bottlenecks.
Table 13 demonstrates the effect of batch processing on throughput, showing potential efficiency gains when multiple operations are processed consecutively. This is particularly relevant for server environments handling thousands of connections. The modest efficiency gains (5–7%) suggest that while batching provides some benefit, the algorithms already achieve good performance even for individual operations.
4.7. Energy Consumption
For resource-constrained environments such as IoT networks, energy consumption is a critical consideration.
Figure 6 shows the estimated energy consumption per operation for each algorithm on Device E3 at NIST Security Level 3.
These measurements reveal significant differences in energy profiles:
Kyber consistently demonstrates the lowest energy requirements for KEM operations, making it the most suitable choice for battery-powered devices.
NTRU key generation consumes approximately 57% more energy than Kyber key generation, reinforcing Kyber’s advantage for IoT devices that may need to refresh keys frequently.
BIKE operations show substantially higher energy consumption compared to lattice-based KEMs, with key generation requiring nearly three times the energy of Kyber. This makes BIKE less suitable for energy-constrained environments.
For signature schemes, Falcon presents an interesting trade-off: its verification is extremely energy-efficient (45% less than Dilithium), but its signing operation consumes more than twice the energy of Dilithium signing.
These energy metrics suggest that application-specific algorithm selection can significantly impact device battery life. For example, devices that primarily verify signatures would benefit from Falcon, while those that frequently sign data would achieve better battery life with Dilithium.
4.8. Comparison with NIST Benchmarks
To validate our results against established benchmarks,
Table 14 compares our measurements with those reported by NIST during their standardization process [
4]. This comparison uses NIST Level 3 parameters on systems with similar specifications to our Server E1.
Our measurements show consistently better performance than the NIST benchmark results, with improvements ranging from 10–25%. These differences can be attributed to:
More recent compiler optimizations in our GCC 11.3.0 vs. earlier compiler versions used in NIST’s evaluation
Implementation improvements in the latest liboqs library (version 0.7.2)
Hardware-specific optimizations enabled by our compiler flags, particularly on Server E1
Different test methodologies and environmental conditions
This comparison validates our methodology while highlighting the ongoing performance improvements in PQC implementations. It also suggests that actual deployments today may achieve better performance than reported in earlier standardization documentation, potentially accelerating the transition to quantum-resistant cryptography.
4.9. Overall Performance Analysis
Our comprehensive benchmarking reveals several key insights about the practical performance of post-quantum cryptography across different computing environments:
Algorithm Efficiency: Among KEMs, CRYSTALS-Kyber consistently delivers the best overall performance in terms of speed, memory usage, and energy efficiency, while BIKE demonstrates the highest computational and memory costs. For signature schemes, the choice between Dilithium and Falcon presents clear trade-offs between signing speed (favoring Dilithium) and verification performance and signature size (favoring Falcon).
Hardware Scalability: All tested algorithms demonstrate reasonable performance even on resource-constrained hardware (Device E3), though with significant variations. The performance gap between high-end servers and edge devices is approximately one order of magnitude, indicating that while PQC is feasible on IoT-class hardware, careful algorithm selection and security level choices are important for constrained environments.
Security Level Impact: Increasing security levels from NIST Level 1 to Level 5 results in performance penalties of 20–40% for lattice-based schemes and 35–80% for BIKE, accompanied by proportional increases in key and signature sizes. This scaling behavior should inform security-performance trade-off decisions, particularly in resource-constrained environments.
Protocol Integration: TLS handshakes with post-quantum KEMs show moderate overhead compared to classical ECDHE, but the differences become negligible in realistic network conditions with typical Internet latencies. Packet fragmentation may become a concern with larger parameter sets and constrained MTUs, particularly for BIKE.
Concurrency Benefits: All algorithms demonstrate good parallelization potential, with near-linear scaling up to the number of physical CPU cores. This confirms that PQC is readily deployable in high-throughput server environments without creating significant performance bottlenecks.
Energy Considerations: Energy consumption varies significantly across algorithms, with Kyber and Falcon verification demonstrating the highest efficiency. For battery-powered IoT devices, these energy differences could translate to meaningful impacts on operational lifespan.
These findings collectively support the conclusion that post-quantum cryptography has matured to the point where it can be deployed across a wide spectrum of computing environments, from high-performance servers to constrained IoT devices. While performance characteristics vary across algorithms and security levels, all tested schemes demonstrate practicality for real-world usage, with appropriate selection based on specific application requirements and hardware constraints.
5. Discussion and Conclusions
Our comprehensive benchmarking results demonstrate that post-quantum cryptography (PQC) has matured from theoretical proposals into practical, deployable solutions suitable for diverse computing environments. The performance profiles of CRYSTALS-Kyber, NTRU, BIKE, CRYSTALS-Dilithium, and Falcon across server, laptop, and edge devices provide valuable insights for organizations planning their quantum-resistant security strategies. This section synthesizes our key findings, addresses implementation challenges, and outlines future research directions.
5.1. Algorithm Performance and Environmental Suitability
The experimental data reveals distinct performance characteristics that make certain algorithms more suitable for specific deployment scenarios. In resource-constrained environments such as IoT and edge devices, CRYSTALS-Kyber consistently demonstrates superior performance with the lowest key-generation latency (0.92 ms at Level 1 on Device E3), modest memory footprint (850 KB), and minimal energy consumption (3.5 mJ per key generation). Although NTRU exhibits comparable encryption and decryption times, its key generation is approximately 50% slower than Kyber’s, creating potential bottlenecks for devices that require frequent re-keying.
BIKE, representing the code-based cryptographic approach, shows noticeably higher computational and memory requirements compared to lattice-based KEMs. Its key generation time (2.25 ms at Level 1 on Device E3) is more than twice that of Kyber, and its energy consumption (9.2 mJ) approaches three times Kyber’s requirements. This significant performance gap makes BIKE less suitable for energy-constrained IoT devices, though it provides valuable algorithmic diversity in a cryptographic portfolio.
For signature schemes, our expanded testing presents a nuanced comparison between CRYSTALS-Dilithium and Falcon. Dilithium offers moderate signing times and memory usage, making it a balanced choice for many applications. However, Falcon provides significantly smaller signatures (approximately 63% smaller than Dilithium at Level 3) and substantially faster verification (43% faster on Device E3), albeit with slower signing operations and higher memory requirements. This tradeoff becomes particularly important in bandwidth-constrained environments where signature size directly impacts transmission efficiency, or in applications where verification operations significantly outnumber signing operations.
On high-performance servers, all tested algorithms deliver sub-millisecond operation times at NIST Level 3, with excellent concurrency scaling across multiple threads. The near-linear throughput gains observed with up to eight threads indicate that modern data centers can handle thousands of parallel PQC operations with minimal overhead. Even Falcon’s more complex signing procedure requires only 0.22 ms on our server platform, suggesting that computational constraints are unlikely to be a significant barrier to PQC adoption in enterprise environments.
Our performance comparison with classical cryptography demonstrates that PQC alternatives often outperform their traditional counterparts in key operations. For instance, all PQC key generation operations dramatically outperform RSA-3072 (over 1000x faster for Kyber), addressing a significant performance bottleneck in classical cryptography. This comparison reinforces that the adoption of quantum-resistant algorithms need not entail significant performance penalties and may actually improve system efficiency in many scenarios.
5.2. Practical Deployment Considerations
The integration of PQC into existing systems presents several important challenges that organizations must address. First, the increased key and signature sizes of post-quantum algorithms—ranging from approximately 700 bytes to over 5 KB depending on the algorithm and security level—may create bandwidth and protocol compatibility issues. Our TLS handshake fragmentation analysis revealed that while lattice-based schemes require only minimal additional packets compared to classical algorithms, BIKE can necessitate substantially more packets, particularly on networks with restricted MTUs. In such constrained environments, packet fragmentation becomes likely, potentially increasing handshake complexity, latency, and vulnerability to packet loss.
Legacy infrastructure presents another significant hurdle for PQC deployment. Many industrial systems and embedded devices rely on cryptographic libraries and protocols designed with assumptions about key and signature sizes that no longer hold for post-quantum algorithms. For example, our measurements of TLS handshakes with post-quantum KEMs show overhead ranging from 0.4 ms (Kyber) to 1.0 ms (BIKE) compared to classical ECDHE. While these differences are modest in absolute terms, they may require adjustments to handshake timeouts and connection management systems in high-throughput services.
To address these challenges, a phased migration approach using hybrid cryptographic solutions offers a pragmatic path forward. By combining classical algorithms (e.g., ECDHE) with post-quantum schemes (e.g., Kyber), organizations can maintain backward compatibility while gradually introducing quantum resistance. Our TLS benchmarks with hybrid modes demonstrate that this approach adds only marginal overhead (approximately 0.3–0.4 ms) compared to pure post-quantum handshakes, making it an attractive transitional strategy.
The comparative analysis with NIST’s benchmark results validates our methodology while highlighting ongoing performance improvements in PQC implementations. The 10–25% performance gains we observed relative to earlier NIST measurements suggest that continued optimization efforts by the cryptographic community are steadily enhancing the practicality of these algorithms. This trend is encouraging for organizations planning large-scale PQC deployments in the near future.
5.3. Security Considerations and Optimizations
While our testing focused primarily on performance metrics, several important security considerations merit attention. The mathematical foundations of the tested algorithms—Module Learning With Errors for Kyber and Dilithium, NTRU lattices for NTRU and Falcon, and quasi-cyclic moderate-density parity-check codes for BIKE—provide different approaches to achieving quantum resistance. This diversity is valuable from a security perspective, as it mitigates the risk of a single cryptanalytic breakthrough compromising all quantum-resistant systems.
However, actual implementations may be vulnerable to side-channel attacks that exploit timing variations, power consumption patterns, or other physical characteristics of the cryptographic operations. These vulnerabilities are of particular concern for lattice-based schemes due to their reliance on polynomial arithmetic operations that can leak information through various side channels. Code-based cryptography like BIKE may present different side-channel characteristics, potentially offering advantages in certain deployment scenarios where specific attack vectors are of concern.
Future implementations should incorporate robust countermeasures such as constant-time operations, blinding techniques, and possibly hardware-based protections. These security enhancements will likely impact performance, and their effects should be carefully measured against the baseline results we’ve established.
Selecting appropriate security levels represents another key decision point for PQC deployments. Our results show that moving from NIST Level 1 to Level 5 typically increases computation time by 30–50% for lattice-based schemes and 35–80% for BIKE, while enlarging key and signature sizes by similar proportions. Organizations should tailor these security parameters to specific use cases, potentially implementing different security levels within the same system based on risk profiles and performance requirements.
The prospect of hardware acceleration offers promising avenues for further optimizing PQC performance. The polynomial operations central to lattice-based cryptography and the syndrome decoding operations in BIKE could benefit significantly from dedicated hardware implementations or specialized instructions. As PQC becomes more widely deployed, we anticipate increasing vendor support for accelerated implementations in CPUs, FPGAs, and specialized cryptographic modules. Such hardware optimization could substantially reduce the performance gap between classical and post-quantum cryptography, particularly for resource-constrained devices.
5.4. Recommendations and Future Work
Based on our comprehensive evaluation, we offer several recommendations for organizations planning PQC deployments:
For IoT and edge environments with limited computational resources, CRYSTALS-Kyber provides the most efficient key encapsulation mechanism across all measured metrics (speed, memory, and energy). For signature schemes, the choice between Dilithium and Falcon should be based on the relative importance of signature size, verification speed, and memory usage. Applications that verify signatures much more frequently than they generate them may benefit substantially from Falcon’s faster verification times, while those with tighter memory constraints or frequent signing operations would be better served by Dilithium.
When algorithmic diversity is a priority (e.g., to mitigate the risk of cryptanalytic breakthroughs), BIKE provides a viable non-lattice alternative, though with performance trade-offs that must be carefully considered, particularly in resource-constrained environments. The significantly larger key sizes and higher computational requirements of BIKE make it more suitable for server or desktop environments rather than IoT deployments.
Enterprise and cloud deployments can readily adopt any of the tested algorithms without significant performance concerns, as all achieve sub-millisecond operation times on modern server hardware. In these environments, selection criteria should focus on standards compliance, security requirements, and integration with existing systems rather than raw performance. The excellent concurrency scaling observed for all algorithms indicates that high-volume server applications can efficiently handle PQC operations even under significant load.
Hybrid cryptography approaches combining classical and post-quantum algorithms offer a pragmatic transition strategy that maintains compatibility with legacy systems while incrementally introducing quantum resistance. The modest additional overhead demonstrated in our TLS benchmarks supports this approach for organizations that cannot immediately upgrade all endpoints.
Further research is needed in several key areas to continue advancing PQC deployment. More comprehensive benchmarking of hybrid modes and composite ciphersuites would provide valuable guidance for organizations implementing transitional cryptographic protocols. As side-channel attack surfaces become better understood, systematic evaluation of countermeasures and their performance impacts will be essential for securing real-world implementations.
Hardware acceleration technologies, including specialized CPU instructions, GPU-accelerated implementations, and FPGA-based cryptographic accelerators, represent promising avenues for further optimizing PQC performance. Continued development in this area could substantially reduce the overhead of quantum-resistant cryptography, particularly for resource-constrained devices.
In conclusion, our results confirm that post-quantum cryptography has progressed beyond theoretical concern to practical implementation. The benchmarked algorithms— CRYSTALS-Kyber, NTRU, BIKE, CRYSTALS-Dilithium, and Falcon—all demonstrate performance characteristics suitable for real-world deployment across a spectrum of computing environments, with appropriate selection based on specific application requirements and hardware constraints. By understanding the performance profiles, security considerations, and implementation challenges associated with these algorithms, organizations can develop effective strategies for migrating to quantum-resistant cryptography while maintaining system performance and compatibility.
As quantum computing continues to advance, the cryptographic community must accelerate efforts to deploy post-quantum solutions before large-scale quantum computers become capable of breaking classical cryptographic schemes. Our findings provide a foundation for informed decision-making in this critical transition, helping to ensure that digital communications remain secure in the post-quantum era.