Next Article in Journal
A Simple Introduction to Quantum Annealing in Antennas and Propagation
Previous Article in Journal
Security in Collaborative Driving: A Survey of Threats, Defenses, and Emerging Trends
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Parameter Settings and Efficient Computation for Homomorphic Encryption in CKKS

Department of Electrical and Computer Engineering, University of Windsor, Windsor, ON N9B 3P4, Canada
*
Author to whom correspondence should be addressed.
Electronics 2026, 15(11), 2391; https://doi.org/10.3390/electronics15112391
Submission received: 30 April 2026 / Revised: 25 May 2026 / Accepted: 28 May 2026 / Published: 1 June 2026
(This article belongs to the Section Computer Science & Engineering)

Abstract

This paper investigates the impact of modular reduction backends on the security–performance trade-offs of the CKKS approximate homomorphic encryption scheme. Specifically, we compare a standard Residue Number System–Chinese Remainder Theorem (RNS-CRT) implementation with a Barrett reduction–based backend across representative parameter sets ( N , log q ) , spanning approximately 80–256-bit security levels as recommended by the Homomorphic Encryption Standard. We evaluate typical CKKS workloads—including encoding, encryption, homomorphic multiplication with the Number Theoretic Transform (NTT), rescaling, relinearization, and decryption—by measuring execution time and peak memory usage on a uniform experimental platform. Our results indicate that the parameter pair ( N , log q ) primarily determines both security level and computational cost, while the choice of backend significantly influences the trade-offs between performance and memory efficiency. In particular, the Barrett-based backend is competitive and slightly more memory-efficient at lower security levels, whereas the CRT-based approach achieves lower latency at higher security levels. However, Barrett reduction provides notable memory savings at the cost of a 2–4× increase in runtime. Based on these findings, we derive practical guidelines for selecting CKKS parameters and modular reduction backends under varying constraints on security, latency, and memory.

1. Introduction

Modern cryptography provides mathematically grounded tools to ensure data confidentiality, integrity, and authenticity in adversarial environments [1]. As digital health systems and connected medical devices proliferate, healthcare providers increasingly rely on cryptography to protect sensitive clinical records, diagnostic data, and treatment histories while still enabling data-driven decision making [2]. Lattice-based, post-quantum cryptography offers strong security guarantees against both classical and quantum adversaries and forms the foundation for advanced primitives such as homomorphic encryption (HE) [3].
HE enables computation directly on encrypted data and is therefore well suited to privacy-preserving health management, where raw medical data (e.g., lab results, imaging features, or wearable sensor streams) cannot be exposed to cloud or third-party analytics systems. In this context, HE supports tasks such as secure risk scoring, encrypted model inference, and collaborative analysis across institutions without revealing individual patient records. Among existing schemes, the Cheon–Kim–Kim–Song (CKKS) scheme for approximate arithmetic is particularly attractive for healthcare workloads because many clinical and predictive models operate on real-valued vectors and can tolerate small numerical approximation errors [4]. CKKS encodes real or complex-valued health features into ring elements and performs approximate fixed-point arithmetic over ciphertexts, enabling deeper encrypted computations than exact-integer schemes at comparable parameter sizes [4].
The practical deployment of CKKS in health management systems depends critically on parameter settings and implementation choices. In particular, the ring degree N and ciphertext modulus q largely determine security level, numerical precision, and circuit depth, while the choice of arithmetic backend, i.e., modular reduction method, directly impacts runtime and memory usage [5,6].
CRT-based residue number systems and Barrett-based techniques are probably the two most widely used modular reduction methods in cryptographic implementations, like in CKKS [7,8]. To the best of our knowledge, there is still no comprehensive evaluation of these two backends within a unified homomorphic encryption environment. Most prior studies analyze modular reduction approaches independently or across different software frameworks and parameter settings, making direct and fair comparisons challenging.
In this work, we bridge this gap by evaluating CRT and Barrett backends within the same HE framework, focusing on how different ( N , q ) configurations and CRT versus Barrett implementations affect performance by measuring their effects on computational overhead, memory utilization, and ciphertext-level performance under standard security levels. The goal is to provide experimentally grounded guidelines for designing CKKS-based health management pipelines that respect strict privacy requirements while remaining practical on real-world computing platforms [9,10,11].
The main contributions of this work include (1) A systematic study of CKKS parameters, highlighting the impact of ring degree N and ciphertext modulus size log q on security, depth, runtime, and memory usage. (2) A comparison of two CKKS modular reduction backends: conventional RNS–CRT and Barrett-based modular reduction. (3) Practical guidelines for parameter and backend selection under different security and application requirements for CPU-based privacy-preserving analytics and machine-learning workloads.
To enhance readability, a list of acronyms and abbreviations is provided at the end of the paper.

2. Math Preliminaries

2.1. Cyclotomic Polynomial Ring

CKKS and many RLWE-based schemes work over a cyclotomic polynomial ring
R = Z [ X ] / ( Φ 2 N ( X ) ) ,
where N is a power of two and Φ 2 N ( X ) = X N + 1 is the ( 2 N ) th cyclotomic polynomial [3,12]. For a modulus q, coefficients are reduced to obtain
R q = Z q [ X ] / ( X N + 1 ) .
An element of R q is a polynomial
a ( X ) = a 0 + a 1 X + + a N 1 X N 1 , a i Z q ,
with addition and multiplication performed modulo X N + 1 and q. The choice Φ 2 N ( X ) = X N + 1 in  Equation (1), where N is a power of two, ensures Cooley–Tukey FFT-style efficient computation for Number Theoretic Transform (NTT) and supports packing of complex or real vectors via the canonical embedding, which is essential for CKKS throughput [3,4].

2.2. Ring Learning with Errors (RLWE)

The security of CKKS relies on the Ring Learning With Errors problem, which lifts the LWE hardness assumption from vectors over Z q to polynomials in R q .
A typical RLWE cryptosystem can be briefly outlined as follows. For the uniformly sampled polynomial a ( X ) R q and the sampled small polynomial s ( x ) , compute b ( X )  from
b ( X ) = a ( X ) s ( X ) + e ( X ) mod q ,
where e ( X ) is a small “error” polynomial. Then the public key is the pair ( a ( X ) , b ( X ) ) , and the secret key is s ( X ) . Given many such noisy equations, recovering s ( X ) is believed to be as hard as solving certain worst-case problems on ideal lattices in the underlying cyclotomic field [13].
For the plaintext message m ( X ) , encode it into a polynomial in R q . Obtain the small polynomials r ( X ) , e 1 ( X ) , e 2 ( X ) through sampling and compute
c 1 ( X ) = a ( X ) r ( X ) + e 1 ( X ) mod q , c 2 ( X ) = b ( X ) r ( X ) + e 2 ( X ) + Encode ( m ( X ) ) mod q ,
Then, the ciphertext can be given as c = ( c 1 ( X ) , c 2 ( X ) ) .
The decryption requires the computation of
v ( X ) = c 2 ( X ) c 1 ( X ) s ( X ) mod q .
So that m ( X ) can be recovered from v ( X ) = Encode ( m ( X ) ) + ( small noise ) .
When instantiated as an encryption scheme, RLWE yields compact public keys and ciphertexts: the public key encodes one noisy relation of the form (2), and encryption masks the plaintext polynomial m ( X ) with additional small noise polynomials (3). Decryption subtracts the secret-dependent part and recovers m ( X ) as long as the accumulated noise remains below a modulus-dependent threshold (4). This structure is inherited directly by CKKS, which augments RLWE encryption with approximate fixed-point encoding and rescaling operations [4,14].

2.3. Residue Number System and CRT in R q

To support large ciphertext moduli efficiently, CKKS uses a residue number system based on CRT. A large integer modulus
q = q 0 q 1 q ,
with pairwise coprime primes q i , is represented by its residues
a mod q ( a 0 , , a ) , a i = a mod q i .
CRT implies the ring isomorphism
Z q Z q 0 × × Z q ,
so additions and multiplications modulo q can be performed independently in each limb.
This idea extends naturally to the polynomial ring R q , yielding the so-called “double-CRT” representation
R q R q 0 × × R q , R q i = Z q i [ X ] / ( X N + 1 ) .
A ciphertext polynomial in R q is stored as its components in each R q i , and all homomorphic operations are carried out limb-wise. Full-RNS CKKS variants operate entirely in this representation and are now standard in efficient implementations [5,6,9].

2.4. Barrett Modular Reduction

Modular reduction by the small primes q i is a core operation in NTT butterflies, pointwise multiplications, and rescaling steps. Direct division by q i is expensive, so CKKS implementations typically employ Barrett modular reduction, which replaces division by precomputed constants and multiplications [15,16].
For a fixed modulus q and an integer k with 2 k > q 2 , Barrett’s method precomputes
μ = 2 k q .
Given an input x with 0 x < q 2 , an approximate quotient is obtained by
q ^ = x · μ 2 k ,
and the intermediate remainder is
r = x q ^ · q .
One can show that 0 r < 2 q , so a final conditional subtraction yields x mod q . In the RNS setting, the same idea is applied independently to each limb q i , using precomputed ( k i , μ i ) for each modulus. Compared to division-based reduction, Barrett’s method significantly reduces latency and is competitive with Montgomery-style techniques in HE workloads [16,17,18].

3. Homomorphic Encryption Schemes and CKKS

Since its introduction, HE has evolved from supporting a single operation on ciphertexts to enabling practical evaluation of complex functions over encrypted data. Early schemes were either additively or multiplicatively homomorphic, while later constructions combined both operations and ultimately achieved fully homomorphic encryption (FHE) with support for arbitrary circuits [19]. This evolution has been driven by lattice-based cryptography and noise-management techniques, which improved efficiency enough for HE to be applied in privacy-sensitive domains such as healthcare analytics and clinical decision support [20].

3.1. From PHE and SHE to FHE

Let Enc ( m ) and Dec ( c ) denote encryption and decryption of a message m and ciphertext c, and let ⊕ and ⊗ denote homomorphic addition and multiplication. A partially homomorphic encryption (PHE) scheme supports only one operation, for example
Enc ( m 1 ) Enc ( m 2 ) = Enc ( m 1 + m 2 ) ,
for additively homomorphic schemes such as Paillier, or 
Enc ( m 1 ) Enc ( m 2 ) = Enc ( m 1 · m 2 ) ,
for multiplicatively homomorphic constructions such as textbook RSA without padding.
Somewhat homomorphic encryption (SHE) schemes support both addition (6) and multiplication (7) but only up to a bounded depth because each operation increases ciphertext noise, and decryption fails once the noise exceeds a threshold [21]. Fully homomorphic encryption (FHE) removes this depth limitation by periodically refreshing ciphertexts via bootstrapping, allowing the evaluation of arbitrary polynomial-size circuits
Eval C , Enc ( m 1 ) , , Enc ( m t ) = Enc C ( m 1 , , m t ) ,
for any circuit C [19].

3.2. Generations of HE Schemes

First-generation FHE schemes, starting from Gentry’s original ideal-lattice construction, established feasibility but were orders of magnitude too slow for practice [19]. Second-generation schemes such as BGV and BFV moved to RLWE-based polynomial rings and introduced techniques such as modulus switching, key switching, and SIMD batching, enabling leveled FHE without frequent bootstrapping [14,22]. Third-generation schemes (GSW, FHEW, TFHE) optimized binary gate evaluation and made bootstrapping fast enough for real-time logic on encrypted bits [23,24,25]. A fourth generation focuses on approximate arithmetic, where many applications—including health management—operate on real-valued features and tolerate small numerical errors. In this space, the CKKS scheme is now a de facto standard, enabling efficient encrypted computation over vectors of real or complex numbers with floating-point-like semantics [4].

3.3. CKKS Scheme

CKKS is an RLWE-based homomorphic encryption scheme tailored to approximate arithmetic on real and complex vectors [4]. It operates over the cyclotomic ring
R q = Z q [ X ] / ( X N + 1 ) ,
where N is a power of two, and q is the ciphertext modulus, often factored into smaller RNS primes. Plaintexts encode vectors in C N / 2 using a packing technique based on the canonical embedding, and each ciphertext carries a scale parameter that tracks fixed-point precision [4,26].

3.3.1. Encoding and Approximate Arithmetic

CKKS assumes that user data are a vector of complex numbers z C n . The encoding maps the vector into the values of a polynomial at special points (roots of unity), then reconstructs the polynomial m ( X ) via an inverse transform.
Let ξ be a primitive ( 2 N ) -th root of unity, ξ C . So we construct m ( X ) such that m ( ξ k i ) = z i through
z I n v e r s e F F T coefficients of m .
after embeds m into the cyclotomic field via the canonical embedding, then CKKS applies a scaling factor Δ :
w = Δ · π 1 ( m ) ,
where π 1 reconstructs a Hermitian-symmetric vector from the slots [26]. The coordinates of w in a fixed lattice basis are rounded to integers and used as coefficients of the plaintext polynomial
m ( X ) = i = 0 N 1 z ˜ i X i R q .
Decoding inverts this process: after decryption, the polynomial is mapped back through the canonical embedding and divided by Δ , yielding an approximation of the original slot vector. The choice of Δ , modulus chain, and N determines the achievable precision and depth for a given health-management workload.

3.3.2. Encryption and Decryption

The secret key is a small polynomial
s ( X ) χ key R q ,
and the public key ( p 0 ( X ) , p 1 ( X ) ) R q 2 is constructed as a noisy RLWE sample encrypting zero [4]. Given an encoded plaintext m ( X ) R q , encryption samples fresh small polynomials u ( X ) , e 0 ( X ) , e 1 ( X ) and outputs the ciphertext
c 0 ( X ) = p 0 ( X ) u ( X ) + e 0 ( X ) + m ( X ) ( mod q ) ,
c 1 ( X ) = p 1 ( X ) u ( X ) + e 1 ( X ) ( mod q ) ,
so that c = ( c 0 , c 1 ) R q 2 .
Decryption uses the linear form in the secret key
m ˜ ( X ) = c 0 ( X ) + c 1 ( X ) s ( X ) ( mod q ) ,
which equals m ( X ) plus a bounded noise polynomial. After decoding and division by Δ , the slots approximate the original real or complex inputs [27].

3.3.3. Homomorphic Addition and Multiplication

Let ciphertexts c = ( c 0 , c 1 ) from (8) and (9), and  c = ( c 0 , c 1 ) encrypt plaintexts μ and μ at the same level. Homomorphic addition is defined component-wise:
c add = ( c 0 + c 0 , c 1 + c 1 ) ,
and decryption yields
Dec s ( c add ) = ( c 0 + c 0 ) + ( c 1 + c 1 ) s μ + μ .
Plaintext–ciphertext multiplication by a plaintext polynomial μ ( X ) is given by
c ptmult = ( μ c 0 , μ c 1 ) ,
with
Dec s ( c ptmult ) = μ ( c 0 + c 1 s ) μ · μ .
Ciphertext–ciphertext multiplication takes two ciphertexts c = ( c 0 , c 1 ) and c = ( c 0 , c 1 ) and forms a 3-component ciphertext
d 0 = c 0 c 0 ,
d 1 = c 0 c 1 + c 1 c 0 ,
d 2 = c 1 c 1 ,
so that
d 0 + d 1 s + d 2 s 2 μ · μ .

3.3.4. Relinearization

Relinearization maps the 3-component product ( d 0 , d 1 , d 2 ) back to a standard 2-component ciphertext under the same secret key [28]. Using an evaluation key that encrypts s 2 under s, relinearization computes
( c 0 * , c 1 * ) = ( d 0 , d 1 ) + RelinMap ( d 2 , evk ) ,
and the resulting ciphertext
c relin = ( c 0 * , c 1 * )
again decrypts with the standard rule Dec s ( c relin ) = c 0 * + c 1 * s μ · μ .

3.3.5. Rescaling and Multiplication Pipeline

After multiplication, both the scale and modulus grow. CKKS controls this growth with a rescaling step that simultaneously reduces the modulus and divides the ciphertext by the scale [6,29]. If c lives at level with modulus q , rescaling to level 1 is defined as
Rescale 1 ( c ) = q 1 q · c mod q 1 ,
which approximately divides both the message and noise by the scale and drops one modulus from the chain.
Combining these steps, the CKKS multiplication pipeline is
c mult = CMult ( c , c ) Relin c relin Rescale c rs ,
where c rs is a 2-component ciphertext at the next lower level with restored scale and controlled noise, ready for further homomorphic operations.

4. Parameter Selection, Simulation and Recommendations

4.1. CKKS Parameters Used in This Work

In the proposed implementation, the CKKS scheme is instantiated with a tuple of parameters ( N , q , Δ , σ ) that jointly determine security, precision, and performance. All experiments use the same parameter sets for both the CRT-based baseline and the Barrett-based variant, so that any performance differences arise purely from the modular arithmetic backend and not from changes in cryptographic strength.

4.1.1. Ring Degree N

The ring degree N is chosen as a power of two (in the range 2 10 2 14 in this work). It defines the polynomial ring Z [ X ] / ( X N + 1 ) , the RLWE lattice dimension, and the number of SIMD slots ( N / 2 ) [4,6]. Larger N improves security and packing capacity but increases ciphertext and key sizes, and raises the cost of NTT-based operations roughly as O ( N log N ) .

4.1.2. Ciphertext Modulus q

The ciphertext modulus q is represented in RNS form as
q = i = 0 q i ,
where each q i is a 30–60-bit prime so that arithmetic in each limb fits in an 80-bit word. This modulus chain is consumed by homomorphic multiplication and rescaling: every multiplication increases noise and effective scale, and each rescale step drops one q i and divides by the scaling factor Δ [4,5]. The total size log q must be large enough to sustain the required multiplicative depth and precision; too small a log q causes decryption failures, whereas too large a log q adds limbs and slows all operations without increasing security [6,30].

4.1.3. Scaling Factor Δ

The scaling factor Δ is chosen as a power of two (e.g., Δ = 2 30 2 40 ). It is applied during encoding to map real or complex health features into fixed-point integers by multiplying by Δ and rounding [30]. A larger Δ improves numerical precision but consumes more modulus bits per multiplication and may require a longer modulus chain; a smaller Δ saves modulus but may reduce application-level accuracy for clinical workloads.

4.1.4. Noise Distribution Parameter σ

The parameter σ controls the discrete Gaussian (or similar) distribution used for RLWE noise in encryption and key-dependent operations [6]. Typical choices (e.g., σ 3 ) provide sufficient initial noise for security while keeping it small enough to be tracked within the available modulus. In this work, ( N , q , Δ , σ ) are selected according to standard CKKS guidelines [6,30] and then fixed across all implementations.

4.2. Parameters ( N , log q ) Versus CKKS Security Levels L

The pair ( N , log q ) is the dominant driver of both security and performance in CKKS. Security is based on the hardness of Ring-LWE in the cyclotomic ring R q = Z q [ X ] / ( X N + 1 ) , so N and log q define the underlying lattice dimension and modulus [6]. In practice, the security level (L, in bits) is estimated via the cost of the best known lattice attacks (typically BKZ-based), which depends primarily on the ring dimension N and the modulus-to-noise ratio (captured through log q and the error distribution) [31,32,33].

4.2.1. Dependence of Security Level L on N

For RLWE instances with fixed modulus and noise parameters, increasing the polynomial modulus degree N (equivalent to the lattice dimension) increases hardness.
The complexity of the best known attacks (e.g., BKZ with block size β ) scales as:
Cost 2 c · β
where β itself depends on N and the ratio q / σ  (the standard noise deviation is usually set as σ = 3.2 ).
Heuristically, for fixed log q , increasing N increases the required block size β , and hence the attack cost.
The implication is that security increases with N, but sublinearly in terms of security bits. For example, doubling N does not double L; instead, it typically yields a moderate increase depending on the modulus size. This behavior is consistent with modern lattice cost models and empirical estimates from tools such as the LWE Estimator.

4.2.2. Dependence of Security Level L on log q

The modulus q directly affects the distinguishability of RLWE samples.
Increasing q (for fixed noise standard deviation σ ) increases the ratio q / σ , making the problem easier. A common heuristic from LWE/RLWE analysis relates the root-Hermite factor to:
δ 0 n q σ
which implies that larger q leads to smaller required δ 0 , hence easier lattice reduction.
Since attack cost depends exponentially on lattice reduction quality, one may write:
log ( Cost ) n log ( q / σ )
As a result, security decreases as log q increases. However, the dependence is not linear; rather, it appears inside a logarithm within an exponential model. Thus, increasing log q reduces security in a nonlinear and attenuated manner.

4.2.3. Combined Dependence and Practical CKKS Parameter Selection

Combining the above effects yields a commonly used first-order heuristic:
L N log ( q / σ )
This expression captures the fundamental trade-off:
  • Increasing N improves security;
  • Increasing q degrades security;
  • The relationship is ratio-based and nonlinear.
However, this is only an approximation. In CKKS, the modulus q grows with multiplicative depth, so parameter selection must respect a security budget constraint:
  • N and log q cannot be scaled independently;
  • For a fixed target security level (e.g., 128 bits), increasing log q requires increasing N.
This explains the structure of parameter tables in implementations such as Microsoft SEAL [34], where larger N permits larger total modulus log q , but the scaling is sublinear rather than proportional.
For fixed log q , increasing N raises the cost of lattice reduction; for fixed N, increasing log q generally weakens the instance and may require a larger N to restore the same security level [6]. The Homomorphic Encryption Standard therefore provides recommended ( N , log q ) curves for 80–128-bit security, which we follow when selecting parameter sets [6].
From a performance perspective, nearly all CKKS operations (NTT/INTT, ciphertext multiplication, key switching, and rotations) scale with N and with the number of RNS primes in q [9,11,35]. Larger N increases latency and memory but also provides more slots ( N / 2 ), improving amortized cost per health record or feature vector. Adding primes to q increases depth and precision but enlarges ciphertexts and evaluation keys and raises time and memory. Consequently, both N and log q are kept as small as possible while meeting target security and accuracy for the health-management workloads considered.

4.3. Algorithmic Choices: RNS–CRT and Barrett Reduction

Efficient polynomial arithmetic in R q is essential to make these parameter choices meaningful. All implementations in this work use the same RNS/CRT representation: the large modulus q is split into pairwise coprime primes q i , and each polynomial coefficient is stored as its residues ( a mod q i ) [5]. This “double-CRT” representation enables limb-wise parallelism and efficient rescaling in Full-RNS CKKS [5,36].
The only algorithmic difference between the two backends lies in how modular reduction is performed at each limb. The baseline uses a conventional CRT-centric modular reduction, while the proposed variant replaces division by Barrett reduction, using precomputed reciprocals μ i for each modulus q i [16,37]. In both cases, the RNS structure and CKKS parameters are identical; the experiments therefore isolate how the choice between RNS-CRT (or CRT in short form) and Barrett affects runtime and peak memory across ( N , log q ) and security levels, for representative health-management workloads.

4.4. Simulation Settings

The experiments are run on an AMD Ryzen 7 7700X (8 cores, 16 threads, 4.49 GHz, fabricated by TSMC in Taiwan), 64 GB DDR5-5200 RAM (manufactured by SK Hynix in South Korea), Ubuntu 22.04, and Python 3.12. The CRT baseline is a forked py-fhe implementation; the Barrett backend is our CKKS-Barrette code. Both use the same CKKS parameters and RNS structure, so that differences reflect only the modular reduction backend. For each parameter set ( N , log q ) , we measure end-to-end execution time (seconds) and peak memory (MB) for a representative CKKS workload that includes at least one ciphertext–ciphertext multiplication with rescaling and relinearization.

4.5. Simulation Results and Recommendations

Simulations were conducted for selected parameter pairs ( N , log q ) at security levels of 80, 100, 110, 128, 192, and 256 bits. The corresponding results are summarized in Table 1, Table 2, Table 3, Table 4, Table 5 and Table 6, respectively. The parameter sets chosen for the 128-, 192-, and 256-bit security levels follow the recommendations of the Homomorphic Encryption Standard [6].

4.5.1. The 80-Bit Security

An 80-bit security level, commonly regarded as ultra-lightweight security, corresponds to legacy cryptographic systems such as 1024-bit RSA and Triple DES. It remains relevant in legacy communication infrastructures and resource-constrained embedded devices where lightweight security is sufficient.
For ( N , log q ) = ( 1024 , 80 ) , CRT is slightly faster than Barrett, whereas Barrett achieves significantly lower memory usage. For the remaining parameter sets, CRT substantially outperforms Barrett in execution time, while Barrett consistently requires less memory.
With a fixed security level, e.g., 80 bits, the selection of ( N , log q ) depends on the requirements of the target application. Further discussion is provided at the end of this section.

4.5.2. The 100 and 110-Bit Security Levels

Security levels of 100 and 110 bits are generally considered lightweight and are suitable for IoT devices, RFID systems, sensor networks, edge computing platforms, and embedded applications.
Even when higher security levels are desired, lower-security configurations are often explored for latency-sensitive applications and intermediate deployment scenarios.
At 100-bit security with ( N , log q ) = ( 2048 , 100 ) , CRT achieves lower execution time but requires more memory. For ( 4096 , 200 ) , CRT is more than twice as fast as Barrett, whereas Barrett reduces memory usage to approximately 95% of that required by CRT. For ( 8192 , 300 ) , CRT is nearly three times faster, while Barrett reduces memory consumption by about 2%.
At 110-bit security, CRT is approximately twice as fast as Barrett for ( 2048 , 80 ) and ( 4096 , 150 ) , although it requires slightly more memory. For ( 8192 , 250 ) , CRT is nearly three times faster, while Barrett reduces memory usage by approximately 1.5%. For (16,384, 550), CRT is about 4.5 times faster, whereas Barrett provides only a marginal memory reduction of roughly 1%.

4.5.3. The 128-Bit Security

A 128-bit security level is widely regarded as the standard baseline for modern cryptographic applications, offering a practical balance between efficiency and long-term protection against classical adversaries.
In CKKS-based homomorphic encryption, 128-bit security is commonly adopted for privacy-preserving machine learning, encrypted database queries, and secure outsourced computation. It is also the default recommendation in standards such as NIST SP 800-57 for protecting sensitive personal, commercial, and institutional data over medium- to long-term periods.
For high-security settings (128 bits or above), large parameter sets such as N = 16,384 with large moduli are generally required. Although both backends incur substantial runtime and memory overhead, CRT consistently achieves lower execution time, whereas Barrett provides slightly lower memory consumption.

4.5.4. The 192-Bit Security

A 192-bit security level is less commonly deployed but is suitable for applications requiring stronger protection against highly capable adversaries or longer data retention periods. Typical use cases include government, defense, and high-assurance-secure communication systems.
In homomorphic encryption and secure multi-party computation, 192-bit security may be preferred when 256-bit security is considered excessively costly, while additional assurance beyond 128-bit security is still desired.
At the 192-bit security level, CRT consistently achieves lower execution time, whereas Barrett provides slightly lower memory usage. For large ring degrees, such as N = 2 13 or 2 14 , the memory advantage of Barrett becomes marginal compared to the substantial runtime advantage of CRT.

4.5.5. The 256-Bit Security

A 256-bit security level targets applications requiring maximum resilience against future threats, including large-scale quantum adversaries and long-term archival attacks. Typical applications include national security systems, critical infrastructure, and highly sensitive financial systems.
In post-quantum and high-assurance cloud-computing applications, 256-bit security provides a conservative security margin for data requiring decades of confidentiality. Although it significantly increases parameter sizes and computational cost in homomorphic encryption systems, such overhead may be justified for highly sensitive long-term data.
At the 256-bit security level, CRT is approximately 2.2–3.4 times faster than Barrett, while Barrett reduces memory usage by roughly 1–2% compared to CRT.
In summary, we acknowledge that the superior performance of the CRT-based method over the Barrett-based method arises primarily from the use of reduced-size moduli obtained via the factorization of q. This decomposition enables computations to be carried out over smaller integers, significantly lowering arithmetic complexity. In addition, the inherent parallelism of the CRT approach—where operations across different moduli are independent—aligns well with modern multi-core architectures, further enhancing performance. This advantage becomes more pronounced for larger values of q, or equivalently, for larger polynomial degrees N that support a larger modulus.
In terms of memory usage, however, the CRT-based method typically incurs higher overhead, as it must maintain multiple residues corresponding to the factorization of q, leading to increased storage and data movement compared to the Barrett-based approach.

4.6. Application Considerations on Selection of ( N , log q )

In CKKS, once the target security level is fixed, the polynomial modulus degree N and the total modulus size log q are primarily driven by circuit depth D, required numerical precision, and the SIMD vector size.
The circuit depth determines the length of the modulus chain: each multiplication (followed by rescaling) consumes one level, so a deeper circuit requires a larger total log q to accommodate the cumulative modulus drops.
The required precision dictates the size of the scaling factor and the tolerance to noise growth; higher precision necessitates larger intermediate moduli and therefore increases log q as well.
The vector size (i.e., the number of packed slots) is bounded by N / 2 , so larger batch sizes require larger N. However, increasing log q at a fixed security level typically forces an increase in N to maintain hardness of the underlying RLWE instance.
Consequently, D and precision directly inflate log q , while vector size directly sets a lower bound on N; together they interact so that more demanding depth or precision indirectly pushes N upward to preserve the desired security level. Two practical workload patterns require special attention:
  • Large batches of input data:
Increasing N increases the number of CKKS slots and thus the batch size. For throughput-oriented workloads (e.g., batched inference), larger N (8192 or 16,384) is recommended even at lower security levels. CRT is preferable when minimizing latency per batch is critical; Barrett becomes attractive when many batches run concurrently, and aggregate memory usage limits the number of parallel sessions.
  • High-resolution input data:
Increasing log q allows larger scaling factors and higher fixed-point precision, but increases runtime and memory. For high-precision workloads (e.g., finance or scientific computing), medium-to-large moduli (220–260 at N = 8192 , or 300–350 at N = 16,384) are recommended. In these cases, CRT should be chosen when speed is paramount, while Barrett offers a memory-optimized alternative when infrastructure cost or device RAM is the bottleneck.
  • Suggested practical guidelines:
Our results suggest the following criteria for choosing between CRT and Barrett for practical users:
  • For memory-constrained IoT devices with limited RAM and moderate security requirements (e.g., 80-bit security), Barrett is preferable due to its lower memory usage, even though it may be slower.
  • For cloud servers or high-performance environments requiring high security (e.g., 128-bit or above), CRT is preferable because it provides significantly better time efficiency at these security levels, and memory is less constrained.
  • For applications where both memory and time are critical, the choice depends on the target security level: Barrett is competitive at low-to-moderate security, while CRT becomes more advantageous as security level increases.
The above guidelines can also be summarized into a table as shown in Table 7.

5. Conclusions

5.1. Summary of Contributions

The proposed research work investigated parameter selection and modular arithmetic design choices for the CKKS homomorphic encryption scheme, with a particular focus on how these choices shape the security–performance trade-off in practice. The core contributions are:
  • A systematic experimental study of CKKS parameters, emphasizing the dominant role of the ring degree N and the ciphertext modulus size log q in determining security, depth, runtime, and memory usage [6,9].
  • A controlled comparison of two modular reduction backends within CKKS: a conventional RNS–CRT-centric implementation and an alternative implementation that integrates Barrett modular reduction into performance-critical arithmetic routines [16].
  • Practical recommendations for parameter and backend selection under different security and application scenarios, targeting CPU-based deployments of privacy-preserving analytics and machine-learning workloads.

5.2. Significance and Limitations of the Research

The significance of the results lies in their systems-level implications for homomorphic encryption. CKKS is increasingly used in applications such as privacy-preserving machine learning and secure data analytics, where performance and memory usage often determine feasibility [9,38]. In such settings, asymptotic complexity alone is insufficient to guide design choices.
The experiments show that even when two implementations share the same cryptographic foundations, low-level arithmetic decisions can meaningfully alter practical behavior. In particular, they indicate that Barrett reduction, while typically slower than CRT-based arithmetic in terms of raw execution time at higher security levels, can reduce peak memory usage in several important regimes. This trade-off is especially relevant for cloud and multi-tenant environments, where memory constraints often limit the number of concurrent encrypted workloads more severely than compute throughput [39,40]. The results also reinforce that there is no universally optimal CKKS configuration: parameter and backend choices must be aligned with application requirements such as acceptable latency, available memory, and target security level [38].
  • Discussion relating to SEAL/Palisade:
These findings have direct implications for widely used homomorphic encryption libraries such as Microsoft SEAL and Palisade. In SEAL, modular reduction is implemented as part of the core arithmetic, and the choice of backend can significantly affect performance for CKKS-based workloads. Our results suggest that, for SEAL deployments targeting low-security, memory-constrained scenarios, a Barrett-based modular reduction backend could reduce memory pressure and enable operation on devices with limited RAM. Conversely, for high-security server-side deployments, a CRT-based backend is likely to provide better throughput and lower latency. Similarly, in Palisade, where multiple modular reduction strategies can be configured, our results indicate that CRT should be preferred for high-security parameter sets, while Barrett may be advantageous for lower-security configurations where memory is the primary bottleneck. Practitioners using these libraries can use our decision table and examples to guide parameter and backend selection based on their target deployment environment.
  • Limitations of this work:
Several limitations should be acknowledged. First, the experimental evaluation is based on Python implementations, which do not fully exploit low-level optimizations available in production-grade C/C++ libraries such as SEAL or OpenFHE [9]. In this sense, the results are intended to guide backend selection and parameter tuning within CKKS, while a native C/C++ re-implementation is left to future work. Second, the study evaluates performance only on a CPU platform and does not consider GPU or FPGA accelerators, where the relative behavior of CRT and Barrett might differ [39]. Finally, the analysis focuses on execution time and peak memory; other important metrics, such as energy consumption and end-to-end numerical accuracy in concrete applications, are outside the scope of the presented work in this paper [9].

5.3. Directions for Possible Future Work

The findings suggest several avenues for future research.
An immediate direction is to extend the evaluation to optimized C/C++ CKKS libraries, integrating Barrett-based modular reduction into mature frameworks such as SEAL or OpenFHE. This would clarify whether the memory advantages observed here persist when aggressive compiler optimizations, vectorization, and hand-tuned arithmetic are applied [6].
Another direction is to explore hardware-accelerated implementations on GPUs and FPGAs, where multiplication-heavy algorithms such as Barrett reduction may benefit disproportionately from parallelism [39,40]. A comparative study across CPU, GPU, and hybrid architectures could yield new insights into backend selection for large-scale deployments.
Finally, the methodology developed in this work could be generalized to other homomorphic encryption schemes, such as BFV or TFHE, to assess whether similar trade-offs between modular arithmetic backends arise beyond CKKS [6,38].

Author Contributions

Conceptualization, H.G.; Methodology, H.G.; Validation, H.G.; Formal analysis, H.G.; Writing—original draft, H.G.; Writing—review and editing, H.W.; Supervision, H.W.; Project administration, H.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

AcronymMeaning
BFVBrakerski/Fan–Vercauteren
BGVBrakerski–Gentry–Vaikuntanathan
BKZBlock Korkine–Zolotarev
CKKSCheon–Kim–Kim–Song Scheme
CRTChinese Remainder Theorem
FHEFully Homomorphic Encryption
FHEWFast Fully Homomorphic Encryption over the Torus
GSWGentry–Sahai–Waters
HEHomomorphic Encryption
NTT (INTT)(Inverse) Number Theoretical Transform
RLWERing Learning with Errors
RNSResidue Number System
SHESomewhat Homomorphic Encryption
SIMDSingle Instruction Multiple Data
TFHEFast Fully Homomorphic Encryption over the Torus

References

  1. Katz, J.; Lindell, Y. Introduction to Modern Cryptography, 3rd ed.; CRC Press: Boca Raton, FL, USA, 2020. [Google Scholar]
  2. Bernstein, D.J.; Buchmann, J.; Dahmen, E. Post-Quantum Cryptography; Springer: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
  3. Lin, C. Ring-LWE: Enhanced Foundations and Applications; Columbia University: New York, NY, USA, 2023. [Google Scholar]
  4. Cheon, J.H.; Kim, A.; Kim, M.; Song, Y. Homomorphic encryption for arithmetic of approximate numbers. In Proceedings of the ASIACRYPT 2017; LNCS; Springer: Berlin/Heidelberg, Germany, 2017; Volume 10624, pp. 409–437. [Google Scholar]
  5. Cheon, J.H.; Han, K.; Kim, A.; Kim, M.; Song, Y. A full RNS variant of approximate homomorphic encryption. In Proceedings of the International Conference on Selected Areas in Cryptography; Springer: Berlin/Heidelberg, Germany, 2018; pp. 347–368. [Google Scholar]
  6. Albrecht, M.; Chase, M.; Chen, H.; Ding, J.; Goldwasser, S.; Gorbunov, S.; Halevi, S.; Hoffstein, J.; Laine, K.; Lauter, K.; et al. Homomorphic encryption standard. In Protecting Privacy Through Homomorphic Encryption; Springer: Berlin/Heidelberg, Germany, 2022; pp. 31–62. [Google Scholar]
  7. Lee, C.; Lee, H.; Satriawan, A.; Lee, H. Configurable arithmetic core architecture for RNS-CKKS homomorphic encryption. IEEE Access 2024, 12, 147220–147234. [Google Scholar] [CrossRef]
  8. Mareta, R.; Lee, H. Compact 2 17 NTT architecture for fully homomorphic encryption. In Proceedings of the 2024 IEEE International Symposium on Circuits and Systems (ISCAS); IEEE: New York, NY, USA, 2024; pp. 1–5. [Google Scholar]
  9. Gharibyar, M.; Krüger, C.; Schoop, D. Towards Efficient Cloud Data Processing: A Comprehensive Guide to CKKS Parameter Selection. In Proceedings of the 11th International Conference on Information Systems Security and Privacy-Volume 2: ICISSP; INSTICC: Lisboa, Portugal; SciTePress: Setúbal, Portugal, 2025; pp. 502–509. [Google Scholar] [CrossRef]
  10. Al Mamun, A.; Yates, K.; Rakotondrafara, A.; Chowdhury, M.; Cartor, R.; Gao, S. Experimental evaluation of post-quantum homomorphic encryption for privacy-preserving v2x communication. arXiv 2025, arXiv:2508.02461. [Google Scholar]
  11. Samardzic, N. Making Computation on Encrypted Data Practical Through Hardware Acceleration of Fully Homomorphic Encryption. Ph.D. Thesis, Massachusetts Institute of Technology, Cambridge, MA, USA, 2022. [Google Scholar]
  12. Mukherjee, T. Cyclotomic Polynomials in Ring-LWE Homomorphic Encryption Schemes. Master’s Thesis, Rochester Institute of Technology, Rochester, NY, USA, 2016. [Google Scholar]
  13. Lyubashevsky, V.; Peikert, C.; Regev, O. On ideal lattices and learning with errors over rings. In Proceedings of the EUROCRYPT 2010; Springer: Berlin/Heidelberg, Germany, 2010; pp. 1–23. [Google Scholar]
  14. Fan, J.; Vercauteren, F. Somewhat Practical Fully Homomorphic Encryption. Technical Report 2012/144; IACR Cryptology ePrint Archive. 2012. Available online: https://eprint.iacr.org/2012/144.pdf (accessed on 27 May 2026).
  15. Barrett, P. Implementing the Rivest Shamir and Adleman public key encryption algorithm on a standard digital signal processor. In Advances in Cryptology—CRYPTO’86; Springer: Berlin/Heidelberg, Germany, 1986; pp. 311–323. [Google Scholar]
  16. Satriawan, A.; Mareta, R.; Lee, H. Integer modular multiplication with Barrett reduction and its variants for homomorphic encryption applications: A comprehensive review and an empirical study. IEEE Access 2024, 12, 147283–147300. [Google Scholar] [CrossRef]
  17. Halevi, S.; Shoup, V. Design and implementation of a homomorphic-encryption library. IBM J. Res. Dev. 2018, 62, 1–42. [Google Scholar]
  18. Shivdikar, K.; Jonatan, G.; Mora, E.; Livesay, N.; Agrawal, R.; Joshi, A.; Abellán, J.L.; Kim, J.; Kaeli, D. Accelerating polynomial multiplication for homomorphic encryption on GPUs. In Proceedings of the 2022 IEEE International Symposium on Secure and Private Execution Environment Design (SEED); IEEE: New York, NY, USA, 2022; pp. 61–72. [Google Scholar]
  19. Gentry, C. A Fully Homomorphic Encryption Scheme. Ph.D. Thesis, Stanford University, Stanford, CA, USA, 2009. [Google Scholar]
  20. Munjal, K.; Bhatia, R. A systematic review of homomorphic encryption and its contributions in healthcare industry. Complex Intell. Syst. 2022, 9, 3759–3786. [Google Scholar] [CrossRef] [PubMed]
  21. van Dijk, M.; Gentry, C.; Halevi, S.; Vaikuntanathan, V. Fully Homomorphic Encryption over the Integers. In Proceedings of the Advances in Cryptology-EUROCRYPT 2010; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2010; Volume 6110, pp. 24–43. [Google Scholar] [CrossRef]
  22. Brakerski, Z.; Gentry, C.; Vaikuntanathan, V. (Leveled) Fully Homomorphic Encryption Without Bootstrapping. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference (ITCS); Association for Computing Machinery: New York, NY, USA, 2012. [Google Scholar] [CrossRef]
  23. Gentry, C.; Sahai, A.; Waters, B. Homomorphic Encryption from Learning with Errors: Conceptually-Simpler, Asymptotically-Faster, Attribute-Based. In Proceedings of the CRYPTO 2013; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2013. [Google Scholar] [CrossRef]
  24. Lepoint, T.; Naehrig, M. FHEW: Bootstrapping Homomorphic Encryption in Less Than a Second. In Proceedings of the EUROCRYPT 2015; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2015. [Google Scholar] [CrossRef]
  25. Chillotti, I.; Gama, N.; Georgieva, M.; Izabachène, M. TFHE: Fast Fully Homomorphic Encryption over the Torus. J. Cryptol. 2019, 33, 34–91. [Google Scholar] [CrossRef]
  26. Huynh, D. CKKS Explained, Part 2: Full Encoding and Decoding. 2025. Available online: https://openmined.org/blog/ckks-explained-part-2-ckks-encoding-and-decoding/ (accessed on 23 May 2025).
  27. Huynh, D. CKKS Explained, Part 3: Encryption and Decryption. 2025. Available online: https://openmined.org/blog/ckks-explained-part-3-encryption-and-decryption/ (accessed on 23 May 2025).
  28. Huynh, D. CKKS Explained, Part 4: Multiplication and Relinearization. 2025. Available online: https://openmined.org/blog/ckks-explained-part-4-multiplication-and-relinearization/ (accessed on 23 May 2025).
  29. Huynh, D. CKKS Explained, Part 5: Rescaling. 2025. Available online: https://openmined.org/blog/ckks-explained-part-5-rescaling/ (accessed on 23 May 2025).
  30. Microsoft SEAL Team. Microsoft SEAL Homomorphic Encryption Library. Microsoft Research. Parameter Selection and Security Levels. 2023. Available online: https://github.com/Microsoft/SEAL (accessed on 1 January 2026).
  31. Regev, O. On lattices, learning with errors, random linear codes, and cryptography. J. ACM (JACM) 2009, 56, 1–40. [Google Scholar] [CrossRef]
  32. Albrecht, M.R.; Player, R.; Scott, S. On the concrete hardness of learning with errors. J. Math. Cryptol. 2015, 9, 169–203. [Google Scholar] [CrossRef]
  33. Chen, Y.; Nguyen, P.Q. BKZ 2.0: Better lattice security estimates. In Proceedings of the International Conference on the Theory and Application of Cryptology and Information Security; Springer: Berlin/Heidelberg, Germany, 2011; pp. 1–20. [Google Scholar]
  34. Microsoft SEAL (Release 3.5), Microsoft Research: Redmond, WA, USA, 2020. Available online: https://github.com/Microsoft/SEAL (accessed on 1 January 2026).
  35. Hati, I.N.K. A brief summary of the full RNS variant of some homomorphic encryptions. J. Mod. Technol. Eng. 2023, 8, 112–118. [Google Scholar]
  36. Duong, P.N.; Lee, H. Pipelined key switching accelerator architecture for CKKS-based fully homomorphic encryption. Sensors 2023, 23, 4594. [Google Scholar] [CrossRef] [PubMed]
  37. Roma, C.A. Hardware Implementation of Barrett Reduction Exploiting Constant Multiplication. Ph.D. Thesis, University of Waterloo, Waterloo, ON, Canada, 2019. [Google Scholar]
  38. Cabrero-Holgueras, J.; Pastrana, S. Towards automated homomorphic encryption parameter selection with fuzzy logic and linear programming. Expert Syst. Appl. 2023, 233, 120941. [Google Scholar] [CrossRef]
  39. Fan, S.; Deng, X.; Tian, Z.; Hu, Z.; Chang, L.; Hou, R.; Meng, D.; Zhang, M. Taiyi: A High-Performance CKKS Accelerator for Practical Applications. arXiv 2024, arXiv:2403.10188. [Google Scholar]
  40. Li, Q.; Zong, R. A GPU-Accelerated FHE Framework with Its Application to High-Throughput CKKS Bootstrapping. arXiv 2025, arXiv:2503.22227. [Google Scholar]
Table 1. Experimental results for 80-bit security level.
Table 1. Experimental results for 80-bit security level.
N log q Time (s)Space (MB)
CRT Barrett CRT Barrett
1024801.451.61231.88187.67
20482005.017.57353.68324.37
409630015.5333.64901.37885.30
Table 2. Experimental results for 100-bit security level.
Table 2. Experimental results for 100-bit security level.
N log q Time (s)Space (MB)
CRT Barrett CRT Barrett
20481004.016.56335.21328.81
409620013.4229.10924.36884.01
819230047.34133.573167.183116.49
Table 3. Experimental results for 110-bit security level.
Table 3. Experimental results for 110-bit security level.
N log q Time (s)Space (MB)
CRT Barrett CRT Barrett
2048803.716.46337.81328.39
409615013.1230.34905.58887.46
819225045.04125.233159.323114.66
16,384550180.49800.4412,183.4812,020.21
Table 4. Experimental results for 128-bit security level.
Table 4. Experimental results for 128-bit security level.
N log q Time (s)Space (MB)
CRT Barrett CRT Barrett
2048603.686.25336.86327.88
409612013.0627.62903.83887.44
819220043.15117.003175.113114.03
16,384450166.36666.7812,174.2812,030.07
Table 5. Experimental results for 192-bit security level.
Table 5. Experimental results for 192-bit security level.
N log q Time (s)Space (MB)
CRT Barrett CRT Barrett
2048403.926.16334.63327.53
40968012.0926.26903.63887.39
819215041.25121.393152.823116.62
16,384300149.71539.8912,124.8311,873.28
Table 6. Experimental results for 256-bit security level.
Table 6. Experimental results for 256-bit security level.
N log q Time (s)Space (MB)
CRT Barrett CRT Barrett
40966011.4825.76902.64886.82
819212039.75109.073150.763116.81
16,384250155.70533.8712,108.4911,323.03
Table 7. Practical guidelines for choosing CRT and Barrett.
Table 7. Practical guidelines for choosing CRT and Barrett.
ApplicationsBackend ChoiceMain Reason
Memory-constrained IoT/low securityBarrettLower memory at 80–100 bit
High security, server/cloudCRTBetter time efficiency at ⩾128 bit
Implementation simplicityCRTUniform scaling with security
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Goganaboina, H.; Wu, H. Parameter Settings and Efficient Computation for Homomorphic Encryption in CKKS. Electronics 2026, 15, 2391. https://doi.org/10.3390/electronics15112391

AMA Style

Goganaboina H, Wu H. Parameter Settings and Efficient Computation for Homomorphic Encryption in CKKS. Electronics. 2026; 15(11):2391. https://doi.org/10.3390/electronics15112391

Chicago/Turabian Style

Goganaboina, Hemanth, and Huapeng Wu. 2026. "Parameter Settings and Efficient Computation for Homomorphic Encryption in CKKS" Electronics 15, no. 11: 2391. https://doi.org/10.3390/electronics15112391

APA Style

Goganaboina, H., & Wu, H. (2026). Parameter Settings and Efficient Computation for Homomorphic Encryption in CKKS. Electronics, 15(11), 2391. https://doi.org/10.3390/electronics15112391

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop