1. Introduction
Fully Homomorphic Encryption (FHE) is considered the “holy grail” of cryptography, allowing arbitrary computations to be performed on encrypted data without access to the secret decryption key. Since Gentry’s first feasible construction [
1], FHE has evolved significantly, with the Torus FHE (TFHE) scheme [
2,
3] emerging as a leading candidate for practical applications. TFHE relies on the hardness of the Learning With Errors (LWE) problem [
4] and allows for the evaluation of exact logic gates with constant-time noise refreshing via fast gate-by-gate bootstrapping.
However, in the era of cloud computing and collaborative privacy-preserving analytics, single-key FHE is often insufficient. Real-world scenarios, such as secure genome analysis or financial auditing, typically involve multiple mutually distrusting parties who wish to compute jointly on their private data. This necessitates Multi-Key Homomorphic Encryption (MKHE) [
5,
6], which extends FHE to support operations on ciphertexts encrypted under different independent keys. The resulting ciphertext can only be decrypted if all involved parties collaborate. While theoretical constructions for MKHE exist [
7,
8], adapting them to the efficient TFHE framework presents significant challenges. Current state-of-the-art Multi-Key TFHE (MK-TFHE) schemes [
9,
10] suffer from three severe scalability bottlenecks:
High Computational Complexity of Blind Rotation: The core blind rotation step requires iterating through every bit of the expanded multi-user key. As the number of participants
grows, the required external product operations increase linearly or even quadratically, leading to prohibitive latency [
9]. The computational complexity typically scales as
, where
is the LWE dimension.
Low Amortized Throughput: Standard TFHE bootstrapping refreshes only one LWE ciphertext per execution. For Boolean circuits requiring massive parallelism, this bit-wise processing is highly inefficient. Although packing techniques (SIMD) [
11] exist for BGV [
12]/CKKS [
13] schemes, applying them to TFHE’s accumulation-based bootstrapping remains non-trivial [
14].
Expensive Format Conversion Overhead: Traditional MKHE pipelines require frequent Sample Extraction operations at the end of bootstrapping. This forces the data out of the ring domain, meaning it must be key-switched back to a format suitable for the accumulator, wasting computational cycles and hindering deep circuit evaluations.
To address these limitations, we propose the Dynamic Multi-Key Block-Binary Ring-Compact Bootstrapping (DMBB-RCB) scheme. Our construction synthesizes three advanced optimization strategies into a unified, scalable framework:
Block Binary Keys for Accelerated Blind Rotation: We adopt a Block Binary Distribution [
15] for the LWE secret keys. By structuring the key as a concatenation of
blocks (each with a Hamming weight of at most one), we redesign the blind rotation algorithm to iterate over
blocks rather than
bits. This structural sparsity reduces the core complexity from
to
(where
), providing a theoretical speedup proportional to the block length
.
Amortized Ring Packing: We implement a PackLWE algorithm [
15] tailored for the multi-key setting. This enables the packing of multiple scalar LWE ciphertexts into the coefficients of a single RLWE polynomial. Consequently, a single execution of our bootstrapping circuit refreshes multiple messages simultaneously, significantly increasing the amortized throughput.
Dynamic Ring-Compact Architecture: Instead of extracting LWE samples at the end of bootstrapping, we utilize a Scheme Switching technique inspired by recent dynamic MKHE frameworks [
16,
17] to output Multi-Key RGSW ciphertexts directly. This keeps the data entirely within the Ring domain, enabling seamless, continuous homomorphic evaluation without intermediate LWE-to-RLWE conversions. Furthermore, our scheme supports the dynamic addition of participants without requiring a global parameter reset, making it highly suitable for flexible Multi-Party Computation (MPC) environments.
Recently, the rapid evolution of MKHE has led to several notable advancements in 2024 and 2025. For instance, Kwak et al. [
10] introduced parallelizable techniques to reduce quasi-linear complexity, and recent hardware/algorithmic co-designs and algorithmic optimizations [
18,
19,
20] have attempted to optimize circuit bootstrapping and automorphism evaluations. While these contemporary works significantly improve specific TFHE bottlenecks, they predominantly operate within static configurations or still incur the expensive LWE-to-RLWE format conversion overhead. In contrast, our DMBB-RCB framework uniquely synthesizes block-binary sparsity with a dynamic ring-compact architecture, addressing both latency and dynamic participant scalability simultaneously—a gap not fully resolved by the most recent 2024–2025 literature.
2. Preliminaries
In this section, we establish the mathematical notations, algebraic structures, probability distributions, and the fundamental cryptographic primitives underlying the Dynamic Multi-Key Block-Binary Ring-Compact Bootstrapping (DMBB-RCB) scheme.
2.1. Notation
Let and denote the set of integers and real numbers, respectively. We define the real torus as (modulo 1), and the integer ring modulo as Vectors and matrices are denoted by bold lowercase (e.g., ) and bold uppercase (e.g., ) letters, respectively. For two vectors , their inner product is denoted by . The operations , and denote the nearest-integer rounding, floor, and ceiling functions, respectively. For a finite set denotes sampling uniformly at random from . Let be a power of 2. We define the cyclotomic polynomial rings and .
2.2. Probability Distributions
The security of our scheme relies on properties of specific error distributions and structured sparse keys.
Definition 1 (Error Distributions). Discrete Gaussian (): A discrete Gaussian distribution defined over the integers with standard deviation , where the probability of sampling an integer is proportional to .
Modular Gaussian ( ): A Gaussian error distribution defined over the real torus with standard deviation , concentrated around .
Polynomial Error Distribution ( ): A distribution of polynomials in the ring . A polynomial
is generated by sampling each of its integer coefficients independently from the discrete Gaussian distribution .
Definition 2 (Block Binary Distribution). Let be the block length and be the number of blocks, such that the total dimension is . The block distribution over outputs a vector with a Hamming weight uniformly at random. The Block Binary Distribution over is defined as the concatenation of independent samples from . Formally, for , we have , where for .
2.3. Cryptographic Primitives
We recall the fundamental definitions of the LWE and RLWE encryption schemes, along with the gadget decomposition mechanism utilized in TFHE.
Definition 3 (Gadget Decomposition)
. Let be an integer base and be the decomposition depth such that . The gadget vector is defined as . We define the decomposition function .
For any input polynomial , the function outputs a vector of polynomials such that:where every integer coefficient of each polynomial is bounded within the interval Definition 4 (LWE Encryption). For dimension , a secret key . To encrypt a message , sample a mask vector and an error term . The ciphertext is , where .
Remark 1 (Practical Message Encoding and Discretization). While theoretical formulations define the message over the continuous torus , practical implementations cannot process arbitrary real numbers (e.g., ) due to finite machine precision. In practice, we utilize a discrete message space and encode it into .
Let be the finite plaintext space (e.g., for standard Boolean circuits).
Encoding: A discrete message is encoded into the torus via the function .
Decoding: During decryption, the recovered noisy phase is decoded back to the exact message by rescaling and rounding to the nearest integer: .
In our software implementation, the continuous torus is discretized and mapped to standard unsigned integer data types (e.g., 32-bit or 64-bit integers), where the interval is represented by the range or , effectively handling the modulo arithmetic via natural CPU integer overflow.
Definition 5 (RLWE Encryption [
21])
. Let the secret key be a polynomial . To encrypt a message polynomial , sample and . The ciphertext is , defined as .
Definition 6 (RGSW Encryption [
22])
. Let be the gadget matrix. An RGSW ciphertext encrypting a message under secret key is a matrix . Here, is a matrix where each row is a valid RLWE encryption of 0. 2.4. Standard TFHE Operations
The homomorphic evaluation logic is built upon the External Product and the CMUX gate:
External Product (): Let be an RGSW ciphertext encrypting and be an RLWE ciphertext encrypting . The external product is computed as . The result is a valid RLWE ciphertext encrypting .
CMUX Gate: The Controlled Multiplexer (CMUX) acts as a homomorphic “if-then-else” gate. For an RGSW ciphertext encrypting a selection bit and two RLWE ciphertexts encrypting , the CMUX operation is defined as . This outputs an RLWE ciphertext encrypting .
3. Construction of the DMBB-RCB Scheme
In this section, we present the formal construction of the Dynamic Multi-Key Block-Binary Ring-Compact Bootstrapping (DMBB-RCB) scheme. The protocol is composed of five distinct phases: Setup and Key Generation, Amortized Input Preparation (Packing), Block-Binary Dynamic Blind Rotation, Ring-Compact Extraction, and Distributed Decryption.
3.1. Advanced Building Blocks
Before detailing the main protocol, we formally define the advanced cryptographic primitives tailored for our multi-key and ring-compact setting.
Amortized Packing (
): This procedure aggregates a set of scalar LWE ciphertexts
into the coefficients of a single RLWE ciphertext using a set of key-switching keys
. The algorithm evaluates the decryption circuit homomorphically in the ring domain:
Homomorphic Trace (
): To isolate a specific message encrypted in a packed RLWE ciphertext
, we define the trace map relative to the sub-ring
via the sum of automorphisms. Let
be automorphisms of the ring
.
Scheme Switching (
): This algorithm converts an RLWE ciphertext
directly into an RGSW ciphertext
using a scheme-switching key
. It applies the gadget decomposition
and reconstructs the matrix format:
3.2. Setup and Key Generation
Let denote the set of active participants in the dynamic MPC environment:
: Given the security parameter , block length , and block count , output the public parameters , where is the LWE dimension, is the polynomial modulus, and is the common reference string.
: Each participant samples an LWE secret key . The corresponding RLWE key is constructed to embed :
where
are small random ternary coefficients. The public key is generated as
:
: Participant generates Blind Rotation Keys , Packing Keys (for LWE to RLWE key-switching), and a Scheme Switching Key .
Remark 2 (Symmetry and Asymmetry in DMBB-RCB). It is worth noting that our framework employs a hybrid cryptographic architecture that leverages both symmetric and asymmetric properties to optimize overall performance. Specifically, the initial encryption of private user data is strictly symmetric (Secret-Key LWE, as per Definition 4) to minimize client-side computational overhead and ciphertext expansion before transmission. In contrast, the homomorphic evaluation phase relies on an asymmetric (Public-Key) paradigm; the evaluation materials, including the Blind Rotation Keys (BRK) and Packing Keys, are published as Public-Key RLWE/RGSW ciphertexts. This allows any untrusted third-party server to process the data without access to the secret keys. Finally, the decryption process is structured as a distributed threshold protocol, where the publicly evaluated ciphertext is jointly decrypted using the participants’ individual symmetric secret keys.
Remark 3 (Cost of Dynamic Participant Addition). A significant advantage of our CRS-based dynamic setup is the minimal overhead required when a new participant joins the computation. Because the global public parameters and the common reference string remain static, existing participants incur zero computational and communication overhead; they do not need to update, regenerate, or re-broadcast their keys. The new participant only needs to locally execute KeyGen and EvalKeyGen once. The cost for strictly consists of generating one LWE/RLWE key pair and producing their local evaluation keys (, and ). The server simply appends these newly broadcasted keys to its storage without halting ongoing independent computations.
3.3. The DMBB-RCB Evaluation Protocol
High-Level Overview of the DMBB-RCB Pipeline:
Before detailing the specific algorithms, we outline how the cryptographic primitives smoothly transition through our evaluation pipeline.
First, users encrypt their private inputs using standard Secret-Key LWE to minimize client-side overhead.
To initiate the bootstrapping, the Amortized Input Preparation phase homomorphically packs multiple such LWE ciphertexts into a single Public-Key RLWE accumulator polynomial.
Subsequently, the Blind Rotation is executed homomorphically over this RLWE accumulator using the users’ public RGSW evaluation keys.
Instead of reverting to LWE via key-switching, our Ring-Compact Extraction natively transforms the RLWE output into a Multi-Key RGSW ciphertext, keeping the data entirely within the ring domain.
Finally, when the computation is complete, the parties invoke the Distributed Decryption protocol to jointly decrypt the resulting ring-based ciphertext using their secret RLWE key shares.
3.3.1. Amortized Input Preparation
To amortize the bootstrapping cost, we aggregate
independent scalar multi-key LWE ciphertexts
into a single RLWE polynomial. The target polynomial slots are defined by a mapping function
. The initialized accumulator
is constructed by homomorphically subtracting the secret-dependent parts using the aggregate Packing Keys
:
This results in a valid multi-key RLWE encryption .
3.3.2. Block-Binary Dynamic Blind Rotation
The core innovation reduces the external product complexity from to by exploiting the block sparsity. The algorithm computes .
3.3.3. Ring-Compact Extraction
To extract a specific target message
from the packed accumulator without leaving the ring domain, we execute the Ring-Compact sequence. First, a trivial rotation
shifts
to the constant term.
Subsequently, we convert the isolated RLWE ciphertext into a multi-key RGSW ciphertext using Scheme Switching:
This closed-loop design ensures can be directly fed into subsequent CMUX gates.
3.4. Distributed Decryption
To recover the computation result while preserving circuit privacy, we apply a distributed decryption protocol with smudging noise [
23].
First, a representative multi-key RLWE sample is extracted:
. Each participant
computes a partial decryption share
by masking the exact key multiplication with an independent Gaussian smudging noise
:
To guarantee statistical indistinguishability, the standard deviation must satisfy
. The shares are aggregated to cancel the public mask:
Finally, the exact message is recovered by rounding the constant term of relative to the scaling factor .
4. Correctness and Security Analysis
In this section, we formally establish the mathematical correctness of the Block-Binary Blind Rotation, derive the strict noise propagation bounds required for correct decryption, and prove the IND-CPA security and circuit privacy of the DMBB-RCB scheme.
4.1. Correctness Analysis
The correctness of DMBB-RCB hinges on the accurate homomorphic evaluation of the decryption phase within the accumulator. We first prove the correctness of the block selectors.
Theorem 1 (Correctness of Block Selectors)
. Let be the -th block of user secret key sampled from , and let encrypt the bits of this block. The block selector constructed in Algorithm 1 accurately isolates the active rotation factor such that:| Algorithm 1: Block-Binary Dynamic Blind Rotation. |
Input,
Output.
(Global Block Iteration) (User Contribution Loop)
|
Proof. By definition of the Block Binary Distribution, the Hamming weight satisfies . We analyze the two possible cases for the selector construction :
Case 1 (): For all , . Consequently, all encrypt . The linear combination homomorphically yields , which trivially satisfies .
Case 2 (): There exists a unique index such that , and for all . The homomorphic sum collapses to the single non-zero term corresponding to :
Since , the theorem holds. □
Theorem 2 (Correctness of Blind Rotation). Algorithm 1 transforms the initialized accumulator into .
Proof. The iterative update per block is
. Substituting Theorem 1, the multiplier evaluates to
. By the homomorphism of the external product over
blocks and
users, the total phase shift perfectly accumulates:
This completes the homomorphic evaluation of the decryption phase. □
4.2. Noise Bounding Analysis
We track the worst-case variance bounds of the noise through the protocol to establish the parameter constraints. Let , and denote the noise variances of a fresh LWE ciphertext, an LWE-to-RLWE key-switching operation, and a TRGSW external product, respectively.
Theorem 3 (Total Error Bound)
. For a packing factor , block count , and participants, correct decryption is guaranteed if the final output variance satisfies , where Proof. Packing Noise (): Initializing the accumulator via aggregates noise linearly with the input ciphertexts and the required key-switching operations.
Blind Rotation Drift (): In standard MK-TFHE, noise accumulates additively over external products per user ). Our block-binary optimization executes only one external product per block. Thus, the drift is bounded by , strictly reducing the dominant noise growth by a factor of .
Extraction (): The Scheme Switching introduces controlled noise proportional to its gadget decomposition depth. Combining these terms yields the total variance, which must remain below the 6-sigma decoding gap . □
4.3. Security Analysis
Our security relies on the standard Decision RLWE assumption and the Block-Binary LWE assumption.
Assumption 1 (Block-Binary LWE). Let . The Block-Binary LWE assumption posits that for appropriate parameters, the distribution of samples is computationally indistinguishable from uniform over . Recent cryptanalysis confirms its hardness against hybrid dual attacks with minimal security loss for small block lengths (e.g., ).
Theorem 4 (IND-CPA Security). Under the RLWE and Block-Binary LWE assumptions, the DMBB-RCB scheme is IND-CPA secure against a probabilistic polynomial-time (PPT) adversary .
Proof. Proof Sketch (Hybrid Argument). Let be the advantage of in Game :
Game 0 (Real): The standard IND-CPA game.
Game 1 (Random Public Keys): We replace with random pairs from . By the RLWE assumption, .
Games 2 & 3 (Simulated Eval Keys): We replace the Packing Keys and Blind Rotation Keys with encryptions of 0. Since these keys are valid RLWE/RGSW ciphertexts and their underlying RLWE secret is protected (from Game 1), semantic security guarantees . The sparse structure of as the message in the BRK does not compromise the RLWE hardness.
Game 4 (Random Challenge): We replace the challenge LWE ciphertext with a uniform random vector. By Assumption 1, . In Game 4, the adversary’s advantage is exactly 0. Thus, the scheme is IND-CPA secure. □
Theorem 5 (1-Hop Circuit Privacy). If the smudging noise standard deviation satisfies , the partial decryption shares can be simulated given only the final output, ensuring circuit privacy.
Proof. The share computation masks the input noise structure. By the Smudging Lemma, the statistical distance between and is bounded by . Thus, the joint decryption reveals no information about the secret keys beyond the final plaintext result. □
5. Theoretical Performance and Complexity Evaluation
In this section, we evaluate the theoretical performance of the proposed DMBB-RCB scheme. We explicitly note that this section presents an analytical and theoretical evaluation grounded in established cryptographic metrics, rather than an implementation-based software benchmark. We incorporate the complexity analysis deferred from
Section 4, establish concrete parameter sets satisfying the derived noise bounds, and compare the asymptotic latency and throughput against existing state-of-the-art Multi-Key FHE schemes.
5.1. Asymptotic Complexity Comparison
The efficiency of a TFHE-style scheme is heavily dominated by the sequence of external products in the Blind Rotation phase. We compare DMBB-RCB against standard Static MK-TFHE and Dynamic MK-TFHE. Let be the number of active participants, be the LWE dimension, be the block length, and be the packing factor. Let and denote the computational cost of a single TRGSW external product and an LWE-to-RLWE key-switching operation, respectively.
Existing schemes mandate iterating through every bit of the expanded multi-key, resulting in a blind rotation complexity of . Furthermore, standard schemes output a scalar LWE sample, requiring an additional key-switching overhead to return to an RLWE format for subsequent gates.
In contrast, DMBB-RCB strictly reduces the blind rotation complexity to
by iterating over
blocks. Concurrently, the amortized cost per bit is minimized to
via the
mechanism. The Ring-Compact architecture eliminates the post-bootstrapping key-switching overhead entirely, as summarized in
Table 1.
5.2. Concrete Parameter Selection
To validate the feasibility of DMBB-RCB, we select parameters targeting 128-bit security. According to recent cryptanalysis, the LWE dimension
must be marginally increased compared to standard binary keys to resist hybrid dual attacks when utilizing block-binary keys. Specifically, to maintain a strict 128-bit security level, standard binary keys (
) typically require an LWE dimension of
. Our concrete parameter selection confirms that employing block lengths of
(Set-I) and
(Set-II) necessitates expanding the dimensions to
and
, respectively. This precisely accounts for the ~10–15% margin required to offset the structural sparsity. The polynomial modulus parameters
and the standard deviations of the error distributions (
for LWE,
for RLWE/RGSW) must carefully balance the noise constraints in Theorem 3 and the standard Lattice Estimator security bounds. Specifically, to guarantee a 128-bit security level against known lattice attacks (e.g., uSVP and dual attacks), the error standard deviations are chosen appropriately relative to the dimensions. In our concrete instantiation, we set the LWE error rate to
and the RLWE error rate to
for Set-I, and adjust them marginally for Set-II to accommodate the larger packing factor. As presented in
Table 2, we propose two parameter sets derived from the noise bounds in Theorem 3: Set-I targets low-latency execution, while Set-II maximizes high-throughput packing.
Regarding the packing factor , it directly dictates the initial noise accumulation. As defined in Theorem 3, the noise variance from packing grows linearly with . To ensure the final noise variance remains below the decoding gap (), the theoretical upper bound for the packing factor is roughly bounded by . For our Set-II parameters (), approaches this upper boundary; exceeding it would cause decryption failures due to noise overflow.
5.3. Theoretical Performance Projections
We project the execution time based on standard AVX-512 accelerated CPU single-thread execution, estimating one TRGSW External Product () at approximately 10 ms for and 35 ms for . We consider a collaborative group of participants:
Latency: For standard MK-TFHE (), evaluating one gate takes . Utilizing Set-I (), DMBB-RCB cuts this execution time strictly by half to .
Amortized Throughput: Utilizing Set-II (), one bootstrapping execution takes . However, this single execution refreshes 1024 logic gates simultaneously. The standard scheme achieves an effective throughput of gates/s, whereas DMBB-RCB achieves gates/s. This represents a theoretical throughput improvement of approximately .
5.4. Storage and Communication Trade-Offs
While computation is highly optimized, the Ring-Compact property necessitates specific storage trade-offs:
Key Storage: The Blind Rotation Keys (
) still require encryptions of individual bits to construct the block selector
, maintaining a size of
, comparable to standard schemes. However, embedding the LWE key into the RLWE key (
Section 3.2) saves
of the key generation storage overhead compared to separate key formulations.
Ciphertext Expansion: The output is an MK-RGSW ciphertext containing RLWE ciphertexts. For a depth of , the output size is per user share, compared to for a scalar LWE sample. While this represents a 19x ciphertext expansion, it is a highly favorable trade-off in realistic MPC scenarios. In modern cloud environments and wide-area networks (WANs), bandwidth is generally abundant, whereas network round-trip latency is the primary bottleneck for distributed systems. Traditional schemes require frequent, interactive key-switching protocols between all parties after each bootstrapping, causing severe latency degradation. By strictly containing the output within the 48 KB MK-RGSW format, DMBB-RCB achieves zero-interaction during the depth-unbounded circuit evaluation phase. Multi-party communication is explicitly deferred to and exclusively required for the final distributed decryption protocol. Exchanging 48 KB of data per share at the very end of the computation is negligible compared to the massive reduction in interactive communication rounds during the evaluation phase.
Runtime Memory Consumption: Beyond static storage, we must also account for dynamic memory (RAM) consumption during the bootstrapping evaluation. In MK-FHE, the primary memory footprint stems from loading the multi-key Blind Rotation Keys (BRK) and the accumulator states into active memory. For participants, the total evaluation key size scales as . Using our Set-II parameters (), the BRK size per user is approximately several hundred megabytes. While our block-binary logic reduces the computational iterations, the entire key must still reside in memory. Consequently, evaluating deep circuits for multiple users requires gigabytes of RAM. This memory consumption is entirely manageable for modern cloud servers (the intended environment for DMBB-RCB), but it necessitates proper memory provisioning and precludes deployment on highly memory-constrained edge devices.
5.5. Evaluation Summary
The theoretical evaluation confirms that DMBB-RCB resolves the primary scalability bottlenecks of MK-TFHE. The sparse key structure reduces computational latency by a factor of , while amortized packing increases throughput by a factor of . The computational cost grows linearly with but is strictly bounded by a shallower slope coefficient ( vs. ), verifying its suitability for large-scale dynamic MPC tasks.
6. Conclusions
In this paper, we addressed the critical scalability and throughput bottlenecks inherent in Multi-Key Homomorphic Encryption (MKHE) by proposing the Dynamic Multi-Key Block-Binary Ring-Compact Bootstrapping (DMBB-RCB) framework.
By abandoning the conventional bit-wise processing paradigm, DMBB-RCB synergizes three advanced cryptographic optimizations: (1) adopting a Block-Binary distribution ) for secret keys to strictly reduce the dominant blind rotation complexity from to ; (2) integrating an amortized multi-key mechanism to process independent messages in parallel, thereby increasing the amortized throughput by orders of magnitude; and (3) implementing a Ring-Compact extraction architecture via Scheme Switching, which outputs Multi-Key RGSW ciphertexts natively to enable depth-unbounded, closed-loop evaluation without interactive LWE-to-RLWE key-switching.
Consequently, DMBB-RCB bridges the gap between the programmable logic flexibility of TFHE and the SIMD efficiency of leveled FHE schemes [
12,
24]. The dynamic common reference string (CRS) design and interaction-free extraction during the evaluation phase (with interaction restricted solely to the final decryption step) make this framework highly scalable and communication-efficient, positioning it as a robust cryptographic foundation for dynamic Secure Multi-Party Computation (MPC) environments, such as federated learning and collaborative cloud analytics.
7. Limitations and Future Research
While DMBB-RCB achieves substantial computational optimizations, we acknowledge specific trade-offs that motivate our future research directions:
Ciphertext Expansion and Bandwidth: The native output of our Ring-Compact scheme is a Multi-Key RGSW ciphertext, which comprises a matrix of polynomials. This is significantly larger than a scalar LWE sample utilized in standard TFHE, necessitating higher transmission bandwidth during the final result retrieval phase.
Latency vs. Throughput Profile: The initial noise term introduced by polynomial packing grows with the packing factor , requiring careful parameter bounds. Furthermore, the overhead of the packing and scheme-switching sub-routines slightly increases the latency of a single bootstrapping execution. Thus, DMBB-RCB is optimized for batched, high-throughput processing rather than ultra-low-latency, real-time control systems.
Building upon these properties, future research will focus on the following trajectories:
Hardware Acceleration: The structured block-wise external products inherent to our block-binary blind rotation are highly amenable to hardware parallelization. Future implementations will explore distributing these operations across FPGAs or GPUs to further minimize absolute latency.
Verifiable MKHE (Malicious Security): To secure the protocol against malicious participants who might supply malformed sparse keys or ciphertexts, integrating Zero-Knowledge Proofs (ZKP) to verify the validity of the keys and the packing step remains a crucial open problem.
Advanced Packing and Hybrid Architectures: We aim to explore automorphism-based slot permutations within the DMBB-RCB accumulator to evaluate complex linear algebra operations directly inside the bootstrapping loop. Additionally, investigating a hybrid framework that switches between DMBB-RCB (for non-linear Boolean logic) and CKKS (for precision arithmetic) could yield a comprehensive solution for Privacy-Preserving Machine Learning as a Service (PPMLaaS).
Software Engineering and Concrete Implementation: The performance evaluations presented in this manuscript are strictly theoretical projections grounded in empirical baseline metrics. Developing a production-ready, highly optimized multi-party execution framework in low-level languages (such as C++ or Rust) falls outside the mathematical scope of this paper. A full-scale implementation—which must rigorously address memory management, side-channel resistance, and network synchronization—is a massive independent software engineering undertaking. This comprehensive C++/Rust implementation, alongside hardware co-design, constitutes our immediate next step for future work to transition DMBB-RCB from a mathematical protocol to a deployable library.
Author Contributions
Conceptualization, Q.X. and R.H.; methodology, Q.X.; validation, Q.X. and R.H.; formal analysis, Q.X.; investigation, Q.X.; writing—original draft preparation, Q.X.; writing—review and editing, Q.X. and R.H.; supervision, R.H.; funding acquisition, R.H. All authors have read and agreed to the published version of the manuscript.
Funding
This work was supported by the National Natural Science Foundation Project of China (No. 62062009) and the Guangxi Key Research and Development Program Project (No. AB24010340).
Data Availability Statement
No new data were created or analyzed in this study.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Gentry, C. Fully homomorphic encryption using ideal lattices. In Proceedings of the 41st Annual ACM Symposium on Theory of Computing (STOC), Bethesda, MD, USA, 31 May–2 June 2009; Association for Computing Machinery: New York, NY, USA, 2009; pp. 169–178. [Google Scholar] [CrossRef]
- Chillotti, I.; Gama, N.; Georgieva, M.; Izabachène, M. Faster fully homomorphic encryption: Bootstrapping in less than 0.1 seconds. In Advances in Cryptology—ASIACRYPT 2016: 22nd International Conference on the Theory and Application of Cryptology and Information Security, Hanoi, Vietnam, 4–8 December 2016, Proceedings, Part I; Springer: Berlin/Heidelberg, Germany, 2016; pp. 3–33. [Google Scholar] [CrossRef]
- Chillotti, I.; Gama, N.; Georgieva, M.; Izabachène, M. TFHE: Fast fully homomorphic encryption over the torus. J. Cryptol. 2020, 33, 34–91. [Google Scholar] [CrossRef]
- Regev, O. On lattices, learning with errors, random linear codes, and cryptography. J. ACM (JACM) 2009, 56, 1–40. [Google Scholar] [CrossRef]
- López-Alt, A.; Tromer, E.; Vaikuntanathan, V. On-the-fly multiparty computation on the cloud via multikey fully homomorphic encryption. In Proceedings of the 44th Annual ACM Symposium on Theory of Computing (STOC), New York, NY, USA, 19–22 May 2012; Association for Computing Machinery: New York, NY, USA, 2012; pp. 1219–1234. [Google Scholar] [CrossRef]
- Clear, M.; McGoldrick, C. Multi-identity and multi-key homomorphic encryption from LWE. In Advances in Cryptology—CRYPTO 2015: 35th Annual Cryptology Conference, Santa Barbara, CA, USA, 16–20 August 2015, Proceedings, Part I; Springer: Berlin/Heidelberg, Germany, 2015; pp. 399–418. [Google Scholar] [CrossRef]
- Mukherjee, P.; Wichs, D. Two round multiparty computation via multi-key FHE. In Advances in Cryptology—EUROCRYPT 2016: 35th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Vienna, Austria, 8–12 May 2016, Proceedings, Part II; Springer: Berlin/Heidelberg, Germany, 2016; pp. 735–763. [Google Scholar] [CrossRef]
- Peikert, B.; Shiehian, S. Multi-key FHE from LWE, revisited. In Theory of Cryptography: 14th International Conference, TCC 2016-B, Beijing, China, 31 October–3 November 2016, Proceedings, Part I; Springer: Berlin/Heidelberg, Germany, 2016; pp. 217–238. [Google Scholar] [CrossRef]
- Chen, H.; Chillotti, I.; Song, Y. Multi-key homomorphic encryption from TFHE. In Advances in Cryptology—ASIACRYPT 2019: 25th International Conference on the Theory and Application of Cryptology and Information Security, Kobe, Japan, 8–12 December 2019, Proceedings, Part II; Springer International Publishing: Cham, Switzerland, 2019; pp. 446–472. [Google Scholar] [CrossRef]
- Kwak, H.; Min, S.; Song, Y. Towards practical multi-key TFHE: Parallelizable, key-compatible, quasi-linear complexity. In Public-Key Cryptography—PKC 2024: 27th IACR International Conference on Practice and Theory of Public-Key Cryptography, Sydney, NSW, Australia, 15–17 April 2024, Proceedings, Part IV; Springer: Berlin/Heidelberg, Germany, 2024; pp. 354–385. [Google Scholar] [CrossRef]
- Smart, N.P.; Vercauteren, F. Fully homomorphic SIMD operations. Des. Codes Cryptogr. 2014, 71, 57–81. [Google Scholar] [CrossRef]
- Brakerski, Z.; Gentry, C.; Vaikuntanathan, V. (Leveled) fully homomorphic encryption without bootstrapping. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference (ITCS), Cambridge, MA, USA, 8–10 January 2012; Association for Computing Machinery: New York, NY, USA, 2012; pp. 309–325. [Google Scholar] [CrossRef]
- Cheon, J.H.; Kim, A.; Kim, M.; Song, Y. Homomorphic encryption for arithmetic of approximate numbers. In Advances in Cryptology—ASIACRYPT 2017: 23rd International Conference on the Theory and Applications of Cryptology and Information Security, Hong Kong, China, 3–7 December 2017, Proceedings, Part I; Springer: Berlin/Heidelberg, Germany, 2017; pp. 409–437. [Google Scholar] [CrossRef]
- Micciancio, D.; Sorrell, J. Ring packing and amortized FHEW bootstrapping. In Proceedings of the 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018), Prague, Czech Republic, 9–13 July 2018; pp. 111:1–111:14. [Google Scholar] [CrossRef]
- Lee, C.; Min, S.; Seo, J.; Song, Y. Faster TFHE bootstrapping with block binary keys. In Proceedings of the 2023 ACM Asia Conference on Computer and Communications Security (ASIA CCS ’23), Melbourne, VIC, Australia, 10–14 July 2023. [Google Scholar]
- Biswas, C.; Dutta, R. Dynamic multi-key FHE in symmetric key setting from LWE without using common reference matrix. J. Ambient Intell. Humaniz. Comput. 2022, 13, 1241–1254. [Google Scholar] [CrossRef]
- Xu, K.; Huang, R. Accelerated multi-key homomorphic encryption via automorphism-based circuit bootstrapping. IEEE Access 2025, 13, 1636–1650. [Google Scholar] [CrossRef]
- Wang, R.; Wen, Y.; Li, Z.; Lu, X.; Wei, B.; Liu, K.; Wang, K. Circuit Bootstrapping: Faster and Smaller. In Advances in Cryptology—EUROCRYPT 2024: 43rd Annual International Conference on the Theory and Applications of Cryptographic Techniques, Zurich, Switzerland, 26–30 May 2024, Proceedings, Part II; Springer Nature: Cham, Switzerland, 2024; pp. 342–372. [Google Scholar] [CrossRef]
- Lee, S.; Kim, D.; Fast, D.S. Compact and Hardware-Friendly Bootstrapping in Less than 3 ms Using Multiple Instruction Multiple Ciphertext. IACR Cryptology ePrint Archive, Paper 2024/1916. 2024. Available online: https://eprint.iacr.org/2024/1916 (accessed on 10 February 2026).
- Bernard, O.; Joye, M. Bootstrapping (T)FHE Ciphertexts via Automorphisms: Closing the Gap Between Binary and Gaussian Keys. IACR Cryptology ePrint Archive, Paper 2025/163. 2025. Available online: https://eprint.iacr.org/2025/163 (accessed on 10 February 2026).
- Lyubashevsky, V.; Peikert, C.; Regev, O. On ideal lattices and learning with errors over rings. In Advances in Cryptology—EUROCRYPT 2010: 29th Annual International Conference on the Theory and Applications of Cryptographic Techniques, French Riviera, 30 May–3 June 2010, Proceedings; Springer: Berlin/Heidelberg, Germany, 2010; pp. 1–23. [Google Scholar] [CrossRef]
- Gentry, C.; Sahai, A.; Waters, B. Homomorphic encryption from learning with errors: Conceptually-simpler, asymptotically-faster, attribute-based. In Advances in Cryptology—CRYPTO 2013: 33rd Annual Cryptology Conference, Santa Barbara, CA, USA, 18–22 August 2013. Proceedings, Part I; Springer: Berlin/Heidelberg, Germany, 2013; pp. 75–92. [Google Scholar] [CrossRef]
- Asharov, G.; Jain, A.; López-Alt, A.; Tromer, E.; Vaikuntanathan, V.; Wichs, D. Multiparty computation with low communication, computation and interaction via threshold homomorphic encryption. In Advances in Cryptology—EUROCRYPT 2012: 31st Annual International Conference on the Theory and Applications of Cryptographic Techniques, Cambridge, UK, 15–19 April 2012, Proceedings; Springer: Berlin/Heidelberg, Germany, 2012; pp. 483–501. [Google Scholar] [CrossRef]
- Fan, J.; Vercauteren, F. Somewhat Practical Fully Homomorphic Encryption. IACR Cryptol. Eprint Arch. 2012, 2012, 144. Available online: https://eprint.iacr.org/2012/144 (accessed on 10 February 2026).
Table 1.
Asymptotic Complexity and Feature Comparison.
Table 1.
Asymptotic Complexity and Feature Comparison.
| Scheme | Blind Rotation Cost | Amortized Cost (Per Bit) | Post-PBS Key Switching | Dynamic Support |
|---|
| Static MK-TFHE [10] | | | ) | No |
| Dynamic MK-TFHE [17] | | | ) | Yes |
| DMBB-RCB (Ours) | | | Eliminated | Yes |
Table 2.
Recommended Parameters for 128-bit Security.
Table 2.
Recommended Parameters for 128-bit Security.
| Parameter | Symbol | Set-I (Speed) | Set-II (Capacity) |
|---|
| Block Length | | 2 | 4 |
| LWE Dimension | | 650 | 720 |
| Block Count | | 325 | 180 |
| RLWE Degree | | 1024 | 2048 |
| RLWE Modulus | | | |
| Packing Factor | | 1 | 1024 |
| Gadget Base | | | |
| Error Standard Deviations | | | |
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |