1. Introduction
Of late, cryptographers have become increasingly worried about advances in quantum computing. Once a powerful quantum computer is released, Shor’s algorithm [
1] will break all widely deployed public key cryptosystems (PKC), specifically those based on factoring or discrete-log hard problems like RSA and elliptic curve constructions. Consequently, in 2017, NIST called for Post-quantum cryptography (PQC) proposals. Lattice-based and code-based cryptographic schemes are two of the main types of proposals submitted to the NIST PQC competition for standardization. Code-based and lattice-based cryptographic schemes are conceptually similar in using noise (error) for security but differ in their mathematical foundations. Both schemes leverage the difficulty of certain structured decoding problems [
2]. Indeed, lattice-based cryptography replaces the
‘codeword plus random errors’ of the code-based cryptography with a
‘lattice point plus random errors’ [
3].
In the NIST’s third round PQC competition, Crystal-Kyber [
4], a lattice-based post-quantum key encapsulation mechanism (KEM), has been selected for standardization. Additionally, another four KEM algorithms have been selected to continue in the NIST’s fourth round. Among the four submissions, Classic-McEleiece [
3,
5], an important code-based post-quantum KEM candidate, is constructed based on the original McEliece [
6] and its dual variant, Niederreiter [
7,
8] public key encryption (PKE), which have proved their stability and robustness against attacks for over 40 years.
Private set intersection (PSI) is a cryptographic multiparty computation problem first introduced in [
9], where distrusted parties holding private sets of elements submit their sets as inputs, and only one or more parties receive the intersection and nothing else. In the classical formulation, the protocol enables two parties to jointly compute the intersection. However, the problem can be extended to a multiparty PSI, adding a layer of complexity. Moreover, there are many variants of PSI. For example, we can classify PSI into one-way PSI, where only one party learns the output, or mutual PSI, where all parties receive the output. We can also classify the problem by the sizes of the input sets, such as balanced PSI for inputs of comparable sizes or unbalanced PSI, where one input is much larger than the others. In another variant, the output can be other functions than the intersection, like PSI-CA, where the output is the cardinality of the intersection rather than the intersection itself. Moreover, the problem might demand additional requirements, such as threshold-PSI, where the parties obtain the output only when the cardinality is above a certain threshold, or size-hiding PSI, where the size of the inputs is hidden throughout the protocol.
The PSI has wide real-world applications like medical and criminal records, anomaly detection, private database queries, the military sector, and others. The following motivational example demonstrates one of the real-world MPSI scenarios.
Example [10]: Various military factories want to compare the quality of a specific precision instrument with others of the same type. To achieve this goal, they need to collect these precision instruments from different military factories. However, no military factory is willing to expose its own private data to others.
Thanks to homomorphic encryption (HE), the sets’ intersection can be computed, and privacy is still preserved. HE is a property of public-key cryptosystems that enables certain computations to be performed on ciphertexts, producing an encrypted result that, when decrypted, complies with the outcome of the same operations on the plaintexts. Most public-key encryption schemes support at least one type of homomorphic operation. For the PSI problem, we are interested in homomorphic schemes that support addition and/or multiplication operations.
Motivated by this, in this work, we investigate and compare the homomorphic capabilities of the underlying public key encryption (PKE) schemes of two important post-quantum KEM candidates, namely Crystal-Kyber and Classic-McEliece, in the context of the two-party PSI problem. While the homomorphic properties of McEliece/Niederreiter PKE have already been studied in the literature stripped of any concrete applications, the homomorphic properties of Kyber have not, to the best of our knowledge, been directly examined. Therefore, we go beyond revisiting these properties by designing two new PSI protocols based on the respective cryptosystems. We stress that our focus is to study and examine the underlying PKEs of the mentioned candidates and to clarify whether these two candidates, in addition to their role in KEM schemes, are appropriate for quantum-safe homomorphic encryption.
Contributions
The main contributions of this study are threefold:
Designs a two-party PSI protocol that utilizes McEliece’s additive homomorphic property and the Bloom filter data structure.
Designs a two-party PSI protocol that utilizes Kyber’s multiplicative homomorphic property.
Provides an experimental performance evaluation framework to compare the two proposed protocols, considering NIST’s security levels 1, 3, and 5, and focusing on three key metrics: storage overhead, communication overhead, and computation cost.
The rest of the paper is organized as follows.
Section 2 presents important background on HE, Crystal-Kyber, and Classic-McEliece. A literature review is conducted in
Section 3. The design of the proposed PSI protocols and the theoretical analysis are detailed in
Section 4, while the experimental work and results are presented in
Section 5.
Section 6 provides the conclusion and outlines the future work.
2. Background
In this section, we recall the necessary information that helps to understand the rest of the paper. Before delving into the details of the dedicated cryptosystems, we provide the following definitions to distinguish between three closely related concepts: public key cryptosystem (PKC), public key encryption (PKE), and key encapsulation mechanism (KEM).
Definition 1 (Public Key Cryptosystem (PKC))
. Any cryptographic scheme that uses a pair of keys (public and private) for performing cryptographic tasks, including encryption, digital signature, and key exchange, is broadly referred to as PKC.
Definition 2 (Public Key Encryption (PKE))
. A specific type of PKC designed solely for exchanging secure messages between two or more legitimate parties. It uses the public key for encryption and the private key for decryption.
Definition 3 (Key Encapsulation Mechanism (KEM))
. A PKC protocol that can be used by two parties to establish a shared secret key over a public (insecure) channel. The output is not an encrypted message but a secret key used for symmetric key encryption.
Also, we provide the following security definitions that help in the formal proofs of the proposed algorithms.
Definition 4 (IND-CPA Security)
. A public key encryption scheme is said to achieve indistinguishability under chosen-plaintext attack (IND-CPA) if no polynomial-time (PPT) adversary can distinguish between the encryptions of two messages of equal length, even when given access to the public key and the ability to encrypt any plaintext of their choice. Formally, for any PPT adversary , the advantage in the IND-CPA game is negligible:where κ is the security parameter and the IND-CPA game involves the adversary choosing two messages, and , receiving an encryption of one of them, and attempting to determine which message was encrypted. Definition 5 (IND-CCA Security)
. A public key encryption scheme achieves indistinguishability under chosen-ciphertext attack (IND-CCA) if it satisfies IND-CPA security and additionally remains secure even when a PPT adversary has access to a decryption oracle that can decrypt any ciphertext except the challenge ciphertext. This represents a stronger security notion than IND-CPA, as it protects against adaptive chosen-ciphertext attacks. Formally, for any PPT adversary with access to decryption oracle , the advantage in the IND-CCA game is negligible:where the adversary can query the decryption oracle on any ciphertext other than the challenge ciphertext during the attack game. Definition 6 (Semi-Honest Security Model)
. A two-party PSI protocol is secure against semi-honest, a.k.a. honest but curious, adversaries if for any probabilistic polynomial-time (PPT) adversary that corrupts at most one party, there exists a PPT simulator such that for all sets and :where IDEAL represents the ideal functionality where parties only learn .
REAL represents the real protocol execution.
denotes computational indistinguishability.
2.1. Homomorphic Encryption
HE is a property of public-key cryptosystems that enables certain arithmetic computations to be performed on ciphertexts, producing an encrypted result that, when decrypted, complies with the outcome of the same operations on the plaintexts. Most public-key encryption schemes support at least one type of homomorphic operation. HE can be classified into three categories: (1) Partially homomorphic encryption (PHE), which can perform only one mathematical function an unlimited number of times. For example, the Paillier cryptosystem [
11] supports only the addition operation, while the original RSA [
12] can support only the multiplication operation. Other PHE schemes are Goldwasser and Micali [
13], ElGamal [
14], Naccache and Stern [
15], Benaloh [
16], and Okamoto and Uchiyama [
17]. (2) Somewhat homomorphic encryption (SWHE): it supports some types of arithmetic functions a limited number of times. For example, the Boneh–Goh–Nissim cryptosystem [
18] can evaluate unlimited additions and one multiplication. Other SWHE schemes are Yao [
19], Sander et al. [
20], and Ishai and Paskin [
21]. (3) Fully homomorphic encryption (FHE), which supports an unlimited number of operations for an unlimited number of times. Constructing a fully homomorphic scheme that can evaluate an arbitrary function was not possible until Gentry’s breakthrough in 2009 [
22]. Other FHE schemes are BFV [
23], BGV [
24], CKKS [
25], and TFHE [
26].
2.2. Crystal Kyber Cryptosystem
Crystal-Kyber [
4] is a lattice-based public key cryptosystem mainly used as a KEM for establishing private keys for symmetric-key cryptosystems. The Kyber KEM is IND-CCA2 secure and is built upon Kyber PKE, which is IND-CPA secure. Its security is based on the hardness of the modular learning with errors (MWLE) problem, which is as hard as several worst-case lattice problems, specifically, the shortest vector problem (SVP). MLWE was first introduced in [
27], and it offers much better efficiency and security tradeoffs when compared with learning with error (RWE) [
2] and ring learning with error (RLWE) [
28] problems. Moreover, its keys are fairly small to be used in real-world applications.
For detailed information and documentation about Kyber’s IND-CCA2-secure KEM, we refer the reader to [
29].
Herein, we describe Kyber’s IND-CPA-secure PKE as it will be used to investigate the homomorphic properties of the cryptosystem. Let
R and
denote the rings
and
, respectively, where
n is a power of two such that
is a cyclotomic polynomial. Specifically, Kyber operates over the polynomial ring
and sets a security parameter
to achieve better security than RLWE. Namely,
k determines the number of polynomials in the module setting. Notably, when
, the MLWE instance reduces to a pure RLWE instance [
30].
Kyber’s PKE Keygen, Encryption, and Decryption algorithms [
4] are defined in Algorithm 1, Algorithm 2, and Algorithm 3, respectively.
| Algorithm 1 Kyber.KeyGen(): key generation |
- 1:
- 2:
- 3:
- 4:
|
| Algorithm 2 Kyber.Enc(pk = (, ): encryption |
- 1:
- 2:
- 3:
- 4:
- 5:
|
| Algorithm 3 Kyber.Dec(sk = s, c = ()): decryption |
- 1:
- 2:
- 3:
|
2.2.1. Example
For a simple example we set
. Let Bob’s public key be a 2 by 2 matrix
chosen randomly,
Then he also samples the secret key
s and the error vector
e from the centered binomial distribution
:
Calculating
mod
, we obtain
after using the compression function, he obtains
Then the public key will be
, while the secret key is
. Suppose Alice wants to send the message
. For the first sample
, we obtain
To find the ciphertext, Alice computes
, we obtain
after compression, the ciphertext will turn to
To decrypt the message, Bob decompresses the ciphertext and calculates
; then the coefficients that are closer to 3841 than 0 or 7681 will be considered 1, otherwise 0. That is, the decrypted message will be
, Alice’s original message.
2.2.2. Homomorphism
Let
and
be two ciphertexts. The following homomorphic relation holds:
We define the addition in (
4) as follows
.
For the multiplication, it is not as straightforward. Consider the following multiplication of two messages
where
are the
i-th component of
and
. Multiplying the LHS yields
Therefore, we write the multiplication as follows
The ciphertext in this form, Equation (
5), is a non-standard ciphertext computed over the rationals
, where
Note that the size of the ciphertext increased, and it is not decrypted as a usual ciphertext. It is possible to reduce it to the usual ciphertext form
such that it is decrypted as
using a computationally costly operation called Relinearization [
31]. Relinearization could be avoided when the multiplicative depth is small using what is called the leveled FHE [
24]. As such, the non-standard ciphertext can be successfully decrypted using Equation (
6).
For instance, in the two-party PSI protocol, the multiplicative depth is exactly equal to one, and Equation (
6) can be efficiently used.
2.3. McEliece/Niederreiter and Classic-McEliece Cryptosystem
The Classic-McEliece key-encapsulation mechanism is derived from the code-based public key cryptosystem proposed by R. McEliece in 1978 [
6]. More precisely, Classic-McEliece KEM is built upon Niederreiter [
7,
8], which is a dual variant of the McEliece PKE. The original McEliece PKE has a robust security history, as dozens of papers over 40 years have tried, with no success, to attack this system. The main reason that keeps McEliece away from practical consideration is its huge key sizes compared to those of its number-theoretic PKE counterparts.
McEliece PKE is based on the hardness of decoding a general linear code. The original algorithm uses binary Goppa codes, which can be efficiently decoded using an algorithm by Paterson. Let
G be the
generator matrix of a linear code that can correct up to
t errors and has an efficient decoding algorithm. The cryptosystem’s Keygen, Encryption, and Decryption algorithms are defined in Algorithms 4–6 as follows:
| Algorithm 4 McEliece.KeyGen(): key generation |
- 1:
- 2:
- 3:
- 4:
|
| Algorithm 5 McEliece.Enc(pk = , ): encryption |
- 1:
- 2:
- 3:
|
| Algorithm 6 McEliece.Dec(sk = (), c): decryption |
- 1:
- 2:
- 3:
- 4:
|
In the decryption algorithm, , where is a codeword. Since , we can decode and obtain . Therefore, .
2.3.1. Example
Consider the following:
generator matrix of a linear code that can correct up to
errors
And assume Alice wants to send to Bob the message
. Bob generates
S randomly with a permutation matrix
P,
Then
So now Alice computes and sends to Bob the following:
Bob then computes
Notice the new error
, so we can correct the errors and decode
to obtain
. Finally, Bob computes
, Alice’s original message.
2.3.2. Homomorphism
McEliece PKE has the following additive homomorphism: [
32]
However, note the following:
We can see that the error vector
of
doubled in weight. Therefore, when performing the addition of ciphertexts,
and
should be chosen such that each has a weight of less than half of the maximum error-correcting rate of the code. In general, if we have
n additions, each of the sum’s components must have a weight less than
of the maximum error-correcting rate of the code.
A more efficient dual variant of the McEliece cryptosystem is due to Niederreiter [
7,
8]. Consider the
linear Binary Goppa Code
G,the Niederreiter cryptosystem is defined as shown in Algorithms 7–9.
| Algorithm 7 Niederreiter.KeyGen(): key generation |
- 1:
- 2:
- 3:
- 4:
- 5:
|
| Algorithm 8 Niederreiter.Enc(pk = , ): encryption |
- 1:
- 2:
- 3:
|
| Algorithm 9 Niederreiter.Dec(sk = (), c): decryption |
- 1:
- 2:
- 3:
- 4:
|
3. Related Works
Exploring the homomorphic properties of McEliece PKE was first studied in [
32]. However, no attention has been given to leveraging these properties for constructing PSI protocols. Similarly, the homomorphic properties of the MLWE lattice, upon which Kyber is built, were investigated in [
31]. That study, however, was not specifically dedicated to Kyber itself but to the MLWE lattice structures in general.
The problem of constructing post-quantum-based PSI protocols has been studied extensively in the literature. Most of the proposed protocols are based on lattice-based structures; specifically RLWE and LWE problems. In [
33], the authors provided a comprehensive literature review on the PSI problem.
In [
34], the authors proposed a lattice-based size hiding protocol for two-party PSI-CA secure in a semi-honest environment. The security is based on the hardness of the decisional Learning With Errors (DLWE) problem with linear complexity in the size of the inputs. To prevent arbitrary inputs, they proposed a protocol where a trusted third party authorizes the client’s input set.
Later in [
35], the authors generalized the protocol to the Multiparty-PSI problem, secure in the semi-honest model with
, where
is the maximum set size and
k and
n are security parameters. The parties are arranged in a star topology so that all parties need not be online at the same time.
The security of the previous protocols is based on DLWE; however, other designs exist with security based on Ring-LWE. In [
36], the authors proposed protocols based on the fully homomorphic encryption scheme that was proposed in [
37], considering the semi-honest model. Also, the authors developed an extension protocol secure under the malicious model by outsourcing computing to a cloud. Another lattice-based protocol based on the NTRU fully homomorphic encryption scheme is proposed in [
10].
While prior works have explored PSI protocols based on lattice assumptions such as LWE, RLWE, and NTRU, no existing study has designed PSI protocols built upon the PKEs of Kyber or McEliece. This leaves a clear gap in the literature, which we aim to address in this work. Specifically, we propose and evaluate two new two-party PSI protocols founded on Kyber and McEliece PKEs to examine and assess their homomorphic properties in practice.
4. Proposed PSI Protocols
In this section, the proposed post-quantum PSI protocols are presented.
4.1. Kyber Based Protocol
Based on the algorithm described in
Section 2.3, we present the Kyber-based PSI protocol in the following section. Let
be two semi-honest parties holding the sets
, respectively, such that
, where
. We can define a vector representation of the
’s set as
such that
where
is a non-zero binary polynomial in
R.
Taking the product of
and
and using (
5), we have
So, after decryption, we obtain
Therefore, Algorithm 10 describes a Kyber-based private intersection protocol:
| Algorithm 10 Kyber-based Private Set Intersection |
Input: Output: - 1:
- 2:
: For each coordinate , draw an independent zero-ciphertext and set . - 3:
: Compute the coordinate-wise product:
where each product is returned in the form , as defined in ( 5). - 4:
- 5:
- 6:
: For , If then else . - 7:
: Output .
|
The correctness of this protocol greatly depends on the noise of the underlying encryption scheme. In particular, from
Section 2.2.2, the protocol is correct when every product in STEP 3 has the following property:
That is when
Because of the re-randomization in STEP 2, we differentiate between the distribution of
and
.
Working coefficient-wise, let
be an upper bound on
and
be an upper bound on
. That is an upper bound on the probability that the
j-th coefficient of the polynomials
and
is greater than
and
, respectively. Then, by union bound, we have
and
. Therefore, if we choose
and
such that
, we can have an upper bound on the failure rate of one product in STEP 3. That is
. Taking a union bound over all products in STEP 3, we have an upper bound on the total failure rate
of the protocol
One choice of
and
is
. However, since
is noisier due to re-randomization, we can have a better estimate by choosing
and
such that
is minimized. Using a Chernoff bound, we obtain the following result:
Theorem 1. Algorithm 10 is -correct, where δ satisfiesSuch thatwhere Proof. Let
. We adopt a uniform quantization model, that is
Focusing on
only since
can be derived in a similar manner taking into account the additional noise. The
jth coefficient is
Each one of the terms is a sum of
products of a centered binomial and uniform random variables. Except for
, a centered binomial random variable, and
, modeled by its upper bound
. Then, define the random part:
We will compute the Moment Generating Function (MGF) of
and apply the Chernoff bound. Note that if
, then
. Also, when
we have
. Conditioning on
and using independence:
Then the MGF of each product in
is
Using the identity
And
, and the binomial expansion of
, a straightforward telescoping yields
Similarly for
for
So,
and
From here, it is straightforward to apply the Chernoff bound, and the proof is complete. □
4.2. McEliece-Based Protocol
Since McEliece PKE supports only the additive homomorphism, a direct algorithm as in (10) cannot be used. In fact, the direct algorithm can be used, but if a peer acts maliciously by pretending that its set includes all the elements in the domain [
38], privacy is no longer guaranteed. Hence, a data structure, e.g., a bloom filter, is needed to design a secure PSI protocol. To perform multiple AND operations, the Sandar Young Yung (YYT) technique [
20] is utilized.
We first define the function , where a bit x encrypted to is expanded to a vector ciphertext with length l as follows:
1. For each i, draw a sample uniformly from .
2. For each element in the vector ciphertext
is set
When the vector ciphertext
is decrypted, the result will be
. If
, then
, so for all
i,
. Otherwise,
will be uniformly distributed in
. Now, we define the sum of two vector ciphertexts as follows:
Observe that if both
x and
y are 1, then all
is 0. However, if one of them is 0, then
will be uniformly distributed in
. So, we have
A bloom filter is a probabilistic data structure used to test for inclusion [
39]. A bloom filter can be represented as a vector of bits
of size
m and
associated hash functions
. Initially, the Bloom filter’s bits are all set to zero. Then we define the following two functions:
In a bloom filter, the function will add the element x by setting the location in B with index to 1 for . The function will test whether an element x is added by checking whether the locations with indices calculated from the hash functions are set to 1 or not. When returns 0, this implies that x is definitely not added to the bloom filter; on the other hand, if returns 1, this tells us that x is probably in the bloom filter. Therefore, false negatives are impossible, while false positives are allowed. However, given that the probability of false positives () is equal to , increasing the number of hash functions () can lead to a negligible value of . Accordingly, the optimal size of the bloom filter can be set by .
Therefore, we can, based on [
40], perform a private intersection protocol as shown in Algorithm 11:
| Algorithm 11 Private Set Intersection Protocol |
- 1:
: - 2:
- 3:
- 4:
- 5:
: - 6:
|
In STEP 3, we have the following
This computation will result to
If
is in the bloom filter, then
will all be 1. So
. That is
. However, if
is not in the bloom filter, then
will be randomly chosen, and so,
will be chosen at random.
4.3. Limitations and Discussion
Kyber and McEliece PKEs are probabilistic and achieve indistinguishability under chosen plaintext attack (IND-CPA) security. However, many real-world applications require a much stronger notion of security against active attacks. Namely, the PKE scheme should achieve indistinguishability under chosen ciphertext attack (IND-CCA) security. In the literature, IND-CCA secure variants of McEliece have been, proposed as in [
41,
42,
43]. On the contrary, there is no IND-CCA secure variant of Kyber PKE.
Employing the McEliece cryptosystem as an additive homomorphic scheme primarily affects its security. Hence, we need to choose the error vectors so that they have proper weights. Since we have at most
in STEP 4 and STEP 5 Algorithm 11, then we choose
. This reduction in the weight should be managed carefully; otherwise, the cryptosystem will be vulnerable to an information-set decoding attack [
44,
45]. In the experimental work section, we multiply
t by 1.5 to keep the algorithm supporting the homomorphic property with high security. However, this will affect the ciphertext and public key size as will be shortly shown in
Section 5. For the same security issue, the McEliece-based PSI protocol (Algorithm 11) is built upon the less efficient protocol [
40] rather than the one proposed in [
46,
47]. From the correctness perspective, the McEliece-based PSI protocol depends on, again, the error vector and the Bloom filter, which may generate false positives.
On the other hand, the homomorphic properties of Kyber are straightforward and reveal no security breaches. One limitation of Kyber as a multiplicative homomorphic scheme is that the use of this property becomes computationally costly when many items are involved in a single multiplication operation, since bootstrapping keys and relinearization are required [
31]. However, when only a small number of items (e.g., two) are multiplied per operation, as in the Kyber-based PSI protocol (Algorithm 10), the overhead remains manageable, keeping the computational cost within a practical range, as will be shown in the next section. Another limitation of the Kyber-based PSI protocol is that the non-standard ciphertext resulted by the homomorphic multiplication is larger than the standard ciphertext by
. The correctness of the protocol relies on the noise boundedness introduced by the Kyber PKE. That is, the protocol is correct as long as the accumulated noise during encryption and homomorphic multiplications does not exceed the decryption threshold.
Moreover, due to the aforementioned limitations, extending the proposed two-party PSI protocols to multiparty PSI protocols is almost impossible. Indeed, in the multiparty environment, these limitations are no longer manageable.
5. Experimental Work and Performance Evaluation
To evaluate our proposed protocols experimentally, we consider a Peer-to-Peer (P2P) file sharing network in which each file is identified by a unique serial number, namely a 32-byte integer. Each peer owns some files. Once a peer gets connected to another peer, they can only know the mutually inclusive file set that both pose by applying one of the proposed PSI protocols. The peer can repeat this process with several peers in the same network. Ultimately, the peer can know if he owns rare files that are only owned by him or a few other peers in the network; thus, an urgent backup must be performed.
Experimental Environment: Java JDK version 17 [
48] is used as a programming language, Apache NetBeans version 20 [
49] is used as an integrated development environment (IDE), and the Bouncy Castle cryptography library [
50] is used to implement the proposed cryptographic algorithms. The PC specifications include an Intel Core i7-1255U CPU, 32 GB RAM, and Windows 11 Pro 64-bit OS.
Storage overhead, communication overhead, and computation cost are the metrics of interest to compute based on the three NIST security levels: level 1 meets the security of AES-128, level 3 meets the security of AES-192, and level 5 meets the security of AES-256.
Based on the specification above,
Table 1 shows reference values for public/private keys and ciphertext sizes of Kyber and McEliece/Niederreiter PKE. Where
McEliece1.5t refers to a tuned version of McEliece that has the
t value increased by a factor of 1.5 to keep the McEliece algorithm secure after applying the additive homomorphic property. For instance, if
McEliece’s t value is 64 for level-1 security, the
McEliece1.5t’s t value becomes 96, and so on. The cost of this increase is an increase in the public key and ciphertext size, as shown in the table. Also, to avoid any decryption failure, the original encrypted texts, i.e., prior to the additive homomorphism, must be encrypted using
and
values for
McEliece and
McEliece1.5t protocols, respectively. We stress that all McEliece parameters and results are due to the Niederreiter variant.
5.1. Storage Overhead
The storage overhead (SO) is the overall sum of the public/private key sizes and the ciphertext (
) size of the entire set,
G, as depicted in (
14).
The key size depends on the underlying algorithm and required security level. The overall ciphertext size is the size of a single encrypted element multiplied by the number of elements in the set,
.
Results for NIST’s level-1 security are depicted in
Table 2 and
Figure 1. The results show that the most storage overhead of McEliece comes from its keys, while the most storage overhead of Kyber comes from its ciphertext. Thus, it is preferable to use Kyber for small set sizes, approximately less than 400 elements when Kyber is compared with McEliece and less than 550 elements when compared with
McEliece1.5t.
Figure 1 shows much clearer results. It shows that McEliece’s line increases by a small constant while Kyber’s line increases linearly. More importantly, it shows the exact intersection point and set size where it is recommended to switch to McEliece. The figure shows that the set size of 392 is where McEliece can be preferably used, as it incurs only 294.33 KB overhead, whereas Kyber incurs 295.53 KB overhead. Also, at the set size of 540, the storage overhead incurred by
McEliece1.5t and Kyber is 405.21 KB and 406.53 KB, respectively.
Results for NIST’s level-3 security are depicted in
Table 3 and
Figure 2. The results show the same tendency as the level-1 security. However, since the public key size of McEliece is almost twice the size of level-1, Kyber performs better than McEliece and McEleice1.5t as long as the set size is less than 575 and 764, respectively, as shown in
Figure 2.
Table 4 and
Figure 3 show the results of NIST’s level-5 security. Again, McEliece’s public key is approximately doubled, and thus the storage overhead is doubled. Therefore, Kyber outperforms McEliece and McEliece1.5t as long as the set size is less than 779 and 1062, respectively, as shown in
Figure 3.
5.2. Communication Overhead
For the Kyber-based protocol (Algorithm 10), Peer 1, who initiates the PSI protocol, sends the encrypted elements of his entire set (including the zero elements) along with the public key. Peer 2 then returns the result of the homomorphic multiplication (element-wise) of its set with Peer 1’s set. Thus, the overall communication overhead, as given in Equation (
15), is the sum of the public key size, the size of Peer 1’s encrypted set, and the size of the encrypted products of Peer 1’s and Peer 2’s sets.
For the McEliece-based protocol (Algorithm 11), Peer 1 should send the encrypted Bloom filter corresponding to his private set, along with the public key. Once Peer 2 receives the encrypted bloom filter, he evaluates the expansion function on the received bloom filter. Finally, he homomorphically xors the expanded ciphertexts with his elements and returns the result. Therefore, the overall communication overhead, Equation (
16), is the total of the public key size, the size of the encrypted bloom filter, and the size of the encrypted set.
Note for this implementation, we set the number of hash functions
, such that the false positive rate (
) is
, following the recommendations in [
46,
47].
The results of the communication overhead for all NIST’s security levels are listed in
Table 5,
Table 6 and
Table 7 and depicted in
Figure 4,
Figure 5 and
Figure 6. The results show that for small set sizes, McEliece incurs higher communication overhead than Kyber. After a certain point, McEliece starts to outperform Kyber, and the gap between the two algorithms increases dramatically as the set size increases. Overall, the performance of Kyber is highly affected by the set size, whereas McEliece is slightly affected. Apparently, the figures show that McEliece’s line is almost straight, while Kyber’s line grows linearly.
5.3. Computation Cost
The most costly operation in the Kyber-based protocol (Algorithm 10) is the number of multiplication/mod operations, which increase as the module size (
K) increases. The most costly operation in the McEliece-based protocol (Algorithm 11) is the hashing. As mentioned earlier, the number of hash functions (
) is set to 50, following the recommendations in [
46,
47]. The results, summarized in
Table 8, indicate that the computation time of McEliece is substantially lower than that of Kyber.
6. Conclusions and Future Work
In this paper, we study the homomorphic properties of two important post-quantum public key cryptosystems (PKCs): Crystal-Kyber and Classic-McEliece. The research focuses on examining their underlying public key encryption (PKE) schemes. We begin with a comprehensive illustration of both candidates and their homomorphic properties, enriched with examples. Next, we apply these homomorphic properties to the private set intersection (PSI) problem. Two different PSI protocols are designed: one based on the additive homomorphic property of McEliece PKE and the other based on the multiplicative homomorphic property of Kyber PKE. Additionally, the limitations of each scheme are discussed and analyzed. To obtain much clearer insights, a practical performance evaluation under NIST’s security levels 1, 3, and 5 is conducted, focusing on three key metrics: storage overhead, communication overhead, and computation cost. Our findings indicate that the Kyber-based PSI Protocol is homomorphically secure, but it suffers from significant computational and communication overhead, which limits its practical applicability. Conversely, the McEliece-based PSI protocol demonstrates greater efficiency in practice but raises fundamental concerns regarding its suitability as a secure homomorphic encryption scheme. Yet, these limitations remain manageable within the controlled setting of two-party PSI protocols. However, extending to a multiparty environment would introduce significant challenges, making the limitations far more difficult to address in practice.
Future work includes improving the security and efficiency of the Kyber-based PSI protocol. From the security perspective, the IND-CPA Kyber PKE could be transformed into an IND-CCA Kyber PKE variant. A more efficient protocol could leverage the Kyber additive homomorphic rather than the multiplicative one. Another promising direction is to utilize the Kyber additive homomorphic property and combine it with a Bloom filter to design a multiparty PSI protocol.
Author Contributions
Conceptualization, A.A.A., K.A. and M.F.; Formal analysis, K.A.; Funding acquisition, M.F.; Methodology, A.A.A., K.A. and M.F.; Project administration, A.A.A. and M.F.; Software, A.A.A. and K.A.; Supervision, M.F.; Validation, A.A.A. and K.A.; Visualization, A.A.A.; Writing—original draft, A.A.A. and K.A.; Writing—review and editing, A.A.A. and M.F. All authors have read and agreed to the published version of the manuscript.
Funding
The APC was funded by the Interdisciplinary Research Center of Intelligent Secure Systems (IRC-ISS).
Data Availability Statement
The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.
Acknowledgments
The authors would like to acknowledge the support provided by King Fahd University of Petroleum and Minerals (KFUPM) and the Interdisciplinary Research Center of Intelligent Secure Systems.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Shor, P.W. Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM Rev. 1999, 41, 303–332. [Google Scholar] [CrossRef]
- Regev, O. On lattices, learning with errors, random linear codes, and cryptography. J. ACM (JACM) 2009, 56, 1–40. [Google Scholar] [CrossRef]
- Chou, T.; Cid, C.; UiB, S.; Gilcher, J.; Lange, T.; Maram, V.; Misoczki, R.; Niederhagen, R.; Paterson, K.; Persichetti, E. Classic McEliece: Conservative Code-Based Cryptography, 10 October 2020. 2020. Available online: https://cryptojedi.org/papers/mceliecenistr3-20201010.pdf (accessed on 1 October 2025).
- Bos, J.; Ducas, L.; Kiltz, E.; Lepoint, T.; Lyubashevsky, V.; Schanck, J.M.; Schwabe, P.; Seiler, G.; Stehlé, D. CRYSTALS-Kyber: A CCA-secure module-lattice-based KEM. In Proceedings of the 2018 IEEE European Symposium on Security and Privacy (EuroS&P), London, UK, 24–26 April 2018; IEEE: New York, NY, USA, 2018; pp. 353–367. [Google Scholar]
- Albrecht, M.R.; Bernstein, D.J.; Chou, T.; Cid, C.; Gilcher, J.; Lange, T.; Maram, V.; Von Maurich, I.; Misoczki, R.; Niederhagen, R.; et al. Classic McEliece: Conservative Code-Based Cryptography. 2022. Available online: https://cr.yp.to/talks/2024.09.17/slides-djb-20240917-mceliece-16x9.pdf (accessed on 1 October 2025).
- McEliece, R.J. A public-key cryptosystem based on algebraic coding theory. Coding Thv 1978, 4244, 114–116. [Google Scholar]
- Niederreiter, H. Knapsack-type cryptosystems and algebraic coding theory. Prob. Contr. Inform. Theory 1986, 15, 157–166. [Google Scholar]
- Wang, W.; Szefer, J.; Niederhagen, R. FPGA-based key generator for the Niederreiter cryptosystem using binary Goppa codes. In Proceedings of the International Conference on Cryptographic Hardware and Embedded Systems, Santa Barbara, CA, USA, 17–19 August 2017; Springer: Berlin/Heidelberg, Germany, 2017; pp. 253–274. [Google Scholar]
- Freedman, M.J.; Nissim, K.; Pinkas, B. Efficient private matching and set intersection. In Proceedings of the International Conference on the Theory and Applications of Cryptographic Techniques, Interlaken, Switzerland, 2–6 May 2004; Springer: Berlin/Heidelberg, Germany, 2004; pp. 1–19. [Google Scholar]
- Chen, L.; Li, Z.; Chen, Z.; Liu, Y. Two anti-quantum attack protocols for secure multiparty computation. In Proceedings of the Trusted Computing and Information Security: 12th Chinese Conference, CTCIS 2018, Wuhan, China, 18 October 2018; Revised Selected Papers 12. Springer: Berlin/Heidelberg, Germany, 2019; pp. 338–359. [Google Scholar]
- Paillier, P. Public-key cryptosystems based on composite degree residuosity classes. In Proceedings of the International Conference on the Theory and Applications of Cryptographic Techniques, Prague, Czech Republic, 2–6 May 1999; Springer: Berlin/Heidelberg, Germany, 1999; pp. 223–238. [Google Scholar]
- Rivest, R.L.; Shamir, A.; Adleman, L. A method for obtaining digital signatures and public-key cryptosystems. Commun. ACM 1978, 21, 120–126. [Google Scholar] [CrossRef]
- Goldwasser, S.; Micali, S. Probabilistic encryption. J. Comput. Syst. Sci. 1984, 28, 270–299. [Google Scholar] [CrossRef]
- ElGamal, T. A public key cryptosystem and a signature scheme based on discrete logarithms. IEEE Trans. Inf. Theory 1985, 31, 469–472. [Google Scholar] [CrossRef]
- Naccache, D.; Stern, J. A new public key cryptosystem based on higher residues. In Proceedings of the 5th ACM Conference on Computer and Communications Security, Francisco, CA, USA, 3–5 November 1998; pp. 59–66. [Google Scholar]
- Benaloh, J. Dense probabilistic encryption. In Proceedings of the Workshop on Selected Areas of Cryptography, Kingston, ON, Canada, 5–6 May 1994; pp. 120–128. [Google Scholar]
- Okamoto, T.; Uchiyama, S. A new public-key cryptosystem as secure as factoring. In Proceedings of the Advances in Cryptology—EUROCRYPT’98: International Conference on the Theory and Application of Cryptographic Techniques, Espoo, Finland, 31 May–4 June 1998; Proceedings 17. Springer: Berlin/Heidelberg, Germany, 1998; pp. 308–318. [Google Scholar]
- Boneh, D.; Goh, E.J.; Nissim, K. Evaluating 2-DNF formulas on ciphertexts. In Proceedings of the Theory of Cryptography: Second Theory of Cryptography Conference, TCC 2005, Cambridge, MA, USA, 10–12 February 2005; Proceedings 2. Springer: Berlin/Heidelberg, Germany, 2005; pp. 325–341. [Google Scholar]
- Yao, A.C. Protocols for secure computations. In Proceedings of the 23rd Annual Symposium on Foundations of Computer Science (SFCS 1982), Chicago, IL, USA, 3–5 November 1982; IEEE: New York, NY, USA, 1982; pp. 160–164. [Google Scholar]
- Sander, T.; Young, A.; Yung, M. Non-interactive cryptocomputing for nc/sup 1. In Proceedings of the 40th Annual Symposium on Foundations of Computer Science (Cat. No. 99CB37039), New York, NY, USA, 17–18 October 1999; IEEE: New York, NY, USA, 1999; pp. 554–566. [Google Scholar]
- Ishai, Y.; Paskin, A. Evaluating branching programs on encrypted data. In Proceedings of the Theory of Cryptography Conference, Amsterdam, The Netherlands, 21–24 February 2007; Springer: Berlin/Heidelberg, Germany, 2007; pp. 575–594. [Google Scholar]
- Gentry, C. Fully homomorphic encryption using ideal lattices. In Proceedings of the Forty-First Annual ACM Symposium on Theory of Computing, Bethesda, MD, USA, 31 May–2 June 2009; pp. 169–178. [Google Scholar]
- Brakerski, Z. Fully homomorphic encryption without modulus switching from classical GapSVP. In Proceedings of the Annual Cryptology Conference, Santa Barbara, CA, USA, 19–23 August 2012; Springer: Berlin/Heidelberg, Germany, 2012; pp. 868–886. [Google Scholar]
- Brakerski, Z.; Gentry, C.; Vaikuntanathan, V. (Leveled) fully homomorphic encryption without bootstrapping. ACM Trans. Comput. Theory (TOCT) 2014, 6, 1–36. [Google Scholar] [CrossRef]
- Cheon, J.H.; Kim, A.; Kim, M.; Song, Y. Homomorphic encryption for arithmetic of approximate numbers. In Proceedings of the Advances in Cryptology–ASIACRYPT 2017: 23rd International Conference on the Theory and Applications of Cryptology and Information Security, Hong Kong, China, 3–7 December 2017; Proceedings, Part I 23. Springer: Berlin/Heidelberg, Germany, 2017; pp. 409–437. [Google Scholar]
- Chillotti, I.; Gama, N.; Georgieva, M.; Izabachene, M. Faster fully homomorphic encryption: Bootstrapping in less than 0.1 seconds. In Proceedings of the Advances in Cryptology–ASIACRYPT 2016: 22nd International Conference on the Theory and Application of Cryptology and Information Security, Hanoi, Vietnam, 4–8 December 2016; Proceedings, Part I 22. Springer: Berlin/Heidelberg, Germany, 2016; pp. 3–33. [Google Scholar]
- Langlois, A.; Stehlé, D. Worst-case to average-case reductions for module lattices. Des. Codes Cryptogr. 2015, 75, 565–599. [Google Scholar] [CrossRef]
- Lyubashevsky, V.; Peikert, C.; Regev, O. On ideal lattices and learning with errors over rings. In Proceedings of the Advances in Cryptology–EUROCRYPT 2010: 29th Annual International Conference on the Theory and Applications of Cryptographic Techniques, French Riviera, France, 30 May–3 June 2010; Proceedings 29. Springer: Berlin/Heidelberg, Germany, 2010; pp. 1–23. [Google Scholar]
- Avanzi, R.; Bos, J.; Ducas, L.; Kiltz, E.; Lepoint, T.; Lyubashevsky, V.; Schanck, J.M.; Schwabe, P.; Seiler, G.; Stehlé, D. CRYSTALS-Kyber algorithm specifications and supporting documentation. NIST PQC Round 2019, 2, 1–43. Available online: https://pq-crystals.org/kyber/data/kyber-specification-round3-20210131.pdf (accessed on 10 August 2025).
- Özeren, S.; Yayla, O. Methods for masking crystals-kyber against side-channel attacks. In Proceedings of the 2023 16th International Conference on Information Security and Cryptology (ISCTürkiye), Ankara, Turkiye, 18–19 October 2023; pp. 71–76. [Google Scholar]
- Mukherjee, A.; Aikata, A.; Mert, A.C.; Lee, Y.; Kwon, S.; Deryabin, M.; Roy, S.S. ModHE: Modular Homomorphic Encryption Using Module Lattices: Potentials and Limitations. Cryptology ePrint Archive. 2023. Available online: https://tches.iacr.org/index.php/TCHES/article/download/11261/10803/11220 (accessed on 10 February 2025).
- Zhao, C.-C.; Yang, Y.-T.; Li, Z.-C. The homomorphic properties of McEliece public-key cryptosystem. In Proceedings of the 2012 Fourth International Conference on Multimedia Information Networking and Security, Nanjing, China, 2–4 November 2012; IEEE: New York, NY, USA, 2012; pp. 39–42. [Google Scholar]
- Morales, D.; Agudo, I.; Lopez, J. Private set intersection: A systematic literature review. Comput. Sci. Rev. 2023, 49, 100567. [Google Scholar] [CrossRef]
- Debnath, S.K.; Stănică, P.; Choudhury, T.; Kundu, N. Post-quantum protocol for computing set intersection cardinality with linear complexity. IET Inf. Secur. 2020, 14, 661–669. [Google Scholar] [CrossRef]
- Debnath, S.K.; Choudhury, T.; Kundu, N.; Dey, K. Post-quantum secure multi-party private set-intersection in star network topology. J. Inf. Secur. Appl. 2021, 58, 102731. [Google Scholar] [CrossRef]
- Cai, Y.; Tang, C.; Xu, Q. Two-party privacy-preserving set intersection with FHE. Entropy 2020, 22, 1339. [Google Scholar] [CrossRef] [PubMed]
- Gao, S. Efficient Fully Homomorphic Encryption Scheme. Cryptology ePrint Archive. 2018. Available online: https://eprint.iacr.org/2018/637 (accessed on 6 June 2025).
- Camenisch, J.; Zaverucha, G.M. Private intersection of certified sets. In Proceedings of the Financial Cryptography and Data Security: 13th International Conference, FC 2009, Accra Beach, Barbados, 23–26 February 2009; Revised Selected Papers 13. Springer: Berlin/Heidelberg, Germany, 2009; pp. 108–127. [Google Scholar]
- Bloom, B.H. Space/time trade-offs in hash coding with allowable errors. Commun. ACM 1970, 13, 422–426. [Google Scholar] [CrossRef]
- Kerschbaum, F. Outsourced private set intersection using homomorphic encryption. In Proceedings of the Proceedings of the 7th ACM Symposium on Information, Computer and Communications Security, Seoul, Republic of Korea, 2–4 May 2012; pp. 85–86. [Google Scholar]
- Dottling, N.; Dowsley, R.; Muller-Quade, J.; Nascimento, A.C. A CCA2 secure variant of the McEliece cryptosystem. IEEE Trans. Inf. Theory 2012, 58, 6672–6680. [Google Scholar] [CrossRef]
- Rastaghi, R. An efficient CCA2-secure variant of the McEliece cryptosystem in the standard model. arXiv 2013, arXiv:1302.0347. [Google Scholar]
- Nojima, R.; Imai, H.; Kobara, K.; Morozov, K. Semantic security for the McEliece cryptosystem without random oracles. Des. Codes Cryptogr. 2008, 49, 289–305. [Google Scholar] [CrossRef]
- Canteaut, A.; Chabaud, F. A new algorithm for finding minimum-weight words in a linear code: Application to McEliece’s cryptosystem and to narrow-sense BCH codes of length 511. IEEE Trans. Inf. Theory 1998, 44, 367–378. [Google Scholar] [CrossRef]
- Horlemann, A.L.; Puchinger, S.; Renner, J.; Schamberger, T.; Wachter-Zeh, A. Information-set decoding with hints. In Proceedings of the Code-Based Cryptography Workshop, Munich, Germany, 21–22 June 2021; Springer: Berlin/Heidelberg, Germany, 2021; pp. 60–83. [Google Scholar]
- Bay, A.; Erkin, Z.; Hoepman, J.H.; Samardjiska, S.; Vos, J. Practical multi-party private set intersection protocols. IEEE Trans. Inf. Forensics Secur. 2021, 17, 1–15. [Google Scholar] [CrossRef]
- Davidson, A.; Cid, C. An efficient toolkit for computing private set operations. In Proceedings of the Australasian Conference on Information Security and Privacy, Auckland, New Zealand, 3–5 July 2017; Springer: Berlin/Heidelberg, Germany, 2017; pp. 261–278. [Google Scholar]
- Nita, S.L.; Mihailescu, M.I. Jdk 17: New features. In Cryptography and Cryptanalysis in Java: Creating and Programming Advanced Algorithms with Java SE 17 LTS and Jakarta EE 10; Springer: Berlin/Heidelberg, Germany, 2022; pp. 9–19. [Google Scholar]
- Kostaras, I.; Drabo, C.; Juneau, J.; Reimers, S.; Schröder, M.; Wielenga, G.; Kostaras, I.; Drabo, C.; Juneau, J.; Reimers, S.; et al. What Is Apache NetBeans. Pro Apache NetBeans: Building Applications on the Rich Client Platform; Springer: Berlin/Heidelberg, Germany, 2020; pp. 3–28. [Google Scholar]
- Bouncy Castle Crypto Library. Available online: https://www.bouncycastle.org (accessed on 11 May 2024).
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).