On the Homomorphic Properties of Kyber and McEliece with Application to Post-Quantum Private Set Intersection

Abudaqa, Anas A.; Alshehri, Khaled; Felemban, Muhamad

doi:10.3390/cryptography9040066

Open AccessArticle

On the Homomorphic Properties of Kyber and McEliece with Application to Post-Quantum Private Set Intersection

by

Anas A. Abudaqa

^1,*

,

Khaled Alshehri

² and

Muhamad Felemban

^3,4,5

¹

Department of Computer Science, College of Engineering and Information Technology, Onaizah Colleges (OC), Qassim 56447, Saudi Arabia

²

Mathematics & Statistics Department, King Fahd University of Petroleum and Minerals (KFUPM), Dhahran 31261, Saudi Arabia

³

Interdisciplinary Research Center of Intelligent Secure Systems, King Fahd University of Petroleum and Minerals (KFUPM), Dhahran 31261, Saudi Arabia

⁴

Computer Engineering Department, King Fahd University of Petroleum and Minerals (KFUPM), Dhahran 31261, Saudi Arabia

⁵

Information and Computer Science Department, King Fahd University of Petroleum and Minerals (KFUPM), Dhahran 31261, Saudi Arabia

^*

Author to whom correspondence should be addressed.

Cryptography 2025, 9(4), 66; https://doi.org/10.3390/cryptography9040066

Submission received: 23 August 2025 / Revised: 4 October 2025 / Accepted: 13 October 2025 / Published: 20 October 2025

Download

Browse Figures

Review Reports Versions Notes

Abstract

Crystals-Kyber and Classic-McEliece are two prominent post-quantum key encapsulation mechanisms (KEMs) designed to address the challenges posed by quantum computing to classical cryptographic schemes. While the former has been standardized by the National Institute of Standards and Technology (NIST), the latter is well-known for its exceptional robustness and as one of the finalists of the fourth round of post-quantum cryptography standardization. Private set intersection (PSI) is a privacy-preserving technique that enables two parties, each possessing a dataset, to compute the intersection of their sets without revealing anything else. This can be achieved thanks to homomorphic encryption (HE), which allows computations on encrypted data. In this paper, firstly, we study Kyber and McEliece, apart from being KEMs, as post-quantum public key encryption (PKE), and examine their homomorphic properties. Secondly, we design two different two-party PSI protocols that utilize the homomorphic capabilities of Kyber and McEliece. Thirdly, a practical performance evaluation under NIST’s security levels 1, 3, and 5 is conducted, focusing on three key metrics: storage overhead, communication overhead, and computation cost. Insights indicate that the Kyber-based PSI Protocol, which utilizes the multiplicative homomorphic property, is secure but less efficient. In contrast, the McEliece-based PSI protocol, while efficient in practice, raises concerns regarding its security as a homomorphic encryption scheme.

Keywords:

homomorphic encryption; Crystal Kyber; Classic McEliece; post-quantum cryptography; key encapsulation method; private set intersection

1. Introduction

Of late, cryptographers have become increasingly worried about advances in quantum computing. Once a powerful quantum computer is released, Shor’s algorithm [1] will break all widely deployed public key cryptosystems (PKC), specifically those based on factoring or discrete-log hard problems like RSA and elliptic curve constructions. Consequently, in 2017, NIST called for Post-quantum cryptography (PQC) proposals. Lattice-based and code-based cryptographic schemes are two of the main types of proposals submitted to the NIST PQC competition for standardization. Code-based and lattice-based cryptographic schemes are conceptually similar in using noise (error) for security but differ in their mathematical foundations. Both schemes leverage the difficulty of certain structured decoding problems [2]. Indeed, lattice-based cryptography replaces the ‘codeword plus random errors’ of the code-based cryptography with a ‘lattice point plus random errors’ [3].

In the NIST’s third round PQC competition, Crystal-Kyber [4], a lattice-based post-quantum key encapsulation mechanism (KEM), has been selected for standardization. Additionally, another four KEM algorithms have been selected to continue in the NIST’s fourth round. Among the four submissions, Classic-McEleiece [3,5], an important code-based post-quantum KEM candidate, is constructed based on the original McEliece [6] and its dual variant, Niederreiter [7,8] public key encryption (PKE), which have proved their stability and robustness against attacks for over 40 years.

Private set intersection (PSI) is a cryptographic multiparty computation problem first introduced in [9], where distrusted parties holding private sets of elements submit their sets as inputs, and only one or more parties receive the intersection and nothing else. In the classical formulation, the protocol enables two parties to jointly compute the intersection. However, the problem can be extended to a multiparty PSI, adding a layer of complexity. Moreover, there are many variants of PSI. For example, we can classify PSI into one-way PSI, where only one party learns the output, or mutual PSI, where all parties receive the output. We can also classify the problem by the sizes of the input sets, such as balanced PSI for inputs of comparable sizes or unbalanced PSI, where one input is much larger than the others. In another variant, the output can be other functions than the intersection, like PSI-CA, where the output is the cardinality of the intersection rather than the intersection itself. Moreover, the problem might demand additional requirements, such as threshold-PSI, where the parties obtain the output only when the cardinality is above a certain threshold, or size-hiding PSI, where the size of the inputs is hidden throughout the protocol.

The PSI has wide real-world applications like medical and criminal records, anomaly detection, private database queries, the military sector, and others. The following motivational example demonstrates one of the real-world MPSI scenarios.

Example [10]: Various military factories want to compare the quality of a specific precision instrument with others of the same type. To achieve this goal, they need to collect these precision instruments from different military factories. However, no military factory is willing to expose its own private data to others.

Thanks to homomorphic encryption (HE), the sets’ intersection can be computed, and privacy is still preserved. HE is a property of public-key cryptosystems that enables certain computations to be performed on ciphertexts, producing an encrypted result that, when decrypted, complies with the outcome of the same operations on the plaintexts. Most public-key encryption schemes support at least one type of homomorphic operation. For the PSI problem, we are interested in homomorphic schemes that support addition and/or multiplication operations.

Motivated by this, in this work, we investigate and compare the homomorphic capabilities of the underlying public key encryption (PKE) schemes of two important post-quantum KEM candidates, namely Crystal-Kyber and Classic-McEliece, in the context of the two-party PSI problem. While the homomorphic properties of McEliece/Niederreiter PKE have already been studied in the literature stripped of any concrete applications, the homomorphic properties of Kyber have not, to the best of our knowledge, been directly examined. Therefore, we go beyond revisiting these properties by designing two new PSI protocols based on the respective cryptosystems. We stress that our focus is to study and examine the underlying PKEs of the mentioned candidates and to clarify whether these two candidates, in addition to their role in KEM schemes, are appropriate for quantum-safe homomorphic encryption.

Contributions

The main contributions of this study are threefold:

Designs a two-party PSI protocol that utilizes McEliece’s additive homomorphic property and the Bloom filter data structure.
Designs a two-party PSI protocol that utilizes Kyber’s multiplicative homomorphic property.
Provides an experimental performance evaluation framework to compare the two proposed protocols, considering NIST’s security levels 1, 3, and 5, and focusing on three key metrics: storage overhead, communication overhead, and computation cost.

The rest of the paper is organized as follows. Section 2 presents important background on HE, Crystal-Kyber, and Classic-McEliece. A literature review is conducted in Section 3. The design of the proposed PSI protocols and the theoretical analysis are detailed in Section 4, while the experimental work and results are presented in Section 5. Section 6 provides the conclusion and outlines the future work.

2. Background

In this section, we recall the necessary information that helps to understand the rest of the paper. Before delving into the details of the dedicated cryptosystems, we provide the following definitions to distinguish between three closely related concepts: public key cryptosystem (PKC), public key encryption (PKE), and key encapsulation mechanism (KEM).

Definition 1 (Public Key Cryptosystem (PKC)).

Any cryptographic scheme that uses a pair of keys (public and private) for performing cryptographic tasks, including encryption, digital signature, and key exchange, is broadly referred to as PKC.

Definition 2 (Public Key Encryption (PKE)).

A specific type of PKC designed solely for exchanging secure messages between two or more legitimate parties. It uses the public key for encryption and the private key for decryption.

Definition 3 (Key Encapsulation Mechanism (KEM)).

A PKC protocol that can be used by two parties to establish a shared secret key over a public (insecure) channel. The output is not an encrypted message but a secret key used for symmetric key encryption.

Also, we provide the following security definitions that help in the formal proofs of the proposed algorithms.

Definition 4 (IND-CPA Security).

A public key encryption scheme is said to achieve indistinguishability under chosen-plaintext attack (IND-CPA) if no polynomial-time (PPT) adversary

A

can distinguish between the encryptions of two messages of equal length, even when given access to the public key and the ability to encrypt any plaintext of their choice. Formally, for any PPT adversary

A

, the advantage in the IND-CPA game is negligible:

{Adv}_{PKE, A}^{IND-CPA} (κ) = |Pr [{IND-CPA}_{PKE}^{A} (κ) = 1] - \frac{1}{2}| \leq negl (κ)

(1)

where κ is the security parameter and the IND-CPA game involves the adversary choosing two messages,

m_{0}

and

m_{1}

, receiving an encryption of one of them, and attempting to determine which message was encrypted.

Definition 5 (IND-CCA Security).

A public key encryption scheme achieves indistinguishability under chosen-ciphertext attack (IND-CCA) if it satisfies IND-CPA security and additionally remains secure even when a PPT adversary

A

has access to a decryption oracle that can decrypt any ciphertext except the challenge ciphertext. This represents a stronger security notion than IND-CPA, as it protects against adaptive chosen-ciphertext attacks. Formally, for any PPT adversary

A

with access to decryption oracle

O_{dec}

, the advantage in the IND-CCA game is negligible:

{Adv}_{PKE, A}^{IND-CCA} (κ) = |Pr [{IND-CCA}_{PKE}^{A, O_{dec}} (κ) = 1] - \frac{1}{2}| \leq negl (κ)

(2)

where the adversary can query the decryption oracle on any ciphertext other than the challenge ciphertext during the attack game.

Definition 6 (Semi-Honest Security Model).

A two-party PSI protocol

π = (P_{1}, P_{2})

is secure against semi-honest, a.k.a. honest but curious, adversaries if for any probabilistic polynomial-time (PPT) adversary

A

that corrupts at most one party, there exists a PPT simulator

S

such that for all sets

G_{1}

and

G_{2}

:

{{IDEAL}_{S, A} (G_{1}, G_{2}, κ)} \overset{c}{\equiv} {{REAL}_{π, A} (G_{1}, G_{2}, κ)}

(3)

where

IDEAL represents the ideal functionality where parties only learn $G_{1} \cap G_{2}$ .
REAL represents the real protocol execution.
$\overset{c}{\equiv}$ denotes computational indistinguishability.

2.1. Homomorphic Encryption

HE is a property of public-key cryptosystems that enables certain arithmetic computations to be performed on ciphertexts, producing an encrypted result that, when decrypted, complies with the outcome of the same operations on the plaintexts. Most public-key encryption schemes support at least one type of homomorphic operation. HE can be classified into three categories: (1) Partially homomorphic encryption (PHE), which can perform only one mathematical function an unlimited number of times. For example, the Paillier cryptosystem [11] supports only the addition operation, while the original RSA [12] can support only the multiplication operation. Other PHE schemes are Goldwasser and Micali [13], ElGamal [14], Naccache and Stern [15], Benaloh [16], and Okamoto and Uchiyama [17]. (2) Somewhat homomorphic encryption (SWHE): it supports some types of arithmetic functions a limited number of times. For example, the Boneh–Goh–Nissim cryptosystem [18] can evaluate unlimited additions and one multiplication. Other SWHE schemes are Yao [19], Sander et al. [20], and Ishai and Paskin [21]. (3) Fully homomorphic encryption (FHE), which supports an unlimited number of operations for an unlimited number of times. Constructing a fully homomorphic scheme that can evaluate an arbitrary function was not possible until Gentry’s breakthrough in 2009 [22]. Other FHE schemes are BFV [23], BGV [24], CKKS [25], and TFHE [26].

2.2. Crystal Kyber Cryptosystem

Crystal-Kyber [4] is a lattice-based public key cryptosystem mainly used as a KEM for establishing private keys for symmetric-key cryptosystems. The Kyber KEM is IND-CCA2 secure and is built upon Kyber PKE, which is IND-CPA secure. Its security is based on the hardness of the modular learning with errors (MWLE) problem, which is as hard as several worst-case lattice problems, specifically, the shortest vector problem (SVP). MLWE was first introduced in [27], and it offers much better efficiency and security tradeoffs when compared with learning with error (RWE) [2] and ring learning with error (RLWE) [28] problems. Moreover, its keys are fairly small to be used in real-world applications.

For detailed information and documentation about Kyber’s IND-CCA2-secure KEM, we refer the reader to [29].

Herein, we describe Kyber’s IND-CPA-secure PKE as it will be used to investigate the homomorphic properties of the cryptosystem. Let R and

R_{q}

denote the rings

Z [X] / (X^{n} + 1)

and

Z_{q} [X] / (X^{n} + 1)

, respectively, where n is a power of two such that

X^{n} + 1

is a cyclotomic polynomial. Specifically, Kyber operates over the polynomial ring

R_{3329} = Z_{3329} [x] / 〈 x^{256} + 1 〉

and sets a security parameter

k \geq 2

to achieve better security than RLWE. Namely, k determines the number of polynomials in the module setting. Notably, when

k = 1

, the MLWE instance reduces to a pure RLWE instance [30].

Kyber’s PKE Keygen, Encryption, and Decryption algorithms [4] are defined in Algorithm 1, Algorithm 2, and Algorithm 3, respectively.

Algorithm 1 Kyber.KeyGen(): key generation

1:: $A \sim R_{q}^{k \times k}$
2:: $(s, e) \sim β_{η}^{k} \times β_{η}^{k}$
3:: $t : = {Compress}_{q} (A s + e, d_{t})$
4:: $return (p k : = (A, t), s k : = s)$

Algorithm 2 Kyber.Enc(pk = (

A, t)

,

m \in M

): encryption

1:: $t : = {Decompress}_{q} (t, d_{t})$
2:: $(r, e_{1}, e_{2}) \sim β_{η}^{k} \times β_{η}^{k} \times β_{η}$
3:: $u : = {Compress}_{q} (A^{T} r + e_{1}, d_{u})$
4:: $v : = {Compress}_{q} (t^{T} r + e_{2} + ⌈\frac{q}{2}⌋ \cdot m, d_{v})$
5:: $return c : = (u, v)$

Algorithm 3 Kyber.Dec(sk = s, c = (

u, v

)): decryption

1:: $u : = {Decompress}_{q} (u, d_{u})$
2:: $v : = {Decompress}_{q} (v, d_{v})$
3:: $return Compress (v - s^{T} u, 1)$

2.2.1. Example

For a simple example we set

k = 2, n = 4, q = 7681, d_{u} = d_{t} = 11, d_{v} = 3

. Let Bob’s public key be a 2 by 2 matrix

A

chosen randomly,

A = (\begin{matrix} 1917 x^{3} + 2032 x^{2} + 2056 x + 273 & 4818 x^{3} + 3189 x^{2} + 6024 x + 153 \\ 520 x^{3} + 7002 x^{2} + 4588 x + 4276 & 2324 x^{3} + 5975 x^{2} + 3315 x + 1547 \end{matrix})

Then he also samples the secret key s and the error vector e from the centered binomial distribution

β_{4}

:

\begin{matrix} s = (\binom{x^{3} - 2 x}{- x^{3} - x^{2} - 2 x}) \\ e = (\binom{- x^{3} - x + 2}{- x^{2}}) \end{matrix}

Calculating

t = A s + e

mod

x^{4} + 1

, we obtain

t = (\binom{6696 x^{3} + 1950 x^{2} + 5122 x + 5267}{4184 x^{3} + 7493 x^{2} + 5013 x + 2709})

after using the compression function, he obtains

t = (\binom{1785 x^{3} + 520 x^{2} + 1366 x + 1404}{1116 x^{3} + 1998 x^{2} + 1337 x + 722})

Then the public key will be

(A, t)

, while the secret key is

s

. Suppose Alice wants to send the message

m = x^{3} + x + 1

. For the first sample

(r, e_{1}, e_{2}) \sim β_{4}^{2} \times β_{4}^{2} \times β_{4}

, we obtain

\begin{matrix} r = (\binom{- 2 x^{3} + x^{2} + 2 x + 1}{- 2 x^{3} - 3 x^{2}}) \\ e_{1} = (\binom{- 3 x^{3} + x^{2} - 3 x - 3}{- 3 x^{3} - 3 x}) \\ e_{2} = x + 3 \end{matrix}

To find the ciphertext, Alice computes

u : = A^{T} r + e_{1}, v : = t^{T} r + e_{2} + 3841 \cdot m

, we obtain

\begin{matrix} u = (\binom{534 x^{3} + 6145 x^{2} + 4948 x + 5655}{3872 x^{3} + 1990 x^{2} + 3766 x + 888}) \\ v = 3931 x^{3} + 376 x^{2} + 5841 x + 5799 \end{matrix}

after compression, the ciphertext will turn to

\begin{matrix} u = (\binom{142 x^{3} + 1638 x^{2} + 1319 x + 1508}{1032 x^{3} + 531 x^{2} + 1004 x + 237}) \\ v = 3931 x^{3} + 376 x^{2} + 5841 x + 5799 \end{matrix}

To decrypt the message, Bob decompresses the ciphertext and calculates

v - s^{T} u = 3747 x^{3} + 7294 x^{2} + 3769 x + 3824

; then the coefficients that are closer to 3841 than 0 or 7681 will be considered 1, otherwise 0. That is, the decrypted message will be

1 * x^{3} + 0 * x^{2} + 1 * x + 1

, Alice’s original message.

2.2.2. Homomorphism

Let

E (x) = (u_{x}, v_{x})

and

E (y) = (u_{y}, v_{y})

be two ciphertexts. The following homomorphic relation holds:

E (x) + E (y) = E (x \oplus y)

(4)

We define the addition in (4) as follows

E (x) + E (y) = (u_{x} + u_{y}, v_{x} + v_{y})

.

For the multiplication, it is not as straightforward. Consider the following multiplication of two messages

\begin{matrix} (w_{x} + ⌈\frac{q}{2}⌋ \cdot x) (w_{y} + ⌈\frac{q}{2}⌋ \cdot y) & \approx ((v_{x} - u_{x} \cdot s) \cdot (v_{y} - u_{y} \cdot s)) \\ \approx v_{x} v_{y} - v_{x} \sum_{i = 1}^{k} u_{y i} g_{i} - v_{y} \sum_{i = 1}^{k} u_{x i} g_{i} + \sum_{i = 1}^{k} \sum_{j = 1}^{k} u_{x i} u_{y j} g_{i} g_{j} . \end{matrix}

where

u_{x i}, u_{y i}, g_{i}

are the i-th component of

u_{x}, u_{y}, s

and

{∥w_{i}∥}_{\infty} \leq ⌈\frac{q}{4}⌋

. Multiplying the LHS yields

(w_{x} + ⌈ \frac{q}{2} ⌋ \cdot x) (w_{y} + ⌈ \frac{q}{2} ⌋ \cdot y) = w_{x} w_{y} + ⌈ \frac{q}{2} ⌋ (x w_{y} + y w_{x}) + {⌈ \frac{q}{2} ⌋}^{2} x y

Therefore, we write the multiplication as follows

E (x) \cdot E (y) = (d_{1}, d_{2}, d_{3})

(5)

The ciphertext in this form, Equation (5), is a non-standard ciphertext computed over the rationals

Q

, where

\begin{matrix} d_{1} & = v_{x} \cdot v_{y} & d_{2 i} & = v_{x} \cdot u_{y i} + v_{y} \cdot u_{x i} \\ d_{3_{i j}} & = u_{x i} \cdot u_{y j} \end{matrix}

Note that the size of the ciphertext increased, and it is not decrypted as a usual ciphertext. It is possible to reduce it to the usual ciphertext form

(d_{1}^{'}, d_{2}^{'})

such that it is decrypted as

d_{2}^{'} - s^{T} d_{1}^{'}

using a computationally costly operation called Relinearization [31]. Relinearization could be avoided when the multiplicative depth is small using what is called the leveled FHE [24]. As such, the non-standard ciphertext can be successfully decrypted using Equation (6).

x \cdot y = ⌈ \frac{d_{1} - d_{2} \cdot s + d_{3} s \cdot s}{{⌈ \frac{q}{2} ⌋}^{2}} ⌋

(6)

For instance, in the two-party PSI protocol, the multiplicative depth is exactly equal to one, and Equation (6) can be efficiently used.

2.3. McEliece/Niederreiter and Classic-McEliece Cryptosystem

The Classic-McEliece key-encapsulation mechanism is derived from the code-based public key cryptosystem proposed by R. McEliece in 1978 [6]. More precisely, Classic-McEliece KEM is built upon Niederreiter [7,8], which is a dual variant of the McEliece PKE. The original McEliece PKE has a robust security history, as dozens of papers over 40 years have tried, with no success, to attack this system. The main reason that keeps McEliece away from practical consideration is its huge key sizes compared to those of its number-theoretic PKE counterparts.

McEliece PKE is based on the hardness of decoding a general linear code. The original algorithm uses binary Goppa codes, which can be efficiently decoded using an algorithm by Paterson. Let G be the

k \times n

generator matrix of a linear code that can correct up to t errors and has an efficient decoding algorithm. The cryptosystem’s Keygen, Encryption, and Decryption algorithms are defined in Algorithms 4–6 as follows:

Algorithm 4 McEliece.KeyGen(): key generation

1:: $Select a random binary non-singular matrix S \sim F_{2}^{k \times k}$
2:: $Select an n \times n permutation matrix P$
3:: $compute G_{p} : = S G P$
4:: $return pk : = G_{p}, sk : = (S, P)$

Algorithm 5 McEliece.Enc(pk =

G_{p}

,

m \in F_{2}^{k}

): encryption

1:: $Choose a random vector e \in F_{2}^{n} | W_{H} (e) = t$
2:: $Compute c = m G_{p} + e$
3:: $return c$

Algorithm 6 McEliece.Dec(sk = (

S, P

), c): decryption

1:: $Compute c^{'} = c P^{- 1}$
2:: $decode m^{'} = d e c o d e (c^{'})$
3:: $compute m = m^{'} S^{- 1}$
4:: $return m$

In the decryption algorithm,

c^{'} = c P^{- 1} = m S G + e P^{- 1}

, where

m S G

is a codeword. Since

W_{H} (e P^{- 1}) \leq t

, we can decode

c^{'}

and obtain

m^{'} = m S

. Therefore,

m = m^{'} S^{- 1}

.

2.3.1. Example

Consider the following:

4 \times 12

generator matrix of a linear code that can correct up to

t = 2

errors

G = (\begin{matrix} 0 & 1 & 1 & 0 & 1 & 0 & 1 & 0 & 0 & 1 & 0 & 0 \\ 0 & 1 & 1 & 1 & 1 & 0 & 0 & 1 & 1 & 0 & 0 & 0 \\ 1 & 1 & 0 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \\ 1 & 1 & 1 & 0 & 1 & 1 & 0 & 1 & 0 & 0 & 1 & 0 \end{matrix})

And assume Alice wants to send to Bob the message

m = (1, 0, 1, 0)

. Bob generates S randomly with a permutation matrix P,

S = (\begin{matrix} 1 & 0 & 0 & 1 \\ 0 & 1 & 0 & 1 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 1 \end{matrix}), P = (\begin{matrix} 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \end{matrix})

Then

G_{p} = S G P = (\begin{matrix} 1 & 1 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 1 \\ 1 & 1 & 0 & 0 & 0 & 1 & 0 & 1 & 0 & 0 & 1 & 0 \\ 0 & 0 & 1 & 0 & 1 & 1 & 0 & 1 & 1 & 0 & 0 & 1 \\ 0 & 1 & 0 & 0 & 0 & 1 & 1 & 0 & 1 & 0 & 1 & 1 \end{matrix}) .

So now Alice computes and sends to Bob the following:

m G_{p} + e = (1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0) + (1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) = (0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0) .

Bob then computes

(0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0) P^{- 1} = (0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0) P^{T} = (0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0) .

Notice the new error

e P^{- 1} = (1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0)

, so we can correct the errors and decode

m S G

to obtain

m^{'} = (1, 1, 0, 1)

. Finally, Bob computes

m S^{- 1} = (1, 1, 0, 1) (\begin{matrix} 1 & 1 & 1 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 1 & 1 \\ 0 & 1 & 1 & 0 \end{matrix}) = (1, 0, 1, 0)

, Alice’s original message.

2.3.2. Homomorphism

McEliece PKE has the following additive homomorphism: [32]

E (x) + E (y) = E (x \oplus y)

(7)

However, note the following:

\begin{matrix} E (x) + E (y) & = (x S G P + e_{1}) + (y S G P + e_{2}) \\ = (x + y) S G P + (e_{1} + e_{2}) = E (x \oplus y) \end{matrix}

We can see that the error vector

e_{1} + e_{2}

of

E (x \oplus y)

doubled in weight. Therefore, when performing the addition of ciphertexts,

e_{1}

and

e_{2}

should be chosen such that each has a weight of less than half of the maximum error-correcting rate of the code. In general, if we have n additions, each of the sum’s components must have a weight less than

1 / n

of the maximum error-correcting rate of the code.

A more efficient dual variant of the McEliece cryptosystem is due to Niederreiter [7,8]. Consider the

(n, κ) -

linear Binary Goppa Code G,the Niederreiter cryptosystem is defined as shown in Algorithms 7–9.

Algorithm 7 Niederreiter.KeyGen(): key generation

1:: $Generate the (n - k) \times n parity check matrix H for G$
2:: $Select a random binary non-singular matrix S \sim F_{2}^{(n - κ) \times (n - k)}$
3:: $Select an n \times n permutation matrix P$
4:: $compute H_{p} : = S H P$
5:: $return pk : = H_{p}, sk : = (S, P)$

Algorithm 8 Niederreiter.Enc(pk =

H_{p}

,

m \in F_{2}^{n}

): encryption

1:: $Encode the message m such that m \in F_{2}^{n} | W_{H} (m) = t$
2:: $Compute c = H_{p} m$
3:: $return c$

Algorithm 9 Niederreiter.Dec(sk = (

S, P

), c): decryption

1:: $Compute c^{'} = S^{- 1} c$
2:: $decode m^{'} = Syndrome decode (c^{'})$
3:: $compute m = P^{- 1} m^{'}$
4:: $return m$

3. Related Works

Exploring the homomorphic properties of McEliece PKE was first studied in [32]. However, no attention has been given to leveraging these properties for constructing PSI protocols. Similarly, the homomorphic properties of the MLWE lattice, upon which Kyber is built, were investigated in [31]. That study, however, was not specifically dedicated to Kyber itself but to the MLWE lattice structures in general.

The problem of constructing post-quantum-based PSI protocols has been studied extensively in the literature. Most of the proposed protocols are based on lattice-based structures; specifically RLWE and LWE problems. In [33], the authors provided a comprehensive literature review on the PSI problem.

In [34], the authors proposed a lattice-based size hiding protocol for two-party PSI-CA secure in a semi-honest environment. The security is based on the hardness of the decisional Learning With Errors (DLWE) problem with linear complexity in the size of the inputs. To prevent arbitrary inputs, they proposed a protocol where a trusted third party authorizes the client’s input set.

Later in [35], the authors generalized the protocol to the Multiparty-PSI problem, secure in the semi-honest model with

O (n k v_{max})

, where

v_{max}

is the maximum set size and k and n are security parameters. The parties are arranged in a star topology so that all parties need not be online at the same time.

The security of the previous protocols is based on DLWE; however, other designs exist with security based on Ring-LWE. In [36], the authors proposed protocols based on the fully homomorphic encryption scheme that was proposed in [37], considering the semi-honest model. Also, the authors developed an extension protocol secure under the malicious model by outsourcing computing to a cloud. Another lattice-based protocol based on the NTRU fully homomorphic encryption scheme is proposed in [10].

While prior works have explored PSI protocols based on lattice assumptions such as LWE, RLWE, and NTRU, no existing study has designed PSI protocols built upon the PKEs of Kyber or McEliece. This leaves a clear gap in the literature, which we aim to address in this work. Specifically, we propose and evaluate two new two-party PSI protocols founded on Kyber and McEliece PKEs to examine and assess their homomorphic properties in practice.

4. Proposed PSI Protocols

In this section, the proposed post-quantum PSI protocols are presented.

4.1. Kyber Based Protocol

Based on the algorithm described in Section 2.3, we present the Kyber-based PSI protocol in the following section. Let

P_{1}, P_{2}

be two semi-honest parties holding the sets

G_{1} = \{g_{11}, g_{12}, . ., g_{1 v}\}, G_{2} = \{g_{21}, g_{22}, . ., g_{2 v}\}

, respectively, such that

G_{1}, G_{2} \subseteq U = \{g_{1}, g_{2}, \dots, g_{w}\}

, where

g_{1} < g_{2} < \dots < g_{w}

. We can define a vector representation of the

P_{i}

’s set as

G_{i}^{'} = \{g_{i 1}^{'}, g_{i 2}^{'}, . ., g_{i w}^{'}\}

such that

g_{i j}^{'} = \{\begin{matrix} p (x), g_{j} \in G_{i} \\ 0, g_{j} \notin G_{i} \end{matrix}, (i = 1, 2, j = 1, 2, \dots, w) .

(8)

where

p (x)

is a non-zero binary polynomial in R.

Taking the product of

E (G_{1}^{'})

and

E (G_{2}^{'})

and using (5), we have

\begin{matrix} E (G_{1}^{'}) \cdot E (G_{2}^{'}) & = \{E (g_{11}^{'}) \cdot E (g_{21}^{'}), E (g_{12}^{'}) \cdot E (g_{22}^{'}), . ., E (g_{1 w}^{'}) \cdot E (g_{2 w}^{'})\} \\ = \{E (g_{11}^{'} g_{21}^{'}), E (g_{12}^{'} g_{22}^{'}), . ., E (g_{1 w}^{'} g_{2 w}^{'})\} \end{matrix}

So, after decryption, we obtain

g_{1 j}^{'} g_{2 j}^{'} = \{\begin{matrix} p {(x)}^{2}, g_{j} \in G_{1} \cap G_{2} \\ 0, g_{j} \notin G_{1} \cap G_{2} \end{matrix}, (j = 1, 2, \dots, w) .

(9)

Therefore, Algorithm 10 describes a Kyber-based private intersection protocol:

Algorithm 10 Kyber-based Private Set Intersection

Input:

G_{1}, G_{2}

Output:

G_{1} \cap G_{2}

1:: $P_{1} ⟶ P_{2} : E (G_{1}^{'}) .$
2:: $P_{2}$ : For each coordinate $j \in {1, \dots, w}$ , draw an independent zero-ciphertext $Z_{j} \leftarrow E (0)$
and set $\tilde{E} (g_{1 j}^{'}) \leftarrow E (g_{1 j}^{'}) + Z_{j}$ .
3:: $P_{2}$ : Compute the coordinate-wise product:

$\tilde{E} (G_{1}^{'}) \cdot E (G_{2}^{'}) = \{\tilde{E} (g_{11}^{'}) \cdot E (g_{21}^{'}), \dots, \tilde{E} (g_{1 w}^{'}) \cdot E (g_{2 w}^{'})\},$

where each product is returned in the form $(d_{1 j}, d_{2 j}, d_{3 j})$ , as defined in (5).
4:: $P_{2} ⟶ P_{1} : {\{(d_{1 j}, d_{2 j}, d_{3 j})\}}_{j = 1}^{w} .$
5:: $P_{1}$ : For each index j,

${\tilde{m}}_{j} \leftarrow ⌈ \frac{d_{1 j} - d_{2 j} \cdot s + d_{3 j} s \cdot s}{{⌈ \frac{q}{2} ⌋}^{2}} ⌋$
6:: $P_{1}$ : For $j = 1, \dots, w$ , If ${\tilde{m}}_{j} = 0$ then $g_{j} \notin G_{1} \cap G_{2}$ else $g_{j} \in G_{1} \cap G_{2}$ .
7:: $P_{1}$ : Output $G_{1} \cap G_{2}$ .

The correctness of this protocol greatly depends on the noise of the underlying encryption scheme. In particular, from Section 2.2.2, the protocol is correct when every product in STEP 3 has the following property:

∥ \frac{w_{x} w_{y}}{{⌈ \frac{q}{2} ⌋}^{2}} + \frac{x w_{x} + y w_{y}}{⌈ \frac{q}{2} ⌋} ∥_{\infty} < \frac{1}{2}

That is when

\frac{{∥w_{x}∥}_{\infty}}{⌈ \frac{q}{2} ⌋} * \frac{{∥w_{y}∥}_{\infty}}{⌈ \frac{q}{2} ⌋} + \frac{{∥w_{x}∥}_{\infty}}{⌈ \frac{q}{2} ⌋} + \frac{{∥w_{y}∥}_{\infty}}{⌈ \frac{q}{2} ⌋} < \frac{1}{2}

Because of the re-randomization in STEP 2, we differentiate between the distribution of

w_{x}

and

w_{y}

.

Working coefficient-wise, let

p_{r} (t_{x})

be an upper bound on

Pr (|w_{x j}| \geq t_{x})

and

p (t_{y})

be an upper bound on

Pr (|w_{y j}| \geq t_{y})

. That is an upper bound on the probability that the j-th coefficient of the polynomials

w_{x}

and

w_{y}

is greater than

t_{x}

and

t_{y}

, respectively. Then, by union bound, we have

Pr ({∥w_{x}∥}_{\infty} \geq t_{x}) \leq n p_{r} (t_{x})

and

Pr ({∥w_{y}∥}_{\infty} \geq t_{y}) \leq n p (t_{y})

. Therefore, if we choose

t_{x}

and

t_{y}

such that

\frac{t_{x}}{⌈ \frac{q}{2} ⌋} * \frac{t_{y}}{⌈ \frac{q}{2} ⌋} + \frac{t_{x}}{⌈ \frac{q}{2} ⌋} + \frac{t_{y}}{⌈ \frac{q}{2} ⌋} \leq \frac{1}{2}

, we can have an upper bound on the failure rate of one product in STEP 3. That is

Pr ({∥w_{x}∥}_{\infty} \geq t_{x} or {∥w_{y}∥}_{\infty} \geq t_{y}) \leq n (p_{r} (t_{x}) + p (t_{y}))

. Taking a union bound over all products in STEP 3, we have an upper bound on the total failure rate

δ

of the protocol

δ \leq n w (p_{r} (t_{x}) + p (t_{y}))

(10)

One choice of

t_{x}

and

t_{y}

is

t_{x} = t_{y} = \frac{\sqrt{6} - 2}{2} ⌈ \frac{q}{2} ⌋

. However, since

w_{x}

is noisier due to re-randomization, we can have a better estimate by choosing

t_{y} = \frac{\frac{1}{2} - \frac{t_{x}}{⌈ \frac{q}{2} ⌋}}{1 + \frac{t_{x}}{⌈ \frac{q}{2} ⌋}} ⌈ \frac{q}{2} ⌋

and

t_{x}

such that

n w (p_{r} (t_{x}) + p (t_{y}))

is minimized. Using a Chernoff bound, we obtain the following result:

Theorem 1.

Algorithm 10 is

(1 - δ)

-correct, where δ satisfies

δ \leq 2 n w (p_{r} (t_{x}) + p (t_{y}))

Such that

p (t_{y}) = min_{α \geq 0} exp (Ψ (α) - α max {t_{x} - \frac{α q}{2^{d_{v} + 1}}, 0}),

p_{r} (t_{x}) = min_{α \geq 0} exp (2 Ψ (α) - α max {t_{y} - \frac{α q}{2^{d_{v}}}, 0}),

where

Ψ (α) : = k n (ln A_{r} (α) + ln A_{s} (α)) + η ln (\frac{1 + cosh α}{2}),

\begin{matrix} A_{r} (α) = \frac{(\binom{2 η}{η})}{4^{η}} + \frac{1}{2^{3 η - 1}} \sum_{j = 1}^{η} (\binom{2 η}{η - j}) {(1 + cosh (j α))}^{η} \frac{sinh (j \frac{α q}{2^{d_{t} + 1}})}{j \frac{α q}{2^{d_{t} + 1}}}, \\ A_{s} (α) = \frac{(\binom{2 η}{η})}{4^{η}} + \frac{1}{2^{3 η - 1}} \sum_{j = 1}^{η} (\binom{2 η}{η - j}) {(1 + cosh (j α))}^{η} \frac{sinh (j \frac{α q}{2^{d_{u} + 1}})}{j \frac{α q}{2^{d_{u} + 1}}} \end{matrix}

Proof.

Let

Δ_{d} : = q / 2^{d}

. We adopt a uniform quantization model, that is

δ_{t j} \sim Unif [- \frac{Δ_{d_{t}}}{2}, \frac{Δ_{d_{t}}}{2}], δ_{u j} \sim Unif [- \frac{Δ_{d_{u}}}{2}, \frac{Δ_{d_{u}}}{2}], δ_{v j} \sim Unif [- \frac{Δ_{d_{v}}}{2}, \frac{Δ_{d_{v}}}{2}]

Focusing on

p (t)

only since

p_{r} (t)

can be derived in a similar manner taking into account the additional noise. The jth coefficient is

w_{y j} = {(e^{⊤} r)}_{j} + {(δ_{t}^{⊤} r)}_{j} - {(s^{⊤} e_{1})}_{j} - {(s^{⊤} δ_{u})}_{j} + (e_{2 j}) + (δ_{v j}) .

Each one of the terms is a sum of

k n

products of a centered binomial and uniform random variables. Except for

e_{2 j}

, a centered binomial random variable, and

(δ_{v j})

, modeled by its upper bound

a_{v} : = \frac{Δ_{d_{v}}}{2}

. Then, define the random part:

G_{j} : = w_{y j} - {(δ_{v})}_{j}

We will compute the Moment Generating Function (MGF) of

G_{j}

and apply the Chernoff bound. Note that if

X \sim β_{η}

, then

E [e^{θ X}] = {(\frac{1 + cosh θ}{2})}^{η}

. Also, when

U \sim Unif [- Δ / 2, Δ / 2], then for any constant c,

we have

[E e^{α c U}] = \frac{sinh (α c Δ / 2)}{α c Δ / 2} = : {sinc}_{h} (\frac{α c Δ}{2}),

{sinc}_{h} (x) : = \frac{sinh x}{x}

. Conditioning on

r \sim β_{η}

and using independence:

E [e^{α r e} | r] = {(\frac{1 + cosh (α r)}{2})}^{η}, E [e^{α r δ_{t}} | r] = {sinc}_{h} (\frac{α r Δ_{d_{t}}}{2})

Then the MGF of each product in

{(e^{⊤} r)}_{j} and {(δ_{t}^{⊤} r)}_{j}

is

A_{r} (α) : = E_{r \sim β_{η}} [{(\frac{1 + cosh (α r)}{2})}^{η} {sinc}_{h} (\frac{α r Δ_{d_{t}}}{2})] .

Using the identity

E_{X \sim β_{m}} [e^{θ X} {sinc}_{h} (γ X)] = \frac{1}{2 γ} \int_{- γ}^{γ} {cosh}^{2 m} (\frac{θ + s}{2}) d s,

And

{cosh}^{2 m} u = \frac{1}{2^{2 m}} ((\binom{2 m}{m}) + 2 \sum_{r = 1}^{m} (\binom{2 m}{m - r}) cosh (2 r u))

, and the binomial expansion of

{(\frac{1 + cosh (α R)}{2})}^{η}

, a straightforward telescoping yields

A_{r} (α) .

Similarly for

A_{s} (α)

for

{(s^{⊤} e_{1})}_{j} and {(s^{⊤} δ_{u})}_{j} .

So,

E [e^{α G_{j}}] = {(A_{r} (α))}^{k n} {(A_{s} (α))}^{k n} {(\frac{1 + cosh α}{2})}^{η} = exp (Ψ (α))

and

Ψ (α) = k n (ln A_{r} (α) + ln A_{s} (α)) + η ln (\frac{1 + cosh α}{2}) .

From here, it is straightforward to apply the Chernoff bound, and the proof is complete. □

Formal Security Proof

Theorem 2.

The Kyber-based Protocol (Algorithm 10) implements a secure two-party PSI in the semi-honest model, assuming the underlying Kyber PKE is IND-CPA secure.

Proof.

We prove security by constructing simulators for both possible corruption cases and demonstrating computational indistinguishability between the real and ideal executions.

Following the semi-honest security model, we require that for any PPT adversary

A

corrupting at most one party, there exists a PPT simulator

S

such that

{{IDEAL}_{S, A} (G_{1}, G_{2}, κ)}_{κ \in N} \overset{c}{\equiv} {{REAL}_{Π, A} (G_{1}, G_{2}, κ)}_{κ \in N}

(11)

Case 1: $P_{1}$ is corrupted. When

P_{1}

is corrupted, the adversary

A_{1}

’s view consists of (i) its input

G_{1}

, (ii) the message received from

P_{2}

:

{(d_{1 j}, d_{2 j}, d_{3 j})}_{j = 1}^{w}

, and (iii) the protocol output

G_{1} \cap G_{2}

.

We construct simulator

S_{1}

that, given

G_{1}

,

G_{1} \cap G_{2}

, and security parameter

κ

, operates as follows:

Generate Kyber key pair $(p k, s k) \leftarrow KeyGen (1^{κ})$
For $j = 1, \dots, w$ :
- If $g_{j} \in G_{1} \cap G_{2}$ : Generate $(d_{1 j}, d_{2 j}, d_{3 j})$ such that the decryption formula $m_{j} \leftarrow ⌈ \frac{d_{1 j} - d_{2 j} \cdot s + d_{3 j} \cdot S^{'}}{{⌈ \frac{q}{2} ⌋}^{2}} ⌋$ yields a non-zero polynomial
- Otherwise: Generate $(d_{1 j}, d_{2 j}, d_{3 j})$ such that $m_{j} = 0$
Output $(G_{1}, {(d_{1 j}, d_{2 j}, d_{3 j})}_{j = 1}^{w}, G_{1} \cap G_{2})$

The indistinguishability follows from the zero-ciphertext blinding performed by

P_{2}

in Step 2, where

\tilde{E} (g_{1 j}) \leftarrow E (g_{1 j}) + Z_{j}

with

Z_{j} \leftarrow E (0)

. This randomization ensures that the homomorphic products reveal only intersection membership information. To extract anything more about

P_{2}

’s element, the adversary would need to identify

(u_{y}, v_{y})

using the triple

(d_{1 j}, d_{2 j}, d_{3 j})

, his ciphertext and the secret key. However, the triple equations are invariant under the map

(u_{x}^{'}, v_{x}^{'}, u_{y}, v_{y}) \mapsto (α u_{x}^{'}, α v_{x}^{'}, α^{- 1} u_{y}, α^{- 1} v_{y})

for any non-zero rational

α

. Let

{\hat{u}}_{x}^{'}

be a particular solution found from

d_{3}

(note that

d_{3}

is independent of the secret key). Then, by randomization, we have that among the set

L = {α {\hat{u}}_{x}^{'} - u_{x} | α \in Q}

, exactly one

α

satisfies

α {\hat{u}}_{x}^{'} - u_{x} = A^{⊤} r + e

, for some small r and e. If

P_{1}

can, find

P_{2}

’s element, then

P_{1}

can find

α

, thus solving the problem: given L find

α

such that

α {\hat{u}}_{x}^{'} - u_{x} = A^{⊤} r + e

for small r and e, which is hard. Thus, under the IND-CPA assumption of Kyber, no PPT adversary can distinguish between real and simulated products.

Case 2: $P_{2}$ is corrupted. When

P_{2}

is corrupted, the adversary

A_{2}

’s view consists of (i) its input

G_{2}

and (ii) the message received from

P_{1}

:

E (G_{1}) = {E (g_{11}), \dots, E (g_{1 w})}

. Note that

P_{2}

does not receive the final output.

We construct simulator

S_{2}

that, given

G_{2}

and security parameter

κ

, operates as follows:

Generate Kyber key pair $(p k, s k) \leftarrow KeyGen (1^{κ})$
For $j = 1, \dots, w$ : Sample $r_{j}$ uniformly at random from the message space and compute $E (g_{1 j}^{'}) \leftarrow Encrypt (p k, r_{j})$
Output $(G_{2}, {E (g_{1 j}^{'})}_{j = 1}^{w})$

By the IND-CPA security of Kyber PKE, encryptions of actual set elements are computationally indistinguishable from encryptions of random elements. Since

P_{2}

receives no intersection information, the simulation is perfect. □

4.2. McEliece-Based Protocol

Since McEliece PKE supports only the additive homomorphism, a direct algorithm as in (10) cannot be used. In fact, the direct algorithm can be used, but if a peer acts maliciously by pretending that its set includes all the elements in the domain [38], privacy is no longer guaranteed. Hence, a data structure, e.g., a bloom filter, is needed to design a secure PSI protocol. To perform multiple AND operations, the Sandar Young Yung (YYT) technique [20] is utilized.

We first define the function

E x p a n d, E (x)

, where a bit x encrypted to

E (x)

is expanded to a vector ciphertext with length l as follows:

1. For each i, draw a sample

r_{i}

uniformly from

{0, 1}

.

2. For each element in the vector ciphertext

E x p a n d (E (x)) = (E (e_{1}), E (e_{2}), \dots, E (e_{l}))

is set

E (e_{i}) = \{\begin{matrix} E (x) + E (1) = E (x \oplus 1) & if r_{i} = 0 \\ E (0) & if r_{i} = 1 \end{matrix}

(12)

When the vector ciphertext

E x p a n d (E (x))

is decrypted, the result will be

(e_{1}, e_{2}, \dots, e_{l})

. If

x = 1

, then

x \oplus 1 = 0

, so for all i,

e_{i} = 0

. Otherwise,

e_{i}

will be uniformly distributed in

{0, 1}

. Now, we define the sum of two vector ciphertexts as follows:

\begin{matrix} E x p a n d (E (x)) + E x p a n d (E (y)) & = E (e_{1}) + E (f_{1}), \dots, E (e_{l}) + E (f_{l}) \\ = E (e_{1} \oplus f_{1}), \dots, E (e_{l} \oplus f_{l}) \end{matrix} .

Observe that if both x and y are 1, then all

e_{i} \oplus f_{i}

is 0. However, if one of them is 0, then

e_{1} \oplus f_{1}

will be uniformly distributed in

{0, 1}

. So, we have

E x p a n d (E (x)) + E x p a n d (E (y)) = E x p a n d (E (x \land y))

A bloom filter is a probabilistic data structure used to test for inclusion [39]. A bloom filter can be represented as a vector of bits

B = (b_{1}, \dots, b_{m})

of size m and

λ

associated hash functions

h_{i} : {0, 1}^{l} \mapsto {1, \dots, m}

. Initially, the Bloom filter’s bits are all set to zero. Then we define the following two functions:

$A d d (x)$ : Return B with $b_{h_{i} (x) = 1}$ for all $1 \leq i \leq λ$ .
$T e s t (x)$ : Return $⋀_{i = 1}^{λ} b_{h_{i} (x)}$ .

In a bloom filter, the function

A d d (x)

will add the element x by setting the location in B with index

b_{h_{i}} (x)

to 1 for

i = 1, \dots, λ

. The function

T e s t (x)

will test whether an element x is added by checking whether the locations with indices calculated from the hash functions are set to 1 or not. When

T e s t (x)

returns 0, this implies that x is definitely not added to the bloom filter; on the other hand, if

T e s t (x)

returns 1, this tells us that x is probably in the bloom filter. Therefore, false negatives are impossible, while false positives are allowed. However, given that the probability of false positives (

ϵ

) is equal to

2^{- λ}

, increasing the number of hash functions (

λ

) can lead to a negligible value of

ϵ

. Accordingly, the optimal size of the bloom filter can be set by

m = \frac{n λ}{ln 2}

.

Therefore, we can, based on [40], perform a private intersection protocol as shown in Algorithm 11:

Algorithm 11 Private Set Intersection Protocol

1:: $P_{1}$ : $i = 1, \dots v : A d d (g_{1 i})$
2:: $P_{1} ⟶ P_{2} : G_{p}, (E (b_{1}), \dots, E (b_{m}))$
3:: $P_{2} : i = 1, \dots, v : E (w_{i}^{'}) = E (g_{2 i}) + \sum_{j = 1}^{λ} E x p a n d (E (b_{h_{j} (g_{2 i})}))$
4:: $P_{2} ⟶ P_{1} : E (w_{1}^{'}), \dots, E (w_{m}^{'})$
5:: $P_{1}$ : $D (E (w_{1}^{'})), \dots, D (E (w_{m}^{'}))$
6:: $P_{1} : \{g_{11}, \dots, g_{1 n}\} \cap \{w_{1}^{'}, \dots, w_{m}^{'}\}$

In STEP 3, we have the following

E (w_{i}^{'}) = E (g_{2 i}) + \sum_{j = 1}^{λ} E x p a n d (E (b_{h_{j} (g_{2 i})})) = E (g_{2 i}) + E x p a n d (E (⋀_{j = 1}^{λ} b_{h_{j} (g_{2 i})}))

This computation will result to

w_{i}^{'} = (g_{2 i, 1} \oplus e_{1}, \dots, g_{2 i, l} \oplus e_{l}) .

If

g_{2 i}

is in the bloom filter, then

b_{h_{j} (g_{2 i})}

will all be 1. So

e_{j} = 0

. That is

(g_{2 i, j} \oplus e_{j}) = g_{2 i, j}

. However, if

g_{2 i}

is not in the bloom filter, then

e_{j}^{'} s

will be randomly chosen, and so,

w_{i}^{'}

will be chosen at random.

Formal Security Proof

Theorem 3.

Assuming the IND-CPA security of McEliece PKE and the error vector weights are properly managed, then the McEliece-based PSI protocol (Algorithm 11) implements a secure private set intersection in the semi-honest model.

Proof.

We prove security by demonstrating that the view of any party in the real protocol can be simulated from their input and output alone, making it indistinguishable from the ideal model. For any PPT adversary

A

corrupting at most one party, there exists a PPT simulator

S

such that

{{IDEAL}_{S, A} (G_{1}, G_{2}, λ)}_{λ \in N} \overset{c}{\equiv} {{REAL}_{Π, A} (G_{1}, G_{2}, λ)}_{λ \in N}

(13)

Case 1: $P_{1}$ is corrupted.

When

P_{1}

is corrupted, the adversary

A_{1}

’s view consists of (i) Its input

G_{1}

, (ii) the message received from

P_{2}

:

{E (w_{1}^{'}), \dots, E (w_{m}^{'})}

, and (iii) the protocol output

G_{1} \cap G_{2}

.

We construct simulator

S_{1}

as follows:

1:: Input: $G_{1}$ , $G_{1} \cap G_{2}$ , security parameter $κ$
2:: Generate McEliece key pair $(G_{p}, S) \leftarrow KeyGen (1^{κ})$
3:: for $i = 1, \dots, | G_{2} |$ do
4:: if $g_{2 i} \in G_{1} \cap G_{2}$ then
5:: Generate $E (w_{i}^{'})$ such that decryption yields an element in $G_{1}$
6:: else
7:: Generate $E (w_{i}^{'})$ such that decryption yields a random element not in $G_{1}$
8:: end if
9:: end for
10:: Output: $(G_{1}, {E (w_{1}^{'}), \dots, E (w_{m}^{'})}, G_{1} \cap G_{2})$

Indistinguishability Analysis: The key insight lies in the properties of the Bloom filter and the additive homomorphic operations:

Bloom Filter Masking: For elements $g_{2 i} \in G_{1} \cap G_{2}$ , all hash functions $h_{j} (g_{2 i})$ map to positions set to 1 in the Bloom filter, resulting in $E (w_{i}^{'}) = E (g_{2 i})$ after the expand operations cancel out.
Random Masking for Non-intersection: For elements $g_{2 i} \notin G_{1} \cap G_{2}$ , at least one hash function maps to a position set to 0 in the Bloom filter. The expand operation introduces randomness, making $w_{i}^{'}$ appear random and unrelated to $g_{2 i}$ .

Under the IND-CPA security of McEliece PKE, the encrypted values are computationally indistinguishable from encryptions of the appropriate elements determined by intersection membership.

Therefore,

{IDEAL}_{S_{1}, A_{1}} (G_{1}, G_{2}, κ) \overset{c}{\equiv} {REAL}_{Π, A_{1}} (G_{1}, G_{2}, κ)

.

Case 2: $P_{2}$ is corrupted.

When

P_{2}

is corrupted, the adversary

A_{2}

’s view consists of

Its input $G_{2}$
The message received from $P_{1}$ : $G_{p}, {E (b_{1}), \dots, E (b_{m})}$ (encrypted Bloom filter)

We construct simulator

S_{2}

as follows:

1:: Input: $G_{2}$ , security parameter $κ$
2:: Generate McEliece key pair $(G_{p}, S) \leftarrow KeyGen (1^{κ})$
3:: Initialize Bloom filter $B = (b_{1}, \dots, b_{m})$ with random bits
4:: for $j = 1, \dots, m$ do
5:: $E (b_{j}) \leftarrow Encrypt (G_{p}, b_{j})$ where $b_{j} \leftarrow {0, 1}$ uniformly at random
6:: end for
7:: Output: $(G_{2}, G_{p}, {E (b_{1}), \dots, E (b_{m})})$

Indistinguishability Analysis: The security follows from the IND-CPA security of McEliece PKE:

Ciphertext Indistinguishability: Under the syndrome decoding assumption, encryptions of the actual Bloom filter bits are computationally indistinguishable from encryptions of random bits.
No Output Leakage: Since $P_{2}$ does not receive the intersection result, the simulator need not ensure consistency with any output information.
Bloom Filter Privacy: The encrypted Bloom filter reveals no information about the underlying set $G_{1}$ beyond what can be inferred from the ciphertexts, which is negligible under the IND-CPA assumption.

Therefore,

{IDEAL}_{S_{2}, A_{2}} (G_{1}, G_{2}, κ) \overset{c}{\equiv} {REAL}_{Π, A_{2}} (G_{1}, G_{2}, κ)

. □

4.3. Limitations and Discussion

Kyber and McEliece PKEs are probabilistic and achieve indistinguishability under chosen plaintext attack (IND-CPA) security. However, many real-world applications require a much stronger notion of security against active attacks. Namely, the PKE scheme should achieve indistinguishability under chosen ciphertext attack (IND-CCA) security. In the literature, IND-CCA secure variants of McEliece have been, proposed as in [41,42,43]. On the contrary, there is no IND-CCA secure variant of Kyber PKE.

Employing the McEliece cryptosystem as an additive homomorphic scheme primarily affects its security. Hence, we need to choose the error vectors so that they have proper weights. Since we have at most

λ + 2

in STEP 4 and STEP 5 Algorithm 11, then we choose

W_{H} (e_{j}) \in (0, \frac{t}{λ + 2}]

. This reduction in the weight should be managed carefully; otherwise, the cryptosystem will be vulnerable to an information-set decoding attack [44,45]. In the experimental work section, we multiply t by 1.5 to keep the algorithm supporting the homomorphic property with high security. However, this will affect the ciphertext and public key size as will be shortly shown in Section 5. For the same security issue, the McEliece-based PSI protocol (Algorithm 11) is built upon the less efficient protocol [40] rather than the one proposed in [46,47]. From the correctness perspective, the McEliece-based PSI protocol depends on, again, the error vector and the Bloom filter, which may generate false positives.

On the other hand, the homomorphic properties of Kyber are straightforward and reveal no security breaches. One limitation of Kyber as a multiplicative homomorphic scheme is that the use of this property becomes computationally costly when many items are involved in a single multiplication operation, since bootstrapping keys and relinearization are required [31]. However, when only a small number of items (e.g., two) are multiplied per operation, as in the Kyber-based PSI protocol (Algorithm 10), the overhead remains manageable, keeping the computational cost within a practical range, as will be shown in the next section. Another limitation of the Kyber-based PSI protocol is that the non-standard ciphertext resulted by the homomorphic multiplication is larger than the standard ciphertext by

50 %

. The correctness of the protocol relies on the noise boundedness introduced by the Kyber PKE. That is, the protocol is correct as long as the accumulated noise during encryption and homomorphic multiplications does not exceed the decryption threshold.

Moreover, due to the aforementioned limitations, extending the proposed two-party PSI protocols to multiparty PSI protocols is almost impossible. Indeed, in the multiparty environment, these limitations are no longer manageable.

5. Experimental Work and Performance Evaluation

To evaluate our proposed protocols experimentally, we consider a Peer-to-Peer (P2P) file sharing network in which each file is identified by a unique serial number, namely a 32-byte integer. Each peer owns some files. Once a peer gets connected to another peer, they can only know the mutually inclusive file set that both pose by applying one of the proposed PSI protocols. The peer can repeat this process with several peers in the same network. Ultimately, the peer can know if he owns rare files that are only owned by him or a few other peers in the network; thus, an urgent backup must be performed.

Experimental Environment: Java JDK version 17 [48] is used as a programming language, Apache NetBeans version 20 [49] is used as an integrated development environment (IDE), and the Bouncy Castle cryptography library [50] is used to implement the proposed cryptographic algorithms. The PC specifications include an Intel Core i7-1255U CPU, 32 GB RAM, and Windows 11 Pro 64-bit OS.

Storage overhead, communication overhead, and computation cost are the metrics of interest to compute based on the three NIST security levels: level 1 meets the security of AES-128, level 3 meets the security of AES-192, and level 5 meets the security of AES-256.

Based on the specification above, Table 1 shows reference values for public/private keys and ciphertext sizes of Kyber and McEliece/Niederreiter PKE. Where McEliece1.5t refers to a tuned version of McEliece that has the t value increased by a factor of 1.5 to keep the McEliece algorithm secure after applying the additive homomorphic property. For instance, if McEliece’s t value is 64 for level-1 security, the McEliece1.5t’s t value becomes 96, and so on. The cost of this increase is an increase in the public key and ciphertext size, as shown in the table. Also, to avoid any decryption failure, the original encrypted texts, i.e., prior to the additive homomorphism, must be encrypted using

0.5 t

and

0.75 t

values for McEliece and McEliece1.5t protocols, respectively. We stress that all McEliece parameters and results are due to the Niederreiter variant.

5.1. Storage Overhead

The storage overhead (SO) is the overall sum of the public/private key sizes and the ciphertext (

c t

) size of the entire set, G, as depicted in (14).

S O = P_{k} + S_{k} + (c t * | G |) .

(14)

The key size depends on the underlying algorithm and required security level. The overall ciphertext size is the size of a single encrypted element multiplied by the number of elements in the set,

| G |

.

Results for NIST’s level-1 security are depicted in Table 2 and Figure 1. The results show that the most storage overhead of McEliece comes from its keys, while the most storage overhead of Kyber comes from its ciphertext. Thus, it is preferable to use Kyber for small set sizes, approximately less than 400 elements when Kyber is compared with McEliece and less than 550 elements when compared with McEliece1.5t. Figure 1 shows much clearer results. It shows that McEliece’s line increases by a small constant while Kyber’s line increases linearly. More importantly, it shows the exact intersection point and set size where it is recommended to switch to McEliece. The figure shows that the set size of 392 is where McEliece can be preferably used, as it incurs only 294.33 KB overhead, whereas Kyber incurs 295.53 KB overhead. Also, at the set size of 540, the storage overhead incurred by McEliece1.5t and Kyber is 405.21 KB and 406.53 KB, respectively.

Results for NIST’s level-3 security are depicted in Table 3 and Figure 2. The results show the same tendency as the level-1 security. However, since the public key size of McEliece is almost twice the size of level-1, Kyber performs better than McEliece and McEleice1.5t as long as the set size is less than 575 and 764, respectively, as shown in Figure 2. Table 4 and Figure 3 show the results of NIST’s level-5 security. Again, McEliece’s public key is approximately doubled, and thus the storage overhead is doubled. Therefore, Kyber outperforms McEliece and McEliece1.5t as long as the set size is less than 779 and 1062, respectively, as shown in Figure 3.

5.2. Communication Overhead

For the Kyber-based protocol (Algorithm 10), Peer 1, who initiates the PSI protocol, sends the encrypted elements of his entire set (including the zero elements) along with the public key. Peer 2 then returns the result of the homomorphic multiplication (element-wise) of its set with Peer 1’s set. Thus, the overall communication overhead, as given in Equation (15), is the sum of the public key size, the size of Peer 1’s encrypted set, and the size of the encrypted products of Peer 1’s and Peer 2’s sets.

C O_{K y b e r} = P_{k} + | E (G_{1}, P_{k}) | + | E (G_{1} \times G_{2}, P_{k}) |) .

(15)

For the McEliece-based protocol (Algorithm 11), Peer 1 should send the encrypted Bloom filter corresponding to his private set, along with the public key. Once Peer 2 receives the encrypted bloom filter, he evaluates the expansion function on the received bloom filter. Finally, he homomorphically xors the expanded ciphertexts with his elements and returns the result. Therefore, the overall communication overhead, Equation (16), is the total of the public key size, the size of the encrypted bloom filter, and the size of the encrypted set.

C O_{M c E l i e c e} = P_{k} + | E (B, P_{k}) | + 2 * | E (G, P_{k}) | .

(16)

Note for this implementation, we set the number of hash functions

λ = 50

, such that the false positive rate (

ϵ

) is

2^{- 50}

, following the recommendations in [46,47].

The results of the communication overhead for all NIST’s security levels are listed in Table 5, Table 6 and Table 7 and depicted in Figure 4, Figure 5 and Figure 6. The results show that for small set sizes, McEliece incurs higher communication overhead than Kyber. After a certain point, McEliece starts to outperform Kyber, and the gap between the two algorithms increases dramatically as the set size increases. Overall, the performance of Kyber is highly affected by the set size, whereas McEliece is slightly affected. Apparently, the figures show that McEliece’s line is almost straight, while Kyber’s line grows linearly.

5.3. Computation Cost

The most costly operation in the Kyber-based protocol (Algorithm 10) is the number of multiplication/mod operations, which increase as the module size (K) increases. The most costly operation in the McEliece-based protocol (Algorithm 11) is the hashing. As mentioned earlier, the number of hash functions (

λ

) is set to 50, following the recommendations in [46,47]. The results, summarized in Table 8, indicate that the computation time of McEliece is substantially lower than that of Kyber.

6. Conclusions and Future Work

In this paper, we study the homomorphic properties of two important post-quantum public key cryptosystems (PKCs): Crystal-Kyber and Classic-McEliece. The research focuses on examining their underlying public key encryption (PKE) schemes. We begin with a comprehensive illustration of both candidates and their homomorphic properties, enriched with examples. Next, we apply these homomorphic properties to the private set intersection (PSI) problem. Two different PSI protocols are designed: one based on the additive homomorphic property of McEliece PKE and the other based on the multiplicative homomorphic property of Kyber PKE. Additionally, the limitations of each scheme are discussed and analyzed. To obtain much clearer insights, a practical performance evaluation under NIST’s security levels 1, 3, and 5 is conducted, focusing on three key metrics: storage overhead, communication overhead, and computation cost. Our findings indicate that the Kyber-based PSI Protocol is homomorphically secure, but it suffers from significant computational and communication overhead, which limits its practical applicability. Conversely, the McEliece-based PSI protocol demonstrates greater efficiency in practice but raises fundamental concerns regarding its suitability as a secure homomorphic encryption scheme. Yet, these limitations remain manageable within the controlled setting of two-party PSI protocols. However, extending to a multiparty environment would introduce significant challenges, making the limitations far more difficult to address in practice.

Future work includes improving the security and efficiency of the Kyber-based PSI protocol. From the security perspective, the IND-CPA Kyber PKE could be transformed into an IND-CCA Kyber PKE variant. A more efficient protocol could leverage the Kyber additive homomorphic rather than the multiplicative one. Another promising direction is to utilize the Kyber additive homomorphic property and combine it with a Bloom filter to design a multiparty PSI protocol.

Author Contributions

Conceptualization, A.A.A., K.A. and M.F.; Formal analysis, K.A.; Funding acquisition, M.F.; Methodology, A.A.A., K.A. and M.F.; Project administration, A.A.A. and M.F.; Software, A.A.A. and K.A.; Supervision, M.F.; Validation, A.A.A. and K.A.; Visualization, A.A.A.; Writing—original draft, A.A.A. and K.A.; Writing—review and editing, A.A.A. and M.F. All authors have read and agreed to the published version of the manuscript.

Funding

The APC was funded by the Interdisciplinary Research Center of Intelligent Secure Systems (IRC-ISS).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

The authors would like to acknowledge the support provided by King Fahd University of Petroleum and Minerals (KFUPM) and the Interdisciplinary Research Center of Intelligent Secure Systems.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Shor, P.W. Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM Rev. 1999, 41, 303–332. [Google Scholar] [CrossRef]
Regev, O. On lattices, learning with errors, random linear codes, and cryptography. J. ACM (JACM) 2009, 56, 1–40. [Google Scholar] [CrossRef]
Chou, T.; Cid, C.; UiB, S.; Gilcher, J.; Lange, T.; Maram, V.; Misoczki, R.; Niederhagen, R.; Paterson, K.; Persichetti, E. Classic McEliece: Conservative Code-Based Cryptography, 10 October 2020. 2020. Available online: https://cryptojedi.org/papers/mceliecenistr3-20201010.pdf (accessed on 1 October 2025).
Bos, J.; Ducas, L.; Kiltz, E.; Lepoint, T.; Lyubashevsky, V.; Schanck, J.M.; Schwabe, P.; Seiler, G.; Stehlé, D. CRYSTALS-Kyber: A CCA-secure module-lattice-based KEM. In Proceedings of the 2018 IEEE European Symposium on Security and Privacy (EuroS&P), London, UK, 24–26 April 2018; IEEE: New York, NY, USA, 2018; pp. 353–367. [Google Scholar]
Albrecht, M.R.; Bernstein, D.J.; Chou, T.; Cid, C.; Gilcher, J.; Lange, T.; Maram, V.; Von Maurich, I.; Misoczki, R.; Niederhagen, R.; et al. Classic McEliece: Conservative Code-Based Cryptography. 2022. Available online: https://cr.yp.to/talks/2024.09.17/slides-djb-20240917-mceliece-16x9.pdf (accessed on 1 October 2025).
McEliece, R.J. A public-key cryptosystem based on algebraic coding theory. Coding Thv 1978, 4244, 114–116. [Google Scholar]
Niederreiter, H. Knapsack-type cryptosystems and algebraic coding theory. Prob. Contr. Inform. Theory 1986, 15, 157–166. [Google Scholar]
Wang, W.; Szefer, J.; Niederhagen, R. FPGA-based key generator for the Niederreiter cryptosystem using binary Goppa codes. In Proceedings of the International Conference on Cryptographic Hardware and Embedded Systems, Santa Barbara, CA, USA, 17–19 August 2017; Springer: Berlin/Heidelberg, Germany, 2017; pp. 253–274. [Google Scholar]
Freedman, M.J.; Nissim, K.; Pinkas, B. Efficient private matching and set intersection. In Proceedings of the International Conference on the Theory and Applications of Cryptographic Techniques, Interlaken, Switzerland, 2–6 May 2004; Springer: Berlin/Heidelberg, Germany, 2004; pp. 1–19. [Google Scholar]
Chen, L.; Li, Z.; Chen, Z.; Liu, Y. Two anti-quantum attack protocols for secure multiparty computation. In Proceedings of the Trusted Computing and Information Security: 12th Chinese Conference, CTCIS 2018, Wuhan, China, 18 October 2018; Revised Selected Papers 12. Springer: Berlin/Heidelberg, Germany, 2019; pp. 338–359. [Google Scholar]
Paillier, P. Public-key cryptosystems based on composite degree residuosity classes. In Proceedings of the International Conference on the Theory and Applications of Cryptographic Techniques, Prague, Czech Republic, 2–6 May 1999; Springer: Berlin/Heidelberg, Germany, 1999; pp. 223–238. [Google Scholar]
Rivest, R.L.; Shamir, A.; Adleman, L. A method for obtaining digital signatures and public-key cryptosystems. Commun. ACM 1978, 21, 120–126. [Google Scholar] [CrossRef]
Goldwasser, S.; Micali, S. Probabilistic encryption. J. Comput. Syst. Sci. 1984, 28, 270–299. [Google Scholar] [CrossRef]
ElGamal, T. A public key cryptosystem and a signature scheme based on discrete logarithms. IEEE Trans. Inf. Theory 1985, 31, 469–472. [Google Scholar] [CrossRef]
Naccache, D.; Stern, J. A new public key cryptosystem based on higher residues. In Proceedings of the 5th ACM Conference on Computer and Communications Security, Francisco, CA, USA, 3–5 November 1998; pp. 59–66. [Google Scholar]
Benaloh, J. Dense probabilistic encryption. In Proceedings of the Workshop on Selected Areas of Cryptography, Kingston, ON, Canada, 5–6 May 1994; pp. 120–128. [Google Scholar]
Okamoto, T.; Uchiyama, S. A new public-key cryptosystem as secure as factoring. In Proceedings of the Advances in Cryptology—EUROCRYPT’98: International Conference on the Theory and Application of Cryptographic Techniques, Espoo, Finland, 31 May–4 June 1998; Proceedings 17. Springer: Berlin/Heidelberg, Germany, 1998; pp. 308–318. [Google Scholar]
Boneh, D.; Goh, E.J.; Nissim, K. Evaluating 2-DNF formulas on ciphertexts. In Proceedings of the Theory of Cryptography: Second Theory of Cryptography Conference, TCC 2005, Cambridge, MA, USA, 10–12 February 2005; Proceedings 2. Springer: Berlin/Heidelberg, Germany, 2005; pp. 325–341. [Google Scholar]
Yao, A.C. Protocols for secure computations. In Proceedings of the 23rd Annual Symposium on Foundations of Computer Science (SFCS 1982), Chicago, IL, USA, 3–5 November 1982; IEEE: New York, NY, USA, 1982; pp. 160–164. [Google Scholar]
Sander, T.; Young, A.; Yung, M. Non-interactive cryptocomputing for nc/sup 1. In Proceedings of the 40th Annual Symposium on Foundations of Computer Science (Cat. No. 99CB37039), New York, NY, USA, 17–18 October 1999; IEEE: New York, NY, USA, 1999; pp. 554–566. [Google Scholar]
Ishai, Y.; Paskin, A. Evaluating branching programs on encrypted data. In Proceedings of the Theory of Cryptography Conference, Amsterdam, The Netherlands, 21–24 February 2007; Springer: Berlin/Heidelberg, Germany, 2007; pp. 575–594. [Google Scholar]
Gentry, C. Fully homomorphic encryption using ideal lattices. In Proceedings of the Forty-First Annual ACM Symposium on Theory of Computing, Bethesda, MD, USA, 31 May–2 June 2009; pp. 169–178. [Google Scholar]
Brakerski, Z. Fully homomorphic encryption without modulus switching from classical GapSVP. In Proceedings of the Annual Cryptology Conference, Santa Barbara, CA, USA, 19–23 August 2012; Springer: Berlin/Heidelberg, Germany, 2012; pp. 868–886. [Google Scholar]
Brakerski, Z.; Gentry, C.; Vaikuntanathan, V. (Leveled) fully homomorphic encryption without bootstrapping. ACM Trans. Comput. Theory (TOCT) 2014, 6, 1–36. [Google Scholar] [CrossRef]
Cheon, J.H.; Kim, A.; Kim, M.; Song, Y. Homomorphic encryption for arithmetic of approximate numbers. In Proceedings of the Advances in Cryptology–ASIACRYPT 2017: 23rd International Conference on the Theory and Applications of Cryptology and Information Security, Hong Kong, China, 3–7 December 2017; Proceedings, Part I 23. Springer: Berlin/Heidelberg, Germany, 2017; pp. 409–437. [Google Scholar]
Chillotti, I.; Gama, N.; Georgieva, M.; Izabachene, M. Faster fully homomorphic encryption: Bootstrapping in less than 0.1 seconds. In Proceedings of the Advances in Cryptology–ASIACRYPT 2016: 22nd International Conference on the Theory and Application of Cryptology and Information Security, Hanoi, Vietnam, 4–8 December 2016; Proceedings, Part I 22. Springer: Berlin/Heidelberg, Germany, 2016; pp. 3–33. [Google Scholar]
Langlois, A.; Stehlé, D. Worst-case to average-case reductions for module lattices. Des. Codes Cryptogr. 2015, 75, 565–599. [Google Scholar] [CrossRef]
Lyubashevsky, V.; Peikert, C.; Regev, O. On ideal lattices and learning with errors over rings. In Proceedings of the Advances in Cryptology–EUROCRYPT 2010: 29th Annual International Conference on the Theory and Applications of Cryptographic Techniques, French Riviera, France, 30 May–3 June 2010; Proceedings 29. Springer: Berlin/Heidelberg, Germany, 2010; pp. 1–23. [Google Scholar]
Avanzi, R.; Bos, J.; Ducas, L.; Kiltz, E.; Lepoint, T.; Lyubashevsky, V.; Schanck, J.M.; Schwabe, P.; Seiler, G.; Stehlé, D. CRYSTALS-Kyber algorithm specifications and supporting documentation. NIST PQC Round 2019, 2, 1–43. Available online: https://pq-crystals.org/kyber/data/kyber-specification-round3-20210131.pdf (accessed on 10 August 2025).
Özeren, S.; Yayla, O. Methods for masking crystals-kyber against side-channel attacks. In Proceedings of the 2023 16th International Conference on Information Security and Cryptology (ISCTürkiye), Ankara, Turkiye, 18–19 October 2023; pp. 71–76. [Google Scholar]
Mukherjee, A.; Aikata, A.; Mert, A.C.; Lee, Y.; Kwon, S.; Deryabin, M.; Roy, S.S. ModHE: Modular Homomorphic Encryption Using Module Lattices: Potentials and Limitations. Cryptology ePrint Archive. 2023. Available online: https://tches.iacr.org/index.php/TCHES/article/download/11261/10803/11220 (accessed on 10 February 2025).
Zhao, C.-C.; Yang, Y.-T.; Li, Z.-C. The homomorphic properties of McEliece public-key cryptosystem. In Proceedings of the 2012 Fourth International Conference on Multimedia Information Networking and Security, Nanjing, China, 2–4 November 2012; IEEE: New York, NY, USA, 2012; pp. 39–42. [Google Scholar]
Morales, D.; Agudo, I.; Lopez, J. Private set intersection: A systematic literature review. Comput. Sci. Rev. 2023, 49, 100567. [Google Scholar] [CrossRef]
Debnath, S.K.; Stănică, P.; Choudhury, T.; Kundu, N. Post-quantum protocol for computing set intersection cardinality with linear complexity. IET Inf. Secur. 2020, 14, 661–669. [Google Scholar] [CrossRef]
Debnath, S.K.; Choudhury, T.; Kundu, N.; Dey, K. Post-quantum secure multi-party private set-intersection in star network topology. J. Inf. Secur. Appl. 2021, 58, 102731. [Google Scholar] [CrossRef]
Cai, Y.; Tang, C.; Xu, Q. Two-party privacy-preserving set intersection with FHE. Entropy 2020, 22, 1339. [Google Scholar] [CrossRef] [PubMed]
Gao, S. Efficient Fully Homomorphic Encryption Scheme. Cryptology ePrint Archive. 2018. Available online: https://eprint.iacr.org/2018/637 (accessed on 6 June 2025).
Camenisch, J.; Zaverucha, G.M. Private intersection of certified sets. In Proceedings of the Financial Cryptography and Data Security: 13th International Conference, FC 2009, Accra Beach, Barbados, 23–26 February 2009; Revised Selected Papers 13. Springer: Berlin/Heidelberg, Germany, 2009; pp. 108–127. [Google Scholar]
Bloom, B.H. Space/time trade-offs in hash coding with allowable errors. Commun. ACM 1970, 13, 422–426. [Google Scholar] [CrossRef]
Kerschbaum, F. Outsourced private set intersection using homomorphic encryption. In Proceedings of the Proceedings of the 7th ACM Symposium on Information, Computer and Communications Security, Seoul, Republic of Korea, 2–4 May 2012; pp. 85–86. [Google Scholar]
Dottling, N.; Dowsley, R.; Muller-Quade, J.; Nascimento, A.C. A CCA2 secure variant of the McEliece cryptosystem. IEEE Trans. Inf. Theory 2012, 58, 6672–6680. [Google Scholar] [CrossRef]
Rastaghi, R. An efficient CCA2-secure variant of the McEliece cryptosystem in the standard model. arXiv 2013, arXiv:1302.0347. [Google Scholar]
Nojima, R.; Imai, H.; Kobara, K.; Morozov, K. Semantic security for the McEliece cryptosystem without random oracles. Des. Codes Cryptogr. 2008, 49, 289–305. [Google Scholar] [CrossRef]
Canteaut, A.; Chabaud, F. A new algorithm for finding minimum-weight words in a linear code: Application to McEliece’s cryptosystem and to narrow-sense BCH codes of length 511. IEEE Trans. Inf. Theory 1998, 44, 367–378. [Google Scholar] [CrossRef]
Horlemann, A.L.; Puchinger, S.; Renner, J.; Schamberger, T.; Wachter-Zeh, A. Information-set decoding with hints. In Proceedings of the Code-Based Cryptography Workshop, Munich, Germany, 21–22 June 2021; Springer: Berlin/Heidelberg, Germany, 2021; pp. 60–83. [Google Scholar]
Bay, A.; Erkin, Z.; Hoepman, J.H.; Samardjiska, S.; Vos, J. Practical multi-party private set intersection protocols. IEEE Trans. Inf. Forensics Secur. 2021, 17, 1–15. [Google Scholar] [CrossRef]
Davidson, A.; Cid, C. An efficient toolkit for computing private set operations. In Proceedings of the Australasian Conference on Information Security and Privacy, Auckland, New Zealand, 3–5 July 2017; Springer: Berlin/Heidelberg, Germany, 2017; pp. 261–278. [Google Scholar]
Nita, S.L.; Mihailescu, M.I. Jdk 17: New features. In Cryptography and Cryptanalysis in Java: Creating and Programming Advanced Algorithms with Java SE 17 LTS and Jakarta EE 10; Springer: Berlin/Heidelberg, Germany, 2022; pp. 9–19. [Google Scholar]
Kostaras, I.; Drabo, C.; Juneau, J.; Reimers, S.; Schröder, M.; Wielenga, G.; Kostaras, I.; Drabo, C.; Juneau, J.; Reimers, S.; et al. What Is Apache NetBeans. Pro Apache NetBeans: Building Applications on the Rich Client Platform; Springer: Berlin/Heidelberg, Germany, 2020; pp. 3–28. [Google Scholar]
Bouncy Castle Crypto Library. Available online: https://www.bouncycastle.org (accessed on 11 May 2024).

Figure 1. Storage overhead measured in KB for 128-level security.

Figure 2. Storage overhead measured in KB for 192-level security.

Figure 3. Storage overhead measured in KB for 256-level security.

Figure 4. Communication overhead measured in KB for 128-level security.

Figure 5. Communication overhead measured in KB for 192-level security.

Figure 6. Communication overhead measured in KB for 256-level security.

Table 1. The corresponding values of McEliece/Niederreiter and Kyber’s public/private keys and ciphertexts (in bytes) that meet the NIST’s security levels.

PKC Algorithm		Level-1 (AES-128)	Level-3 (AES-192)	Level-5 (AES-256)
Kyber	Pk	800	1184	1568
	SK	768	1152	1536
	Ciphertext	768	1088	1568
McEliece (Niederreiter variant)	Pk	261,120	524,160	1,047,319
	SK	6492	13,608	13,948
	Ciphertext	96	156	208
McEliece1.5t (Niederreiter variant)	Pk	330,624	640,224	1,344,434
	SK	6551	13,704	14,066
	Ciphertext	144	234	290

Table 2. Storage Overhead measured in KB for 128-level security.

Set Size	McEliece (261,120, 6492)		McEliece1.5t (330,624, 6551)		Kyber (800, 768)
Set Size	Ciphertext Size	Storage Overhead	Ciphertext Size	Storage Overhead	Ciphertext Size	Storage Overhead
100	9.38	266.96	14.06	343.33	75.00	76.53
200	18.75	276.33	28.12	357.40	150.00	151.53
300	28.12	285.71	42.19	371.46	225.00	226.53
350	32.81	290.40	49.22	378.49	262.50	264.03
400	37.50	295.08	56.25	385.52	300.00	301.53
450	42.19	299.77	63.28	392.55	337.50	339.03
500	46.88	304.46	70.31	399.58	375.00	376.53
550	51.56	309.15	77.34	406.62	412.50	414.03
600	56.25	313.83	84.38	413.65	450.00	451.53

Table 3. Storage overhead measured in KB for 192-level security.

Set Size	McEliece (524,160, 13,608)		McEliece1.5t (640,224, 13,704)		Kyber (1184, 1152)
Set Size	Ciphertext Size	Storage Overhead	Ciphertext Size	Storage Overhead	Ciphertext Size	Storage Overhead
100	15.23	540.40	22.85	661.45	106.25	108.53
200	30.47	555.63	45.70	684.30	212.50	214.78
300	45.70	570.87	68.55	707.16	318.75	321.03
400	60.94	586.10	91.41	730.01	425.00	427.28
500	76.17	601.34	114.26	752.86	531.25	533.53
550	83.79	608.95	125.68	764.29	584.38	586.66
600	91.41	616.57	137.11	775.71	637.50	639.78
650	99.02	624.19	148.54	787.14	690.62	692.91
700	106.64	631.80	159.96	798.56	743.75	746.03
750	114.26	639.42	171.39	809.99	796.88	799.16
800	121.88	647.04	182.81	821.41	850.00	852.28

Table 4. Storage overhead measured in KB for 256-level security.

Set Size	McEliece (1,047,319, 13,948)		McEliece1.5t (1,344,434, 14,066)		Kyber (1568, 1536)
Set Size	CiphertextSize	Storage Overhead	Ciphertext Size	Storage Overhead	Ciphertext Size	Storage Overhead
100	20.31	1056.71	28.32	1354.98	153.12	156.16
200	40.62	1077.02	56.64	1383.30	306.25	309.28
300	60.94	1097.33	84.96	1411.62	459.38	462.41
400	81.25	1117.64	113.28	1439.94	612.50	615.53
500	101.56	1137.96	141.60	1468.26	765.62	768.66
600	121.88	1155.98	140.62	1496.58	918.75	921.78
700	142.19	1158.27	169.92	1524.90	1071.88	1074.91
750	152.34	1178.58	198.24	1539.06	1148.44	1151.47
800	162.50	1188.74	212.40	1553.22	1225.00	1228.03
850	172.66	1206.76	226.56	1567.38	1301.56	1304.59
900	182.81	1198.89	240.72	1581.54	1378.12	1381.16
1000	203.12	1209.05	254.88	1609.86	1531.25	1534.28
1050	213.28	1219.21	283.20	1624.02	1607.81	1610.84
1100	223.44	1239.52	297.36	1638.18	1684.38	1687.41

Table 5. Communication overhead measured in KB for 128-level security.

Set Size	McEliece (261,120, 6492)	McEliece1.5t (330,624, 6551)	Kyber (800, 768)
Set Size	Communication Overhead	Communication Overhead	Communication Overhead
100	264.94	337.50	188.28
200	274.88	352.13	375.78
300	284.82	366.75	563.28
350	289.78	374.07	657.03
400	294.75	381.38	750.78
450	299.72	388.69	844.53
500	304.69	396.01	938.28
550	309.66	403.32	1032.03
600	314.63	410.63	1125.78

Table 6. Communication overhead measured in KB for 192-level security.

Set Size	McEliece (524,160, 13,608)	McEliece1.5t (640,224, 13,704)	Kyber (1184, 1152)
Set Size	Communication Overhead	Communication Overhead	Communication Overhead
100	527.68	648.64	213.66
200	543.47	672.05	266.78
300	559.27	695.46	532.41
400	575.07	718.88	798.03
500	590.87	742.30	1063.66
550	598.77	754.00	1329.28
600	606.66	765.71	1462.09
650	614.56	777.42	1594.91
700	622.46	789.12	1727.72
750	630.36	800.84	1860.53
800	638.26	812.54	1993.34

Table 7. Communication overhead measured in KB for 256-level security.

Set Size	McEliece (524,160, 13,608)	McEliece1.5t (640,224, 13,704)	Kyber (1184, 1152)
Set Size	Communication Overhead	Communication Overhead	Communication Overhead
100	1043.65	1341.81	384.34
200	1064.53	1370.69	767.16
300	1085.40	1399.58	1149.97
400	1106.28	1428.46	1532.78
500	1127.16	1457.35	1915.59
600	1148.03	1486.23	2298.41
700	1168.91	1515.11	2681.22
750	1179.35	1529.56	2872.62
800	1189.78	1544.00	3064.03
850	1200.22	1558.44	3255.44
900	1210.66	1572.88	3446.84
1000	1231.53	1601.76	3829.66
1050	1241.97	1616.21	4021.06
1100	1252.41	1630.65	4212.47

Table 8. Computation cost measured in ms.

Set Size	McEliece $λ = 50$	Kyber ( $K = 2$ )	Kyber ( $K = 3$ )	Kyber ( $K = 4$ )
Set Size	Computation Cost	Computation Cost	Computation Cost	Computation Cost
100	3.08	23.9	33.4	39.5
200	6.66	53.2	76.9	102.6
300	8.52	99.6	121.7	133.9
400	11.60	115.5	167	195.9
500	14.07	161.6	190.3	259.1
600	18.14	189.1	236.6	304.3
700	19.70	228.1	271.5	330.3

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Abudaqa, A.A.; Alshehri, K.; Felemban, M. On the Homomorphic Properties of Kyber and McEliece with Application to Post-Quantum Private Set Intersection. Cryptography 2025, 9, 66. https://doi.org/10.3390/cryptography9040066

AMA Style

Abudaqa AA, Alshehri K, Felemban M. On the Homomorphic Properties of Kyber and McEliece with Application to Post-Quantum Private Set Intersection. Cryptography. 2025; 9(4):66. https://doi.org/10.3390/cryptography9040066

Chicago/Turabian Style

Abudaqa, Anas A., Khaled Alshehri, and Muhamad Felemban. 2025. "On the Homomorphic Properties of Kyber and McEliece with Application to Post-Quantum Private Set Intersection" Cryptography 9, no. 4: 66. https://doi.org/10.3390/cryptography9040066

APA Style

Abudaqa, A. A., Alshehri, K., & Felemban, M. (2025). On the Homomorphic Properties of Kyber and McEliece with Application to Post-Quantum Private Set Intersection. Cryptography, 9(4), 66. https://doi.org/10.3390/cryptography9040066

Article Menu

On the Homomorphic Properties of Kyber and McEliece with Application to Post-Quantum Private Set Intersection

Abstract

1. Introduction

Contributions

2. Background

2.1. Homomorphic Encryption

2.2. Crystal Kyber Cryptosystem

2.2.1. Example

2.2.2. Homomorphism

2.3. McEliece/Niederreiter and Classic-McEliece Cryptosystem

2.3.1. Example

2.3.2. Homomorphism

3. Related Works

4. Proposed PSI Protocols

4.1. Kyber Based Protocol

Formal Security Proof

4.2. McEliece-Based Protocol

Formal Security Proof

4.3. Limitations and Discussion

5. Experimental Work and Performance Evaluation

5.1. Storage Overhead

5.2. Communication Overhead

5.3. Computation Cost

6. Conclusions and Future Work

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI