Next Article in Journal
PPE-EYE: A Deep Learning Approach to Personal Protective Equipment Compliance Detection
Next Article in Special Issue
A Dynamic Clustering Framework for Intelligent Task Orchestration in Mobile Edge Computing
Previous Article in Journal
Joint Inference of Image Enhancement and Object Detection via Cross-Domain Fusion Transformer
Previous Article in Special Issue
pFedKA: Personalized Federated Learning via Knowledge Distillation with Dual Attention Mechanism
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Privacy-Preserving Set Intersection Protocol Based on SM2 Oblivious Transfer

1
School of Computer Science and Technology, Harbin University of Science and Technology, Harbin 150080, China
2
College of Computing and Data Science, Nanyang Technological University, 50 Nanyang Avenue, Singapore 639798, Singapore
*
Author to whom correspondence should be addressed.
Computers 2026, 15(1), 44; https://doi.org/10.3390/computers15010044
Submission received: 25 December 2025 / Revised: 5 January 2026 / Accepted: 7 January 2026 / Published: 10 January 2026
(This article belongs to the Special Issue Mobile Fog and Edge Computing)

Abstract

Private Set Intersection (PSI) is a fundamental cryptographic primitive in privacy-preserving computation and has been widely applied in federated learning, secure data sharing, and privacy-aware data analytics. However, most existing PSI protocols rely on RSA or standard elliptic curve cryptography, which limits their applicability in scenarios requiring domestic cryptographic standards and often leads to high computational and communication overhead when processing large-scale datasets. In this paper, we propose a novel PSI protocol based on the Chinese commercial cryptographic standard SM2, referred to as SM2-OT-PSI. The proposed scheme constructs an oblivious transfer-based Oblivious Pseudorandom Function (OPRF) using SM2 public-key cryptography and the SM3 hash function, enabling efficient multi-point OPRF evaluation under the semi-honest adversary model. A formal security analysis demonstrates that the protocol satisfies privacy and correctness guarantees assuming the hardness of the Elliptic Curve Discrete Logarithm Problem. To further improve practical performance, we design a software–hardware co-design architecture that offloads SM2 scalar multiplication and SM3 hashing operations to a domestic reconfigurable cryptographic accelerator (RSP S20G). Experimental results show that, for datasets with up to millions of elements, the presented protocol significantly outperforms several representative PSI schemes in terms of execution time and communication efficiency, especially in medium and high-bandwidth network environments. The proposed SM2-OT-PSI protocol provides a practical and efficient solution for large-scale privacy-preserving set intersection under national cryptographic standards, making it suitable for deployment in real-world secure computing systems.

1. Introduction

With the accelerating pace of global digitalization, the issue of privacy information leakage has become increasingly prominent, ultimately emerging as a significant impediment to the healthy development of the digital economy. Privacy-preserving computation [1] has garnered substantial attention from both academia and industry. It refers to a suite of technologies that enable analytical computation without exposing the underlying data, thus achieving the goal of making the data “available but invisible” and facilitating the secure release of the value of the data. Owing to its considerable security advantages, privacy-preserving computation has increasingly become a central focus in both research and practical applications. Among the various paradigms of privacy-preserving computation protocols, the Private Set Intersection (PSI) protocol occupies a pivotal role. Through carefully designed cryptographic mechanisms, PSI ensures that participating parties obtain only the intersection of their respective input sets while preventing the disclosure of any additional information, thus effectively safeguarding the privacy rights of data owners. By integrating data utility with robust security guarantees, PSI enables secure information sharing without compromising privacy. To address the national need for localized, verifiable, and controllable information security, it is crucial to develop PSI protocols based on a national cryptographic system. This will help establish an autonomous and controllable technical ecosystem. This approach also has broad practical significance and promising application prospects. It has important practical value for promoting the compliant circulation and release of value of data elements.
In recent years, research on two-party PSI protocols has evolved into a field where theoretical innovation and engineering practice advance in tandem. To meet the performance requirements of practical applications, researchers have proposed a wide range of technical approaches and construction frameworks. As a fundamental component of many privacy-preserving computation workflows, PSI protocols play an indispensable role. For example, before performing Vertical Federated Learning (VFL) [2,3], PSI is used to align samples between participating parties; similarly, prior to index-based secure query execution, PSI commonly enables the secure acquisition of index data. Moreover, PSI has been independently applied in real-world scenarios such as remote diagnosis [4], record linkage [5], and online advertising effectiveness measurement [6], demonstrating its broad applicability and practical significance. Despite substantial research progress and early application successes, existing PSI protocols still encounter critical bottlenecks that hinder practical deployment. In particular, mainstream cryptographic systems adopted by these protocols, such as RSA and AES, exhibit well-documented algorithmic vulnerabilities, which introduce risks of cryptanalysis and potential privacy leakage.
To address these challenges, we propose the first PSI protocol fully based on the SM2 [7] elliptic curve cryptography (ECC) standard, thereby filling a critical research gap in the application of Secure Multi-Party Computation (SMPC) within Chinese national cryptographic ecosystem. Although SM2 is less widely adopted than NIST curves in Western cryptographic literature, it serves as a mandatory standard for China’s critical information infrastructure; anchored by the SM2 public-key cryptographic algorithm and the SM3 hash algorithm [8], the Chinese national cryptographic system boasts notable advantages in security, independent controllability and efficiency, acting as a robust alternative to internationally prevalent but potentially compromised algorithms like RSA. This study further establishes a foundational theoretical framework for privacy-preserving protocols under this regulatory regime, analogous to the role that Curve25519 plays in non-standardized high-performance cryptographic applications. Built on SM2 oblivious transfer (SM2-OT-PSI), the proposed protocol introduces an oblivious transfer extension mechanism tailored to PSI scenarios and constructs a multi-point Oblivious Pseudorandom Function (OPRF), enabling an efficient Private Set Intersection under the semi-honest adversary model. Through formal security analysis, we formally establish the security and correctness of the protocol under the Elliptic Curve Discrete Logarithm Problem (ECDLP) assumption.
At the engineering implementation level, this paper introduces a software–hardware co-design acceleration architecture combined with a domestically developed reconfigurable cryptographic security chip. Computationally intensive operations, such as SM2 point multiplication and SM3 hashing, are offloaded to hardware for execution, thereby reducing protocol runtime and system load. Experimental results demonstrate that under million-scale dataset sizes and medium-to-high bandwidth network environments, the proposed SM2-OT-PSI protocol outperforms representative PSI schemes in terms of runtime efficiency and scalability, demonstrating substantial practical engineering value. The main contributions of this paper can be summarized as follows:
  • A PSI protocol fully built on Chinese national cryptographic SM2 and SM3 algorithms is presented, satisfying cryptographic compliance requirements for indepen- dent controllability.
  • SM2-based oblivious transfer and multi-point OPRF constructions are developed to support an efficient PSI computation workflow.
  • The protocol’s practical performance under large-scale datasets is substantially improved through a software–hardware co-design acceleration architecture, and its effectiveness is validated through experimental evaluation.

2. Related Work

To meet the requirements of diverse practical scenarios, current PSI protocols can be classified into two-party PSI protocols and multi-party PSI protocols, according to the number of participating entities. Two-party PSI, which involves exactly two participants, represents the canonical PSI scenario, and this section therefore focuses on related research in this category. PSI is a representative SMPC problem. However, due to the inherent computational complexity of PSI, directly applying generalMPC protocols leads to inefficient or sub-optimal performance. Consequently, both academia and industry have developed a variety of specialized PSI protocols. The core cryptographic primitives underlying existing PSI protocols can be broadly categorized into the following types: Garbled Circuits (GC) [9,10,11,12], Public Key Encryption (PKE) [13,14,15,16,17], Oblivious Transfer (OT) [18,19,20,21,22,23,24,25,26], and PSI schemes incorporating hardware-optimized designs [14,17,23]. GC-based PSI protocols follow the design philosophy of general MPC, constructing PSI schemes using techniques such as Yao’s Garbled Circuits [27] or GMW circuits [28], and subsequently optimizing them for practical deployment. In contrast, OT-based PSI protocols typically achieve significantly better performance. Specifically, OT-based PSI incurs lower computational overhead but generally requires higher communication bandwidth.
However, simply classifying PSI protocols according to their underlying cryptographic primitives does not adequately capture their core design principles or application scenarios. Two-party PSI involves two mutually distrustful participants holding datasets X and Y, who seek to compute the intersection X Y while preserving the confidentiality of the elements in their respective input sets. Meadows et al. [13] first proposed a two-party PSI protocol based on the Elliptic Curve Diffie–Hellman (ECDH) key exchange, which relies on the hardness of the Elliptic Curve Discrete Logarithm Problem. This protocol enables privacy-preserving verification of set overlap without revealing either party’s input. Nevertheless, when processing large-scale datasets, this scheme encounters significant computational bottlenecks due to the large number of elliptic curve point multiplications required, which limits its suitability for high-performance applications. To overcome this bottleneck, Wu et al. [14] proposed an optimized variant of the ECDH-based PSI protocol, referred to as DH-IPP PSI. This scheme introduces extensive optimizations to elliptic curve operations using the Intel Integrated Performance Primitives Cryptography (IPP-Crypto), achieving high compatibility and efficiency on Intel Central Processing Unit (CPU) architectures. Systematic testing on an Intel Xeon Platinum 8369B processor (2.70 GHz) demonstrated substantial improvements in computational efficiency. In this paper, we design a PSI protocol based on the SM2 elliptic curve cryptographic standard. Although SM2 is less widely adopted than NIST curves in Western cryptographic literature, it is a mandatory standard within China’s critical information infrastructure. This study establishes a theoretical foundation for privacy-preserving protocols within this regulatory context, serving a role analogous to that of Curve25519 in non-standardized high-performance applications. In our previous work, we designed and implemented a hardware-acceleration architecture for elliptic curve computations [29,30], aiming to mitigate algorithmic overhead through hardware-level optimization. However, despite these efficiency gains, the fundamental computational burden imposed by the protocol’s native algorithmic structure remains unresolved, and its performance bottleneck in large-scale dataset scenarios persists. The PSI protocol proposed by Cristofaro et al. [15] employs RSA-based homomorphic encryption, augmented by performance evaluations and the integration of Bloom filters (BFs) [16]. This protocol supports intersection computation, intersection-cardinality estimation, and verifiable computation, making it one of the most widely deployed PSI protocols to date. Because RSA is increasingly vulnerable to cryptanalytic advances, the protocol strengthens security by incorporating private-key blinding and 2048-bit key lengths. However, because the number of RSA encryption and decryption operations scales linearly with the set size, the resulting computational overhead becomes prohibitive, rendering RSA-based PSI protocols impractical for large real-world deployments. Zhang et al. [17] developed the robust federated-learning framework FLASH and implemented a FLASH-RSA-PSI hardware–software co-optimization architecture on the Xilinx VU13P FPGA. Experimental evaluations show that when both parties hold million-scale datasets, the privacy-preserving intersection computation time exceeds 400 s. Due to its relatively low hardware speedup—only a two-fold speedup compared with software execution—the framework cannot meet the real-time requirements of large-scale data scenarios.
At present, the development of high-performance PSI protocols is closely linked to advances in OT. First proposed by Rabin et al. [18] in 1981, OT is a secure two-party communication protocol in which the sender holds multiple input items, while the receiver obtains exactly one selected item. The protocol guarantees that the sender cannot infer the receiver’s selection, and the receiver cannot access any information about the unselected items. Pinkas et al. [19] initially proposed a method for privacy-preserving set intersection computation based on OT, and subsequent work [20] implemented this approach by constructing an OPRF using enhanced OT extension techniques. Extensive theoretical and experimental evaluationsshow that, among semi-honest PSI protocols, two-party PSI based on OT extension achieves higher efficiency. The GMR [21] protocol employs Multi-query Reverse Private Membership Testing (mqRPMT) and OT, preventing either party from linking test results to specific elements and thereby enabling secure, intersection-only computation; however, it exhibits limited performance in practice. The KKRT [22] protocol implements PSI using Cuckoo hashing combined with a single-point auxiliary OPRF, reducing the sender’s computational burden, but the single-point evaluation capability leads to substantially increased communication overhead. To address this limitation, Pinkas et al. [23] proposed the SpOT-Light PSI, which builds on OT extension techniques and functions as an efficient multi-point OPRF optimized for cloud-server CPUs. Nevertheless, its computational complexity O ( n log 2 n ) is prohibitively high. PRTY19 [12] proposes a batch processing and amortization design scheme for OPPRF, which reduces the communication overhead. Other representative protocols include PaXoS (PRTY20) [24], which achieves security in the OT-hybrid model while maintaining linear communication complexity, and CM20 [25], which implements multi-point OPRF through matrix construction to avoid excessive communication costs. RA17 [26] leverages OT extension techniques and provides an efficient solution for balanced PSI scenarios, but incurs substantial communication overhead in unbalanced settings.
Table 1 summarizes representative two-party PSI protocols and highlights their main advantages and limitations. Existing schemes typically face a trade-off between computational efficiency and communication overhead. RSA-based designs suffer from expensive public-key operations, while ECDH-based PSI requires a large number of elliptic curve multiplications. OT/OPRF-based PSI significantly improves performance but often incurs substantial communication costs. In contrast, the proposed SM2-OT-PSI protocol achieves a balanced trade-off between efficiency and communication while fully complying with Chinese national cryptographic standards, making it suitable for large-scale practical deployment.

3. Preliminaries

3.1. Semi-Honest Adversary Model

The protocol proposed in this paper is developed within the semi-honest adversary model, and it is provably secure within this framework, meaning that neither party learns the other party’s set contents [31]. The protocol includes a formal, simulation-based security proof to establish the security of SM-PSI. While the malicious model offers stronger security guarantees, recent comprehensive surveys on MPC security models indicate that the semi-honest assumption remains the dominant and most practical choice for large-scale industrial applications due to its superior efficiency [32]. Furthermore, cutting-edge research, such as the efficient PSI protocol presented at ACM CCS 2024, confirms that protocol designs under the semi-honest model can achieve orders-of-magnitude performance improvements, which is critical for supporting large-scale, high-concurrency real-time privacy services [33]. The semi-honest adversary model assumes that adversaries follow all protocol-specified steps while passively observing the execution to extract information such as inputs or outputs. Compared with the malicious adversary model, the semi-honest model presumes that adversaries do not actively alter information or deviate from the prescribed protocol during execution. The security definition of secure multi-party computation under the semi-honest adversary model is formally defined as follows: for n participants P 1 , P 2 , , P n each holding an input dataset X 1 , X 2 , , X n who jointly compute the function f = f ( X 1 , X 2 , , X n ) via Protocol Π , a semi-honest adversary A controlling a subset I of one or more participants, the multi-party computation protocol is secure if and only if for any such adversary A , there exists a simulator S such that the view View A Π ( X 1 , X 2 , , X n ) generated by A during protocol execution is computationally indistinguishable from the simulated view output by S , for all input datasets X 1 , X 2 , , X n and all adversary-controlled subsets I.
To formalize computational indistinguishability, we introduce a probability function P r , which denotes the probability that adversary A observes a given view:
P r View A Π X 1 , X 2 , , X n = P r S f X 1 , X 2 , , X n
where X I represents the inputs of the participants controlled by adversary A . The security of the SMPC model under the semi-honest adversary model ensures that adversary A cannot deduce the private inputs of honest participants from intercepted protocol data. Specifically, there exists a simulator S that produces a simulated view—computationally indistinguishable from the adversary’s real view—using only the protocol output and the inputs of the corrupted participants.

3.2. Oblivious Transfer

Oblivious Transfer (OT) is a core cryptographic primitive in the field of secure computation first introduced by Rabin [18]. The 1-out-of-2 OT protocol describes a scenario in which a sender holding two input strings ( m 0 , m 1 ) interacts with a receiver who possesses a selection bit b. In this interaction, the receiver learns m b without learning m 1 b , and the sender remains unaware of the value of b. In 2003, Ishai et al. [34] proposed an OT extension protocol—commonly referred to as IKNP—which enables the execution of a large number of OT instances by performing only a small number of computationally expensive base OTs. In this paper, all OT implementations employed in the tests are based on the ALSZ scheme [28].

3.3. SM2 Algorithm

The SM2 algorithm is a public-key cryptographic algorithm based on elliptic curves whose security is primarily founded on the intractability of the Elliptic Curve Discrete Logarithm Problem (ECDLP). Compared with other public-key cryptographic schemes (e.g., RSA, Paillier) offering comparable security levels, SM2 provides advantages including shorter key lengths and smaller parameter sizes. Moreover, the number of logic gates required to implement SM2 in hardware (e.g., on chips) is significantly smaller than that required by RSA, resulting in reduced power consumption. This paper adopts the Chinese national cryptographic standard SM2, as defined in GB/T 32918-2016 [7], which specifies the use of an elliptic curve over a 256 bit prime field. Its security relies on the intractability of the Elliptic Curve Discrete Logarithm Problem (ECDLP). To ensure clarity and standardization, this paper adopts the recommended parameters defined in Part 4 of the standard:
  • Curve Equation: The algorithm operates on an elliptic curve over a prime field F p , defined by the equation y 2 = x 3 + a x + b ;
  • Key Parameters: It utilizes a 256-bit prime field, offering security comparable to NIST P-256 but with independent parameter generation.
  • Efficiency: Compared to RSA-2048, SM2 employs shorter 256-bit keys and requires fewer logic gates for hardware implementation, resulting in lower power consumption and higher processing speed.

3.4. SM3 Algorithm

SM3, a Chinese national cryptographic standard, is a cryptographic hash function independently developed in China and is primarily used in scenarios such as digital signatures, message authentication codes, and other data-integrity verification tasks. The parameters of the SM3 hash algorithm adopted in this paper are defined as follows:
  • Input and Padding: The maximum input message length is 2 64 bits. The message is padded and parsed into 512-bit message blocks;
  • Compression Function: The algorithm utilizes a 32-bit word size and produces 132 message words from each block. The compression function executes 64 rounds of operations;
  • Output Digest: After processing all blocks sequentially, the algorithm produces a fixed 256-bit hash value, which ensures high collision resistance and compatibility with the SM2 algorithm.

3.5. Non-Interactive Zero-Knowledge Proof (NIZK)

Non-interactive zero-knowledge proof (NIZK) is a specialized form of zero-knowledge proof in which only a single message is exchanged between the prover and the verifier, the prover sends a proof that the verifier can validate without any multi-round interaction. Based on the binary relation formed by points on the SM2 elliptic curve and their discrete logarithms, defined as R D L = ( G , Q , P , w ) P = w G , this paper constructs the non-interactive zero-knowledge functionality F com - ZK R using SM2. Only a single message exchange via a program interface is required to implement key-correctness verification. For the non-interactive zero-knowledge functionality corresponding to relation R, the participating entities are P 1 and P 2 , and the procedure is outlined as follows:
  • Upon receiving the instruction ( prove , sid , x , w ) from participant P i ( i { 1 , 2 } ), the system first checks whether the session identifier sid has been previously used. If it has, the message is ignored; if it has not, the system sends ( proof - receipt , sid , x ) to P 3 i . If ( x , w ) R , the system records ( sid , i , x ) * .
  • Upon receiving the instruction ( response - proof , sid ) from P i , if a record of the form ( sid , i , x ) exists, the system sends ( response - proof , sid , x ) to P 3 i .

4. Design of a Private Set Intersection Protocol Based on Chinese National Cryptographic Algorithm

4.1. Computational Problem Description of Private Set Intersection

This paper assumes the existence of two participants, Alice and Bob, each holding a dataset denoting U A and U B , respectively. The datasets U A and U B store the unique identifiers (ID) of users in the sets owned by Alice and Bob, enabling each participant to locate the corresponding users within their respective collections. Specifically, U A contains i elements, denoted U A i , while U B contains j elements, denoted U B j . When Alice and Bob wish to compute the intersection of their respective sets without revealing the contents of U A or U B , the overall framework of the SM2-OT-PSI protocol is shown in Figure 1.
Encryption Initialization Phase: Both parties generate the required encryption parameters based on SM2 and perform initial encryption and obfuscation of the user ID.
Oblivious Transfer Phase: Without direct communication, both parties verify the correctness of the submitted public and private keys via signature generation and verification mechanisms. This approach not only prevents man-in-the-middle attacks but also avoids the substantial communication overhead of interactive verification, thereby ensuring that no private-key information is leaked throughout the process.
Private Set Intersection Computation Phase: Since identical user ID encrypted using the SM3-based OPRF yield consistent ciphertexts, the two parties can directly compare the encrypted outputs. Thus, the intersection of the sets can be efficiently derived without requiring decryption of the ciphertexts.

4.2. Protocol Flow of SM2-Based Non-Interactive Zero-Knowledge Proof

This paper designs a Non-Interactive Zero-Knowledge (NIZK) proof functionality, denoted as F c o m Z K R , based on the SM2 algorithm. Under this functionality, the Sender (S) transmits a ciphertext c 1 = [ k ] G = ( x 1 , y 1 ) containing the critical key information k to the Receiver (R) and proves the correctness of the key k via F c o m Z K R .
  • Proof Generation: Sender S first sends the instruction ( prove , sid | | 1 , c 1 , k ) to the functionality F c o m Z K R , then selects a random number r and computes R = [ r ] G . Subsequently, Sender S sends the instruction ( response - proof , sid | | 1 ) to F c o m Z K R again and computes the following based on the returned result:
    c = H a s h ( R , c 1 , G ) .
    s = r + k c .
    This generates the proof ciphertext c = ( R , s ) . Finally, Sender S transmits the proof ciphertext c and the corresponding ciphertext set to Receiver R.
  • Proof Reception and Recording: Upon receiving the ciphertext set and the proof ciphertext c = ( R , s ) from Sender S, Receiver R simultaneously receives a return message ( proof - receipt , sid | | 1 ) from the functionality F c o m Z K R and records ( sid | | 1 , 1 , c 1 ) , where c 1 = [ k ] G = ( x 1 , y 1 ) .
  • Proof Verification: Receiver R verifies the ciphertext c 1 using the proof ciphertext c = ( R , s ) . First, R computes c as defined in Equation (2). Subsequently, Receiver R receives the message ( response - proof , sid | | 1 , c 1 ) returned by the functionality F c o m Z K R . If this message is not received, the protocol is aborted; if received, the zero-knowledge proof ( G , R , c 1 ) R D L is verified. If the verification fails, the protocol is likewise aborted.
  • Result Determination: When all the aforementioned verifications pass, Receiver R further computes and checks whether the following equation holds:
    [ s ] G = R + [ c ] c 1 .
    If the equation does not hold, the protocol is aborted. If the equation holds, the correctness of c 1 = [ k ] G = ( x 1 , y 1 ) is accepted, thereby completing the verification of the discrete logarithm relationship.

4.3. Protocol Flow of SM2-Based Oblivious Transfer

This paper constructs a 1-out-of-i OT protocol using the SM2 public-key cryptographic algorithm. The sender, Alice, holds a message set U A = { U A 1 , , U A i } , while the receiver, Bob, generates a private-key set d B = { d B 1 , , d B i } and selects one key to obtain the corresponding message. The sender remains unaware of which message the receiver selects, and the receiver learns only the chosen message without obtaining any information about the remaining ones. OPRF enables two parties to jointly compute a pseudorandom function value while ensuring that neither party learns the other’s private input. In Section 4.4, the SM2-based OT scheme is extended to construct an SM2-OPRF scheme used within the PSI protocol. Protocol Flow is as follows:
  • Operations of Bob
    • According to the GB/T 32918-2016 standard, Bob selects a private key d B i in the range [ 1 , n 2 ] and derives the corresponding public key P B using the elliptic curve base point G.
  • Operations of Alice
    • Alice obtains the public parameters.
    • For each message, she generates a random value k i and computes
      c 1 i = [ k i ] G = ( x 1 i , y 1 i ) .
    • Alice computes
      [ k i ] P B = ( x 2 i , y 2 i ) .
    • Alice generates the key material
      t i = ( x 2 i y 2 i , klen ) .
    • Alice encrypts each message m i U A as
      c 2 i = m i t i .
    • Alice computes a signature-like verification tag
      c 3 i = Hash ( x 2 i m i y 2 i ) .
    • Alice transmits the ciphertext tuple
      C i = ( c 1 i c 3 i c 2 i )
      to Bob.
  • Operations of Bob
    • Bob receives the ciphertexts and selects the ciphertext C i corresponding to the message he intends to obtain.
    • Bob computes
      ( x 2 i , y 2 i ) = [ d B i ] c 1 i = [ k i ] P B .
    • Bob recovers the message as
      c 2 i t i = m i .
    • Bob verifies the correctness of the message by checking whether
      ( x 2 i m i y 2 i ) = c 3 i .
      If the verification equation ( x 2 i m i y 2 i ) = c 3 i holds, the message is deemed valid and is accepted. Otherwise, this indicates that the ciphertext has been tampered with or the sender has failed to follow the protocol. In this case, the receiver will immediately execute the abort procedure, terminate the session, and return the failure identifier ⊥ to prevent any potential information leakage.

4.4. SM2-Based Oblivious Pseudorandom Function via OT Extension

This section builds upon the SM2 Oblivious Transfer (OT) protocol introduced in Section 4.3, extending it to an Oblivious Pseudorandom Function (OPRF) protocol. In the OPRF protocol, the sender, Alice, generates a pseudorandom output for each of her inputs, determined by the implicit choice derived from the receiver, Bob’s private key; meanwhile, Bob is only able to retrieve the output corresponding to the key he has selected. OPRF acts as a fundamental primitive in the construction of privacy-preserving protocols, including PSI.
Protocol Goal: Alice possesses an input set ( U A = U A 1 , , U A j ) , while Bob holds a private key d B and its corresponding public key P B . Both parties collaborate to compute: Alice retrieves the OPRF values c 2 i corresponding to all inputs U A j , while Bob can only learn the plaintext U A i corresponding to a specific value c 2 i , and cannot gain any information about the other c 2 i where ( j i ) . Moreover, Alice is unable to discern which index i Bob has selected. Protocol Flow is as follows:
  • Operations of Bob
    • Bob selects (or generates) an SM2 private key d B i and computes the corresponding public key P B = [ d B i ] . He transmits P B to Alice while retaining d B i as confidential information.
  • Operations of Alice
    • Alice obtains the public parameters.
    • For each message, Alice samples a random value k i and computes c 1 i as defined in Equation (5).
    • Alice computes [ k i ] P B as defined in Equation (6).
    • Alice derives the key material using the key-generation function:
      t i = K D F ( x 2 i y 2 i , klen ) .
      Here, KDF is the key derivation function defined in the GB/T 32918.4-2016 standard [7], which generates one or more shared secret keys by operating on the shared secret and other parameters known to both participating parties.
    • Alice encrypts each message m i U A = { U A 1 , , U A i } as
      c 2 i = S M 3 t | | U A i
      This step implements the specific instantiation of the pseudorandom function
      OPRF ( t j , U A j )
      and the output c 2 i serves as the final OPRF value corresponding to the input U A j .

4.5. Protocol Flow of Private Set Intersection

This paper proposes a PSI protocol based on a public-key cryptosystem, in which the construction of the PSI protocol leverages the SM2 and SM3 throughout the computation process. This section provides a detailed description of the protocol steps and operational procedures, and the overall workflow of the national-cryptography-based PSI protocol is presented in Table 2. In this protocol, computation scheduling for the national-cryptography-based PSI is performed on the software side, whereas the point-multiplication operations of the SM2 algorithm and the hash computations of the SM3 algorithm are executed on the hardware side. This software–hardware co-design approach improves computational efficiency and reduces communication overhead; it minimizes the number of communication rounds and transmitted data, offloads computationally intensive operations to the chip-integrated arithmetic module, and delegates branch-control logic to the software layer. Each participant in the PSI protocol deploys a server equipped with the domestic reconfigurable cryptographic security chip RSP S20G, which provides hardware acceleration for national cryptographic algorithms. Owing to its programmability, the RSP S20G chip can accommodate new computational requirements through algorithm updates or optimizations without necessitating hardware replacement. Its performance achieves throughput rates of 30 Gbps for SM3 hashing and 300,000 SM2 encryption operations per second. The protocol interacts with the security chip’s arithmetic module through the CAP interface on the software side, while the Linux kernel driver layer and PCIe 2.0 × 8 interface handle the transmission of data streams, control signals, and status signals. The eight-channel read/write mechanism supports a data rate of up to 4 GB/s, enabling tight software–hardware integration and facilitating highly efficient data flow within the system.

4.6. Security Analysis

The security of the SM2-OT-PSI protocol relies on the Chinese National Cryptographic Algorithms, non-interactive zero-knowledge proofs, and the security mechanisms provided by hardware security chips. In this section, under the semi-honest setting, we analyze the security of the SM2-OT-PSI protocol using a simulation-based security proof for the semi-honest adversary model. From the hardware perspective, the RSP S20G security chip incorporates physical attack-resistant features, such as anti-disassembly protections and countermeasures against side-channel attacks. Its operational logic and internal data-processing mechanisms have undergone rigorous security certification. By optimizing the implementations of the SM2 and SM3 algorithms, the chip prevents leakage of intermediate computational states. Additionally, private keys are generated internally and stored within a dedicated secure region, ensuring that they cannot be accessed or tampered with by external entities. From the software perspective, the protocol workflow is designed in accordance with the security guarantees required under the semi-honest adversary model. All ciphertexts transmitted during the protocol are encrypted using SM2—whose security is based on the intractability of the ECDLP—and further protected by the cryptographic hash function SM3. Data is transferred to the security chip through the high-speed and stable PCIe 2.0 × 8 interface, which supports encryption and identity authentication. Both parties establish a secure communication channel through a cryptographic handshake protocol, ensuring the confidentiality and integrity of transmitted data, providing resistance against collusion attacks, and preventing replay attacks. This architecture strictly follows the privacy-preserving computation model under the semi-honest adversary assumption: during protocol execution, a semi-honest adversary cannot obtain any information beyond the intended output. With the combined support of hardware security chips, national cryptographic algorithms, and the oblivious transfer mechanism, collusion among multiple adversaries to infer private data is effectively mitigated. The formal proofs of correctness and computational indistinguishability for the PSI protocol based on Chinese National Cryptographic Algorithms are presented below:
Theorem: Security. If the SM2-based Elliptic Curve Discrete Logarithm Problem is intractable and the SM3 hash function is cryptographically secure, then the Private Set Intersection protocol based on Chinese National Cryptographic Algorithms is computationally indistinguishable. Proof: Suppose there exists a probabilistic polynomial-time adversary A that can distinguish the ciphertexts of the Chinese National Cryptographic Algorithm-based PSI protocol with a non-negligible advantage. In that case, we can construct a challenger C that breaks the intercepted ciphertext with a non-negligible advantage. Below, we prove (following the security standards and methods specified in the privacy-preserving computation model against semi-honest adversary attacks) that during the secure computation of U A U B , the adversary gains no private information about the other party beyond the intersection result U A U B .
Construction of Simulator S 1 Π : Let the input data of parties P 1 and P 2 be X 1 and X 2 , respectively, and let f = f ( X 1 , X 2 ) denote the function computed via the SM2-OT-PSI protocol. For the SM2-OT-PSI protocol with inputs X 1 and X 2 . The view of party P 1 executing the SM2-OT-PSI protocol is given by View 1 Π ( X 1 ) = ( X 1 , c , m 1 , , m t ) , where c denotes the ciphertext set, and m t denotes the received messages. The view of party P 2 executing the SM2-OT-PSI protocol is given by View 2 Π ( X 2 ) = ( X 2 , d B , m 1 , , m t ) , where d B denotes the private key. The challenger C runs the system setup algorithm to obtain system parameters and the public key P B , and then randomly selects a dataset of unique ID values for users that satisfy the intersection protocol strategy.
Challenge Phase: When the adversary knows the output of C ’s protocol, and the input of the participant controlled by C is c = ( c 1 , c 3 , c 2 ) , C outputs an arbitrary ID value b from its dataset. c 2 is a hash value generated via the SM3 encryption algorithm, and the adversary cannot obtain b by decrypting c 2 ; When the adversary uses the same SM3 algorithm, given c 1 , the private key d B is unknown. Therefore, the adversary cannot compute [ d B ] c 1 = [ d B ] [ k ] G = ( x 2 , y 2 ) , and thus cannot obtain b via brute-force methods (e.g., rainbow table attacks). When the adversary controls C ’s input, C does not output the obtained intersection result, thus preventing the adversary from learning b. For the security of C ’s private information, if under the control of the simulator S 1 Π , the viewed information that an adversary A can obtain in polynomial time is less than or equal to the viewed information obtained by C , and P r View A Π X 1 , X 2 , c = P r S f X 1 , X 2 , c , then C ’s private information remains secure. From the above, when a semi-honest adversary exists, the protocol can securely compute f with no information leakage, thereby satisfying the security requirement.

5. Performance Analysis

5.1. Experimental Setup

This section presents a comprehensive performance evaluation of the implementations of SM-PSI and several representative PSI protocols. The experiments are conducted on a server configured with an Intel(R) Core(TM) i5-12400F processor, 32 GB of memory, and the CentOS7 operating system. All implementations are developed using the C and Verilog programming languages, with the SM2/SM3 software libraries built on OpenSSL. To simulate real-world network conditions, bandwidth and RTT latency are manually configured using the Linux tc command. To clearly present the experimental design, we first summarize the salient features of the experimental setup. The evaluation covers both symmetric and asymmetric PSI scenarios, with dataset sizes ranging from 2 12 to 2 20 , including a highly unbalanced setting with 2 20 elements on one side and 2 10 on the other. To reflect practical deployment conditions, experiments are conducted under multiple network environments, including a high-speed LAN (10 Gbps) and several WAN settings with different bandwidth and RTT configurations. Both single-threaded and multi-threaded executions are evaluated to examine scalability and parallelism. Representative PSI protocols from different design paradigms, including OT/OPRF-based, mqRPMT-based, and hardware-accelerated schemes, are selected as baselines for comparison. For the proposed protocol, computationally intensive SM2 scalar multiplication and SM3 hashing operations are offloaded to a reconfigurable cryptographic accelerator, enabling an accurate assessment of the benefits of software–hardware co-design.
In each experimental round, multiple PSI protocols are evaluated. The evaluation includes PSI protocols representing diverse technical paradigms as comparison baselines, including hardware-accelerated optimization schemes DH-IPP [14], FLASH-RSA [17], and SpOT-Light [23]; the blinding OPRF-based paradigm RA17 [26]; the single-point OPRF-based paradigm KKRT [22]; the multi-point OPRF-based paradigms PRTY19 [12], CM20 [25], and PRTY20 [24]; and the mqRPMT-based paradigm GMR21 [21]. All protocols are implemented and evaluated under the semi-honest security model. Among them, the hardware configurations adopted by DH-IPP, FLASH-RSA, and PRTY19 are summarized in Table 3, and their corresponding hardware speedup are illustrated in Figure 2. RA17 is implemented through the construction of a blinding OPRF based on ECC, and the SECP256 curve is selected to satisfy security requirements under the standard model. In the implementations of KKRT and GMR21, hash bucket partitioning employs a three-way cuckoo hashing scheme. The OT instances required by KKRT, PRTY19, CM20, GMR21, and PRTY20 are uniformly instantiated using the ALSZ [28] scheme.
All protocols may be categorized into two primary phases: the initialization phase and the online phase. The initialization phase entails one-time preprocessing procedures conducted prior to the execution of the actual PSI protocol, including key generation, baseline OT, and supplementary operations; the online phase implements the formal workflow of the PSI protocol on the basis of the input sets provided by both participating parties. To assess the scalability of PSI, the input set sizes deployed in the experiments include 2 12 , 2 16 , and 2 20 . Furthermore, we conducted tests on the scenario involving asymmetric dataset sizes between the two parties, in which the set sizes of the sender and receiver are 2 20 and 2 10 , respectively. The network environments simulated by means of the tc command comprise a LAN environment (bandwidth: 10 Gbps, RTT: 0.02 ms) and three WAN environments (bandwidth: 1 Gbps/100 Mbps/10 Mbps; corresponding RTT: 20 ms/40 ms/80 ms). In the meantime, we assessed the performance under both single-threaded (T = 1) and multi-threaded (T = 7) configurations. The full performance test results for symmetric set sizes between the two parties are presented in Table 4 and Figure 3, where “R” and “S”, respectively, denote the communication volumes transmitted by the two participating parties. The running time is calculated by comparing the timestamps recorded at the initiation and termination of the protocol (with the receiver’s time adopted as the reference), whereas the communication volume is calculated by summing the sizes of data packets generated during the communication process. To achieve a more intuitive visualization of the data, Figure 3 illustrates the variations in running time with the reduction in bandwidth under general conditions. The axes of the figure employ a logarithmic scale, and the legend is shared across all subfigures: Subfigures (a, b, c) correspond to the single-threaded configuration, whereas Subfigures (d, e, f) correspond to the multi-threaded configuration. From left to right, each column corresponds to the set sizes n = 2 12 , 2 16 , and 2 20 , respectively. The full performance test results for asymmetric set sizes between the two parties are presented in Table 5 and Figure 4, where Subfigure (a) corresponds to the single-threaded configuration and Subfigure (b) corresponds to the multi-threaded configuration.

5.2. Experimental Data and Evaluation

Among semi-honest model protocols, the proposed protocol demonstrates notable advantages over SpOT-Light, DH-IPP, and FLASH-RSA with respect to computational and communication complexity. By reducing the frequency with which datasets are involved in complex encryption computations and decreasing the number of communication rounds that transmit encrypted datasets, the proposed protocol effectively reduces the communication data volume and the number of computational operations. This performance merit is particularly critical for applications handling large-scale datasets and operating in low-bandwidth environments, which is consistent with the demands of practical deployments. Notably, SM-PSI integrates the national cryptographic algorithms SM2 and SM3 into its operational workflow. This research furnishes critical support for the realization of independent and controllable cybersecurity within China’s cyberspace.
Table 3 presents a comparative analysis of the complexity of PSI protocols developed upon hardware-optimized architectures, where N X and N Y denote the sizes of the client-side and server-side datasets, respectively, with t = N X + N Y , p k represents the number of asymmetric encryption operations; ρ and ϕ denote the asymmetric security parameter and elliptic curve size, respectively; denotes the width of the OT extension matrix; λ is the statistical security parameter; and k denotes the security parameter employed for the hash function. SpOT-Low and SpOT-Fast exhibit an identical total communication complexity: O ( λ + log t ) · N X + · N X . The computational complexity of SpOT-Low comprises O ( p k ) asymmetric public-key operations, O t log 2 t finite field operations, and O ( t k ) symmetric cryptographic operations; the computational complexity of SpOT-Fast comprises O ( p k ) public-key operations, O ( t λ ) finite field operations, and O ( t k ) symmetric cryptographic operations. With respect to the DH-IPP PSI protocol, its communication complexity is O ( N X + 2 N Y ) · ϕ = O ( t · ϕ ) , and it performs 2 t ECC point multiplications, yielding a computational complexity of O ( 2 t ) . Regarding the FLASH-RSA PSI protocol, its communication complexity is O N Y · ( λ + log 2 t ) = O ( t · λ ) , and its computational complexity comprises O ( t log t ) cryptographic operations. For the proposed SM-PSI protocol, its communication complexity amounts to O N X · k + ρ = O ( t · k ) , and its computational complexity amounts to O ( t · k ) .
In Figure 2, the hardware optimization acceleration performance of these protocols is comparatively analyzed across the input set sizes (ranging from 10 K to 50 K) adopted by both participating parties. Specifically, for SpOT-Light, a performance comparison is conducted between its execution on an Intel Xeon vCPU cloud server and that on an Intel Xeon E5-2699 v3 CPU; for the DH-IPP-PSI protocol (implemented with IPP-Crypto and adapted for the Intel Xeon Platinum 8369B CPU), an acceleration comparison is performed relative to the unadapted ECDH-PSI protocol running on the same Intel Xeon Platinum 8369B CPU; for the FLASH-RSA-PSI framework implemented on the Xilinx VU13P FPGA, an acceleration comparison is carried out relative to its execution on a general-purpose CPU; furthermore, an acceleration performance comparison is conducted between the proposed SM-PSI protocol (implemented on the RSP S20 chip) and its execution on a general-purpose Intel CPU. Overall, the performance of the proposed SM-PSI protocol consistently outperforms its counterpart executed on an Intel CPU, reaching a speedup of 9.0×. The performance-optimized DH-IPP attains a nearly 10× improvement; however, owing to its higher computational complexity, the overall computational speed of this protocol still remains significantly lower than that of SM-PSI.
Comparison of Communication Overhead for Equal-Size Datasets: As shown in Figure 3 and Table 4, the proposed SM-PSI protocol based on the multi-point OPRF paradigm demonstrates the optimal communication overhead performance in most test scenarios. As shown in Table 4, in the large-scale symmetric setting with n = 2 20 , the proposed SM2-OT-PSI protocol requires only 32 MB total communication and achieves an online runtime of 6.27 s under a 10 Gbps LAN environment with multi-threaded execution. Compared with representative baselines, this corresponds to a 2.9× speedup over KKRT, an 8.0× speedup over CM20, and more than an order-of-magnitude improvement over RA17 and GMR21.
Moreover, when transitioning from a 10 Gbps LAN to a 1 Gbps WAN environment, the multi-threaded runtime of the proposed protocol increases only from 6.27 s to 6.78 s, representing an 8.1 % increase, which demonstrates low sensitivity to bandwidth variations due to reduced communication complexity.
For the asymmetric setting reported in Table 5 ( 2 20 vs. 2 10 ), the proposed protocol achieves an online runtime of 0.35 s under a 10 Gbps LAN environment with multi-threaded execution, significantly outperforming PRTY20 (4.41 s) and GMR21 (129.64 s). These results confirm that the proposed SM2-OT-PSI protocol maintains strong efficiency advantages in both balanced and unbalanced large-scale scenarios. The communication overhead of the evaluated schemes may be evaluated by observing the slope of the curves presented in the figure. Notably, PSI schemes utilizing single-point OPRF (e.g., KKRT16 and PRTY20) generally incur higher communication overhead, and their running time declines markedly with increasing bandwidth. For the experiment employing a large-scale dataset ( n = 2 20 ), the total communication volume of the proposed SM-PSI protocol is merely 32 MB: this corresponds to a 74.9 % reduction relative to KKRT16 (with a communication volume of 127.4 MB, roughly 4.0 times that of SM-PSI), a 79.9 % reduction relative to PRTY20 (159 MB, approximately 5.0 times that of SM-PSI), and a 97.4 % reduction relative to GMR21 (1239 MB, roughly 38.7 times that of SM-PSI). This communication overhead advantage remains consistent across varying dataset sizes.
Comparison of Single-Threaded vs. Multi-Threaded Performance for Equal-Size Datasets: By comparison, the other evaluated PSI schemes demonstrate lower bandwidth sensitivity. Among PSI schemes based on the multi-point OPRF paradigm, the proposed SM-PSI protocol demonstrates more stable computational efficiency, a trait that is particularly evident in experiments utilizing large-scale ID datasets. PSI schemes with distinct architectural designs exhibit varying degrees of sensitivity to multi-threaded execution environments. As illustrated in Figure 3, most OPRF-based PSI schemes achieve superior performance under multi-threaded execution compared to single-threaded execution; however, schemes with limited parallelism (e.g., KKRT16) yield better performance in single-threaded mode when the input dataset scale is small. In such cases, the latency reduction afforded by parallelism cannot even offset the overhead incurred by multi-thread scheduling. Furthermore, as the dataset size increases, parallelism generally yields more substantial performance improvements: when n = 2 20 , multi-threaded execution exhibits a more pronounced performance advantage over single-threaded execution (see Figure 3c,f); in contrast, when n = 2 12 , this performance advantage is nearly negligible (see Figure 3a,d).
Performance of Equal-Size Datasets in LAN Environment: In a 10 Gbps LAN environment, the proposed SM-PSI protocol demonstrates excellent computational efficiency in terms of running time. When n = 2 12 , the single-threaded running time of the proposed SM-PSI protocol is merely 0.03 s, attaining a roughly 11.3-fold performance improvement relative to the sub-optimal KKRT16 scheme (0.34 s). When n = 2 20 , the multi-threaded running time of the proposed SM-PSI protocol is 6.27 s. As shown in Table 6, this represents significant performance improvements ranging from 5.4-fold to 39.6-fold relative to other state-of-the-art schemes (RA17, PRTY19-F, CM20, PRTY20, and GMR21). Even when compared with KKRT16 (18.39 s), the scheme with the most comparable performance to the proposed SM-PSI protocol, the proposed SM-PSI protocol still retains a 2.9-fold performance advantage.
Performance of Equal-Size Datasets in WAN Environment: In a 1 Gbps WAN environment, the performance advantage of the proposed SM-PSI protocol is more pronounced. When n = 2 20 , the multi-threaded running time of the proposed SM-PSI protocol is 6.78 s, delivering performance improvements of 7.7-fold, 5.6-fold, and 3.3-fold relative to CM20 (51.95 s), PRTY20 (38.15 s), and KKRT16 (22.26 s), respectively. Notably, the multi-threaded running time of the proposed SM-PSI protocol merely increased by 8.1 % when transitioning from the LAN environment (6.27 s) to the WAN environment (6.78 s), indicating that the proposed SM-PSI protocol exhibits low sensitivity to variations in network bandwidth—a trait attributed to its low communication complexity.
The proposed SM-PSI protocol demonstrates favorable scalability for large-scale datasets. As the dataset size increases from 2 12 to 2 20 (a 256-fold increase), the multi-threaded running time of the proposed SM-PSI protocol increases from 0.05 s to 6.27 s (in the 10 Gbps LAN deployment environment), a roughly 125-fold increase—a growth magnitude lower than the linear growth rate of the dataset size, thus reflecting favorable asymptotic efficiency. By comparison, the running time of certain OPRF-based schemes (e.g., PRTY19-F) exhibits a more substantial increase on large-scale datasets.
Table 4. Performance Test Results of PSI Protocol under the Condition of n 1 = n 2 .
Table 4. Performance Test Results of PSI Protocol under the Condition of n 1 = n 2 .
nProtocolComm. (MB)TotalRunning Time (s)
R S 10 Gbps 1 Gbps 100 Mbps 10 Mbps
Setup Online Setup Online Single-Thread Multi-Thread Single-Thread Multi-Thread Single-Thread Multi-Thread Single-Thread Multi-Thread
Setup Online Setup Online Setup Online Setup Online Setup Online Setup Online Setup Online Setup Online
2 12 RA1700.1400.190.3301.5501.0301.6801.8601.5901.7502.0101.90
KKRT160.010.330.010.140.470.130.210.120.400.220.280.390.890.230.330.200.930.220.630.201.17
PRTY19-L0.010.230.010.050.280.0324.001.2415.040.0324.171.3015.140.0324.301.1915.380.0324.481.2315.79
PRTY19-F0.010.230.010.100.330.043.091.813.180.033.351.733.510.024.161.823.560.034.742.014.12
PRTY200.010.560.010.050.620.210.400.180.620.370.430.550.980.370.510.711.190.390.760.721.59
CM200.010.310.010.050.370.020.470.020.650.200.590.020.930.030.640.010.940.020.840.191.12
GMR210.022.160.021.403.600.631.210.471.191.132.100.871.321.122.200.882.380.973.950.903.85
SM-PSI (proposed)00.000100.130.1300.0300.0500.2500.2700.2500.2701.5200.97
2 16 RA1702.2903.015.31022.1105.91022.8806.95022.3506.94024.8609.68
KKRT160.015.310.012.167.480.131.630.121.980.211.610.222.820.211.850.203.270.217.080.208.12
PRTY19-F0.013.760.011.455.220.0542.611.5321.040.0444.521.6321.220.0546.981.8226.010.0347.921.8926.50
PRTY200.018.820.010.729.550.212.210.392.810.372.660.714.000.373.620.724.530.5610.540.5511.15
CM200.014.990.010.735.730.025.520.013.540.026.360.034.950.037.710.014.600.0210.240.019.11
GMR210.0239.650.0227.3967.070.5410.880.578.740.9413.091.0511.370.9516.970.8713.251.1562.870.9060.70
SM-PSI (proposed)00.000102.002.0000.8300.2800.9700.6001.1200.7102.7702.97
2 20 RA17036.70050.3387.030329.4075.160339.3078.720349.3079.720380.10118.5
KKRT160.0186.510.0140.89127.40.1320.250.1518.240.2223.730.2122.050.2127.330.2026.440.2184.550.1982.64
PRTY19-F0.0161.260.0127.2788.540.17655.83.84244.60.39667.04.03240.30.37715.14.11306.40.36727.85.48317.1
PRTY200.01145.30.0113.63159.00.2137.740.5033.400.3944.170.7237.430.3848.990.7245.220.41159.80.87157.4
CM200.0181.400.0113.6495.040.13125.40.1050.020.02133.60.1951.760.03136.00.1954.560.02198.20.01123.6
GMR210.02717.60.02521.012390.65205.70.51136.41.12228.90.87160.71.12277.10.87213.10.911073
SM-PSI (proposed)00.0001032.0032.00010.0806.27011.0706.78016.31010.15039.76031.03
The proposed SM-PSI protocol exhibits favorable responsiveness to multi-threaded parallelization. Under the conditions of n = 2 20 and a 10 Gbps network bandwidth, switching the proposed SM-PSI protocol from single-threaded execution (10.08 s) to multi-threaded execution (6.27 s) delivers a 37.8 % performance improvement. Unlike single-point OPRF-based schemes (e.g., KKRT16), which incur negative returns from parallelization on small-scale datasets, the proposed SM-PSI protocol can derive performance benefits from parallelization across all tested scales.
In the unequal-size dataset scenario ( n 1 = 2 10 , n 2 = 2 20 ), the proposed SM-PSI protocol demonstrates notable performance advantages, especially in high-bandwidth network environments.
Comparison of Communication Overhead for Unequal-Size Datasets: The total communication volume of the proposed SM-PSI protocol is 32 MB, attaining a moderate ranking among all evaluated protocols. Compared with the GMR21 protocol (1087 MB), the scheme with the highest communication overhead among the evaluated protocols, the communication volume of the proposed SM-PSI protocol is reduced by 97.1 % ; relative to the KKRT16 protocol (37.84 MB), the communication volume of the proposed SM-PSI protocol is reduced by 15.4 % . However, the communication volume of the proposed SM-PSI protocol is higher than that of other multi-point OPRF-based protocols including PRTY20 (12.75 MB), CM20 (12.68 MB), and PRTY19-L (12.65 MB), with an increase of approximately 151– 153 % . This increase in communication overhead is primarily attributed to the higher ciphertext expansion rate of the SM2/SM3 cryptographic algorithms relative to other lightweight cryptographic primitives.
Comparison of Performance for Unequal-Size Datasets in LAN Environment: In the LAN environment (10 Gbps, RTT: 0.02 ms), the proposed SM-PSI protocol delivers the optimal operational efficiency. Under the single-threaded configuration, the online phase of the proposed SM-PSI protocol merely requires 0.74 s: compared with the PRTY20 protocol (6.03 s), the sub-optimal scheme among the evaluated protocols, the proposed SM-PSI protocol attains an 8.1-fold speedup ratio, reducing the running time by 87.7 % ; relative to the blinded OPRF-based RA17 protocol (109 s), the proposed SM-PSI protocol attains a 147.3-fold speedup ratio, reducing the running time by 99.3 % ; relative to the mqRPMT-based GMR21 protocol (129.64 s), the proposed SM-PSI protocol attains a 175.2-fold speedup ratio. Under the multi-threaded configuration ( T = 7 ) , the running time of the online phase of the proposed SM-PSI protocol is further reduced to 0.35 s: the speedup relative to PRTY20 (4.41 s) rises to 12.5-fold, and the speedup relative to GMR21 reaches 295.0-fold.
Table 5. Performance Test Results under the Condition of Unequal n 1 and n 2 .
Table 5. Performance Test Results under the Condition of Unequal n 1 and n 2 .
n 1 n 2 ProtocolComm. (MB)TotalRunning Time (s)
R S 10 Gbps 1 Gbps 100 Mbps 10 Mbps
Setup Online Setup Online Single-Thread Multi-Thread Single-Thread Multi-Thread Single-Thread Multi-Thread Single-Thread Multi-Thread
Setup Online Setup Online Setup Online Setup Online Setup Online Setup Online Setup Online Setup Online
2 10 2 20 RA1700.04012.6212.6507.86023.7708.90024.44010.14024.16034.18031.92
KKRT160.010.090.0137.7537.840.307.860.228.090.218.900.378.590.2110.140.229.590.3934.180.3934.38
PRTY19-L0.010.060.0112.5912.650.12832.90.02177.90.09868.30.20182.80.03843.00.01162.80.07850.50.02170.5
PRTY19-F0.010.060.0125.1725.240.02335.40.12126.30.03343.10.20124.20.03337.20.02123.70.03350.10.19134.5
PRTY200.010.150.0112.5812.750.355.680.394.020.385.910.714.990.376.150.725.750.4917.190.7214.49
CM200.010.080.0112.5912.680.0928.190.0115.250.0430.210.0215.960.0635.050.1917.360.0338.850.0225.22
GMR210.015660.02520.9610870.09129.10.47103.80.94151.00.90126.81.12193.30.88163.20.92919.8
SM-PSI00.0001032.0032.0000.7400.3501.4201.0304.0003.60028.29027.88
Comparison of Performance for Unequal-Size Datasets in WAN Environment: In the WAN deployment environment, the proposed SM-PSI protocol also retains leading operational performance. Under the condition of 1 Gbps bandwidth, the single-threaded running time of the online phase of the proposed SM-PSI protocol is 1.42 s: the proposed SM-PSI protocol attains a 4.4-fold speedup ratio relative to PRTY20 (6.29 s), and a 78.5-fold speedup ratio relative to RA17 (111.4 s); under the multi-threaded configuration, the running time of the online phase of the proposed SM-PSI protocol is reduced to 1.03 s, with a speedup of 5.5-fold relative to PRTY20. Under the condition of 100 Mbps bandwidth, the single-threaded running time of the online phase of the proposed SM-PSI protocol is 4.0 s, still retaining a 1.6-fold performance advantage relative to PRTY20 (6.52 s). The performance advantage of the proposed SM-PSI protocol gradually diminishes as bandwidth decreases. In the low-bandwidth environment of 10 Mbps, attributed to the higher communication volume of the proposed SM-PSI protocol (32 MB) relative to that of PRTY20 (12.75 MB), communication transmission emerges as a performance bottleneck, leading to the running time of the online phase of the proposed SM-PSI protocol (28.29 s) being marginally higher than that of PRTY20 (17.68 s). This phenomenon suggests that the proposed SM-PSI protocol is more suitable for deployment in medium-to-high bandwidth network environments.
Comprehensive analysis of all evaluated metrics reveals that the proposed SM-PSI protocol demonstrates outstanding operational performance in medium-to-high bandwidth network deployment environments. Across the network environments ranging from LAN to 1 Gbps WAN, the average speedups attained by the proposed SM-PSI protocol relative to each benchmark protocol are as follows: RA17 (58.6-fold), KKRT16 (9.1-fold), PRTY19-L (445.7-fold), PRTY19-F (215.3-fold), PRTY20 (5.7-fold), CM20 (22.0-fold), and GMR21 (132.6-fold). The proposed SM-PSI protocol delivers an order-of-magnitude improvement in operational computational efficiency at the expense of a moderate incremental communication overhead, and thus exhibits notable application advantages in scenarios equipped with sufficient bandwidth resources.
Table 6. Comparison of Performance Speedup of SM2-OT-PSI Relative to Other Protocols ( n = 2 20 , LAN Environment).
Table 6. Comparison of Performance Speedup of SM2-OT-PSI Relative to Other Protocols ( n = 2 20 , LAN Environment).
ProtocolProtocol TypeRunning Time (s)Speedup
PRTY19-FMulti-point OPRF248.4439.6×
GMR21mqRPMT136.9121.8×
RA17Blinded OPRF75.1612.0×
CM20Multi-point OPRF50.128.0×
PRTY20Multi-point OPRF33.95.4×
KKRT16Single-point OPRF18.392.9×
SM-PSIProposed6.271.0× (Baseline)

6. Conclusions

This paper proposes an efficient, practical PSI protocol based on the National Commercial Cryptography Standard of China (SM2), named SM2-OT-PSI. This scheme mitigates the limitations of existing PSI protocols that rely on RSA or standard elliptic curve cryptography, especially in scenarios requiring compliance with China’s national cryptographic specifications and high-performance processing of large-scale datasets. Through the construction of an OPRF built upon OT leveraging SM2 public-key cryptography and the SM3 hash function, the proposed protocol enables efficient multi-point OPRF evaluation under the semi-honest adversary model. Formal security analysis verifies that, under the intractability assumption of the ECDLP and the cryptographic security assumption of SM3, the protocol ensures the correctness and privacy properties. To further enhance practical operational performance, a software–hardware co-design architecture is incorporated into the protocol design, which employs domestically developed reconfigurable cryptographic accelerators to offload computationally intensive SM2 and SM3 cryptographic operations. Experimental assessments conducted in LAN and WAN environments show that, relative to several representative PSI schemes, the proposed SM2-OT-PSI protocol delivers notable performance enhancements in both execution time and scalability when handling million-scale datasets. These experimental results demonstrate that this protocol is highly suitable for deployment in medium-to-high bandwidth network environments. This work verifies that high-performance PSI can be efficiently realized under national cryptographic standards via meticulous protocol design and system-level optimization. Future research efforts will focus on extending the proposed scheme to support stronger adversary models (e.g., the malicious security model), and investigating additional optimizations for heterogeneous computing platforms and an expanded range of privacy-preserving data analysis applications.

Author Contributions

Conceptualization, Z.G. and H.H.; methodology, H.H., Z.G. and H.Y.; software, Z.G., Q.J. and K.C.; validation, Z.G., Q.J. and H.Y.; formal analysis, H.H., Z.G. and B.Y.; investigation, H.H. and H.Y.; writing—original draft preparation, Z.G.; writing—review and editing, B.Y., M.G. and C.M.; visualization, Z.G., K.C. and Q.J.; funding acquisition, H.H. and C.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Key Research and Development Program of Heilongjiang Province (2022ZX01A36), Harbin Manufacturing Science and Technology Innovation Talent Project (2022CXRCCG004), and the National Key Research and Development Plan Project (2023YFB4403500).

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy restrictions.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Zhao, C.; Zhao, S.; Zhao, M.; Chen, Z.; Gao, C.; Li, H.; Tan, Y. Secure multi-party computation: Theory, practice and applications. Inf. Sci. 2019, 476, 357–372. [Google Scholar] [CrossRef]
  2. He, Y.; Tan, X.; Ni, J.; Yang, L.T.; Deng, X. Differentially private set intersection for asymmetrical id alignment. IEEE Trans. Inf. Forensics Secur. 2022, 17, 3479–3494. [Google Scholar] [CrossRef]
  3. Gao, Y.; Xie, Y.; Deng, H.; Zhu, Z.; Zhang, Y. A Privacy-preserving Data Alignment Framework for Vertical Federated Learning. J. Electron. Inf. Technol. 2024, 46, 3419–3427. [Google Scholar] [CrossRef]
  4. Brickell, J.; Porter, D.E.; Shmatikov, V.; Witchel, E. Privacy-preserving remote diagnostics. In Proceedings of the 14th ACM Conference on Computer and Communications Security, Alexandria, VA, USA, 29–31 October 2007; ACM: New York, NY, USA, 2007; pp. 498–507. [Google Scholar] [CrossRef]
  5. He, X.; Machanavajjhala, A.; Flynn, C.; Srivastava, D. Composing differential privacy and secure computation: A case study on scaling private record linkage. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, Dallas, TX, USA, 30 October–3 November 2017; ACM: New York, NY, USA, 2017; pp. 1389–1406. [Google Scholar] [CrossRef]
  6. Ion, M.; Kreuter, B.; Nergiz, E.; Patel, S.; Saxena, S.; Seth, K.; Shanahan, D.; Yung, M. Private Intersection-Sum Protocol with Applications to Attributing Aggregate ad Conversions. Cryptology ePrint Archive, Paper 2017/738. 2017. Available online: https://eprint.iacr.org/2017/738 (accessed on 17 December 2025).
  7. GB/T 32918.4-2016; Information Security Technology—Public Key Cryptographic Algorithm SM2 Based on Elliptic Curves—Part 4: Public Key Encryption Algorithm. China Standards Press: Beijing, China, 2016.
  8. GB/T 32905-2016; Information Security Techniques—SM3 Cryptographic Hash Algorithm. China Standards Press: Beijing, China, 2016.
  9. Kales, D.; Rechberger, C.; Schneider, T.; Senker, M.; Weinert, C. Mobile private contact discovery at scale. In Proceedings of the 28th USENIX Security Symposium (USENIX Security 19), Santa Clara, CA, USA, 14–16 August 2019; USENIX Association: Berkeley, CA, USA, 2019; pp. 1447–1464. Available online: https://www.usenix.org/conference/usenixsecurity19/presentation/kales (accessed on 17 December 2025).
  10. Huang, Y.; Evans, D.; Katz, J. Private set intersection: Are garbled circuits better than custom protocols? In Proceedings of the 19th Network and Distributed System Security Symposium (NDSS 2012), San Diego, CA, USA, 5–8 February 2012; Internet Society: Reston, VA, USA, 2012. [Google Scholar]
  11. Kiss, Á.; Liu, J.; Schneider, T.; Asokan, N.; Pinkas, B. Private set intersection for unequal set sizes with mobile applications. In Proceedings of the Privacy Enhancing Technologies Symposium (PoPETS 2017), Munich, Germany, 19–21 July 2017; De Gruyter: Berlin, Germany, 2017; pp. 177–197. [Google Scholar] [CrossRef]
  12. Pinkas, B.; Schneider, T.; Tkachenko, O.; Yanai, A. Efficient circuit-based PSI with linear communication. In Proceedings of the Annual International Conference on the Theory and Applications of Cryptographic Techniques (EUROCRYPT 2019), Darmstadt, Germany, 19–23 May 2019; Springer: Cham, Switzerland, 2019; pp. 122–153. [Google Scholar] [CrossRef]
  13. Meadows, C. A more efficient cryptographic matchmaking protocol for use in the absence of a continuously available third party. In Proceedings of the 1986 IEEE Symposium on Security and Privacy (S & P 1986), Toronto, ON, Canada, 27–29 October 1986; IEEE: Piscataway, NJ, USA, 1986; p. 134. [Google Scholar] [CrossRef]
  14. Wu, G.; He, Q.; Jiang, J.; Zhang, Z.X.; Zhao, Y.; Zou, Y.C.; Zhang, J.; Wei, C.Z.; Yan, Y.; Zhang, H. Topgun: An ECC accelerator for private set intersection. ACM Trans. Reconfigurable Technol. Syst. 2023, 16, 1–30. [Google Scholar] [CrossRef]
  15. Hazay, C.; Nissim, K. Efficient set operations in the presence of malicious adversaries. In Proceedings of the International Workshop on Public Key Cryptography (PKC 2010), Paris, France, 28–30 May 2010; Springer: Berlin, Heidelberg, 2010; pp. 312–331. [Google Scholar] [CrossRef]
  16. De Cristofaro, E.; Tsudik, G. Practical private set intersection protocols with linear complexity. In Proceedings of the International Conference on Financial Cryptography and Data Security (FC 2010), Bridgetown, Barbados, 22–26 February 2010; Springer: Berlin, Heidelberg, 2010; pp. 143–159. [Google Scholar] [CrossRef]
  17. Zhang, J.X.; Cheng, X.D.; Wang, W.; Yang, L.; Hu, J.B.; Chen, K. {FLASH}: Towards a high-performance hardware acceleration architecture for cross-silo federated learning. In Proceedings of the 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23), Boston, MA, USA, 25–28 April 2023; USENIX Association: Berkeley, CA, USA, 2023; pp. 1057–1079. Available online: https://www.usenix.org/conference/nsdi23/presentation/zhang-junxue (accessed on 6 January 2026).
  18. Rabin, M.O. Transaction protection by beacons. J. Comput. Syst. Sci. 1983, 27, 256–267. [Google Scholar] [CrossRef]
  19. Pinkas, B.; Schneider, T.; Zohner, M. Faster private set intersection based on OT extension. In Proceedings of the 23rd USENIX Security Symposium (USENIX Security 14), San Diego, CA, USA, 20–22 August 2014; USENIX Association: Berkeley, CA, USA, 2014; pp. 797–812. Available online: https://www.usenix.org/conference/usenixsecurity14/technical-sessions/presentation/pinkas (accessed on 6 January 2026).
  20. Pinkas, B.; Schneider, T.; Segev, G.; Zohner, M. Phasing: Private set intersection using permutation-based hashing. In Proceedings of the 24th USENIX Security Symposium (USENIX Security 15), Washington, DC, USA, 12–14 August 2015; USENIX Association: Berkeley, CA, USA, 2015; pp. 515–530. Available online: https://www.usenix.org/conference/usenixsecurity15/technical-sessions/presentation/pinkas (accessed on 6 January 2026).
  21. Garimella, G.; Mohassel, P.; Rosulek, M.; Sadeghian, S.; Singh, J. Private set operations from oblivious switching. In Proceedings of the IACR International Conference on Public Key Cryptography (PKC 2021), Virtual Event, 17–20 May 2021; Springer: Cham, Switzerland, 2021; pp. 591–617. [Google Scholar] [CrossRef]
  22. Kolesnikov, V.; Kumaresan, R.; Rosulek, M.; Trieu, N. Efficient batched oblivious PRF with applications to private set intersection. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS 2016), Vienna, Austria, 24–28 October 2016; ACM: New York, NY, USA, 2016; pp. 818–829. [Google Scholar] [CrossRef]
  23. Pinkas, B.; Rosulek, M.; Trieu, N.; Yanai, A. SpOT-light: Lightweight private set intersection from sparse OT extension. In Proceedings of the Annual International Cryptology Conference (CRYPTO 2019), Santa Barbara, CA, USA, 18–22 August 2019; Springer: Cham, Switzerland, 2019; pp. 401–431. Available online: https://link.springer.com/chapter/10.1007/978-3-030-26954-8_13 (accessed on 6 January 2026).
  24. Pinkas, B.; Rosulek, M.; Trieu, N.; Yanai, A. PSI from PaXoS: Fast, malicious private set intersection. In Proceedings of the Annual International Conference on the Theory and Applications of Cryptographic Techniques (EUROCRYPT 2020), Zagreb, Croatia, 10–14 May 2020; Springer: Cham, Switzerland, 2020; pp. 739–767. Available online: https://link.springer.com/chapter/10.1007/978-3-030-45724-2_25 (accessed on 6 January 2026).
  25. Chase, M.; Miao, P. Private set intersection in the internet setting from lightweight oblivious PRF. In Proceedings of the Annual International Cryptology Conference (CRYPTO 2020), Santa Barbara, CA, USA, 17–21 August 2020; Springer: Cham, Switzerland, 2020; pp. 34–63. Available online: https://link.springer.com/chapter/10.1007/978-3-030-56877-1_2 (accessed on 6 January 2026).
  26. Resende, A.C.D.; Aranha, D.F. Faster unbalanced private set intersection. In Proceedings of the International Conference on Financial Cryptography and Data Security (FC 2018), Bridgetown, Barbados, 19–23 February 2018; Springer: Berlin, Heidelberg, 2018; pp. 203–221. Available online: https://link.springer.com/chapter/10.1007/978-3-662-58387-6_11 (accessed on 20 December 2025).
  27. Yao, A.C. Protocols for secure computations. In Proceedings of the 23rd Annual Symposium on Foundations of Computer Science (SFCS 1982), Chicago, IL, USA, 25–27 October 1982; IEEE: Piscataway, NJ, USA, 1982; pp. 160–164. [Google Scholar] [CrossRef]
  28. Goldreich, O.; Micali, S.; Wigderson, A. How to play any mental game, or a completeness theorem for protocols with honest majority. In Providing Sound Foundations for Cryptography: On the Work of Shafi Goldwasser and Silvio Micali; ACM: New York, NY, USA, 2019; pp. 307–328. [Google Scholar] [CrossRef]
  29. Yu, B.; Huang, H.; Liu, Z.W.; Zhao, S.L.; Na, N. High-performance hardware architecture design and implementation of Ed25519 algorithm. J. Electron. Inf. Technol. 2021, 43, 1821–1827. [Google Scholar] [CrossRef]
  30. Liu, Z.W.; Zhang, Q.; Huang, H.; Yang, X.Q.; Chen, G.B.; Zhao, S.L.; Yu, B. Design of high area efficiency elliptic curve scalar multiplier based on fast modulo reduction of bit reorganization. J. Electron. Inf. Technol. 2024, 46, 344–352. [Google Scholar] [CrossRef]
  31. Bay, A.; Erkin, Z.; Hoepman, J.-H.; Samardjiska, S.; Vos, J. Practical multi-party private set intersection protocols. IEEE Trans. Inf. Forensics Secur. 2021, 17, 1–15. [Google Scholar] [CrossRef]
  32. Zhou, I.; Tofigh, F.; Piccardi, M.; Abolhasan, M.; Franklin, D.; Lipman, J. Secure multi-party computation for machine learning: A survey. IEEE Access 2024, 12, 53881–53899. [Google Scholar] [CrossRef]
  33. Gao, Y.; Luo, Y.; Wang, L.; Liu, X.; Qi, L.; Wang, W.; Zhou, M. Efficient scalable multi-party private set intersection (-Variants) from bicentric zero-sharing. In Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security, Salt Lake City, UT, USA, 14–18 October 2024; Association for Computing Machinery: New York, NY, USA, 2024; pp. 4137–4151. [Google Scholar] [CrossRef]
  34. Ishai, Y.; Kilian, J.; Nissim, K.; Petrank, E. Extending oblivious transfers efficiently. In Proceedings of the Annual International Cryptology Conference (CRYPTO 2003), Santa Barbara, CA, USA, 17–21 August 2003; Springer: Berlin, Heidelberg, 2003; pp. 145–161. [Google Scholar] [CrossRef]
Figure 1. SM2-OT-PSI Framework Flowchart.
Figure 1. SM2-OT-PSI Framework Flowchart.
Computers 15 00044 g001
Figure 2. Hardware Speedup of the PSI Protocol.
Figure 2. Hardware Speedup of the PSI Protocol.
Computers 15 00044 g002
Figure 3. Comparison of Performance Test Results Under the Scenario of Equal-Scale Sets ( n 1 = n 2 ).
Figure 3. Comparison of Performance Test Results Under the Scenario of Equal-Scale Sets ( n 1 = n 2 ).
Computers 15 00044 g003
Figure 4. Comparison of Performance Test Results Under the Scenario of Unequal Sizes of n 1 and n 2 .
Figure 4. Comparison of Performance Test Results Under the Scenario of Unequal Sizes of n 1 and n 2 .
Computers 15 00044 g004
Table 1. Comparison of Representative Two-Party PSI Protocols.
Table 1. Comparison of Representative Two-Party PSI Protocols.
ProtocolCryptographic PrimitiveSecurity ModelMain AdvantagesMain Limitations
Meadows [13]ECDHSemi-honestSimple construction; early use of ECC for PSIRequires a large number of ECC scalar multiplications; poor scalability for large datasets
DH-IPP [14]ECDH +
ECC optimization
Semi-honestHighly optimized ECC operations on Intel CPUs; improved runtime over naive ECDH-PSIStill requires (2n) ECC multiplications; limited scalability and CPU-dependent optimization
FLASH-RSA [17]RSA-based
homomorphic encryption
Semi-honestHardware–software co-design; supports large-scale federated learning scenariosRSA operations incur high computational cost; limited speedup and large runtime for million-scale datasets
RA17 [26]Blinded
OPRF (ECC-based)
Semi-honestEfficient for unbalanced datasets; avoids expensive hashing structuresCommunication overhead increases significantly in balanced or large-scale settings
KKRT [22]Single-point
OPRF + OT extension
Semi-honestLow sender-side computation; well-studied OT-based constructionSingle-point OPRF leads to high communication overhead and bandwidth sensitivity
SpOT-Light [23]Sparse OT extension +
multi-point OPRF
Semi-honestReduced sender computation; optimized for cloud CPU environmentsComputational complexity grows as ( O ( n log n ) ); still communication-intensive
PRTY19/PRTY20 [12,24]Multi-point OPRF +
OT extension
Semi-honestLinear communication complexity; improved scalability over single-point OPRFComplex protocol structure; communication overhead remains non-negligible
CM20 [25]Lightweight OPRF +
matrix construction
Semi-honestReduced communication compared to earlier OT-based PSIIncreased implementation complexity; relies on non-national cryptographic primitives
GMR21 [21]mqRPMT + OTSemi-honestStrong privacy guarantees; intersection-only outputVery high communication overhead; limited practicality for large datasets
Proposed SM2-OT-PSISM2-based OT +
SM3-based OPRF
Semi-honestFully compliant with Chinese national cryptographic standards; low communication complexity; efficient multi-point OPRF; hardware-accelerated implementationCommunication overhead higher than some lightweight OPRF schemes in very low-bandwidth networks
Table 2. Workflow of the Private Set Intersection protocol based on SM2-OT.
Table 2. Workflow of the Private Set Intersection protocol based on SM2-OT.
StageDescription
InputAlice and Bob hold private datasets U A = { U A 1 , , U A i } and U B = { U B 1 , , U B j } , respectively.
OutputThe intersection set U A U B .
PreparationAlice and Bob jointly select a common set of SM2 elliptic curve parameters. Specifically, a large prime p is chosen to define the finite field F p . An elliptic curve E ( F p ) is constructed over F p , and a generator G = ( x G , y G ) of order n is selected. For k Z n * , [ k ] G denotes elliptic curve scalar multiplication. The cryptographic hash function SM3 is adopted. The symbol ‖ denotes concatenation of byte strings. All subsequent computations are performed without revealing any additional information.
Bob’s Operations(1) Bob selects a random number d B Z n * as his private key.
(2) Bob computes the public key P B = [ d B ] G and publishes P B .
Alice’s Operations(1) Alice receives P B and computes S = [ h ] P B . If S is the point at infinity, the protocol aborts.
(2) Alice samples a random number k Z n * and computes c 1 = [ k ] G = ( x 1 , y 1 ) .
(3) Alice computes the obfuscated point [ k ] G = ( x 2 , y 2 ) .
(4) Alice derives the symmetric key t = ( x 2 y 2 , Klen ) .
(5) Alice computes c 2 A i = OPRF ( t , U A i ) and forms the ciphertext set C 2 A = { c 2 A 1 , , c 2 A i } .
(6) Alice computes c 2 A i = Hash ( x 2 U A i y 2 )
(7) Alice sends an instruction prove , sid | | 1 , c 1 , k to the Non-Interactive Zero-Knowledge Proof Functionality F com - ZK R .
Bob’s Operations(1) Bob receives F com - ZK R from the returned message proof - receipt , sid | | 1 .
Alice’s Operations(1) Alice selects a random number r according to the zero-knowledge proof protocol and computes the commitment point R = [ r ] G .
(2) Alice computes the challenge value c = Hash R , c 1 , G and the response value s = r + k c , and obtains the proof pair c 3 = R , s .
(3) Alice sends C i = c 1 | | c 3 | | c 2 A i to the receiver Bob, and the sender Alice proves that G , R , c 1 R DL via a zero-knowledge proof.
Bob’s Operations(1) Bob receives the ciphertext set C i from Alice and uses c 3 = R , s to prove c 1 = [ k ] G = ( x 1 , y 1 ) and computes the challenge value c = Hash R , c 1 , G .
(2) Bob receives the message response - proof , sid | | 1 returned by F com - ZK R . If he fails to receive the message or the zero-knowledge proof relation G , R , c 1 R DL does not hold, Bob terminates the protocol; otherwise, Bob continues the execution. Compute and verify whether the [ s ] G = R + [ c ] c 1 equation holds. If it does not hold, the receiver terminates the protocol; otherwise, the receiver confirms that c 1 = [ k ] G = ( x 1 , y 1 ) is correct.
(3) Bob computes d B j c 1 = [ d B j k ] G = ( x 2 , y 2 ) .
(4) Bob derives the symmetric key t = KDF ( x 2 y 2 , Klen ) .
(5) Bob computes c 2 B j = OPRF ( t , U B j ) . Obtain the ciphertext set c 2 B j = Hash ( x 2 U B j y 2 ) .
(6) Bob obtains the preliminary intersection C = c 2 A c 2 B .
(7) Bob verifies the result by checking c 3 B j = c 3 A i , where c 3 B j = Hash ( x 2 U B j y 2 ) .
(8) If the verification holds, the final intersection result I = c 2 A c 2 B is obtained and shared by both parties.
Table 3. Complexity Comparison of PSI Protocols.
Table 3. Complexity Comparison of PSI Protocols.
PSI ProtocolAlgorithm FrameworkHardware ArchitectureCommunication ComplexityComputational Complexity
SpOT-LowOTIntel Xeon vCPU O ( · t ) O ( t log 2 t )
SpOT-FastOTIntel Xeon vCPU O ( · t ) O ( t · λ )
DH-IPPECDH SHA256Intel Xeon Platinum 8369B CPU O ( t · ϕ ) O ( 2 t )
FLASH-RSARSA SHA256Xilinx VU13P FPGA O ( t · λ ) O ( t log t )
ProposedSM2 SM3Muchuang RSPS20 Chip O ( t · k ) O ( t · k )
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Guan, Z.; Huang, H.; Yao, H.; Jia, Q.; Cheng, K.; Ge, M.; Yu, B.; Ma, C. Privacy-Preserving Set Intersection Protocol Based on SM2 Oblivious Transfer. Computers 2026, 15, 44. https://doi.org/10.3390/computers15010044

AMA Style

Guan Z, Huang H, Yao H, Jia Q, Cheng K, Ge M, Yu B, Ma C. Privacy-Preserving Set Intersection Protocol Based on SM2 Oblivious Transfer. Computers. 2026; 15(1):44. https://doi.org/10.3390/computers15010044

Chicago/Turabian Style

Guan, Zhibo, Hai Huang, Haibo Yao, Qiong Jia, Kai Cheng, Mengmeng Ge, Bin Yu, and Chao Ma. 2026. "Privacy-Preserving Set Intersection Protocol Based on SM2 Oblivious Transfer" Computers 15, no. 1: 44. https://doi.org/10.3390/computers15010044

APA Style

Guan, Z., Huang, H., Yao, H., Jia, Q., Cheng, K., Ge, M., Yu, B., & Ma, C. (2026). Privacy-Preserving Set Intersection Protocol Based on SM2 Oblivious Transfer. Computers, 15(1), 44. https://doi.org/10.3390/computers15010044

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop