Next Article in Journal
xIIRS: Industrial Internet Intrusion Response Based on Explainable Deep Learning
Next Article in Special Issue
Efficient Adversarial Training for Federated Image Systems: Crafting Client-Specific Defenses with Robust Trimmed Aggregation
Previous Article in Journal
Optimal Control Model of Electromagnetic Interference and Filter Design in Motor Drive System
Previous Article in Special Issue
Design of Self-Optimizing Polynomial Neural Networks with Temporal Feature Enhancement for Time Series Classification
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Communication Efficient Secure Three-Party Computation Using Lookup Tables for RNN Inference

1
School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen 518055, China
2
Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies, Shenzhen 518067, China
3
School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China
*
Author to whom correspondence should be addressed.
Electronics 2025, 14(5), 985; https://doi.org/10.3390/electronics14050985
Submission received: 27 January 2025 / Revised: 20 February 2025 / Accepted: 25 February 2025 / Published: 28 February 2025
(This article belongs to the Special Issue Security and Privacy in Distributed Machine Learning)

Abstract

:
Many leading technology companies currently offer Machine Learning as a Service Platform, enabling developers and organizations to access the inference capabilities of pre-trained models via API calls. However, due to concerns over user data privacy, inter-enterprise competition, and legal and regulatory constraints, directly utilizing pre-trained models in the cloud for inference faces security challenges. In this paper, we propose communication-efficient secure three-party protocols for recurrent neural network (RNN) inference. First, we design novel three-party secret-sharing protocols for digit decomposition, B2A conversion, enabling efficient transformation of secret shares between Boolean and arithmetic rings. Then, we propose the lookup table-based secure three-party protocol. Unlike the intuitive way of directly looking up tables to obtain results, we compute the results by utilizing the inherent mathematical properties of binary lookup tables, and the communication complexity of the lookup table protocol is only related to the output bit width. We also design secure three-party protocols for key functions in the RNN, including matrix multiplication, sigmoid function, and Tanh function. Our protocol divides the computation into online and offline phase, and places most of the computations locally. The theoretical analysis shows that the communication round of our work was reduced from four rounds to one round. The experiment results show that compared with the current SOTA-SIRNN, the online communication overhead of sigmoid and tanh functions decreased by 80.39% and 79.94%, respectively.

1. Introduction

In recent years, many leading technology companies have introduced “Machine Learning as a Service” (MLaaS) platforms, such as Amazon SageMaker, Microsoft Azure Machine Learning, Google Cloud AI Platform, Alibaba Cloud Platform of Artificial Intelligence, and Tencent Cloud TI Platform. These platforms provide users with easy access to pre-trained neural network models through simple API calls, without requiring a deep understanding of the underlying technical details. As a result, users can quickly perform inference tasks using their own data. This approach has significantly lowered the barrier to using neural network technologies, making it possible for more businesses and individuals to benefit from the advantages of neural network models.
However, behind the convenience of these services lies a significant risk of privacy leakage and challenges that cannot be ignored. On one hand, when users’ data are uploaded to cloud servers for processing, there exists the risk of these private data being leaked or misused. For example, in 2023, the short-video platform TikTok faced widespread privacy concerns globally due to its data-collection and -processing practices. Governments and regulatory authorities in multiple countries have investigated its data-handling methods, fearing that user private data might be employed for improper purposes. On the other hand, when users employ machine learning models provided by service providers for inference tasks, there is a risk of leakage of model parameters and training data. Malicious users can potentially reverse-engineer model parameters and even deduce parts of the training dataset content through repeated access to the model. This poses a significant threat to enterprises that rely on highly confidential data to train their models. Therefore, ensuring the privacy of both users’ input data and neural network models during the model inference phase has become one of the current research hotspots.
Secure Multi-Party Computation (MPC) allows multiple data owners to perform arbitrary computational tasks without revealing their private data. This technology provides an effective method for addressing the challenges. Current research on privacy-preserving deep neural network inference using MPC primarily focuses on convolutional neural networks (CNNs) that handle spatially distributed data [1,2,3,4,5,6,7,8,9,10,11,12]; these efforts have advanced secure inference for image and grid-structured tasks. With the rapid development of large language models in recent years, models that can handle sequence problems have become a focus of research. Recurrent Neural Networks (RNNs) is an architecture specifically designed for time series such as speech signals, physiological time series, and financial transaction flows, while there are relatively limited works on RNNs [13,14,15]. RNNs are more complex in structure compared to CNNs, particularly due to the inclusion of complex nonlinear activation functions (e.g., sigmoid/Tanh function). On one hand, existing work for RNNs only supports secure inference between two parties, without considering the high communication overhead associated with involving multiple parties. On the other hand, existing methods often use piecewise linear functions for approximate computations, which can result in lower accuracy. To address these issues, we propose a communication-efficient secure three-party RNN inference method. In summary, we made the following contributions:
(1)
We construct novel three-party secret-sharing-based digit decomposition, B2A conversion protocols.These protocols complement the existing three-party secret-sharing scheme, effectively reducing the communication overhead in the online phase.
(2)
We propose the lookup table-based secure three-party protocol by utilizing the inherent mathematical properties of binary lookup tables, and the communication complexity of the LUT protocol is only related to the output bit width.
(3)
We propose the lookup table-based secure three-party protocols for RNN inference, including key functions in RNNs, such as matrix multiplication, sigmoid and Tanh function, and achieving lower online communication overhead compared to current SOTA-SIRNN.
The rest of this paper is organized as follows: Section 2 describes work related to this work. Section 3 introduces the preliminaries of the secret sharing scheme and lookup table. Section 4 provides the protocol constructions, including the system model, conversions of secret sharing, and the lookup table-based 3PC protocol. Section 5 gives the basic building blocks of RNN based on the secure protocols of Section 4. Section 6 provide the security analysis. Section 7 provides performance analysis from theoretical and experimental perspectives, and Section 8 summarizes the conclusions.

2. Related Work

In recent years, privacy preservation has become a crucial concern across various domains, reflecting the growing need to protect sensitive data in increasingly interconnected environments. In cloud computing, significant advancements have been made in enabling secure and verifiable machine learning model training [16], as well as in developing efficient frameworks for privacy-preserving neural network training and inference [17]. In the realm of path planning, researchers have introduced techniques to manage complex multi-task assignments while maintaining data privacy [18]. Additionally, the blockchain sector has progressed in privacy governance by ensuring data integrity and controlling access within decentralized databases [19]. Despite these diverse applications, there remains a critical need to specifically address the security issues of sensitive data and privacy models in neural network computation.
There are already many works aimed at CNNs in the privacy-preserving machine learning field, designing secure computing protocols for specific structures in CNNs. For works only focusing on two parties involving secure neural network computations, many of them use different techniques to evaluate secure inference [1,2,3,4,5,6,7,8,9]. Some work ignores nonlinear mathematical functions [1,6,10,11], since whether a monotonic mathematical function is used or not does not affect the inference result in some cases. In 2PC settings, some works use higher-degree polynomials to approximate nonlinear mathematical functions [12] to ensure accuracy. Some previous works use ad hoc approximations to approximate nonlinear functions, but this may cause a loss in model accuracy while achieving higher computational efficiency [3,9]. Rathee et al. [13] utilized a lookup table for some complex nonlinear math functions and the model accuracy was improved to a certain extent. Flute et al. [20] utilized a lookup table for some complex nonlinear math functions, and the model accuracy of nonlinear math functions was improved under 2PC settings.
In order to improve the performance of two-party secure neural network computations, many works introduce a third party to provide randomness during the offline phase to assist computation or involve computations. Similar to the previous work under 2PC settings, 3PC works still use polynomial approximation to compute nonlinear math functions [21,22,23,24]. Chameleon et al. [21] utilized a semi-honest third-party server to assist in generating correlated random numbers during the offline phase and Yao’s Garbled Circuits for nonlinear functions. To reduce the expensive computing operations caused by garbled circuits, some work (e.g., [25]) used ad hoc approximations instead of GC to calculate nonlinear functions, such as maxpool, ReLU, and so on. Moreover, ref. [26] proposed improved replication secret sharing that improved the online communication efficiency. In addition, some multi-party works focus on application scenarios where malicious adversaries exist, thus requiring more communication and computation overhead [27,28,29,30].
However, in many practical scenarios, such as those involving sequential data, Recurrent Neural Networks (RNNs) are essential for effective computation. Early efforts in secure inference of RNNs include ref. [13] addressing SIRNN, which used lookup tables to calculate commonly used nonlinear activation functions like sigmoid and tanh in RNNs. This approach enabled high accuracy with a two-server setup, making it a significant step forward in secure RNN inference, particularly for speech and time series sensor data. Building on this, Zheng et al. [14] introduced a three-party secure computation protocol to handle nonlinear activation functions and their derivatives in RNNs, leveraging Fourier series approximations to balance precision and computation efficiency.
RNNs still face significant challenges in secure computation, particularly in real-world applications. One major issue is the high communication overhead associated with secure RNN inference, which is caused by the complexity of nonlinear activation functions and the recurrent structure of RNNs. Our work focuses on designing secure computation protocols for complex nonlinear functions in RNNs to reduce online communication.

3. Preliminaries

This section provides the fundamental theory required by our method. We introduce the related 3PC secret sharing scheme for the whole framework of our works, and the lookup table technology for nonlinear layers. The notations used in this work are listed in Table 1.

3.1. Secret Sharing

In this paper, we utilize the optimized 3PC secret sharing scheme provided in Meteor [26] to calculate the linear layer of the RNN, which can accelerate fixed-point multiplication of two inputs and integer multiplication of N inputs, and this scheme can be easily extended to more party settings. The scheme is executed with three parties { P i } i [ 0 , 2 ] over a Boolean ring Z 2 (or arithmetic ring Z 2 l , where l is the bit width). Given two private bits { x , y } , Boolean sharing refers to the secret sharing scheme of bit size l = 1 and with logical AND, XOR, NOT operations, denoting , , x ¯ , respectively.

3.1.1. · -Sharing

  • For · -sharing over Boolean ring Z 2 , any Boolean value v { 0 , 1 } is secret shared by three random values v 0 , v 1 , v 2 Z 2 , and  P i holds its share [ v ] i = v i , such that v = [ v ] 0 [ v ] 1 [ v ] 2 .
  • For · -sharing over arithmetic ring Z 2 l , any arithmetic value v Z 2 l is secret shared by three random values v 0 , v 1 , v 2 Z 2 l , and  P i holds its share [ v ] i = v i , such that v = v 0 + v 1 + v 2 ( m o d 2 l ) .

3.1.2. · -Sharing

· -Sharing is similar to · -sharing, but it is the three-party 2-out-of-3 replicated secret sharing scheme.
For · -sharing over Boolean ring Z 2 , any Boolean value v { 0 , 1 } is secret shared by three random values v 0 , v 1 , v 2 Z 2 , where v = [ v ] 0 [ v ] 1 [ v ] 2 . P i holds v i = ( v i , v i + 1 ) , such that any two parties can construct the secret value v.
For · -sharing over arithmetic ring Z 2 l , any arithmetic value v Z 2 l is secret shared by three random values v 0 , v 1 , v 2 Z 2 l , where v = v 0 + v 1 + v 2 ( m o d 2 l ) . Also, P i holds v i = ( v i , v i + 1 ) , such that any two parties can construct the secret value v.
  • Linear Operation: For secret shared Boolean values x and y , three parties { P i } i [ 3 ] can compute value z = c 1 x c 2 y c 3 , where c 1 , c 2 , c 3 are public constant bits. In the Boolean ring Z 2 , each party P i can locally compute its secret shares z i , such that any two parities of { P i } i [ 3 ] can recover the secret value z. It is similar for arithmetic rings Z 2 l , by replacing the XOR operation ⊕ with the ADD operation +, and taking the modulus of 2 l for the result.
  • AND Operation: For secret shared Boolean values x and y , three parties { P i } i [ 3 ] can compute the secret shared value z = x y as follows: (a) each party P i locally computes z i = x i y i x i + 1 y i x i y i + 1 ; (b) each party P i locally computes z i = z i α i , where α 0 α 1 α 2 = 0 (for random number generation, please refer to Appendix A); (c) all three parties together perform re-sharing to obtain the sharing of z i by sending z i to P i 1 so that each P i holds z i = ( z i , z i + 1 ) .
  • Multiplication: For secret shared arithmetic values x and y , three parties { P i } i [ 3 ] can compute the secret shared value z = x × y as follows: (a) each party P i locally computes z i = x i y i + x i + 1 y i + x i y i + 1 ; (b) each party P i computes z i = z i + α i , where α 0 + α 1 + α 2 = 0 (for generation of α 0 , α 1 , α 2 , please refer to Appendix A); (c) all three parties together perform re-sharing so that each party P i holds its own secret share z i = ( z i , z i + 1 ) .

3.2. Secret Sharing Semantics of This Work

In order to improve the computational efficiency of the · -sharing and · -sharing, we provide the novel and efficient three-party secret sharing scheme denoted as [ [ · ] ] -sharing inspired by the scheme in [9,26]. With the aid of using [ [ · ] ] -sharing, we further divide the computation into online and offline phases.
For [ [ · ] ] -sharing over Boolean ring Z 2 , any Boolean value v { 0 , 1 } is secret shared with a mask value m v Z 2 and the · -sharing of a random value λ v Z 2 , P i holds [ [ v ] ] i = ( m v , λ v i ) , where m v = v λ v is known to all three parties, i { 0 , 1 , 2 } .
For [ [ · ] ] -sharing over arithmetic ring Z 2 l , any arithmetic value v Z 2 l is secret shared with a mask value m v Z 2 l and the · -sharing of a random value λ v Z 2 l , P i holds [ [ v ] ] i = ( m v , λ v i ) , where m v = v + λ v ( m o d 2 l ) is known to all three parties, i { 0 , 1 , 2 } . In addition, the complement of v can be computed as [ [ v ¯ ] ] = [ [ 1 v ] ] by setting m v ¯ = m v ¯ .
  • Linear operation: This pattern is linear for both Boolean and arithmetic rings. For example, for Boolean sharing, assume c 1 , c 2 , c 3 are public constant bits, [ [ x ] ] , [ [ y ] ] are two secret-shared values, and  z = c 1 x c 2 y c 3 , then each party P i can compute its share [ [ z ] ] i = ( m z , λ z i ) locally by setting m z = c 0 m x c 1 m y c 2 and λ z i = c 0 λ x i c 1 λ y i , for  i { 0 , 1 , 2 } .
  • Secret share operation: Secret share operation enables a privacy data owner (assume P i is the data owner) to generate [ [ · ] ] -sharing of its private input x. In the offline phase, all parties jointly invoke F R a n d · in Appendix A to sample random values λ x , where privacy data owner P i knows λ x clearly. In the online phase, P i computes and reveals m x = x λ x in a Boolean ring and m x = x λ x ( m o d 2 l ) in an arithmetic ring. The above process can be carried out using F S h a r e [ [ · ] ] function.
  • Reconstruct operation: F R e c [ [ · ] ] can reconstruct x by invoking F R e c · first to obtain λ x , and then parties can locally compute x = m x λ x in a Boolean ring and x = m x + λ x ( m o d 2 l ) in an arithmetic ring.
  • AND operation: With the functionality F 2 A N D [ [ · ] ] AND two Boolean secret value [ [ x ] ] , [ [ y ] ] , and output [ [ z ] ] , where z = x y , we have:
    m z = x y λ z = ( m x λ x ) ( m y λ y ) λ z = m x m y λ x m y λ y m x λ x λ y λ z
    As shown in Equation (1), all terms except for λ x λ y can be computed locally since m x , m y is known to all parties. Therefore, the main difficulty becomes calculating λ x λ y given λ x and λ y . Since λ x and λ y is input-independent, λ x λ y can be computed in the offline phase using F 2 A N D · . In the offline phase, parties interactively generate the randomness λ z using the method in Appendix A, and in the online phase, each party P i computes m z = z λ z . The protocols of function F 2 A N D [ [ · ] ] are shown in Algorithm 1.
    Algorithm 1  F 2 A N D [ [ · ] ] : two-input AND in boolean ring
    • Input:  [ [ · ] ] -shares of x and y.
    • Output:  [ [ · ] ] -shares of z.
    •  Offline Phase:
       1.
      P i sample random values λ z where i { 0 , 1 , 2 } .
       2.
      Parties mutually generate λ x λ y using F 2 A N D · .
    •  Online Phase:
       1.
      Parties locally set m Δ i = m y λ x i m x λ y i λ x λ y i λ z i , where i { 0 , 1 , 2 } .
       2.
      Parties reveal m Δ and set m z = m Δ m x m y .
       3.
      P i hold [ [ z ] ] i = ( m z , λ z i ) .
  • Multi-input AND operation: For multi-input AND gates, F N A N D [ [ · ] ] takes the N Boolean value ( x 1 , x 2 , , x N ) as input, and output z = i = 0 N x i , then we have:
    m z = i = 0 N x i λ z = i = 0 N ( m x i λ x i ) λ z = I { 1 , , N } ( j I m x j · k I λ x k ) λ z
    So as the procedure of the two-input AND gate, parties compute the input-independent · -sharing of k I λ x k I { 1 , , N } by invoking F 2 A N D · in a tree-like combinatorial manner in the offline phase.
  • Two-input multiplication: the multiplication of two numbers in an arithmetic ring F 2 M u l t [ [ · ] ] is similar to that in Boolean sharing, simply replacing the OR operation in Boolean sharing with addition and the AND operation with multiplication. The multiplication of two numbers under arithmetic sharing is depicted in Equation (3):
    m z = x × y λ z = ( m x + λ x ) ( m y + λ y ) λ z = m x m y + λ x m y + λ y m x + λ x λ y λ z
    Similarly, the shares of λ x λ y can be calculated using F 2 M u l t · in Section 3.1.2. As introduced in Section 3.3, the calculation of secure inference uses fixed-point representation in Z 2 l ; after precise multiplication, the decimal part of the result will double its original size, i.e.,  x · 2 d × y · 2 d = x y · 2 2 d . Therefore, we should truncate the last d bits of the product to obtain an approximate result. In a secret sharing scheme, truncation takes into account two types of probability errors: a small error caused by carry bit error and a large error caused by overflow during multiplication calculation. And the shares of output that make z = z / 2 d hold a probability of 1 are called faithful truncation. We use faithful truncation methods from [9,26]: In the offline phase, parties mutually generate the · -sharing of λ z and λ z where λ z = λ z / 2 d . During the online phase, all parties P i locally compute m z = x × y λ z and reveal m z , and then parties set m z = m z / 2 d . The protocols of F 2 M u l t [ [ · ] ] are shown in the Algorithm 2.
    Algorithm 2  F 2 M u l t [ [ · ] ] : two-input multiplication in arithmetic ring
    • Input:  [ [ · ] ] -shares of x and y.
    • Output:  [ [ · ] ] -shares of z.
    •  Offline Phase:
       1.
      P i sample random values λ z and set λ z where i { 0 , 1 , 2 } .
       2.
      Parties mutually generate λ x λ y using F 2 M u l t · .
    •  Online Phase:
       1.
      Parties locally set m Δ i = m y λ x i + m x λ y i + λ x λ y i λ z i , where i { 0 , 1 , 2 } .
       2.
      Parties reveal m Δ and set m z = m Δ + m x m y .
       3.
      P i hold [ [ z ] ] i = ( m z / 2 d , λ z i ) .
  • Multi-input multiplication: Similar to the multi-input AND operation under Boolean sharing, multiplication of multiple numbers under arithmetic sharing is shown in Equation (4):
    m z = i = 1 N x i λ z = i = 1 N ( m x i + λ x i ) λ z = I { 1 , , N } ( j I m x j · k I λ x k ) λ z

3.3. Fixed-Point Representation

Practical applications such as machine learning and mathematical statistics usually require the floating-point numbers for calculations. While in MPC, it is generally calculated in finite rings or fields, so it is necessary to encode floating-point numbers as fixed-point numbers [3,11,21,24,31]. The conversion relationship between floating-point numbers and fixed-point numbers is as follows: given a floating-point number x R , its corresponding fixed-point number x = x · 2 d ( m o d 2 l ) , where l is bit width and d is precision. We use [ 0 , 2 l 1 ) and [ 2 l 1 , 2 l ) to represent positive and negative numbers, respectively.

3.4. Lookup Table

The lookup table (LUT) structure is used to pre-compute and store the inputs and corresponding results of the function. The corresponding results can be found by looking up the input in the table, without the need to calculate again.
The lookup table of a function in this paper refers to the set of all inputs and their corresponding outputs within a certain range, i.e.,  f : { 0 , 1 } δ { 0 , 1 } σ [32], where σ and δ are bit width of input and output, respectively. Using this representation, any function can construct its corresponding lookup table within a certain range, and the computational complexity of the lookup table is related to its size. Therefore, logically complex lookup tables are theoretically more suitable to computing using lookup tables.
An instance of an LUT is demonstrated in Figure 1. In this work, { E u } u [ δ ] is the input column of the table and { ξ w } w [ σ ] is the output column, where each length is 2 δ , for example, E 2 is 00110011 and ξ 1 is 00101101. In practical scenarios, the bit width design of lookup tables (LUTs) requires a balance between computational efficiency and numerical accuracy requirements. Our lookup table (LUT) selects 12 bit input/12 bit output precision, achieving a balance between the actual deployability of RNNs and model accuracy. In latency-critical scenarios like real-time ECG anomaly detection (sampling rates ≤ 250 Hz), 12-bit input limits LUT size to 4 KB. For applications such as speech recognition that are not sensitive to real-time efficiency, 12-bit can also ensure good accuracy. However, if users want to use a look-up table for models with large numbers or high accuracy requirements, they need to optimize the look-up table structure by using an automation toolchain to reduce the level of look-up table circuits and achieve higher performance.

4. Protocol Constructions

In this section, to provide a clearer exposition of the protocol we have constructed, we first present the system model in this paper. Then, we provide the digit decomposition, B2A and Bit2A conversion protocols based on the [ [ · ] ] -sharing. Finally, we proposed a lookup table-based secure three-party protocol.

4.1. System Model

Consider a computing scenario with three servers that are independent of each other. We execute a secure computing protocol under a semi-honest model, As shown in Figure 2, there are three roles in a system model: model owner, data owner, and computation participants.
(1)
Model owner: This is the model’s architecture and parameters, typically having a machine learning model that has been trained or needs to be trained. In the initial stage of computation, the model owner must send the model parameters to the computation participants in the form of secret shares, but does not receive the computation results.
(2)
Data owner: This has real data and aims to conduct joint inference without exposing data privacy. The data owner sends the secret shares of private data to the computation participants at the beginning of the calculation, and receives the result shares sent by the computation participants after the secure computation is completed.
(3)
Computation participants: These act as the actual “computing executors” within the secure computation protocol, typically serving as third-party platforms, service providers, or distributed nodes with multi-party secure computation capabilities. In the context of this paper, the computation participant servers are three servers that initially receive the model shares from the model owner and the data shares from the data owner. After executing the secure computation protocol, three servers send their result shares back to the data owner, respectively.
To demonstrate this system model’s practical applicability, we take the healthcare field as an example. Some medical institutions utilize speech-recognition technology to develop electronic medical record (EMR)-management platforms, which enables physicians to complete clinical documentation through voice dictation, significantly reducing documentation burdens and improving workflow efficiency. Concurrently, the technology assists clinicians in rapidly retrieving patients’ historical medical records, thereby providing robust support for accurate diagnosis. However, as patient medical information involves sensitive personal privacy, the system model in this work can be used. For this case, the model owner as in Figure 2 is the medical institution, it can train the model locally or via other methods using relevant datasets, and then the model parameters are distributed to three servers through secret sharing. The data owner is a physician (or patient); they can split their private personal data into secret shares and transmit them to the same three servers. Following secure computation protocols, the servers return the result shares to the physician. This process effectively prevents leakage of both the medical institution’s model parameters and the patient’s private information.

4.2. Conversions of Sharing

In the lookup table protocol, the inputs are [ [ · ] ] 2 -shares of every δ bit. Therefore, a secure digit decomposition function F D i g D e c [ [ · ] ] under a 3PC [ [ · ] ] -sharing scheme is required before each invoking of a lookup table protocol. In this protocol, the wrap function and the 3PC private-compare function (the same as the protocol in [31]) under the · -sharing scheme will be invoked. The following describes the protocol in detail.

4.2.1. Wrap Function

The wrap function W r a p 3 calculates the carry bit when the secret shares held by three parties are added together, and the output may be 0, 1 or 2, i.e., assume three secret shares are a 0 , a 1 , a 2 Z 2 l , and   W r a p 3 ( a 0 , a 1 , a 2 ) is:
W r a p 3 ( a 0 , a 1 , a 2 ) = 0 , i f a 0 + a 1 + a 2 < 2 l 1 , i f 2 l a 0 + a 1 + a 2 < 2 · 2 l 2 , i f 2 · 2 l a 0 + a 1 + a 2 < 3 · 2 l
But since the operands in the lookup table protocol are all bits, the wrap function defined here is modulo 2 on the original result, i.e.,  F W r a p 3 · = W r a p 3 m o d 2 . The details of the secure wrap protocol can refer to [31].

4.2.2. Private Compare

In the digit decomposition protocol, the operation should obtain the bit shares of comparison results of secret value x and a public number r. The 3PC private compare function F P C · can be realized in [31].

4.2.3. Digit Decomposition

F D i g D e c [ [ · ] ] converts secret shares in arithmetic ring Z 2 l into shares in Boolean ring Z 2 . Given private input [ [ x ] ] i = ( m x , λ x i ) , x Z 2 l , i { 0 , 1 , 2 } , output [ [ x j ] ] i = ( m x j , λ x j i ) , j [ l ] , where x j is the jth bit of x (i.e., x = j = 0 l 1 x j · 2 j ). For ease of calculation, each party locally decomposes value m x , then obtains public bits { m x j } j [ l ] , where m x = j = 0 l 1 m x j · 2 j . Next, each party interactively converts the · -shares into Boolean shares.
The challenge of the three-party decomposition protocol under [ [ · ] ] -sharing lies in how to construct a specific λ x j Z 2 l on the basis of an existing arithmetic ring λ x Z 2 l . Based on the mathematical observation, we propose a three-party decomposition protocol under [ [ · ] ] -sharing, enabling most calculations to be completed offline for decomposition. Below, we describe the mathematical observation and the corresponding protocol in detail. Given the secret sharing [ [ x ] ] i = ( m x , λ x i ) , where m x , λ x Z 2 l , i { 0 , 1 , 2 } , suppose g i , j = λ i , j λ i , j 1 λ i , 0 , j < l , then the · -sharing follows Formula (5):
λ x j = λ 0 , j + λ 1 , j + λ 2 , j + c j ( m o d 2 )
where c j is the carry bit of g 0 , j i , g 1 , j i , g 2 , j i , i.e.,  c j = F W r a p 3 · ( g 0 , j i , g 1 , j i , g 2 , j i , 2 j ) .
Similarly, suppose there is Y j = λ j λ j 1 λ 0 , B j = m x j m x j 1 m x 0 , j < l . The secret sharing [ [ x ] ] satisfies:
x j = λ x j + m x j + c j ( m o d 2 ) = λ 0 , j + λ 1 , j + λ 2 , j + c j + m x j + c j ( m o d 2 )
where c j is the carry bit of Y j 1 + B j 1 , i.e.,  c j = W r a p 2 ( Y j 1 , B j 1 , 2 j ) . However, in actual computation, Y j 1 is encrypted, so c j cannot be directly computed. By observation, we can deduce c j = W r a p 2 ( Y j 1 , B j 1 , 2 j ) = Y j 1 + B j 1 < 2 j ? 0 : 1 = ( g 0 , j 1 + g 1 , j 1 + g 2 , j 1 ( m o d 2 j ) ) < 2 j B j 1 ? 0 : 1
Since B j 1 is a public value, this term can be moved to the right side of the inequality, i.e., c j = g 0 , j 1 + g 1 , j 1 + g 2 , j 1 ( m o d 2 j ) < 2 j B j 1 ? 0 : 1 = F P C · ( g 0 , j 1 , g 1 , j 1 , g 2 , j 1 , 2 j B j 1 ) . Since W r a p 2 can be computed locally, there is no online communication. The protocols of the F D i g D e c [ [ · ] ] are described in Algorithm 3.
Algorithm 3  F D i g D e c [ [ · ] ] : 3PC digit decomposition
  • Input:  [ [ · ] ] -shares of x, where x Z 2 l .
  • Output:  [ [ · ] ] -shares of x j , where x j Z 2 , j [ l ] .
  •  Offline Phase:
     1.
    Each party locally converts the shares λ x into bits { λ b , j } and computes g b , j 1 locally, where b { 0 , 1 , 2 } , j { 0 , , l 1 } ;
     2.
    Each party invokes the function F W r a p 3 · ( g 0 , j 1 , g 1 , j 1 , g 2 , j 1 , 2 j ) to obtain c j ;
     3.
    Each party locally computes u b , j = λ b , j + c b , j .
  •  Online Phase:
     1.
    Each party locally converts the shares m x into bits m x j , B j 1 ;
     2.
    Each party invokes the function F P C · ( g 0 , j 1 , g 1 , j 1 , g 2 , j 1 , 2 j B j 1 ) to obtain c j ;
     3.
    Each party locally computes ψ b , j = u b , j + c b , j . The final share after bit conversion is [ [ x j ] ] = ( m x j , ψ b , j ) .

4.2.4. B2A Conversion

Boolean to arithmetic function F B 2 A [ [ · ] ] is the inverse operation of digit decomposition. It converts the bit shares ( x 0 , x 1 , , x l 1 ) Z 2 to value x Z 2 l , such that x = j = 0 l 1 x j · 2 j , x Z 2 l . We can first convert x i to the arithmetic ring, and then use the linear property of [ [ · ] ] -sharing to calculate. Then, for secret x Z 2 in [ [ · ] ] -sharing that x = m x λ x , we can derive the following mathematical properties: x A = m x A + λ x A 2 m x A λ x A ; so we have:
x A = m x A + λ x A 2 m x A λ x A = m x A + ( 1 2 m x A ) λ x A = m x A + ( 1 2 m x A ) ( λ 0 λ 1 λ 2 ) = m x A + ( 1 2 m x A ) ( λ 0 A + λ 1 A + λ 2 A 2 λ 0 A λ 1 A 2 λ 1 A λ 2 A 2 λ 0 A λ 2 A + 4 λ 0 A λ 1 A λ 2 A )
We have x = m x λ x = m x λ 0 λ 1 λ 2 , and assume that the arithmetic value in Z 2 l is x A , m x A , λ x A , respectively. Therefore, we can derive the mathematical equation in Equation (7) in the same way. All items except for λ 0 A λ 1 A λ 2 A can be computed locally, but we can have one of the participants P i , i { 0 , 1 , 2 } locally compute λ i λ i + 1 , and then securely compute λ 0 A λ 1 A λ 2 A using the Du-Atallah protocol [21].

4.3. Lookup Table-Based 3PC Protocol

We assume that the function to be evaluated is represented as a lookup table, and the LUT is public that all parties can know the input encoding { E u } u [ δ ] and output encoding { ξ w } w [ σ ] , and the parties have Boolean secret sharing of the private input. In secure two-party computation, there are several approaches to compute a lookup table. The most widely applied are One Time True Table (OTTT) [33], Online-LUT (OP-LUT) [32], and Setup-LUT (SP-LUT) [32]. But in the 3PC scenario, there are no efficient computation schemes. In order to increase the efficiency of the online phase, we proposed a 3PC lookup table protocol, expanding the application scenarios of Flute [20] from 2PC to 3PC settings.
This approach converts the lookup process of the lookup table into a computing function that is related to the private input representing the inner product. The previous lookup table method enumerated all inputs [32,33] or all outputs [32] and then obtained the final lookup table output by utilizing the obvious transfer (OT). This is regular but the complexity of computation and communication are too high because the whole table will be sent to the other party. Therefore, we only require part of the lookup table to be evaluated rather than an entire table. There are four steps involved in this conversion, and the specific steps are shown below.
(1) The first step: compute the fully disjunctive normal form of the input. When evaluating a Boolean δ -to- σ lookup table, we only need to focus on the rows where the result is 1, then the output of LUT can be represented as the full disjunctive normal form (DNF) of the corresponding input for those rows. Taking a lookup table with output 1 as an example in Figure 1, assuming there are α rows whose results are 1, then for each j [ α ] , we compute all terms i = 1 δ L j i , where L j i = x i if x i = 1 in the LUT and L j i = x i ¯ if x i = 0 for all i [ δ ] , and then connect all terms using the OR operation, output L U T ( x 1 , x 2 , x 3 ) = j = 1 α i = 1 δ L j i .
Using the lookup table depicted in Figure 1 as an illustration, the output of the third, fifth, sixth rows is 1, i.e.,  α = 4 , and assuming the input is ( x 1 , x 2 , x 3 ) = ( 0 , 1 , 1 ) , the output of LUT is 0. And then we calculate the formula; then, we can exactly obtain the output result L U T ( x 1 , x 2 , x 3 ) = ( x 1 ¯ x 2 x 3 ¯ ) ( x 1 x 2 ¯ x 3 ¯ ) ( x 1 x 2 ¯ x 3 ) ( x 1 x 2 x 3 ) = 0 .
(2) The second step: replace the OR operation with the XOR operation. Evaluating the above DNF expression requires δ · α 1 AND and OR operations, which costs high online communication. However, We remove OR operations with the following significant properties: Given the input ( x 1 , , x δ ) , in the above DNF equation i = 1 δ L j i , at most one term can result in 1 [20], which means that it is impossible for any two different terms to each result in 1. Based on this property, we can obtain the following Equation (8):
j = 1 α i = 1 δ L j i = j = 1 α i = 1 δ L j i
Still taking the lookup table in Figure 1 as an example, α = 4 , and assuming the input is ( x 1 , x 2 , x 3 ) = ( 0 , 1 , 1 ) , the output of LUT is 0. Then, compute the improved formula to obtain L U T ( x 1 , x 2 , x 3 ) = ( x 1 ¯ x 2 x 3 ¯ ) ( x 1 x 2 ¯ x 3 ¯ ) ( x 1 x 2 ¯ x 3 ) ( x 1 x 2 x 3 ) = 0 .
(3) The third step: replace the previous formula with an equivalent inner product computation. The purpose of this step is to transform the equation from the previous step into a more easily computable and efficient form. When α = 1 , the equation is i = 1 δ L j i , which can be seen as multi-input AND gate in Section 3.2. When δ = 2 , the transformed equation is j = 1 α L j 1 L j 2 , which is equivalent to the vector inner product operation over a Boolean ring. Therefore, we use the aforementioned protocols to calculate the inner product and the multi-input AND gate, inspired by [9,20]. Finally, the equation is as follows in Equation (9)
L U T ( x 1 , x 2 , x 3 ) = j = 1 α i = 1 δ L j i = j = 1 α i = 1 δ L j i = i = 1 δ L i
where ⨀ denotes an inner product operation like the form in vector. Assume α = 3 , and the input is ( x 1 , x 2 , x 3 ) = ( 0 , 1 , 1 ) ; the output of LUT is 0. Then, compute the improved formula to obtain L U T ( x 1 , x 2 , x 3 ) = x 1 ¯ x 1 x 1 x 1 · x 2 x 2 ¯ x 2 ¯ x 2 · x 3 ¯ x 3 ¯ x 3 x 3 = 0 , and we can see that the result is correct.

4.3.1. Mathematical Expression of Multi-Input LUT

Consider a δ input vector I = ( x 1 , x 2 , , x δ ) and 1-bit output, the dimension of each component x i is represented as d where i { 1 , 2 , , δ } . Based on the above conclusion, we need to calculate o u t p u t = x 1 x 2 x δ . Let I j = ( x j 1 , x j 2 , , x j δ ) be a set of the jth element of each x i I . As we mentioned in Section 3.2, the secret value x is [ [ · ] ] -sharing in this work, so we have:
y = x 1 x 2 x δ = j = 1 d i = 1 δ x j i = j = 1 d i = 1 δ m x j i λ x j i = j = 1 d S j 2 I j m S j λ I j S j
In Equation  (10), it is easy to derive the equations for the first three lines from Section 4, while in the fourth step, S j 2 I j ( m S j λ I j S j ) is the expansion of i = 1 δ ( m x j i λ x j i ) , which follows the distribution law, and its expension has 2 σ items in total, power set 2 I j means all the expanded items, and  S j is one of a subset in 2 I j , and  I j S j means difference set, where j { 1 , , d } , i.e.,  I j S j = I j S j . Although the expression contains d · ( 2 δ δ 1 ) AND gates, these AND gates can be locally calculated to obtain the shares of each term leading to low online communication since λ I j S j can be computed by F N A N D · locally and m S j is clear to all parties.

4.3.2. Lookup Table Protocol

Given a δ -to- σ LUT T, we calculate the lookup table results bit by bit, so the goal is to compute L U T ( x 1 , , x δ ) = ( y 1 , , y σ ) , y t [ σ ] = i = 1 δ L i = j = 1 α S j 2 I j m S j λ I j S j as mentioned in Equation (10), where α is the number of rows that output 1. To filter out rows with output 1, we can multiply this expression by the output encoding of LUT { ξ w } w [ σ ] , so the rows with output 0 will be deleted. Since the lookup table is public, the output encoding is also public, so no additional communication is required to obtain all rows with output 1 as in Equation (11).
y t σ = j 2 σ , ξ j t = 1 i = 1 δ L j i = L 1 L δ ξ t
In Equation (11), we also need to handle the mapping relationship between the input vector ( x 1 , , x δ ) and vector L i for i [ δ ] . We can see from Section 4.3 that L j i = x i if x i = 1 and L j i = x i ¯ if x i = 0 for all i [ δ ] . Since the lookup table is public, we can also obtain the input encoding { E u } u [ δ ] , so we can obtain L i as Equation (12):
L j i = x i ( 1 E j i ) = ( m x i λ x i ) ( 1 E j i ) = ( m x i 1 E j i ) λ x i
where i [ δ ] , j 2 δ . Because the input encoding E i is public and the secret sharing scheme we used in Section 3.2 is linear, no additional communication is required for this step of processing.
Furthermore, in our secret sharing scheme, calculating the complement x i ¯ only needs to compute the complement of m x i z, but λ x i remains unchanged, such that [ [ x i ¯ ] ] = [ [ m x i ¯ λ x i ] ] . Thus, we can observe that the term λ I j S j for all j [ 2 δ ] in Equation (13) is the same for each row with input vector I = { x 1 , , x δ } , so the same random number λ I S can be used. Therefore, we can obtain the following derivation formula as Equation (13):    
y w = L 1 L δ ξ w = j = 1 2 δ i = 1 δ L j i ξ w = j = 1 2 δ i = 1 δ L j i ξ j w = j = 1 2 δ i = 1 δ m L j i λ L j i ξ j w = j = 1 2 δ S j 2 I j m S j λ I j S j ξ j w = S 2 I j = 1 2 δ m S j ξ j w λ I S = S 2 I m S ξ w λ I S
where ξ w denotes the output encoding of LUT for all w [ σ ] , S j denotes a set replacing each x i with L j i , m L j i and λ L j i is the secret shares of L j i for i [ δ ] , j [ 2 δ ] , and  m S j , λ I j S j is the shares of each S j where m S j is replaced with m L j i , and λ I j S j stays the same. The third to fourth lines follow the law of distribution, and since it is observed that m S j and ξ j w are public parameters, these two items are combined for computation. And then according to Equation (9), the final equation, Equation (13), can be obtained.
Because λ I j S j are reused, in the offline phase, we just need to invoke the F 2 A N D · 2 δ δ 1 times.
The protocol details are shown in Protocol 4. In the second step of the online phase in Algorithm 4, to prevent parties repeat calculating the m I , we move this term in the fourth step for calculation.
Algorithm 4  F L U T [ [ · ] ] : protocol of LUT
  • Input: a public δ -to- σ LUT T with input encoding { E u } u [ δ ] { 0 , 1 } 2 δ and output encoding { ξ w } w [ σ ] { 0 , 1 } 2 δ , and  [ [ · ] ] -shares of input vector ( [ [ x 1 ] ] , , [ [ x δ ] ] ) .
  • Output:  [ [ y ] ] , where y = ( y 1 , , y σ ) = L U T ( x 1 , , x δ ) .
  •  Offline Phase:
     1.
    Each party { P t } t [ 0 , 2 ] sample random values λ y j t { 0 , 1 } using F R a n d · for t { 0 , 1 , 2 } , j [ σ ] .
     2.
    Each party { P t } t [ 0 , 2 ] interactively generates λ S for S 2 I by invoking F 2 A N D · .
  •  Online Phase:
     1.
    All parties locally set L j i = x i ( 1 E j i ) in Equation (12) for i [ δ ] , j [ 2 δ ] .
     2.
    Each Party { P t } t [ 0 , 2 ] locally computes:
    v w t = S 2 I , S I m S ξ w λ I S λ y j t
     3.
    All parties renconstruct v w by exchanging v w t and computing v w = v w 0 v w 1 v w 2 .
     4.
    All parties locally compute m y j = v w ( m I ξ w ) .

5. The 3PC Protocols of Secure RNNs Operators

In this section, we will introduce secure matrix multiplication and secure activation functions of nonlinear layers commonly used in RNNs, such as sigmoid, and tanh as shown in Table 2. In computers, only finite numbers can be represented; complex mathematical functions cannot be accurately represented. Therefore, floating-point numbers are usually used to approximate infinite numbers [34].

5.1. Matrix Multiplication

The linear part of the model is usually matrix computation. The protocol for matrix multiplication is similar to F 2 M u l t [ [ · ] ] . An  m × o matrix [ [ X ] ] i = ( m X , λ X i ) is multiplied by an o × n matrix [ [ Y ] ] i = ( m Y , λ Y i ) , then we obtain matrix m × n and matrix [ [ Z ] ] i = ( m Z , λ Z i ) . In the offline phase, all parties P i mutually generate λ X Y by invoking F M u l t · , and then in the online phase, all parties P i reveal m X , m Y and compute m Z locally. The details are shown in Algorithm 5.
Algorithm 5 Matrix multiplication F M a t M u l ( [ [ x ] ] )
  • Input:  [ [ · ] ] -shares of matrix X and Y.
  • Output:  [ [ · ] ] -shares of Z where Z = X · Y .
  •  Offline Phase:
     1.
    P i sample random values λ Z and set λ Z where i { 0 , 1 , 2 } .
     2.
    Parties mutually generate λ X λ Y using F 2 M u l t · .
  •  Online Phase:
     1.
    P i locally sets m Δ i = m Y λ X i + m X λ Y i + λ X λ Y i λ Z i , where i { 0 , 1 , 2 } .
     2.
    Parties reveal m Δ and set m Z = m Δ + m X m Y .
     3.
    P i hold [ [ Z ] ] i = ( m Z / 2 d , λ Z i ) .

5.2. Exponential

Consider the following exponential function: r E x p l ( x ) = e x , x R + . Due to the properties of exponential functions, this can be equivalent to r E x p ( x ) = e x = r E x p ( 2 d ( k 1 ) x k 1 ) · · r E x p ( 2 d x 1 ) · r E x p ( 2 x 0 ) , where x of length l is divided into k parts, each with a length of d [13], where this can be easily computed by invoking F D i g D e c [ [ · ] ] in Section 3 first, and then invoking the lookup table protocols separately in Algorithm 4. To reduce communication and computing costs as well as memory usage, the private inputs from the larger arithmetic ring are decomposed into smaller one Z 2 8 . The details are in Algorithm 6.
Algorithm 6 Functionality of exponential F r E x p ( [ [ x ] ] )
  • Input:  [ [ · ] ] -shares of x.
  • Output:  [ [ · ] ] -shares of y e x = r E x p ( x ) .
  •  offline phase:
     1.
    Each party P i invokes the offline phase of F D i g D e c [ [ · ] ] with input [ [ x i ] ] Z l to obtain u j b where i , b { 0 , 1 , 2 } , j { 0 , , l 1 } , l is the bit width of x.
     2.
    After obtaining the [ [ x i j ] ] Z 2 (it can be obtained in the first step of the offline phase and the first step of the online phase in F r E x p ( [ [ x ] ] ) ), each party P i invokes the offline phase of F L U T [ [ · ] ] with input [ [ x i j ] ] Z 2 and the input encoding and output encoding corresponding to the exponential function, then one obtains randomness λ y j t and λ S where t { 0 , 1 , 2 } , j { 0 , , l 1 } , l is the bit width of x.
  •  Online Phase:
     1.
    Each party P i invokes the online phase in F D i g D e c [ [ · ] ] to obtain output [ [ x i j ] ] Z 2 where i { 0 , 1 , 2 } , j [ l ] .
     2.
    Each party P i invoke the online phase of F L U T [ [ · ] ] that for each t k , compute and obtain [ [ y j ] ] = F L U T [ [ · ] ] ( [ [ x j t ] ] ) after building a lookup table with δ = d , σ = d , where j [ l ] , x j t means the input for jth bit of output.
     3.
    Each party P i convert bit shares to [ [ z ] ] in arithmetic shares by using F B 2 A [ [ · ] ] respectively.

5.3. Sigmoid

The sigmoid function can be simply assumed to be composed of an exponential function e x and a reciprocal 1 x . Therefore, we sequentially calculate the exponential function and the reciprocal to obtain the result of the sigmoid function. For exponential function e x , we can use F r E x p ( [ [ x ] ] ) and obtain the accurate approximate results (the accuracy depends on the size of the lookup table). Then, for reciprocal 1 x , we use the Goldschmidt iteration method (similar to the method in [13]); this method’s accuracy largely depends on the initial iteration value. In order to obtain a closer initial value, we construct a lookup table to obtain a more reliable approximation of the reciprocal function, and then continuously iterate on this basis to improve accuracy. The details are in Algorithm 7.
Algorithm 7 Functionality of exponential F sigmoid ( [ [ x ] ] )
  • Input:  [ [ · ] ] -shares of x.
  • Output:  [ [ · ] ] -shares of y 1 1 + e x = sigmoid ( x ) .
  •   offline phase:
     1.
    Each party P i invokes the offline phase of F D i g D e c [ [ · ] ] with input [ [ x i ] ] Z l to obtain u j b where i , b { 0 , 1 , 2 } , j { 0 , , l 1 } , l is the bit width of x.
     2.
    Each party P i invokes the second step of the offline phase in F r E x p [ [ · ] ] to obtain randomness λ y j t and λ S where t { 0 , 1 , 2 } , j { 0 , , l 1 } , l is the bit width of x.
  •   Online Phase:
     1.
    Each party P i invokes the online phase of F D i g D e c [ [ · ] ] to obtain output [ [ x i j ] ] Z 2 where i { 0 , 1 , 2 } , j [ l ] .
     2.
    Each party P i invokes the online phase of F r E x p [ [ · ] ] to obtain [ [ z ] ] where i { 0 , 1 , 2 } , and then obtain output by executing Goldschmidt iteration.

5.4. Tanh

The hyperbolic tangent (tanh) has many application scenarios in neural networks, mainly as activation functions for hidden layers. It is a variant of the sigmoid function. The tanh function maps the input value to a continuous value in the range of −1 to 1. According to the definition, the following equation holds: tanh = 2 sigmoid ( 2 x ) 1 ; thus, it can be computed by invoking the sigmoid function.

6. Security Analysis

We prove security protocols under both real world execution and ideal world simulation paradigms, and consider the security in the semi-honest model with all three parties following the protocol exactly.
Proof: assume A is a semi-honest adversary in the real world that cannot corrupt more than one party at the same time, and S is a simulator of the ideal world. In the real world, all parties involved execute the protocol in the presence of adversary A ; in an ideal world, all parties send their inputs to the simulator S and have S execute the protocol honestly. Any knowledge that adversary A can obtain in the real world can also be obtained by simulator S in the ideal world, so the real world and the ideal world are indistinguishable.
Security for F S h a r e · , F R e c · , F A N D · , F M u l t · ,, F W r a p 3 · , F P C · , F S h a r e [ [ · ] ] , F R e c [ [ · ] ] , F A N D [ [ · ] ] , F M u l t [ [ · ] ] . Since we use the 3PC protocol of [26] as our secret sharing primitiveness, the security of these protocols in this paper are inherited from [26].
Security for F L U T [ [ · ] ] : let A corrupt S 0 during the protocol F L U T [ [ · ] ] , and it is symmetric to S 1 , S 2 . In the offline phase, each party invokes F R a n d · to generate correlation randomness and invokes F 2 A N D · to obtain the multi-input dot inner product, so security is guaranteed by these two functions. In both real and ideal protocols, the only key information obtained by the corrupt party S 0 from the honest party is a bit v m 0 . Whether in real or ideal execution, this bit is masked by a random number λ y j 2 selected by the honest party, which is invisible for S 0 , making it a random bit that follows a uniform distribution in S 0 view. Since this is the only mask message that S 0 can observe, from its perspective, the execution of the real world and the simulation of the ideal world are indistinguishable.
Security for F D i g D e c [ [ · ] ] : let A corrupt S 0 during the protocol F D i g D e c [ [ · ] ] , and let it be symmetric to S 1 , S 2 . The interaction between the parties only occurs when invoking protocol F W r a p s · and F P C · , and all other calculations are completed locally. Therefore, this protocol’s security is guaranteed by the security of F W r a p s · and F P C · .

7. Evaluation

This chapter provides experimental results and corresponding experimental settings for the proposed protocol.
Experiment setting: The experiment was run on an Intel(R) Xeon(R) Platinum 8176 M CPU @ 2.10 GHz and 48 GB RAM and was conducted on a single threaded LAN and WAN, and we created three docker containers representing three servers in Ubuntu 20.04. The bandwidth in the LAN was approximately 1 GB/s and round-trip time (RTT) was 1ms, while the bandwidth in the WAN was 40 MB/s and RTT was 70 ms. The code implementation written in C++ programming language.
For our work’s benchmark, we compared the state-of-the-art SIRNN [13] in current RNN secure inference. In theoretical analysis, we compared the complexity of online communication and online communication rounds. Then, we compared the running time respectively under LAN and WAN, and communication volume of key building blocks. Finally, we applied our sigmoid and tanh protocols to end-to-end RNN secure inference and compared it with SIRNN [13] under the same setting. We simulated the fastGRNN [35] model on the Speech Command dataset [36], which identified keywords in short speech (such as digits, simple command, or directions). Its primary goal is to provide a way to build and test models that detect when a single word is spoken, from a set of ten target words, and the datasize size is 8.17 GiB. The FastGRNN model contains 99 sigmoid and 99 tanh layers; each layers has 100 instances.

7.1. Comparison of Theoretical Communication Cost on Building Blocks

As shown in Table 3 and Table 4, we compared the online communication complexity and communication rounds of basic building blocks of our 3PC work with the previous current optimal RNN work [13]. Note that the sigmoid and tanh functions of SIRNN is in 2PC setting. Moreover, we also compared with the typical three-party computation (3PC) framework ABY3 [24] and SecureNN [25], which are not specifically designed for RNNs, to obtain more comprehensive comparison results. For F 2 M u l t [ [ · ] ] , compared to SIRNN with a communication complexity of λ ( l + 3 ) + 1.5 l 2 + 2.5 l + 2 , our method reduces the communication complexity by an order of magnitude to l, and the communication rounds are reduced from 4 to 1. Our work reduces the online communication complexity by 10 times and 4 times compared to ABY3 and SecureNN, respectively, while also decreasing the number of communication rounds by one round compared to SecureNN, maintaining the same number of communication rounds as ABY3. For F D i g D e c [ [ · ] ] , the online communication complexity and communication rounds of basic building blocks of our work are not much different from ref. [13]. For F B 2 A [ [ · ] ] , SIRNN and SecureNN do not implement F B 2 A [ [ · ] ] . Although the communication complexity of our work is slightly lower than that of ABY3, the communication rounds are reduced from 1 + l o g l to 1. For F L U T [ [ · ] ] , the communication complexity of our work was reduced so that it was only related to the output bit width σ , and the communication round was reduced to only one round.

7.2. Online Cost of RNN Building Blocks and End-to-End Inference

We tested the basic building blocks of RNN, including matrix multiplication, sigmoid, and tanh. It should be noted that our work focused on reducing online communication overhead, so the communication cost and runtime we provide are obtained during the online phase. Similarly, we also evaluated the online communication overhead and runtime for SIRNN. However, since SIRNN does not strictly divide the calculation into online/offline phases, all computations are related to the privacy input. Therefore, we took the total communication cost and runtime of SIRNN as online communication cost and online runtime. Compared with SIRNN [13], the online communication of RNN key building blocks matrix multiplication, sigmoid, and tanh in our research work was reduced by 27.56%, 80.39%, and 79.94%, respectively, as shown in Table 5. The results of online runtime for RNN’s building blocks under LAN and WAN are shown in Table 6. In the LAN setting, for the sigmoid and tanh functions, the online runtime of our work is similar to that of SIRNN. However, in the WAN setting, for the sigmoid function, our work is 8% faster than SIRNN in terms of online runtime, and for the tanh function, our scheme is 3% faster than SIRNN. Also, we conducted end-to-end RNN secure inference using the Google-30 dataset [36] on fastGRNN [35]. As shown in Table 7, although our work is slightly slower than SIRNN in terms of online communication time (LAN: SIRNN 10.29s vs. Ours 10.31s; WAN: SIRNN 522.07 s vs. Ours 598.10 s), our online communication overhead is reduced by 39.45% compared to SIRNN.

8. Conclusions

In this work, we introduce an innovative protocol to improve the efficiency of three-party secret sharing and secure inference in recurrent neural networks (RNNs). The experimental results show that compared with SIRNN, the online communication of the core building blocks of the RNN model is significantly reduced (matmul, 27.56%; sigmoid, 80.39%; tanh, 79.94%), making our protocol available for delay-sensitive applications, such as real-time healthcare diagnosis, motion detection, financial risk control detection, and smart voice assistants.
However, it is important to acknowledge potential limitations. When our protocol is applied to RNN models with more layers or more complex structures, it may lead to a linear increase in the total/online communication overhead. In addition, although our lookup table protocol requires only one round of online communication, the performance of our method in a distributed environment may be affected by network latency, which may affect the communication time of the online phase in real deployment.
In summary, this research presents significant advancements in reducing communication costs for secure three-party computations in RNNs, but it also highlights areas for further improvement. Future work will focus on integrating these protocols into real-world model inference tasks, addressing scalability and latency challenges, and enhancing the privacy and security of lookup tables. Considering the balance of communication efficiency and security, our work only supports semi-honest security. We plan to improve the work further to support malicious security in the future. In addition, multi-valued logic (MVL) may work in reducing the communication overhead of the lookup table in theory, so future research will consider improvements with MVL. By exploring these methods, we aim to broaden the utility and robustness of secure computation protocols in practical applications.

Author Contributions

Conceptualization, Y.W. and C.L.; Methodology, Y.W. and C.L.; Software, C.L., X.S. and Y.S.; Validation, Y.S.; Formal Analysis, X.S. and T.W.; Writing—Original Draft Preparation, Y.W. and C.L.; Writing—review & editing, Y.W., C.L. and T.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded in part by Colleges and Universities Stable Support Project of Shenzhen, China, (No. GXWD20220811170225001), National Natural Science Foundation of China (No. 62402142), and Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies (No. 2022B1212010005).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding authors.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Randomness Generation

Throughout this paper, two types of random numbers are utilized for · -sharing using pseudo-random functions. The details are as follows:
  • Three-out-of-three random numbers: a pairwise random shared-key k i is held between P i and P i + 1 (i.e., P 0 , P 1 have k 0 , P 1 , P 2 have k 1 , P 2 , P 0 have k 2 ), and then P i can compute α i = F k i , c n t F k i 1 , c n t where c n t is a self-increasing counter, i { 0 , 1 , 2 } , so that α 0 + α 1 + α 2 0 .
  • Two-out-of-three random numbers: similar to the aforementioned processes, let P i , P i + 1 keep k i , then each party generates ( α i , α i 1 ) where α i = F k i , c n t , i { 0 , 1 , 2 } and c n t is an increased counter.
These two methods of generating random numbers mentioned above are established in both arithmetic and Boolean rings. The 3-out-of-3 random numbers can be applied to matrix inner products in linear layers and lookup table protocols. The 2-out-of-3 randomness can be applied to [ [ · ] ] -sharing.

References

  1. Dowlin, N.; Gilad-Bachrach, R.; Laine, K.; Lauter, K.; Naehrig, M.; Wernsing, J. CryptoNets: Applying neural networks to encrypted data with high throughput and accuracy. In Proceedings of the 33rd International Conference on Machine Learning, New York, NY, USA, 20–22 June 2016. [Google Scholar]
  2. Hesamifard, E.; Takabi, H.; Ghasemi, M. CryptoDL: Deep Neural Networks over Encrypted Data. arXiv 2017, arXiv:1711.05189. [Google Scholar]
  3. Mohassel, P.; Zhang, Y. Secureml: A system for scalable privacy-preserving machine learning. In Proceedings of the 2017 IEEE Symposium on Security and Privacy (SP), San Jose, CA, USA, 22–26 May 2017; pp. 19–38. [Google Scholar]
  4. Liu, J.; Juuti, M.; Lu, Y.; Asokan, N. Oblivious neural network predictions via minionn transformations. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, Dallas, TX, USA, 30 October–3 November 2017; pp. 619–631. [Google Scholar]
  5. Rouhani, B.D.; Riazi, M.S.; Koushanfar, F. DeepSecure: Scalable Provably-Secure Deep Learning. In Proceedings of the 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC), San Francisco, CA, USA, 24–29 June 2018. [Google Scholar]
  6. Juvekar, C.; Vaikuntanathan, V.; Chandrakasan, A. GAZELLE A Low Latency Framework for Secure Neural Network Inference. In Proceedings of the 27th USENIX Security Symposium (USENIX Security 18), Baltimore, MD, USA, 15–17 August 2018. [Google Scholar]
  7. Chandran, N.; Gupta, D.; Rastogi, A.; Sharma, R.; Tripathi, S. EzPC: Programmable and Efficient Secure Two-Party Computation for Machine Learning. In Proceedings of the 2019 IEEE European Symposium on Security and Privacy (EuroS&P), Stockholm, Sweden, 17–19 June 2019. [Google Scholar]
  8. Demmler, D.; Schneider, T.; Zohner, M. ABY-;A Framework for Efficient Mixed-Protocol Secure Two-Party Computation. In Proceedings of the 2015 Network and Distributed System Security Symposium, San Diego, CA, USA, 8–11 February 2015. [Google Scholar]
  9. Patra, A.; Schneider, T.; Suresh, A.; Yalame, H. ABY2.0: Improved Mixed-Protocol Secure Two-Party Computation. In Proceedings of the 30th USENIX Security Symposium (USENIX Security 21), Vancouver, BC, Canada, 11–13 August 2021; pp. 2165–2182. [Google Scholar]
  10. Mishra, P.; Lehmkuhl, R.; Srinivasan, A.; Zheng, W.; Popa, R.A. Delphi: A cryptographic inference system for neural networks. In Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice, Virtual, 9 November 2020; pp. 27–30. [Google Scholar]
  11. Rathee, D.; Rathee, M.; Kumar, N.; Chandran, N.; Gupta, D.; Rastogi, A.; Sharma, R. Cryptflow2: Practical 2-party secure inference. In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security, Virtual, 9–13 November 2020; pp. 325–342. [Google Scholar]
  12. Chen, V.; Pastro, V.; Raykova, M. Secure Computation for Machine Learning with SPDZ. arXiv 2019, arXiv:1901.00329. [Google Scholar]
  13. Rathee, D.; Rathee, M.; Goli, R.K.K.; Gupta, D.; Sharma, R.; Chandran, N.; Rastogi, A. Sirnn: A math library for secure rnn inference. In Proceedings of the 2021 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 24–27 May 2021; pp. 1003–1020. [Google Scholar]
  14. Zheng, Y.; Zhang, Q.; Chow, S.S.; Peng, Y.; Tan, S.; Li, L.; Yin, S. Secure softmax/sigmoid for machine-learning computation. In Proceedings of the 39th Annual Computer Security Applications Conference, Austin, TX, USA, 8–12 December 2023; pp. 463–476. [Google Scholar]
  15. Feng, Q.; Xia, Z.; Xu, Z.; Weng, J.; Weng, J. OPAF: Optimized Secure Two-Party Computation Protocols for Nonlinear Activation Functions in Recurrent Neural Network. arXiv 2024, arXiv:2403.00239. [Google Scholar]
  16. Hu, C.; Zhang, C.; Lei, D.; Wu, T.; Liu, X.; Zhu, L. Achieving Privacy-Preserving and Verifiable Support Vector Machine Training in the Cloud. IEEE Trans. Inf. Forensics Secur. 2023, 18, 3476–4291. [Google Scholar] [CrossRef]
  17. Zhang, C.; Hu, C.; Wu, T.; Zhu, L.; Liu, X. Achieving Efficient and Privacy-Preserving Neural Network Training and Prediction in Cloud Environments. IEEE Trans. Dependable Secur. Comput. 2023, 20, 4245–4257. [Google Scholar] [CrossRef]
  18. Zhang, C.; Luo, X.; Liang, J.; Liu, X.; Zhu, L.; Guo, S. POTA: Privacy-Preserving Online Multi-Task Assignment with Path Planning. IEEE Trans. Mob. Comput. 2024, 23, 5999–6011. [Google Scholar] [CrossRef]
  19. Zhang, C.; Zhao, M.; Liang, J.; Fan, Q.; Zhu, L.; Guo, S. NANO: Cryptographic Enforcement of Readability and Editability Governance in Blockchain Database. IEEE Trans. Dependable Secur. Comput. 2024, 21, 3439–3452. [Google Scholar] [CrossRef]
  20. Brüggemann, A.; Hundt, R.; Schneider, T.; Suresh, A.; Yalame, H. FLUTE: Fast and secure lookup table evaluations. In Proceedings of the 2023 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 22–24 May 2023; pp. 515–533. [Google Scholar]
  21. Riazi, M.S.; Weinert, C.; Tkachenko, O.; Songhori, E.M.; Schneider, T.; Koushanfar, F. Chameleon: A hybrid secure computation framework for machine learning applications. In Proceedings of the 2018 on Asia Conference on Computer and Communications Security, Incheon, Republic of Korea, 4–8 June 2018; pp. 707–721. [Google Scholar]
  22. Knott, B.; Venkataraman, S.; Hannun, A.; Sengupta, S.; Ibrahim, M.; van der Maaten, L. Crypten: Secure multi-party computation meets machine learning. Adv. Neural Inf. Process. Syst. 2021, 34, 4961–4973. [Google Scholar]
  23. Kumar, N.; Rathee, M.; Chandran, N.; Gupta, D.; Rastogi, A.; Sharma, R. CRYPTFLOW: Secure TensorFlow Inference. In Proceedings of the 2020 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 18–21 May 2020. [Google Scholar]
  24. Mohassel, P.; Rindal, P. ABY3: A mixed protocol framework for machine learning. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, Toronto, ON, Canada, 15–19 October 2018; pp. 35–52. [Google Scholar]
  25. Wagh, S.; Gupta, D.; Chandran, N. SecureNN: 3-party secure computation for neural network training. Proc. Priv. Enhancing Technol. 2019, 3, 26–49. [Google Scholar] [CrossRef]
  26. Dong, Y.; Chen, X.; Jing, W.; Li, K.; Wang, W. Meteor: Improved secure 3-party neural network inference with reducing online communication costs. In Proceedings of the ACM Web Conference 2023, Austin, TX, USA, 30 April–4 May 2023; pp. 2087–2098. [Google Scholar]
  27. Patra, A.; Suresh, A. BLAZE: Blazing Fast Privacy-Preserving Machine Learning. In Proceedings of the 2020 Network and Distributed System Security Symposium, San Diego, CA, USA, 23–26 February 2020. [Google Scholar]
  28. Dalskov, A.; Escudero, D.; Keller, M. Fantastic Four: Honest-Majority Four-Party Secure Computation with Malicious Security. IACR Cryptology ePrint Archive, 2020. Available online: https://eprint.iacr.org/2020/1330 (accessed on 24 February 2025).
  29. Byali, M.; Chaudhari, H.; Patra, A.; Suresh, A. FLASH: Fast and Robust Framework for Privacy-Preserving Machine Learning. IACR Cryptology ePrint Archive, 2019. Available online: https://eprint.iacr.org/2019/1365 (accessed on 24 February 2025).
  30. Koti, N.; Patra, A.; Rachuri, R.; Suresh, A. Tetrad: Actively Secure 4PC for Secure Training and Inference. In Proceedings of the 2022 Network and Distributed System Security Symposium, San Diego, CA, USA, 24–28 April 2022. [Google Scholar]
  31. Wagh, S.; Tople, S.; Benhamouda, F.; Kushilevitz, E.; Mittal, P.; Rabin, T. Falcon: Honest-Majority Maliciously Secure Framework for Private Deep Learning. Proc. Priv. Enhancing Technol. 2021, 2021, 188–208. [Google Scholar] [CrossRef]
  32. Dessouky, G.; Koushanfar, F.; Sadeghi, A.R.; Schneider, T.; Zeitouni, S.; Zohner, M. Pushing the Communication Barrier in Secure Computation using Lookup Tables. In Proceedings of the 2017 Network and Distributed System Security Symposium, San Diego, CA, USA, 26 February–1 March 2017. [Google Scholar]
  33. Ishai, Y.; Kushilevitz, E.; Meldgaard, S.; Orlandi, C.; Paskin-Cherniavsky, A. On the power of correlated randomness in secure computation. In Theory of Cryptography, Proceedings of the 10th Theory of Cryptography Conference, TCC 2013, Tokyo, Japan, 3–6 March 2013; Proceedings; Springer: Berlin/Heidelberg, Germany, 2013; pp. 600–620. [Google Scholar]
  34. Goldberg, D. What every computer scientist should know about floating-point arithmetic. ACM Comput. Surv. (CSUR) 1991, 23, 5–48. [Google Scholar] [CrossRef]
  35. Kusupati, A.; Singh, M.; Bhatia, K.; Kumar, A.; Jain, P.; Varma, M. FastGRNN: A Fast, Accurate, Stable and Tiny Kilobyte Sized Gated Recurrent Neural Network. arXiv 2019, arXiv:1901.02358. [Google Scholar] [CrossRef]
  36. Warden, P. Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition. arXiv 2018, arXiv:1804.03209. [Google Scholar] [CrossRef]
Figure 1. Example of a function with δ = 3 inputs and σ = 1 outputs represented as Boolean circuit and lookup table.
Figure 1. Example of a function with δ = 3 inputs and σ = 1 outputs represented as Boolean circuit and lookup table.
Electronics 14 00985 g001
Figure 2. System model of secure 3-party RNN inference.
Figure 2. System model of secure 3-party RNN inference.
Electronics 14 00985 g002
Table 1. Notation table.
Table 1. Notation table.
NotationDescription
· 3-out-of-3 sharing
· 2-out-of-3 replicated secret sharing
[ [ · ] ] 3PC secret sharing in this work
δ -to- σ TLUT T with σ inputs and δ outputs
2 I Powerset of set I
{ E u } u [ δ ] Input encoding of LUT T with bit size 2 δ
{ ξ w } w [ σ ] Output encoding of LUT T with bit size 2 σ
v ¯ Complement of bit v { 0 , 1 } , v ¯ = 1 v
Table 2. RNN model and corresponding activation function.
Table 2. RNN model and corresponding activation function.
ModelActivation Function
vanilla RNNSigmoid, Softmax
LSTMSigmoid, Tanh
GRUSigmoid, Tanh
FastGRNNSigmoid, Tanh
Table 3. Online communication rounds.
Table 3. Online communication rounds.
OperatorRounds
SIRNNABY3SecureNNOurs
2-Mult4121
DigDec l + 1 -- l + log l + 1
B2A- 1 + log l -1
LUT d + 1 --1
Note: In the table, l is bit width, and assume there is a δ -to- σ lookup table divided into several sub-blocks based on the input length δ , then the length of each sub block is d; “-” means that the baseline does not implement this function.
Table 4. Online communication complexity.
Table 4. Online communication complexity.
OperatorComm (MB)
SIRNNABY3SecureNN Ours
2-Mult λ ( l + 3 ) + 1.5 l 2 + 2.5 l + 2 11 l 5 l l
DigDec l + log l + 1 -- 2 l
B2A- l + l log l - l 2
LUT 2 λ + N l -- σ
Note: In the table, λ is the computational security parameter, l is bit width, and assume there is a δ -to- σ lookup table divided into several sub-blocks based on the input length δ , then the length of each sub block is d; “-” means that baseline does not implement this function.
Table 5. Online communication of RNN operators.
Table 5. Online communication of RNN operators.
Building BlockSizeComm (MB)
SIRNNOurs
MatMul ( 128 , 500 , 100 ) 0.1560.109
sigmoid ( 128 , 128 ) 0.9330.183
tanh ( 128 , 128 ) 0.9470.190
Note: The data with better performance in the table is displayed in bold.
Table 6. Online runtime of RNN operators under LAN and WAN.
Table 6. Online runtime of RNN operators under LAN and WAN.
Building BlockLAN (ms)WAN (s)
SIRNNOursSIRNNOurs
sigmoid3022966.155.65
tanh3222976.035.81
Table 7. Online cost of end-to-end RNN inference with FastGRNN on speech command dataset.
Table 7. Online cost of end-to-end RNN inference with FastGRNN on speech command dataset.
WorkComm (MB)LAN (s)WAN (s)
SIRNN102.8010.29522.07
ours62.2510.31598.10
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wu, Y.; Liao, C.; Sun, X.; Shen, Y.; Wu, T. Communication Efficient Secure Three-Party Computation Using Lookup Tables for RNN Inference. Electronics 2025, 14, 985. https://doi.org/10.3390/electronics14050985

AMA Style

Wu Y, Liao C, Sun X, Shen Y, Wu T. Communication Efficient Secure Three-Party Computation Using Lookup Tables for RNN Inference. Electronics. 2025; 14(5):985. https://doi.org/10.3390/electronics14050985

Chicago/Turabian Style

Wu, Yulin, Chuyi Liao, Xiaozhen Sun, Yuyun Shen, and Tong Wu. 2025. "Communication Efficient Secure Three-Party Computation Using Lookup Tables for RNN Inference" Electronics 14, no. 5: 985. https://doi.org/10.3390/electronics14050985

APA Style

Wu, Y., Liao, C., Sun, X., Shen, Y., & Wu, T. (2025). Communication Efficient Secure Three-Party Computation Using Lookup Tables for RNN Inference. Electronics, 14(5), 985. https://doi.org/10.3390/electronics14050985

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop