Communication Efficient Secure Three-Party Computation Using Lookup Tables for RNN Inference

Wu, Yulin; Liao, Chuyi; Sun, Xiaozhen; Shen, Yuyun; Wu, Tong

doi:10.3390/electronics14050985

Open AccessArticle

Communication Efficient Secure Three-Party Computation Using Lookup Tables for RNN Inference

by

Yulin Wu

^1,2,

Chuyi Liao

^1,2,

Xiaozhen Sun

^1,2,

Yuyun Shen

^1,2 and

Tong Wu

^2,3,*

¹

School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen 518055, China

²

Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies, Shenzhen 518067, China

³

School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(5), 985; https://doi.org/10.3390/electronics14050985

Submission received: 27 January 2025 / Revised: 20 February 2025 / Accepted: 25 February 2025 / Published: 28 February 2025

(This article belongs to the Special Issue Security and Privacy in Distributed Machine Learning)

Download

Browse Figures

Versions Notes

Abstract

Many leading technology companies currently offer Machine Learning as a Service Platform, enabling developers and organizations to access the inference capabilities of pre-trained models via API calls. However, due to concerns over user data privacy, inter-enterprise competition, and legal and regulatory constraints, directly utilizing pre-trained models in the cloud for inference faces security challenges. In this paper, we propose communication-efficient secure three-party protocols for recurrent neural network (RNN) inference. First, we design novel three-party secret-sharing protocols for digit decomposition, B2A conversion, enabling efficient transformation of secret shares between Boolean and arithmetic rings. Then, we propose the lookup table-based secure three-party protocol. Unlike the intuitive way of directly looking up tables to obtain results, we compute the results by utilizing the inherent mathematical properties of binary lookup tables, and the communication complexity of the lookup table protocol is only related to the output bit width. We also design secure three-party protocols for key functions in the RNN, including matrix multiplication, sigmoid function, and Tanh function. Our protocol divides the computation into online and offline phase, and places most of the computations locally. The theoretical analysis shows that the communication round of our work was reduced from four rounds to one round. The experiment results show that compared with the current SOTA-SIRNN, the online communication overhead of sigmoid and tanh functions decreased by 80.39% and 79.94%, respectively.

Keywords:

secure multi-party computation; lookup table; replicated secret sharing; privacy-preserving neural network

1. Introduction

In recent years, many leading technology companies have introduced “Machine Learning as a Service” (MLaaS) platforms, such as Amazon SageMaker, Microsoft Azure Machine Learning, Google Cloud AI Platform, Alibaba Cloud Platform of Artificial Intelligence, and Tencent Cloud TI Platform. These platforms provide users with easy access to pre-trained neural network models through simple API calls, without requiring a deep understanding of the underlying technical details. As a result, users can quickly perform inference tasks using their own data. This approach has significantly lowered the barrier to using neural network technologies, making it possible for more businesses and individuals to benefit from the advantages of neural network models.

However, behind the convenience of these services lies a significant risk of privacy leakage and challenges that cannot be ignored. On one hand, when users’ data are uploaded to cloud servers for processing, there exists the risk of these private data being leaked or misused. For example, in 2023, the short-video platform TikTok faced widespread privacy concerns globally due to its data-collection and -processing practices. Governments and regulatory authorities in multiple countries have investigated its data-handling methods, fearing that user private data might be employed for improper purposes. On the other hand, when users employ machine learning models provided by service providers for inference tasks, there is a risk of leakage of model parameters and training data. Malicious users can potentially reverse-engineer model parameters and even deduce parts of the training dataset content through repeated access to the model. This poses a significant threat to enterprises that rely on highly confidential data to train their models. Therefore, ensuring the privacy of both users’ input data and neural network models during the model inference phase has become one of the current research hotspots.

Secure Multi-Party Computation (MPC) allows multiple data owners to perform arbitrary computational tasks without revealing their private data. This technology provides an effective method for addressing the challenges. Current research on privacy-preserving deep neural network inference using MPC primarily focuses on convolutional neural networks (CNNs) that handle spatially distributed data [1,2,3,4,5,6,7,8,9,10,11,12]; these efforts have advanced secure inference for image and grid-structured tasks. With the rapid development of large language models in recent years, models that can handle sequence problems have become a focus of research. Recurrent Neural Networks (RNNs) is an architecture specifically designed for time series such as speech signals, physiological time series, and financial transaction flows, while there are relatively limited works on RNNs [13,14,15]. RNNs are more complex in structure compared to CNNs, particularly due to the inclusion of complex nonlinear activation functions (e.g., sigmoid/Tanh function). On one hand, existing work for RNNs only supports secure inference between two parties, without considering the high communication overhead associated with involving multiple parties. On the other hand, existing methods often use piecewise linear functions for approximate computations, which can result in lower accuracy. To address these issues, we propose a communication-efficient secure three-party RNN inference method. In summary, we made the following contributions:

(1): We construct novel three-party secret-sharing-based digit decomposition, B2A conversion protocols.These protocols complement the existing three-party secret-sharing scheme, effectively reducing the communication overhead in the online phase.
(2): We propose the lookup table-based secure three-party protocol by utilizing the inherent mathematical properties of binary lookup tables, and the communication complexity of the LUT protocol is only related to the output bit width.
(3): We propose the lookup table-based secure three-party protocols for RNN inference, including key functions in RNNs, such as matrix multiplication, sigmoid and Tanh function, and achieving lower online communication overhead compared to current SOTA-SIRNN.

The rest of this paper is organized as follows: Section 2 describes work related to this work. Section 3 introduces the preliminaries of the secret sharing scheme and lookup table. Section 4 provides the protocol constructions, including the system model, conversions of secret sharing, and the lookup table-based 3PC protocol. Section 5 gives the basic building blocks of RNN based on the secure protocols of Section 4. Section 6 provide the security analysis. Section 7 provides performance analysis from theoretical and experimental perspectives, and Section 8 summarizes the conclusions.

2. Related Work

In recent years, privacy preservation has become a crucial concern across various domains, reflecting the growing need to protect sensitive data in increasingly interconnected environments. In cloud computing, significant advancements have been made in enabling secure and verifiable machine learning model training [16], as well as in developing efficient frameworks for privacy-preserving neural network training and inference [17]. In the realm of path planning, researchers have introduced techniques to manage complex multi-task assignments while maintaining data privacy [18]. Additionally, the blockchain sector has progressed in privacy governance by ensuring data integrity and controlling access within decentralized databases [19]. Despite these diverse applications, there remains a critical need to specifically address the security issues of sensitive data and privacy models in neural network computation.

There are already many works aimed at CNNs in the privacy-preserving machine learning field, designing secure computing protocols for specific structures in CNNs. For works only focusing on two parties involving secure neural network computations, many of them use different techniques to evaluate secure inference [1,2,3,4,5,6,7,8,9]. Some work ignores nonlinear mathematical functions [1,6,10,11], since whether a monotonic mathematical function is used or not does not affect the inference result in some cases. In 2PC settings, some works use higher-degree polynomials to approximate nonlinear mathematical functions [12] to ensure accuracy. Some previous works use ad hoc approximations to approximate nonlinear functions, but this may cause a loss in model accuracy while achieving higher computational efficiency [3,9]. Rathee et al. [13] utilized a lookup table for some complex nonlinear math functions and the model accuracy was improved to a certain extent. Flute et al. [20] utilized a lookup table for some complex nonlinear math functions, and the model accuracy of nonlinear math functions was improved under 2PC settings.

In order to improve the performance of two-party secure neural network computations, many works introduce a third party to provide randomness during the offline phase to assist computation or involve computations. Similar to the previous work under 2PC settings, 3PC works still use polynomial approximation to compute nonlinear math functions [21,22,23,24]. Chameleon et al. [21] utilized a semi-honest third-party server to assist in generating correlated random numbers during the offline phase and Yao’s Garbled Circuits for nonlinear functions. To reduce the expensive computing operations caused by garbled circuits, some work (e.g., [25]) used ad hoc approximations instead of GC to calculate nonlinear functions, such as maxpool, ReLU, and so on. Moreover, ref. [26] proposed improved replication secret sharing that improved the online communication efficiency. In addition, some multi-party works focus on application scenarios where malicious adversaries exist, thus requiring more communication and computation overhead [27,28,29,30].

However, in many practical scenarios, such as those involving sequential data, Recurrent Neural Networks (RNNs) are essential for effective computation. Early efforts in secure inference of RNNs include ref. [13] addressing SIRNN, which used lookup tables to calculate commonly used nonlinear activation functions like sigmoid and tanh in RNNs. This approach enabled high accuracy with a two-server setup, making it a significant step forward in secure RNN inference, particularly for speech and time series sensor data. Building on this, Zheng et al. [14] introduced a three-party secure computation protocol to handle nonlinear activation functions and their derivatives in RNNs, leveraging Fourier series approximations to balance precision and computation efficiency.

RNNs still face significant challenges in secure computation, particularly in real-world applications. One major issue is the high communication overhead associated with secure RNN inference, which is caused by the complexity of nonlinear activation functions and the recurrent structure of RNNs. Our work focuses on designing secure computation protocols for complex nonlinear functions in RNNs to reduce online communication.

3. Preliminaries

This section provides the fundamental theory required by our method. We introduce the related 3PC secret sharing scheme for the whole framework of our works, and the lookup table technology for nonlinear layers. The notations used in this work are listed in Table 1.

3.1. Secret Sharing

In this paper, we utilize the optimized 3PC secret sharing scheme provided in Meteor [26] to calculate the linear layer of the RNN, which can accelerate fixed-point multiplication of two inputs and integer multiplication of N inputs, and this scheme can be easily extended to more party settings. The scheme is executed with three parties

{P_{i}}_{i \in [0, 2]}

over a Boolean ring

Z_{2}

(or arithmetic ring

Z_{2^{l}}

, where l is the bit width). Given two private bits

{x, y}

, Boolean sharing refers to the secret sharing scheme of bit size

l = 1

and with logical AND, XOR, NOT operations, denoting

\land, \oplus, \bar{x}

, respectively.

3.1.1. $[\cdot]$ -Sharing

For $[\cdot]$ -sharing over Boolean ring $Z_{2}$ , any Boolean value $v \in {0, 1}$ is secret shared by three random values $v_{0}$ , $v_{1}$ , $v_{2} \in Z_{2}$ , and $P_{i}$ holds its share ${[v]}_{i} = v_{i}$ , such that $v = {[v]}_{0} \oplus {[v]}_{1} \oplus {[v]}_{2}$ .
For $[\cdot]$ -sharing over arithmetic ring $Z_{2^{l}}$ , any arithmetic value $v \in Z_{2^{l}}$ is secret shared by three random values $v_{0}$ , $v_{1}$ , $v_{2} \in Z_{2^{l}}$ , and $P_{i}$ holds its share ${[v]}_{i} = v_{i}$ , such that $v = v_{0} + v_{1} + v_{2} (m o d 2^{l})$ .

3.1.2. $〈\cdot〉$ -Sharing

〈\cdot〉

-Sharing is similar to

[\cdot]

-sharing, but it is the three-party 2-out-of-3 replicated secret sharing scheme.

For

〈\cdot〉

-sharing over Boolean ring

Z_{2}

, any Boolean value

v \in {0, 1}

is secret shared by three random values

v_{0}

,

v_{1}

,

v_{2} \in Z_{2}

, where

v = {[v]}_{0} \oplus {[v]}_{1} \oplus {[v]}_{2}

.

P_{i}

holds

{〈v〉}_{i} = (v_{i}, v_{i + 1})

, such that any two parties can construct the secret value v.

For

〈\cdot〉

-sharing over arithmetic ring

Z_{2^{l}}

, any arithmetic value

v \in Z_{2^{l}}

is secret shared by three random values

v_{0}

,

v_{1}

,

v_{2} \in Z_{2^{l}}

, where

v = v_{0} + v_{1} + v_{2} (m o d 2^{l})

. Also,

P_{i}

holds

{〈v〉}_{i} = (v_{i}, v_{i + 1})

, such that any two parties can construct the secret value v.

Linear Operation: For secret shared Boolean values $〈x〉$ and $〈y〉$ , three parties ${P_{i}}_{i \in [3]}$ can compute value $〈z〉 = c_{1} 〈x〉 \oplus c_{2} 〈y〉 \oplus c_{3}$ , where $c_{1}, c_{2}, c_{3}$ are public constant bits. In the Boolean ring $Z_{2}$ , each party $P_{i}$ can locally compute its secret shares ${〈z〉}_{i}$ , such that any two parities of ${P_{i}}_{i \in [3]}$ can recover the secret value z. It is similar for arithmetic rings $Z_{2^{l}}$ , by replacing the XOR operation ⊕ with the ADD operation +, and taking the modulus of $2^{l}$ for the result.
AND Operation: For secret shared Boolean values $〈x〉$ and $〈y〉$ , three parties ${P_{i}}_{i \in [3]}$ can compute the secret shared value $〈z〉 = 〈x〉 \land 〈y〉$ as follows: (a) each party $P_{i}$ locally computes $z_{i} = x_{i} y_{i} \oplus x_{i + 1} y_{i} \oplus x_{i} y_{i + 1}$ ; (b) each party $P_{i}$ locally computes $z_{i}^{'} = z_{i} \oplus α_{i}$ , where $α_{0} \oplus α_{1} \oplus α_{2} = 0$ (for random number generation, please refer to Appendix A); (c) all three parties together perform re-sharing to obtain the sharing of $z_{i}^{'}$ by sending $z_{i}^{'}$ to $P_{i - 1}$ so that each $P_{i}$ holds ${〈z〉}_{i} = (z_{i}^{'}, z_{i + 1}^{'})$ .
Multiplication: For secret shared arithmetic values $〈x〉$ and $〈y〉$ , three parties ${P_{i}}_{i \in [3]}$ can compute the secret shared value $〈z〉 = 〈x〉 \times 〈y〉$ as follows: (a) each party $P_{i}$ locally computes $z_{i} = x_{i} y_{i} + x_{i + 1} y_{i} + x_{i} y_{i + 1}$ ; (b) each party $P_{i}$ computes $z_{i}^{'} = z_{i} + α_{i}$ , where $α_{0} + α_{1} + α_{2} = 0$ (for generation of $α_{0}, α_{1}, α_{2}$ , please refer to Appendix A); (c) all three parties together perform re-sharing so that each party $P_{i}$ holds its own secret share ${〈z〉}_{i} = (z_{i}^{'}, z_{i + 1}^{'})$ .

3.2. Secret Sharing Semantics of This Work

In order to improve the computational efficiency of the

[\cdot]

-sharing and

〈\cdot〉

-sharing, we provide the novel and efficient three-party secret sharing scheme denoted as

[[\cdot]]

-sharing inspired by the scheme in [9,26]. With the aid of using

[[\cdot]]

-sharing, we further divide the computation into online and offline phases.

For

[[\cdot]]

-sharing over Boolean ring

Z_{2}

, any Boolean value

v \in {0, 1}

is secret shared with a mask value

m_{v} \in Z_{2}

and the

〈\cdot〉

-sharing of a random value

λ_{v} \in Z_{2}

,

P_{i}

holds

{[[v]]}_{i} = (m_{v}, {〈λ_{v}〉}_{i})

, where

m_{v} = v \oplus λ_{v}

is known to all three parties,

i \in {0, 1, 2}

.

For

[[\cdot]]

-sharing over arithmetic ring

Z_{2^{l}}

, any arithmetic value

v \in Z_{2^{l}}

is secret shared with a mask value

m_{v} \in Z_{2^{l}}

and the

〈\cdot〉

-sharing of a random value

λ_{v} \in Z_{2^{l}}

,

P_{i}

holds

{[[v]]}_{i} = (m_{v}, {〈λ_{v}〉}_{i})

, where

m_{v} = v + λ_{v} (m o d 2^{l})

is known to all three parties,

i \in {0, 1, 2}

. In addition, the complement of v can be computed as

[[\bar{v}]] = [[1 \oplus v]]

by setting

m_{\bar{v}} = \bar{m_{v}}

.

Linear operation: This pattern is linear for both Boolean and arithmetic rings. For example, for Boolean sharing, assume $c_{1}, c_{2}, c_{3}$ are public constant bits, $[[x]], [[y]]$ are two secret-shared values, and $z = c_{1} x \oplus c_{2} y \oplus c_{3}$ , then each party $P_{i}$ can compute its share ${[[z]]}_{i} = (m_{z}, {〈λ_{z}〉}_{i})$ locally by setting $m_{z} = c_{0} m_{x} \oplus c_{1} m_{y} \oplus c_{2}$ and ${〈λ_{z}〉}_{i} = c_{0} {〈λ_{x}〉}_{i} \oplus c_{1} {〈λ_{y}〉}_{i}$ , for $i \in {0, 1, 2}$ .
Secret share operation: Secret share operation enables a privacy data owner (assume $P_{i}$ is the data owner) to generate $[[\cdot]]$ -sharing of its private input x. In the offline phase, all parties jointly invoke $F_{R a n d}^{〈\cdot〉}$ in Appendix A to sample random values $〈λ_{x}〉$ , where privacy data owner $P_{i}$ knows $λ_{x}$ clearly. In the online phase, $P_{i}$ computes and reveals $m_{x} = x \oplus λ_{x}$ in a Boolean ring and $m_{x} = x - λ_{x} (m o d 2^{l})$ in an arithmetic ring. The above process can be carried out using $F_{S h a r e}^{[[\cdot]]}$ function.
Reconstruct operation: $F_{R e c}^{[[\cdot]]}$ can reconstruct x by invoking $F_{R e c}^{〈\cdot〉}$ first to obtain $λ_{x}$ , and then parties can locally compute $x = m_{x} \oplus λ_{x}$ in a Boolean ring and $x = m_{x} + λ_{x} (m o d 2^{l})$ in an arithmetic ring.

AND operation: With the functionality

F_{2 - A N D}^{[[\cdot]]}

AND two Boolean secret value

[[x]], [[y]]

, and output

[[z]]

, where

z = x \land y

, we have:

\begin{matrix} m_{z} & = x \land y \oplus λ_{z} = (m_{x} \oplus λ_{x}) (m_{y} \oplus λ_{y}) \oplus λ_{z} \\ = m_{x} m_{y} \oplus λ_{x} m_{y} \oplus λ_{y} m_{x} \oplus λ_{x} λ_{y} \oplus λ_{z} \end{matrix}

(1)

As shown in Equation (1), all terms except for

λ_{x} λ_{y}

can be computed locally since

m_{x}, m_{y}

is known to all parties. Therefore, the main difficulty becomes calculating

〈λ_{x} λ_{y}〉

given

〈λ_{x}〉

and

〈λ_{y}〉

. Since

λ_{x}

and

λ_{y}

is input-independent,

〈λ_{x} λ_{y}〉

can be computed in the offline phase using

F_{2 - A N D}^{〈\cdot〉}

. In the offline phase, parties interactively generate the randomness

〈λ_{z}〉

using the method in Appendix A, and in the online phase, each party

P_{i}

computes

〈m_{z}〉 = z \oplus λ_{z}

. The protocols of function

F_{2 - A N D}^{[[\cdot]]}

are shown in Algorithm 1.

Algorithm 1

F_{2 - A N D}^{[[\cdot]]}

: two-input AND in boolean ring

Input: $[[\cdot]]$ -shares of x and y.
Output: $[[\cdot]]$ -shares of z.
Offline Phase:
1.
$P_{i}$ sample random values $〈λ_{z}〉$ where $i \in {0, 1, 2}$ .
2.
Parties mutually generate $〈λ_{x} λ_{y}〉$ using $F_{2 - A N D}^{〈\cdot〉}$ .
Online Phase:
1.
Parties locally set ${〈m_{Δ}〉}_{i} = m_{y} {〈λ_{x}〉}_{i} \oplus m_{x} {〈λ_{y}〉}_{i} \oplus {〈λ_{x} λ_{y}〉}_{i} \oplus {〈λ_{z}〉}_{i}$ , where $i \in {0, 1, 2}$ .
2.
Parties reveal $m_{Δ}$ and set $m_{z} = m_{Δ} \oplus m_{x} m_{y}$ .
3.
$P_{i}$ hold ${[[z]]}_{i} = (m_{z}, {〈λ_{z}〉}_{i})$ .

Multi-input AND operation: For multi-input AND gates, $F_{N - A N D}^{[[\cdot]]}$ takes the N Boolean value $(x_{1}, x_{2}, \dots, x_{N})$ as input, and output $z = ⋀_{i = 0}^{N} x_{i}$ , then we have:

$\begin{matrix} m_{z} & = ⋀_{i = 0}^{N} x_{i} \oplus λ_{z} = ⋀_{i = 0}^{N} (m_{x_{i}} \oplus λ_{x_{i}}) \oplus λ_{z} \\ = ⨁_{I \subseteq {1, \dots, N}} (\underset{j \notin I}{⋀} m_{x_{j}} \cdot \underset{k \in I}{⋀} λ_{x_{k}}) \oplus λ_{z} \end{matrix}$

(2)

So as the procedure of the two-input AND gate, parties compute the input-independent $〈\cdot〉$ -sharing of ${⋀_{k \in I} λ_{x_{k}}}_{I \subseteq {1, \dots, N}}$ by invoking $F_{2 - A N D}^{〈\cdot〉}$ in a tree-like combinatorial manner in the offline phase.

Two-input multiplication: the multiplication of two numbers in an arithmetic ring

F_{2 - M u l t}^{[[\cdot]]}

is similar to that in Boolean sharing, simply replacing the OR operation in Boolean sharing with addition and the AND operation with multiplication. The multiplication of two numbers under arithmetic sharing is depicted in Equation (3):

\begin{matrix} m_{z} & = x \times y - λ_{z} = (m_{x} + λ_{x}) (m_{y} + λ_{y}) - λ_{z} \\ = m_{x} m_{y} + λ_{x} m_{y} + λ_{y} m_{x} + λ_{x} λ_{y} - λ_{z} \end{matrix}

(3)

Similarly, the shares of

〈λ_{x} λ_{y}〉

can be calculated using

F_{2 - M u l t}^{〈\cdot〉}

in Section 3.1.2. As introduced in Section 3.3, the calculation of secure inference uses fixed-point representation in

Z_{2^{l}}

; after precise multiplication, the decimal part of the result will double its original size, i.e.,

x \cdot 2^{d} \times y \cdot 2^{d} = x y \cdot 2^{2 d}

. Therefore, we should truncate the last d bits of the product to obtain an approximate result. In a secret sharing scheme, truncation takes into account two types of probability errors: a small error caused by carry bit error and a large error caused by overflow during multiplication calculation. And the shares of output that make

z^{'} = z / 2^{d}

hold a probability of 1 are called faithful truncation. We use faithful truncation methods from [9,26]: In the offline phase, parties mutually generate the

〈\cdot〉

-sharing of

λ_{z}

and

{λ_{z}}^{'}

where

λ_{z} = {λ_{z}}^{'} / 2^{d}

. During the online phase, all parties

P_{i}

locally compute

〈{m_{z}}^{'}〉 = x \times y - 〈{λ_{z}}^{'}〉

and reveal

{m_{z}}^{'}

, and then parties set

m_{z} = {m_{z}}^{'} / 2^{d}

. The protocols of

F_{2 - M u l t}^{[[\cdot]]}

are shown in the Algorithm 2.

Algorithm 2

F_{2 - M u l t}^{[[\cdot]]}

: two-input multiplication in arithmetic ring

Input: $[[\cdot]]$ -shares of x and y.
Output: $[[\cdot]]$ -shares of z.
Offline Phase:
1.
$P_{i}$ sample random values $〈λ_{z}〉$ and set $〈{λ_{z}}^{'}〉$ where $i \in {0, 1, 2}$ .
2.
Parties mutually generate $〈λ_{x} λ_{y}〉$ using $F_{2 - M u l t}^{〈\cdot〉}$ .
Online Phase:
1.
Parties locally set ${〈m_{Δ}〉}_{i} = m_{y} {〈λ_{x}〉}_{i} + m_{x} {〈λ_{y}〉}_{i} + {〈λ_{x} λ_{y}〉}_{i} - {〈{λ_{z}}^{'}〉}_{i}$ , where $i \in {0, 1, 2}$ .
2.
Parties reveal $m_{Δ}$ and set ${m_{z}}^{'} = m_{Δ} + m_{x} m_{y}$ .
3.
$P_{i}$ hold ${[[z]]}_{i} = ({m_{z}}^{'} / 2^{d}, {〈λ_{z}〉}_{i})$ .

Multi-input multiplication: Similar to the multi-input AND operation under Boolean sharing, multiplication of multiple numbers under arithmetic sharing is shown in Equation (4):

$\begin{matrix} m_{z} & = \sum_{i = 1}^{N} x_{i} - λ_{z} = \prod_{i = 1}^{N} (m_{x_{i}} + λ_{x_{i}}) - λ_{z} \\ = \sum_{I \subseteq {1, \dots, N}} (\prod_{j \notin I} m_{x_{j}} \cdot \prod_{k \in I} λ_{x_{k}}) - λ_{z} \end{matrix}$

(4)

3.3. Fixed-Point Representation

Practical applications such as machine learning and mathematical statistics usually require the floating-point numbers for calculations. While in MPC, it is generally calculated in finite rings or fields, so it is necessary to encode floating-point numbers as fixed-point numbers [3,11,21,24,31]. The conversion relationship between floating-point numbers and fixed-point numbers is as follows: given a floating-point number

x \in R

, its corresponding fixed-point number

x^{'} = ⌊ x \cdot 2^{d} ⌋ (m o d 2^{l})

, where l is bit width and d is precision. We use

[0, 2^{l - 1})

and

[2^{l - 1}, 2^{l})

to represent positive and negative numbers, respectively.

3.4. Lookup Table

The lookup table (LUT) structure is used to pre-compute and store the inputs and corresponding results of the function. The corresponding results can be found by looking up the input in the table, without the need to calculate again.

The lookup table of a function in this paper refers to the set of all inputs and their corresponding outputs within a certain range, i.e.,

f : {0, 1}^{δ} \to {0, 1}^{σ}

[32], where

σ

and

δ

are bit width of input and output, respectively. Using this representation, any function can construct its corresponding lookup table within a certain range, and the computational complexity of the lookup table is related to its size. Therefore, logically complex lookup tables are theoretically more suitable to computing using lookup tables.

An instance of an LUT is demonstrated in Figure 1. In this work,

{{\vec{E}}^{u}}_{u \in [δ]}

is the input column of the table and

{{\vec{ξ}}^{w}}_{w \in [σ]}

is the output column, where each length is

2^{δ}

, for example,

{\vec{E}}^{2}

is 00110011 and

{\vec{ξ}}^{1}

is 00101101. In practical scenarios, the bit width design of lookup tables (LUTs) requires a balance between computational efficiency and numerical accuracy requirements. Our lookup table (LUT) selects 12 bit input/12 bit output precision, achieving a balance between the actual deployability of RNNs and model accuracy. In latency-critical scenarios like real-time ECG anomaly detection (sampling rates ≤ 250 Hz), 12-bit input limits LUT size to 4 KB. For applications such as speech recognition that are not sensitive to real-time efficiency, 12-bit can also ensure good accuracy. However, if users want to use a look-up table for models with large numbers or high accuracy requirements, they need to optimize the look-up table structure by using an automation toolchain to reduce the level of look-up table circuits and achieve higher performance.

4. Protocol Constructions

In this section, to provide a clearer exposition of the protocol we have constructed, we first present the system model in this paper. Then, we provide the digit decomposition, B2A and Bit2A conversion protocols based on the

[[\cdot]]

-sharing. Finally, we proposed a lookup table-based secure three-party protocol.

4.1. System Model

Consider a computing scenario with three servers that are independent of each other. We execute a secure computing protocol under a semi-honest model, As shown in Figure 2, there are three roles in a system model: model owner, data owner, and computation participants.

(1): Model owner: This is the model’s architecture and parameters, typically having a machine learning model that has been trained or needs to be trained. In the initial stage of computation, the model owner must send the model parameters to the computation participants in the form of secret shares, but does not receive the computation results.
(2): Data owner: This has real data and aims to conduct joint inference without exposing data privacy. The data owner sends the secret shares of private data to the computation participants at the beginning of the calculation, and receives the result shares sent by the computation participants after the secure computation is completed.
(3): Computation participants: These act as the actual “computing executors” within the secure computation protocol, typically serving as third-party platforms, service providers, or distributed nodes with multi-party secure computation capabilities. In the context of this paper, the computation participant servers are three servers that initially receive the model shares from the model owner and the data shares from the data owner. After executing the secure computation protocol, three servers send their result shares back to the data owner, respectively.

To demonstrate this system model’s practical applicability, we take the healthcare field as an example. Some medical institutions utilize speech-recognition technology to develop electronic medical record (EMR)-management platforms, which enables physicians to complete clinical documentation through voice dictation, significantly reducing documentation burdens and improving workflow efficiency. Concurrently, the technology assists clinicians in rapidly retrieving patients’ historical medical records, thereby providing robust support for accurate diagnosis. However, as patient medical information involves sensitive personal privacy, the system model in this work can be used. For this case, the model owner as in Figure 2 is the medical institution, it can train the model locally or via other methods using relevant datasets, and then the model parameters are distributed to three servers through secret sharing. The data owner is a physician (or patient); they can split their private personal data into secret shares and transmit them to the same three servers. Following secure computation protocols, the servers return the result shares to the physician. This process effectively prevents leakage of both the medical institution’s model parameters and the patient’s private information.

4.2. Conversions of Sharing

In the lookup table protocol, the inputs are

{[[\cdot]]}^{2}

-shares of every

δ

bit. Therefore, a secure digit decomposition function

F_{D i g D e c}^{[[\cdot]]}

under a 3PC

[[\cdot]]

-sharing scheme is required before each invoking of a lookup table protocol. In this protocol, the wrap function and the 3PC private-compare function (the same as the protocol in [31]) under the

〈 \cdot 〉

-sharing scheme will be invoked. The following describes the protocol in detail.

4.2.1. Wrap Function

The wrap function

W r a p_{3}

calculates the carry bit when the secret shares held by three parties are added together, and the output may be 0, 1 or 2, i.e., assume three secret shares are

a_{0}, a_{1}, a_{2} \in Z_{2^{l}}

, and

W r a p_{3} (a_{0}, a_{1}, a_{2})

is:

W r a p_{3} (a_{0}, a_{1}, a_{2}) = \{\begin{matrix} 0 & , i f a_{0} + a_{1} + a_{2} < 2^{l} \\ 1 & , i f 2_{l} \leq a_{0} + a_{1} + a_{2} < 2 \cdot 2^{l} \\ 2 & , i f 2 \cdot 2_{l} \leq a_{0} + a_{1} + a_{2} < 3 \cdot 2^{l} \end{matrix}

But since the operands in the lookup table protocol are all bits, the wrap function defined here is modulo 2 on the original result, i.e.,

F_{W r a p_{3}}^{〈 \cdot 〉} = W r a p_{3} m o d 2

. The details of the secure wrap protocol can refer to [31].

4.2.2. Private Compare

In the digit decomposition protocol, the operation should obtain the bit shares of comparison results of secret value x and a public number r. The 3PC private compare function

F_{P C}^{〈 \cdot 〉}

can be realized in [31].

4.2.3. Digit Decomposition

F_{D i g D e c}^{[[\cdot]]}

converts secret shares in arithmetic ring

Z_{2^{l}}

into shares in Boolean ring

Z_{2}

. Given private input

{[[x]]}_{i} = (m_{x}, {〈 λ_{x} 〉}_{i}), x \in Z_{2^{l}}, i \in {0, 1, 2}

, output

{[[x_{j}]]}_{i} = (m_{x_{j}}, {〈 λ_{x_{j}} 〉}_{i}), j \in [l]

, where

x_{j}

is the jth bit of x (i.e.,

x = \sum_{j = 0}^{l - 1} x_{j} \cdot 2^{j}

). For ease of calculation, each party locally decomposes value

m_{x}

, then obtains public bits

{m_{x_{j}}}_{j \in [l]}

, where

m_{x} = \sum_{j = 0}^{l - 1} m_{x_{j}} \cdot 2^{j}

. Next, each party interactively converts the

〈 \cdot 〉

-shares into Boolean shares.

The challenge of the three-party decomposition protocol under

[[\cdot]]

-sharing lies in how to construct a specific

〈 λ_{x_{j}} 〉 \in Z_{2^{l}}

on the basis of an existing arithmetic ring

〈 λ_{x} 〉 \in Z_{2^{l}}

. Based on the mathematical observation, we propose a three-party decomposition protocol under

[[\cdot]]

-sharing, enabling most calculations to be completed offline for decomposition. Below, we describe the mathematical observation and the corresponding protocol in detail. Given the secret sharing

{[[x]]}_{i} = (m_{x}, {〈 λ_{x} 〉}_{i})

, where

m_{x}, λ_{x} \in Z_{2^{l}}, i \in {0, 1, 2}

, suppose

g_{i, j} = λ_{i, j} ∥ λ_{i, j - 1} ∥ \dots ∥ λ_{i, 0}, j < l

, then the

〈 \cdot 〉

-sharing follows Formula (5):

\begin{matrix} λ_{x_{j}} = λ_{0, j} + λ_{1, j} + λ_{2, j} + c_{j} (m o d 2) \end{matrix}

(5)

where

c_{j}

is the carry bit of

g_{0, j - i}, g_{1, j - i}, g_{2, j - i}

, i.e.,

c_{j} = F_{W r a p 3}^{〈 \cdot 〉} (g_{0, j - i}, g_{1, j - i}, g_{2, j - i}, 2^{j})

.

Similarly, suppose there is

Y_{j} = λ_{j} ∥ λ_{j - 1} ∥ \dots ∥ λ_{0}

,

B_{j} = m_{x_{j}} ∥ m_{x_{j - 1}} ∥ \dots ∥ m_{x_{0}}

,

j < l

. The secret sharing

[[x]]

satisfies:

\begin{matrix} x_{j} & = λ_{x_{j}} + m_{x_{j}} + c_{j}^{'} (m o d 2) \\ = λ_{0, j} + λ_{1, j} + λ_{2, j} + c_{j} + m_{x_{j}} + c_{j}^{'} (m o d 2) \end{matrix}

(6)

where

c_{j}^{'}

is the carry bit of

Y_{j - 1} + B_{j - 1}

, i.e.,

c_{j}^{'} = W r a p_{2} (Y_{j - 1}, B_{j - 1}, 2^{j})

. However, in actual computation,

Y_{j - 1}

is encrypted, so

c_{j}^{'}

cannot be directly computed. By observation, we can deduce

c_{j}^{'} = W r a p_{2} (Y_{j - 1}, B_{j - 1}, 2^{j}) = Y_{j - 1} + B_{j - 1} < 2^{j} ? 0 : 1 = (g_{0, j - 1} + g_{1, j - 1} + g_{2, j - 1} (m o d 2^{j})) < 2^{j} - B_{j - 1} ? 0 : 1

Since

B_{j - 1}

is a public value, this term can be moved to the right side of the inequality, i.e.,

c_{j}^{'} = g_{0, j - 1} + g_{1, j - 1} + g_{2, j - 1} (m o d 2^{j}) < 2^{j} - B_{j - 1} ? 0 : 1 = F_{P C}^{〈 \cdot 〉} (g_{0, j - 1}, g_{1, j - 1}, g_{2, j - 1}, 2^{j} - B_{j - 1})

. Since

W r a p_{2}

can be computed locally, there is no online communication. The protocols of the

F_{D i g D e c}^{[[\cdot]]}

are described in Algorithm 3.

Algorithm 3

F_{D i g D e c}^{[[\cdot]]}

: 3PC digit decomposition

Input: $[[\cdot]]$ -shares of x, where $x \in Z_{2^{l}}$ .
Output: $[[\cdot]]$ -shares of $x_{j}$ , where $x_{j} \in Z_{2}, j \in [l]$ .
Offline Phase:
1.
Each party locally converts the shares $λ_{x}$ into bits ${λ_{b, j}}$ and computes $g_{b, j - 1}$ locally, where $b \in {0, 1, 2}, j \in {0, \dots, l - 1}$ ;
2.
Each party invokes the function $F_{W r a p 3}^{〈 \cdot 〉} (g_{0, j - 1}, g_{1, j - 1}, g_{2, j - 1}, 2^{j})$ to obtain $〈 c_{j} 〉$ ;
3.
Each party locally computes $u_{b, j} = λ_{b, j} + c_{b, j}$ .
Online Phase:
1.
Each party locally converts the shares $m_{x}$ into bits $m_{x_{j}}, B_{j - 1}$ ;
2.
Each party invokes the function $F_{P C}^{〈 \cdot 〉} (g_{0, j - 1}, g_{1, j - 1}, g_{2, j - 1}, 2^{j} - B_{j - 1})$ to obtain $〈 c_{j}^{'} 〉$ ;
3.
Each party locally computes $ψ_{b, j} = u_{b, j} + c_{b, j}^{'}$ . The final share after bit conversion is $[[x_{j}]] = (m_{x_{j}}, 〈 ψ_{b, j} 〉)$ .

4.2.4. B2A Conversion

Boolean to arithmetic function

F_{B 2 A}^{[[\cdot]]}

is the inverse operation of digit decomposition. It converts the bit shares

(x_{0}, x_{1}, \dots, x_{l - 1}) \in Z_{2}

to value

x \in Z_{2^{l}}

, such that

x = \sum_{j = 0}^{l - 1} x_{j} \cdot 2^{j}, x \in Z_{2^{l}}

. We can first convert

x_{i}

to the arithmetic ring, and then use the linear property of

[[\cdot]]

-sharing to calculate. Then, for secret

x \in Z_{2}

in

[[\cdot]]

-sharing that

x = m_{x} \oplus λ_{x}

, we can derive the following mathematical properties:

x^{A} = m_{x}^{A} + λ_{x}^{A} - 2 m_{x}^{A} λ_{x}^{A}

; so we have:

\begin{matrix} x^{A} & = m_{x}^{A} + λ_{x}^{A} - 2 m_{x}^{A} λ_{x}^{A} \\ = m_{x}^{A} + (1 - 2 m_{x}^{A}) λ_{x}^{A} \\ = m_{x}^{A} + (1 - 2 m_{x}^{A}) (λ_{0} \oplus λ_{1} \oplus λ_{2}) \\ = m_{x}^{A} + (1 - 2 m_{x}^{A}) (λ_{0}^{A} + λ_{1}^{A} + λ_{2}^{A} - 2 λ_{0}^{A} λ_{1}^{A} - 2 λ_{1}^{A} λ_{2}^{A} - 2 λ_{0}^{A} λ_{2}^{A} + 4 λ_{0}^{A} λ_{1}^{A} λ_{2}^{A}) \end{matrix}

(7)

We have

x = m_{x} \oplus λ_{x} = m_{x} \oplus λ_{0} \oplus λ_{1} \oplus λ_{2}

, and assume that the arithmetic value in

Z_{2^{l}}

is

x^{A}, m_{x}^{A}, λ_{x}^{A}

, respectively. Therefore, we can derive the mathematical equation in Equation (7) in the same way. All items except for

λ_{0}^{A} λ_{1}^{A} λ_{2}^{A}

can be computed locally, but we can have one of the participants

P_{i}, i \in {0, 1, 2}

locally compute

λ_{i} λ_{i + 1}

, and then securely compute

λ_{0}^{A} λ_{1}^{A} λ_{2}^{A}

using the Du-Atallah protocol [21].

4.3. Lookup Table-Based 3PC Protocol

We assume that the function to be evaluated is represented as a lookup table, and the LUT is public that all parties can know the input encoding

{{\vec{E}}^{u}}_{u \in [δ]}

and output encoding

{{\vec{ξ}}^{w}}_{w \in [σ]}

, and the parties have Boolean secret sharing of the private input. In secure two-party computation, there are several approaches to compute a lookup table. The most widely applied are One Time True Table (OTTT) [33], Online-LUT (OP-LUT) [32], and Setup-LUT (SP-LUT) [32]. But in the 3PC scenario, there are no efficient computation schemes. In order to increase the efficiency of the online phase, we proposed a 3PC lookup table protocol, expanding the application scenarios of Flute [20] from 2PC to 3PC settings.

This approach converts the lookup process of the lookup table into a computing function that is related to the private input representing the inner product. The previous lookup table method enumerated all inputs [32,33] or all outputs [32] and then obtained the final lookup table output by utilizing the obvious transfer (OT). This is regular but the complexity of computation and communication are too high because the whole table will be sent to the other party. Therefore, we only require part of the lookup table to be evaluated rather than an entire table. There are four steps involved in this conversion, and the specific steps are shown below.

(1) The first step: compute the fully disjunctive normal form of the input. When evaluating a Boolean

δ

-to-

σ

lookup table, we only need to focus on the rows where the result is 1, then the output of LUT can be represented as the full disjunctive normal form (DNF) of the corresponding input for those rows. Taking a lookup table with output 1 as an example in Figure 1, assuming there are

α

rows whose results are 1, then for each

j \in [α]

, we compute all terms

⋀_{i = 1}^{δ} {\vec{L}}_{j}^{i}

, where

{\vec{L}}_{j}^{i} = x_{i}

if

x_{i} = 1

in the LUT and

{\vec{L}}_{j}^{i} = \bar{x_{i}}

if

x_{i} = 0

for all

i \in [δ]

, and then connect all terms using the OR operation, output

L U T (x_{1}, x_{2}, x_{3}) = ⋁_{j = 1}^{α} ⋀_{i = 1}^{δ} {\vec{L}}_{j}^{i}

.

Using the lookup table depicted in Figure 1 as an illustration, the output of the third, fifth, sixth rows is 1, i.e.,

α = 4

, and assuming the input is

(x_{1}, x_{2}, x_{3}) = (0, 1, 1)

, the output of LUT is 0. And then we calculate the formula; then, we can exactly obtain the output result

L U T (x_{1}, x_{2}, x_{3}) = (\bar{x_{1}} \land x_{2} \land \bar{x_{3}}) ⋁ (x_{1} \land \bar{x_{2}} \land \bar{x_{3}}) ⋁ (x_{1} \land \bar{x_{2}} \land x_{3}) ⋁ (x_{1} \land x_{2} \land x_{3}) = 0

.

(2) The second step: replace the OR operation with the XOR operation. Evaluating the above DNF expression requires

δ \cdot α - 1

AND and OR operations, which costs high online communication. However, We remove OR operations with the following significant properties: Given the input

(x_{1}, \dots, x_{δ})

, in the above DNF equation

⋀_{i = 1}^{δ} {\vec{L}}_{j}^{i}

, at most one term can result in 1 [20], which means that it is impossible for any two different terms to each result in 1. Based on this property, we can obtain the following Equation (8):

\begin{matrix} ⋁_{j = 1}^{α} ⋀_{i = 1}^{δ} {\vec{L}}_{j}^{i} = ⨁_{j = 1}^{α} ⋀_{i = 1}^{δ} {\vec{L}}_{j}^{i} \end{matrix}

(8)

Still taking the lookup table in Figure 1 as an example,

α = 4

, and assuming the input is

(x_{1}, x_{2}, x_{3}) = (0, 1, 1)

, the output of LUT is 0. Then, compute the improved formula to obtain

L U T (x_{1}, x_{2}, x_{3}) = (\bar{x_{1}} \land x_{2} \land \bar{x_{3}}) ⨁ (x_{1} \land \bar{x_{2}} \land \bar{x_{3}}) ⨁ (x_{1} \land \bar{x_{2}} \land x_{3}) ⨁ (x_{1} \land x_{2} \land x_{3}) = 0

.

(3) The third step: replace the previous formula with an equivalent inner product computation. The purpose of this step is to transform the equation from the previous step into a more easily computable and efficient form. When

α = 1

, the equation is

⋀_{i = 1}^{δ} {\vec{L}}_{j}^{i}

, which can be seen as multi-input AND gate in Section 3.2. When

δ = 2

, the transformed equation is

⨁_{j = 1}^{α} {\vec{L}}_{j}^{1} \land {\vec{L}}_{j}^{2}

, which is equivalent to the vector inner product operation over a Boolean ring. Therefore, we use the aforementioned protocols to calculate the inner product and the multi-input AND gate, inspired by [9,20]. Finally, the equation is as follows in Equation (9)

\begin{matrix} L U T (x_{1}, x_{2}, x_{3}) & = ⋁_{j = 1}^{α} ⋀_{i = 1}^{δ} {\vec{L}}_{j}^{i} \\ = ⨁_{j = 1}^{α} ⋀_{i = 1}^{δ} {\vec{L}}_{j}^{i} \\ = ⨀_{i = 1}^{δ} {\vec{L}}^{i} \end{matrix}

(9)

where ⨀ denotes an inner product operation like the form in vector. Assume

α = 3

, and the input is

(x_{1}, x_{2}, x_{3}) = (0, 1, 1)

; the output of LUT is 0. Then, compute the improved formula to obtain

L U T (x_{1}, x_{2}, x_{3}) = (\begin{matrix} \bar{x_{1}} \\ x_{1} \\ x_{1} \\ x_{1} \end{matrix}) \cdot (\begin{matrix} x_{2} \\ \bar{x_{2}} \\ \bar{x_{2}} \\ x_{2} \end{matrix}) \cdot (\begin{matrix} \bar{x_{3}} \\ \bar{x_{3}} \\ x_{3} \\ x_{3} \end{matrix}) = 0

, and we can see that the result is correct.

4.3.1. Mathematical Expression of Multi-Input LUT

Consider a

δ

input vector

I = ({\vec{x}}^{1}, {\vec{x}}^{2}, \dots, {\vec{x}}^{δ})

and 1-bit output, the dimension of each component

{\vec{x}}^{i}

is represented as d where

i \in {1, 2, \dots, δ}

. Based on the above conclusion, we need to calculate

o u t p u t = {\vec{x}}^{1} ⊙ {\vec{x}}^{2} ⊙ \dots ⊙ {\vec{x}}^{δ}

. Let

I_{j} = ({\vec{x}}_{j}^{1}, {\vec{x}}_{j}^{2}, \dots, {\vec{x}}_{j}^{δ})

be a set of the jth element of each

{\vec{x}}^{i} \in I

. As we mentioned in Section 3.2, the secret value x is

[[\cdot]]

-sharing in this work, so we have:

\begin{matrix} y & = {\vec{x}}^{1} ⊙ {\vec{x}}^{2} ⊙ \dots ⊙ {\vec{x}}^{δ} \\ = ⨁_{j = 1}^{d} ⋀_{i = 1}^{δ} {\vec{x}}_{j}^{i} \\ = ⨁_{j = 1}^{d} ⋀_{i = 1}^{δ} (m_{{\vec{x}}_{j}^{i}} \oplus λ_{{\vec{x}}_{j}^{i}}) \\ = ⨁_{j = 1}^{d} (⨁_{S_{j} \in 2^{I_{j}}} (m_{S_{j}} \land λ_{I_{j} ∖ S_{j}})) \end{matrix}

(10)

In Equation (10), it is easy to derive the equations for the first three lines from Section 4, while in the fourth step,

⨁_{S_{j} \in 2^{I_{j}}} (m_{S_{j}} \land λ_{I_{j} ∖ S_{j}})

is the expansion of

⋀_{i = 1}^{δ} (m_{{\vec{x}}_{j}^{i}} \oplus λ_{{\vec{x}}_{j}^{i}})

, which follows the distribution law, and its expension has

2^{σ}

items in total, power set

2^{I_{j}}

means all the expanded items, and

S_{j}

is one of a subset in

2^{I_{j}}

, and

I_{j} ∖ S_{j}

means difference set, where

j \in {1, \dots, d}

, i.e.,

I_{j} ∖ S_{j} = I_{j} - S_{j}

. Although the expression contains

d \cdot (2^{δ} - δ - 1)

AND gates, these AND gates can be locally calculated to obtain the shares of each term leading to low online communication since

λ_{I_{j} ∖ S_{j}}

can be computed by

F_{N - A N D}^{〈\cdot〉}

locally and

m_{S_{j}}

is clear to all parties.

4.3.2. Lookup Table Protocol

Given a

δ

-to-

σ

LUT T, we calculate the lookup table results bit by bit, so the goal is to compute

L U T (x_{1}, \dots, x_{δ}) = (y_{1}, \dots, y_{σ}), y_{t \in [σ]} = ⨀_{i = 1}^{δ} {\vec{L}}^{i} = ⨁_{j = 1}^{α} (⨁_{S_{j} \in 2^{I_{j}}} (m_{S_{j}} \land λ_{I_{j} ∖ S_{j}}))

as mentioned in Equation (10), where

α

is the number of rows that output 1. To filter out rows with output 1, we can multiply this expression by the output encoding of LUT

{{\vec{ξ}}^{w}}_{w \in [σ]}

, so the rows with output 0 will be deleted. Since the lookup table is public, the output encoding is also public, so no additional communication is required to obtain all rows with output 1 as in Equation (11).

\begin{matrix} y_{t \in σ} & = ⨁_{j \in 2^{σ}, {\vec{ξ}}_{j}^{t} = 1} ⋀_{i = 1}^{δ} {\vec{L}}_{j}^{i} \\ = {\vec{L}}^{1} ⊙ \dots ⊙ {\vec{L}}^{δ} ⊙ {\vec{ξ}}^{t} \end{matrix}

(11)

In Equation (11), we also need to handle the mapping relationship between the input vector

(x_{1}, \dots, x_{δ})

and vector

{\vec{L}}^{i}

for

i \in [δ]

. We can see from Section 4.3 that

{\vec{L}}_{j}^{i} = x_{i}

if

x_{i} = 1

and

{\vec{L}}_{j}^{i} = \bar{x_{i}}

if

x_{i} = 0

for all

i \in [δ]

. Since the lookup table is public, we can also obtain the input encoding

{{\vec{E}}^{u}}_{u \in [δ]}

, so we can obtain

{\vec{L}}^{i}

as Equation (12):

\begin{matrix} {\vec{L}}_{j}^{i} & = x_{i} \oplus (1 \oplus {\vec{E}}_{j}^{i}) \\ = (m_{x_{i}} \oplus λ_{x_{i}}) \oplus (1 \oplus {\vec{E}}_{j}^{i}) \\ = (m_{x_{i}} \oplus 1 \oplus {\vec{E}}_{j}^{i}) \oplus λ_{x_{i}} \end{matrix}

(12)

where

i \in [δ]

,

j \in 2^{δ}

. Because the input encoding

{\vec{E}}^{i}

is public and the secret sharing scheme we used in Section 3.2 is linear, no additional communication is required for this step of processing.

Furthermore, in our secret sharing scheme, calculating the complement

\bar{x_{i}}

only needs to compute the complement of

m_{x_{i}}

z, but

λ_{x_{i}}

remains unchanged, such that

[[\bar{x_{i}}]] = [[\bar{m_{x_{i}}} \oplus λ_{x_{i}}]]

. Thus, we can observe that the term

λ_{I_{j} ∖ S_{j}}

for all

j \in [2^{δ}]

in Equation (13) is the same for each row with input vector

I = {x_{1}, \dots, x_{δ}}

, so the same random number

λ_{I ∖ S}

can be used. Therefore, we can obtain the following derivation formula as Equation (13):

\begin{matrix} y_{w} & = {\vec{L}}^{1} ⊙ \dots ⊙ {\vec{L}}^{δ} ⊙ {\vec{ξ}}^{w} = (⨁_{j = 1}^{2^{δ}} ⋀_{i = 1}^{δ} {\vec{L}}_{j}^{i}) \land {\vec{ξ}}^{w} \\ = ⨁_{j = 1}^{2^{δ}} ((⋀_{i = 1}^{δ} {\vec{L}}_{j}^{i}) \land {\vec{ξ}}_{j}^{w}) \\ = ⨁_{j = 1}^{2^{δ}} ((⋀_{i = 1}^{δ} (m_{{\vec{L}}_{j}^{i}} \oplus λ_{{\vec{L}}_{j}^{i}})) \land {\vec{ξ}}_{j}^{w}) \\ = ⨁_{j = 1}^{2^{δ}} (⨁_{S_{j} \in 2^{I_{j}}} (m_{S_{j}} \land λ_{I_{j} ∖ S_{j}} \land {\vec{ξ}}_{j}^{w})) \\ = ⨁_{S \in 2^{I}} ((⨁_{j = 1}^{2^{δ}} (m_{S_{j}} \land {\vec{ξ}}_{j}^{w})) \land λ_{I ∖ S}) \\ = ⨁_{S \in 2^{I}} (({\vec{m}}_{S} ⊙ {\vec{ξ}}^{w}) \land λ_{I ∖ S}) \end{matrix}

(13)

where

{\vec{ξ}}^{w}

denotes the output encoding of LUT for all

w \in [σ]

,

S_{j}

denotes a set replacing each

x_{i}

with

{\vec{L}}_{j}^{i}

,

m_{{\vec{L}}_{j}^{i}}

and

λ_{{\vec{L}}_{j}^{i}}

is the secret shares of

{\vec{L}}_{j}^{i}

for

i \in [δ], j \in [2^{δ}]

, and

m_{S_{j}}, λ_{I_{j} ∖ S_{j}}

is the shares of each

S_{j}

where

m_{S_{j}}

is replaced with

m_{{\vec{L}}_{j}^{i}}

, and

λ_{I_{j} ∖ S_{j}}

stays the same. The third to fourth lines follow the law of distribution, and since it is observed that

m_{S_{j}}

and

{\vec{ξ}}_{j}^{w}

are public parameters, these two items are combined for computation. And then according to Equation (9), the final equation, Equation (13), can be obtained.

Because

λ_{I_{j} ∖ S_{j}}

are reused, in the offline phase, we just need to invoke the

F_{2 - A N D}^{〈\cdot〉}

2^{δ} - δ - 1

times.

The protocol details are shown in Protocol 4. In the second step of the online phase in Algorithm 4, to prevent parties repeat calculating the

{\vec{m}}_{I}

, we move this term in the fourth step for calculation.

Algorithm 4

F_{L U T}^{[[\cdot]]}

: protocol of LUT

Input: a public $δ$ -to- $σ$ LUT T with input encoding ${{\vec{E}}^{u}}_{u \in [δ]} \in {0, 1}^{2^{δ}}$ and output encoding ${{\vec{ξ}}^{w}}_{w \in [σ]} \in {0, 1}^{2^{δ}}$ , and $[[\cdot]]$ -shares of input vector $([[x_{1}]], \dots, [[x_{δ}]])$ .
Output: $[[\vec{y}]]$ , where $\vec{y} = (y_{1}, \dots, y_{σ}) = L U T (x_{1}, \dots, x_{δ})$ .
Offline Phase:
1.
Each party ${P_{t}}_{t \in [0, 2]}$ sample random values $λ_{y_{j}}^{t} \in {0, 1}$ using $F_{R a n d}^{〈\cdot〉}$ for $t \in {0, 1, 2}$ , $j \in [σ]$ .
2.
Each party ${P_{t}}_{t \in [0, 2]}$ interactively generates $〈λ_{S}〉$ for $S \in 2^{I}$ by invoking $F_{2 - A N D}^{〈 \cdot 〉}$ .
Online Phase:
1.
All parties locally set ${\vec{L}}_{j}^{i} = x_{i} \oplus (1 \oplus {\vec{E}}_{j}^{i})$ in Equation (12) for $i \in [δ], j \in [2^{δ}]$ .
2.
Each Party ${P_{t}}_{t \in [0, 2]}$ locally computes:

$v_{w}^{t} = ⨁_{S \in 2^{I}, S \neq I} (({\vec{m}}_{S} ⊙ {\vec{ξ}}^{w}) \land λ_{I ∖ S}) \oplus λ_{y_{j}}^{t}$

3.
All parties renconstruct $v_{w}$ by exchanging $v_{w}^{t}$ and computing $v_{w} = v_{w}^{0} \oplus v_{w}^{1} \oplus v_{w}^{2}$ .
4.
All parties locally compute $m_{y_{j}} = v_{w} \oplus ({\vec{m}}_{I} ⊙ {\vec{ξ}}^{w})$ .

5. The 3PC Protocols of Secure RNNs Operators

In this section, we will introduce secure matrix multiplication and secure activation functions of nonlinear layers commonly used in RNNs, such as sigmoid, and tanh as shown in Table 2. In computers, only finite numbers can be represented; complex mathematical functions cannot be accurately represented. Therefore, floating-point numbers are usually used to approximate infinite numbers [34].

5.1. Matrix Multiplication

The linear part of the model is usually matrix computation. The protocol for matrix multiplication is similar to

F_{2 - M u l t}^{[[\cdot]]}

. An

m \times o

matrix

{[[X]]}_{i} = (m_{X}, {〈 λ_{X} 〉}_{i})

is multiplied by an

o \times n

matrix

{[[Y]]}_{i} = (m_{Y}, {〈 λ_{Y} 〉}_{i})

, then we obtain matrix

m \times n

and matrix

{[[Z]]}_{i} = (m_{Z}, {〈 λ_{Z} 〉}_{i})

. In the offline phase, all parties

P_{i}

mutually generate

〈 λ_{X Y} 〉

by invoking

F_{M u l t}^{〈 \cdot 〉}

, and then in the online phase, all parties

P_{i}

reveal

m_{X}, m_{Y}

and compute

m_{Z}

locally. The details are shown in Algorithm 5.

Algorithm 5 Matrix multiplication

F_{M a t M u l} ([[x]])

Input: $[[\cdot]]$ -shares of matrix X and Y.
Output: $[[\cdot]]$ -shares of Z where $Z = X \cdot Y$ .
Offline Phase:
1.
$P_{i}$ sample random values $〈λ_{Z}〉$ and set $〈{λ_{Z}}^{'}〉$ where $i \in {0, 1, 2}$ .
2.
Parties mutually generate $〈λ_{X} λ_{Y}〉$ using $F_{2 - M u l t}^{〈\cdot〉}$ .
Online Phase:
1.
$P_{i}$ locally sets ${〈m_{Δ}〉}_{i} = m_{Y} {〈λ_{X}〉}_{i} + m_{X} {〈λ_{Y}〉}_{i} + {〈λ_{X} λ_{Y}〉}_{i} - {〈{λ_{Z}}^{'}〉}_{i}$ , where $i \in {0, 1, 2}$ .
2.
Parties reveal $m_{Δ}$ and set ${m_{Z}}^{'} = m_{Δ} + m_{X} m_{Y}$ .
3.
$P_{i}$ hold ${[[Z]]}_{i} = ({m_{Z}}^{'} / 2^{d}, {〈λ_{Z}〉}_{i})$ .

5.2. Exponential

Consider the following exponential function:

r E x p^{l} (x) = e^{- x}, x \in R^{+}

. Due to the properties of exponential functions, this can be equivalent to

r E x p (x) = e^{- x} = r E x p (2^{d (k - 1)} x_{k - 1}) \cdot \dots \cdot r E x p (2^{d} x_{1}) \cdot r E x p (2 x_{0})

, where x of length l is divided into k parts, each with a length of d [13], where this can be easily computed by invoking

F_{D i g D e c}^{[[\cdot]]}

in Section 3 first, and then invoking the lookup table protocols separately in Algorithm 4. To reduce communication and computing costs as well as memory usage, the private inputs from the larger arithmetic ring are decomposed into smaller one

Z_{2^{8}}

. The details are in Algorithm 6.

Algorithm 6 Functionality of exponential

F_{r E x p} ([[x]])

Input: $[[\cdot]]$ -shares of x.
Output: $[[\cdot]]$ -shares of $y \approx e^{- x} = r E x p (x)$ .
offline phase:
1.
Each party $P_{i}$ invokes the offline phase of $F_{D i g D e c}^{[[\cdot]]}$ with input $[[x_{i}]] \in Z_{l}$ to obtain $u_{j}^{b}$ where $i, b \in {0, 1, 2}, j \in {0, \dots, l - 1}$ , l is the bit width of x.
2.
After obtaining the $[[x_{i j}]] \in Z_{2}$ (it can be obtained in the first step of the offline phase and the first step of the online phase in $F_{r E x p} ([[x]])$ ), each party $P_{i}$ invokes the offline phase of $F_{L U T}^{[[\cdot]]}$ with input $[[x_{i j}]] \in Z_{2}$ and the input encoding and output encoding corresponding to the exponential function, then one obtains randomness $λ_{y_{j}}^{t}$ and $〈 λ_{S} 〉$ where $t \in {0, 1, 2}, j \in {0, \dots, l - 1}$ , l is the bit width of x.
Online Phase:
1.
Each party $P_{i}$ invokes the online phase in $F_{D i g D e c}^{[[\cdot]]}$ to obtain output $[[x_{i j}]] \in Z_{2}$ where $i \in {0, 1, 2}, j \in [l]$ .
2.
Each party $P_{i}$ invoke the online phase of $F_{L U T}^{[[\cdot]]}$ that for each $t \in k$ , compute and obtain $[[y_{j}]] = F_{L U T}^{[[\cdot]]} ([[x_{j}^{t}]])$ after building a lookup table with $δ = d, σ = d$ , where $j \in [l]$ , $x_{j}^{t}$ means the input for jth bit of output.
3.
Each party $P_{i}$ convert bit shares to $[[z]]$ in arithmetic shares by using $F_{B 2 A}^{[[\cdot]]}$ respectively.

5.3. Sigmoid

The sigmoid function can be simply assumed to be composed of an exponential function

e^{- x}

and a reciprocal

\frac{1}{x}

. Therefore, we sequentially calculate the exponential function and the reciprocal to obtain the result of the sigmoid function. For exponential function

e^{- x}

, we can use

F_{r E x p} ([[x]])

and obtain the accurate approximate results (the accuracy depends on the size of the lookup table). Then, for reciprocal

\frac{1}{x}

, we use the Goldschmidt iteration method (similar to the method in [13]); this method’s accuracy largely depends on the initial iteration value. In order to obtain a closer initial value, we construct a lookup table to obtain a more reliable approximation of the reciprocal function, and then continuously iterate on this basis to improve accuracy. The details are in Algorithm 7.

Algorithm 7 Functionality of exponential

F_{sigmoid} ([[x]])

Input: $[[\cdot]]$ -shares of x.
Output: $[[\cdot]]$ -shares of $y \approx \frac{1}{1 + e^{- x}} = sigmoid (x)$ .
offline phase:
1.
Each party $P_{i}$ invokes the offline phase of $F_{D i g D e c}^{[[\cdot]]}$ with input $[[x_{i}]] \in Z_{l}$ to obtain $u_{j}^{b}$ where $i, b \in {0, 1, 2}, j \in {0, \dots, l - 1}$ , l is the bit width of x.
2.
Each party $P_{i}$ invokes the second step of the offline phase in $F_{r E x p}^{[[\cdot]]}$ to obtain randomness $λ_{y_{j}}^{t}$ and $〈 λ_{S} 〉$ where $t \in {0, 1, 2}, j \in {0, \dots, l - 1}$ , l is the bit width of x.
Online Phase:
1.
Each party $P_{i}$ invokes the online phase of $F_{D i g D e c}^{[[\cdot]]}$ to obtain output $[[x_{i j}]] \in Z_{2}$ where $i \in {0, 1, 2}, j \in [l]$ .
2.
Each party $P_{i}$ invokes the online phase of $F_{r E x p}^{[[\cdot]]}$ to obtain $[[z]]$ where $i \in {0, 1, 2}$ , and then obtain output by executing Goldschmidt iteration.

5.4. Tanh

The hyperbolic tangent (tanh) has many application scenarios in neural networks, mainly as activation functions for hidden layers. It is a variant of the sigmoid function. The tanh function maps the input value to a continuous value in the range of −1 to 1. According to the definition, the following equation holds: tanh

= 2

sigmoid

(2 x) - 1

; thus, it can be computed by invoking the sigmoid function.

6. Security Analysis

We prove security protocols under both real world execution and ideal world simulation paradigms, and consider the security in the semi-honest model with all three parties following the protocol exactly.

Proof: assume

A

is a semi-honest adversary in the real world that cannot corrupt more than one party at the same time, and

S

is a simulator of the ideal world. In the real world, all parties involved execute the protocol in the presence of adversary

A

; in an ideal world, all parties send their inputs to the simulator

S

and have

S

execute the protocol honestly. Any knowledge that adversary

A

can obtain in the real world can also be obtained by simulator

S

in the ideal world, so the real world and the ideal world are indistinguishable.

Security for

F_{S h a r e}^{〈 \cdot 〉}

,

F_{R e c}^{〈 \cdot 〉}

,

F_{A N D}^{〈 \cdot 〉}

,

F_{M u l t}^{〈 \cdot 〉}

,,

F_{W r a p 3}^{〈 \cdot 〉}

,

F_{P C}^{〈 \cdot 〉}

,

F_{S h a r e}^{[[\cdot]]}

,

F_{R e c}^{[[\cdot]]}

,

F_{A N D}^{[[\cdot]]}

,

F_{M u l t}^{[[\cdot]]}

. Since we use the 3PC protocol of [26] as our secret sharing primitiveness, the security of these protocols in this paper are inherited from [26].

Security for

F_{L U T}^{[[\cdot]]}

: let

A

corrupt

S_{0}

during the protocol

F_{L U T}^{[[\cdot]]}

, and it is symmetric to

S_{1}, S_{2}

. In the offline phase, each party invokes

F_{R a n d}^{〈\cdot〉}

to generate correlation randomness and invokes

F_{2 - A N D}^{〈 \cdot 〉}

to obtain the multi-input dot inner product, so security is guaranteed by these two functions. In both real and ideal protocols, the only key information obtained by the corrupt party

S_{0}

from the honest party is a bit

v_{m}^{0}

. Whether in real or ideal execution, this bit is masked by a random number

λ_{y_{j}}^{2}

selected by the honest party, which is invisible for

S_{0}

, making it a random bit that follows a uniform distribution in

S_{0}

view. Since this is the only mask message that

S_{0}

can observe, from its perspective, the execution of the real world and the simulation of the ideal world are indistinguishable.

Security for

F_{D i g D e c}^{[[\cdot]]}

: let

A

corrupt

S_{0}

during the protocol

F_{D i g D e c}^{[[\cdot]]}

, and let it be symmetric to

S_{1}, S_{2}

. The interaction between the parties only occurs when invoking protocol

F_{W r a p s}^{〈 \cdot 〉}

and

F_{P C}^{〈 \cdot 〉}

, and all other calculations are completed locally. Therefore, this protocol’s security is guaranteed by the security of

F_{W r a p s}^{〈 \cdot 〉}

and

F_{P C}^{〈 \cdot 〉}

.

7. Evaluation

This chapter provides experimental results and corresponding experimental settings for the proposed protocol.

Experiment setting: The experiment was run on an Intel(R) Xeon(R) Platinum 8176 M CPU @ 2.10 GHz and 48 GB RAM and was conducted on a single threaded LAN and WAN, and we created three docker containers representing three servers in Ubuntu 20.04. The bandwidth in the LAN was approximately 1 GB/s and round-trip time (RTT) was 1ms, while the bandwidth in the WAN was 40 MB/s and RTT was 70 ms. The code implementation written in C++ programming language.

For our work’s benchmark, we compared the state-of-the-art SIRNN [13] in current RNN secure inference. In theoretical analysis, we compared the complexity of online communication and online communication rounds. Then, we compared the running time respectively under LAN and WAN, and communication volume of key building blocks. Finally, we applied our sigmoid and tanh protocols to end-to-end RNN secure inference and compared it with SIRNN [13] under the same setting. We simulated the fastGRNN [35] model on the Speech Command dataset [36], which identified keywords in short speech (such as digits, simple command, or directions). Its primary goal is to provide a way to build and test models that detect when a single word is spoken, from a set of ten target words, and the datasize size is 8.17 GiB. The FastGRNN model contains 99 sigmoid and 99 tanh layers; each layers has 100 instances.

7.1. Comparison of Theoretical Communication Cost on Building Blocks

As shown in Table 3 and Table 4, we compared the online communication complexity and communication rounds of basic building blocks of our 3PC work with the previous current optimal RNN work [13]. Note that the sigmoid and tanh functions of SIRNN is in 2PC setting. Moreover, we also compared with the typical three-party computation (3PC) framework ABY3 [24] and SecureNN [25], which are not specifically designed for RNNs, to obtain more comprehensive comparison results. For

F_{2 - M u l t}^{[[\cdot]]}

, compared to SIRNN with a communication complexity of

λ (l + 3) + 1.5 l^{2} + 2.5 l + 2

, our method reduces the communication complexity by an order of magnitude to l, and the communication rounds are reduced from 4 to 1. Our work reduces the online communication complexity by 10 times and 4 times compared to ABY3 and SecureNN, respectively, while also decreasing the number of communication rounds by one round compared to SecureNN, maintaining the same number of communication rounds as ABY3. For

F_{D i g D e c}^{[[\cdot]]}

, the online communication complexity and communication rounds of basic building blocks of our work are not much different from ref. [13]. For

F_{B 2 A}^{[[\cdot]]}

, SIRNN and SecureNN do not implement

F_{B 2 A}^{[[\cdot]]}

. Although the communication complexity of our work is slightly lower than that of ABY3, the communication rounds are reduced from

1 + l o g l

to 1. For

F_{L U T}^{[[\cdot]]}

, the communication complexity of our work was reduced so that it was only related to the output bit width

σ

, and the communication round was reduced to only one round.

7.2. Online Cost of RNN Building Blocks and End-to-End Inference

We tested the basic building blocks of RNN, including matrix multiplication, sigmoid, and tanh. It should be noted that our work focused on reducing online communication overhead, so the communication cost and runtime we provide are obtained during the online phase. Similarly, we also evaluated the online communication overhead and runtime for SIRNN. However, since SIRNN does not strictly divide the calculation into online/offline phases, all computations are related to the privacy input. Therefore, we took the total communication cost and runtime of SIRNN as online communication cost and online runtime. Compared with SIRNN [13], the online communication of RNN key building blocks matrix multiplication, sigmoid, and tanh in our research work was reduced by 27.56%, 80.39%, and 79.94%, respectively, as shown in Table 5. The results of online runtime for RNN’s building blocks under LAN and WAN are shown in Table 6. In the LAN setting, for the sigmoid and tanh functions, the online runtime of our work is similar to that of SIRNN. However, in the WAN setting, for the sigmoid function, our work is 8% faster than SIRNN in terms of online runtime, and for the tanh function, our scheme is 3% faster than SIRNN. Also, we conducted end-to-end RNN secure inference using the Google-30 dataset [36] on fastGRNN [35]. As shown in Table 7, although our work is slightly slower than SIRNN in terms of online communication time (LAN: SIRNN 10.29s vs. Ours 10.31s; WAN: SIRNN 522.07 s vs. Ours 598.10 s), our online communication overhead is reduced by 39.45% compared to SIRNN.

8. Conclusions

In this work, we introduce an innovative protocol to improve the efficiency of three-party secret sharing and secure inference in recurrent neural networks (RNNs). The experimental results show that compared with SIRNN, the online communication of the core building blocks of the RNN model is significantly reduced (matmul, 27.56%; sigmoid, 80.39%; tanh, 79.94%), making our protocol available for delay-sensitive applications, such as real-time healthcare diagnosis, motion detection, financial risk control detection, and smart voice assistants.

However, it is important to acknowledge potential limitations. When our protocol is applied to RNN models with more layers or more complex structures, it may lead to a linear increase in the total/online communication overhead. In addition, although our lookup table protocol requires only one round of online communication, the performance of our method in a distributed environment may be affected by network latency, which may affect the communication time of the online phase in real deployment.

In summary, this research presents significant advancements in reducing communication costs for secure three-party computations in RNNs, but it also highlights areas for further improvement. Future work will focus on integrating these protocols into real-world model inference tasks, addressing scalability and latency challenges, and enhancing the privacy and security of lookup tables. Considering the balance of communication efficiency and security, our work only supports semi-honest security. We plan to improve the work further to support malicious security in the future. In addition, multi-valued logic (MVL) may work in reducing the communication overhead of the lookup table in theory, so future research will consider improvements with MVL. By exploring these methods, we aim to broaden the utility and robustness of secure computation protocols in practical applications.

Author Contributions

Conceptualization, Y.W. and C.L.; Methodology, Y.W. and C.L.; Software, C.L., X.S. and Y.S.; Validation, Y.S.; Formal Analysis, X.S. and T.W.; Writing—Original Draft Preparation, Y.W. and C.L.; Writing—review & editing, Y.W., C.L. and T.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded in part by Colleges and Universities Stable Support Project of Shenzhen, China, (No. GXWD20220811170225001), National Natural Science Foundation of China (No. 62402142), and Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies (No. 2022B1212010005).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding authors.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Randomness Generation

Throughout this paper, two types of random numbers are utilized for

〈\cdot〉

-sharing using pseudo-random functions. The details are as follows:

Three-out-of-three random numbers: a pairwise random shared-key $k_{i}$ is held between $P_{i}$ and $P_{i + 1}$ (i.e., $P_{0}, P_{1}$ have $k_{0}$ , $P_{1}, P_{2}$ have $k_{1}$ , $P_{2}, P_{0}$ have $k_{2}$ ), and then $P_{i}$ can compute $α_{i} = F_{k_{i}, c n t} - F_{k_{i - 1}, c n t}$ where $c n t$ is a self-increasing counter, $i \in {0, 1, 2}$ , so that $α_{0} + α_{1} + α_{2} \equiv 0$ .
Two-out-of-three random numbers: similar to the aforementioned processes, let $P_{i}, P_{i + 1}$ keep $k_{i}$ , then each party generates $(α_{i}, α_{i - 1})$ where $α_{i} = F_{k_{i}, c n t}$ , $i \in {0, 1, 2}$ and $c n t$ is an increased counter.

These two methods of generating random numbers mentioned above are established in both arithmetic and Boolean rings. The 3-out-of-3 random numbers can be applied to matrix inner products in linear layers and lookup table protocols. The 2-out-of-3 randomness can be applied to

[[\cdot]]

-sharing.

References

Dowlin, N.; Gilad-Bachrach, R.; Laine, K.; Lauter, K.; Naehrig, M.; Wernsing, J. CryptoNets: Applying neural networks to encrypted data with high throughput and accuracy. In Proceedings of the 33rd International Conference on Machine Learning, New York, NY, USA, 20–22 June 2016. [Google Scholar]
Hesamifard, E.; Takabi, H.; Ghasemi, M. CryptoDL: Deep Neural Networks over Encrypted Data. arXiv 2017, arXiv:1711.05189. [Google Scholar]
Mohassel, P.; Zhang, Y. Secureml: A system for scalable privacy-preserving machine learning. In Proceedings of the 2017 IEEE Symposium on Security and Privacy (SP), San Jose, CA, USA, 22–26 May 2017; pp. 19–38. [Google Scholar]
Liu, J.; Juuti, M.; Lu, Y.; Asokan, N. Oblivious neural network predictions via minionn transformations. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, Dallas, TX, USA, 30 October–3 November 2017; pp. 619–631. [Google Scholar]
Rouhani, B.D.; Riazi, M.S.; Koushanfar, F. DeepSecure: Scalable Provably-Secure Deep Learning. In Proceedings of the 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC), San Francisco, CA, USA, 24–29 June 2018. [Google Scholar]
Juvekar, C.; Vaikuntanathan, V.; Chandrakasan, A. GAZELLE A Low Latency Framework for Secure Neural Network Inference. In Proceedings of the 27th USENIX Security Symposium (USENIX Security 18), Baltimore, MD, USA, 15–17 August 2018. [Google Scholar]
Chandran, N.; Gupta, D.; Rastogi, A.; Sharma, R.; Tripathi, S. EzPC: Programmable and Efficient Secure Two-Party Computation for Machine Learning. In Proceedings of the 2019 IEEE European Symposium on Security and Privacy (EuroS&P), Stockholm, Sweden, 17–19 June 2019. [Google Scholar]
Demmler, D.; Schneider, T.; Zohner, M. ABY-;A Framework for Efficient Mixed-Protocol Secure Two-Party Computation. In Proceedings of the 2015 Network and Distributed System Security Symposium, San Diego, CA, USA, 8–11 February 2015. [Google Scholar]
Patra, A.; Schneider, T.; Suresh, A.; Yalame, H. ABY2.0: Improved Mixed-Protocol Secure Two-Party Computation. In Proceedings of the 30th USENIX Security Symposium (USENIX Security 21), Vancouver, BC, Canada, 11–13 August 2021; pp. 2165–2182. [Google Scholar]
Mishra, P.; Lehmkuhl, R.; Srinivasan, A.; Zheng, W.; Popa, R.A. Delphi: A cryptographic inference system for neural networks. In Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice, Virtual, 9 November 2020; pp. 27–30. [Google Scholar]
Rathee, D.; Rathee, M.; Kumar, N.; Chandran, N.; Gupta, D.; Rastogi, A.; Sharma, R. Cryptflow2: Practical 2-party secure inference. In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security, Virtual, 9–13 November 2020; pp. 325–342. [Google Scholar]
Chen, V.; Pastro, V.; Raykova, M. Secure Computation for Machine Learning with SPDZ. arXiv 2019, arXiv:1901.00329. [Google Scholar]
Rathee, D.; Rathee, M.; Goli, R.K.K.; Gupta, D.; Sharma, R.; Chandran, N.; Rastogi, A. Sirnn: A math library for secure rnn inference. In Proceedings of the 2021 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 24–27 May 2021; pp. 1003–1020. [Google Scholar]
Zheng, Y.; Zhang, Q.; Chow, S.S.; Peng, Y.; Tan, S.; Li, L.; Yin, S. Secure softmax/sigmoid for machine-learning computation. In Proceedings of the 39th Annual Computer Security Applications Conference, Austin, TX, USA, 8–12 December 2023; pp. 463–476. [Google Scholar]
Feng, Q.; Xia, Z.; Xu, Z.; Weng, J.; Weng, J. OPAF: Optimized Secure Two-Party Computation Protocols for Nonlinear Activation Functions in Recurrent Neural Network. arXiv 2024, arXiv:2403.00239. [Google Scholar]
Hu, C.; Zhang, C.; Lei, D.; Wu, T.; Liu, X.; Zhu, L. Achieving Privacy-Preserving and Verifiable Support Vector Machine Training in the Cloud. IEEE Trans. Inf. Forensics Secur. 2023, 18, 3476–4291. [Google Scholar] [CrossRef]
Zhang, C.; Hu, C.; Wu, T.; Zhu, L.; Liu, X. Achieving Efficient and Privacy-Preserving Neural Network Training and Prediction in Cloud Environments. IEEE Trans. Dependable Secur. Comput. 2023, 20, 4245–4257. [Google Scholar] [CrossRef]
Zhang, C.; Luo, X.; Liang, J.; Liu, X.; Zhu, L.; Guo, S. POTA: Privacy-Preserving Online Multi-Task Assignment with Path Planning. IEEE Trans. Mob. Comput. 2024, 23, 5999–6011. [Google Scholar] [CrossRef]
Zhang, C.; Zhao, M.; Liang, J.; Fan, Q.; Zhu, L.; Guo, S. NANO: Cryptographic Enforcement of Readability and Editability Governance in Blockchain Database. IEEE Trans. Dependable Secur. Comput. 2024, 21, 3439–3452. [Google Scholar] [CrossRef]
Brüggemann, A.; Hundt, R.; Schneider, T.; Suresh, A.; Yalame, H. FLUTE: Fast and secure lookup table evaluations. In Proceedings of the 2023 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 22–24 May 2023; pp. 515–533. [Google Scholar]
Riazi, M.S.; Weinert, C.; Tkachenko, O.; Songhori, E.M.; Schneider, T.; Koushanfar, F. Chameleon: A hybrid secure computation framework for machine learning applications. In Proceedings of the 2018 on Asia Conference on Computer and Communications Security, Incheon, Republic of Korea, 4–8 June 2018; pp. 707–721. [Google Scholar]
Knott, B.; Venkataraman, S.; Hannun, A.; Sengupta, S.; Ibrahim, M.; van der Maaten, L. Crypten: Secure multi-party computation meets machine learning. Adv. Neural Inf. Process. Syst. 2021, 34, 4961–4973. [Google Scholar]
Kumar, N.; Rathee, M.; Chandran, N.; Gupta, D.; Rastogi, A.; Sharma, R. CRYPTFLOW: Secure TensorFlow Inference. In Proceedings of the 2020 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 18–21 May 2020. [Google Scholar]
Mohassel, P.; Rindal, P. ABY3: A mixed protocol framework for machine learning. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, Toronto, ON, Canada, 15–19 October 2018; pp. 35–52. [Google Scholar]
Wagh, S.; Gupta, D.; Chandran, N. SecureNN: 3-party secure computation for neural network training. Proc. Priv. Enhancing Technol. 2019, 3, 26–49. [Google Scholar] [CrossRef]
Dong, Y.; Chen, X.; Jing, W.; Li, K.; Wang, W. Meteor: Improved secure 3-party neural network inference with reducing online communication costs. In Proceedings of the ACM Web Conference 2023, Austin, TX, USA, 30 April–4 May 2023; pp. 2087–2098. [Google Scholar]
Patra, A.; Suresh, A. BLAZE: Blazing Fast Privacy-Preserving Machine Learning. In Proceedings of the 2020 Network and Distributed System Security Symposium, San Diego, CA, USA, 23–26 February 2020. [Google Scholar]
Dalskov, A.; Escudero, D.; Keller, M. Fantastic Four: Honest-Majority Four-Party Secure Computation with Malicious Security. IACR Cryptology ePrint Archive, 2020. Available online: https://eprint.iacr.org/2020/1330 (accessed on 24 February 2025).
Byali, M.; Chaudhari, H.; Patra, A.; Suresh, A. FLASH: Fast and Robust Framework for Privacy-Preserving Machine Learning. IACR Cryptology ePrint Archive, 2019. Available online: https://eprint.iacr.org/2019/1365 (accessed on 24 February 2025).
Koti, N.; Patra, A.; Rachuri, R.; Suresh, A. Tetrad: Actively Secure 4PC for Secure Training and Inference. In Proceedings of the 2022 Network and Distributed System Security Symposium, San Diego, CA, USA, 24–28 April 2022. [Google Scholar]
Wagh, S.; Tople, S.; Benhamouda, F.; Kushilevitz, E.; Mittal, P.; Rabin, T. Falcon: Honest-Majority Maliciously Secure Framework for Private Deep Learning. Proc. Priv. Enhancing Technol. 2021, 2021, 188–208. [Google Scholar] [CrossRef]
Dessouky, G.; Koushanfar, F.; Sadeghi, A.R.; Schneider, T.; Zeitouni, S.; Zohner, M. Pushing the Communication Barrier in Secure Computation using Lookup Tables. In Proceedings of the 2017 Network and Distributed System Security Symposium, San Diego, CA, USA, 26 February–1 March 2017. [Google Scholar]
Ishai, Y.; Kushilevitz, E.; Meldgaard, S.; Orlandi, C.; Paskin-Cherniavsky, A. On the power of correlated randomness in secure computation. In Theory of Cryptography, Proceedings of the 10th Theory of Cryptography Conference, TCC 2013, Tokyo, Japan, 3–6 March 2013; Proceedings; Springer: Berlin/Heidelberg, Germany, 2013; pp. 600–620. [Google Scholar]
Goldberg, D. What every computer scientist should know about floating-point arithmetic. ACM Comput. Surv. (CSUR) 1991, 23, 5–48. [Google Scholar] [CrossRef]
Kusupati, A.; Singh, M.; Bhatia, K.; Kumar, A.; Jain, P.; Varma, M. FastGRNN: A Fast, Accurate, Stable and Tiny Kilobyte Sized Gated Recurrent Neural Network. arXiv 2019, arXiv:1901.02358. [Google Scholar] [CrossRef]
Warden, P. Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition. arXiv 2018, arXiv:1804.03209. [Google Scholar] [CrossRef]

Figure 1. Example of a function with

δ = 3

inputs and

σ = 1

outputs represented as Boolean circuit and lookup table.

Figure 1. Example of a function with

δ = 3

inputs and

σ = 1

outputs represented as Boolean circuit and lookup table.

Figure 2. System model of secure 3-party RNN inference.

Table 1. Notation table.

Notation	Description
$[\cdot]$	3-out-of-3 sharing
$〈\cdot〉$	2-out-of-3 replicated secret sharing
$[[\cdot]]$	3PC secret sharing in this work
$δ$ -to- $σ$ T	LUT T with $σ$ inputs and $δ$ outputs
$2^{I}$	Powerset of set $I$
${{\vec{E}}^{u}}_{u \in [δ]}$	Input encoding of LUT T with bit size $2^{δ}$
${{\vec{ξ}}^{w}}_{w \in [σ]}$	Output encoding of LUT T with bit size $2^{σ}$
$\bar{v}$	Complement of bit $v \in {0, 1}$ , $\bar{v} = 1 \oplus v$

Table 2. RNN model and corresponding activation function.

Model	Activation Function
vanilla RNN	Sigmoid, Softmax
LSTM	Sigmoid, Tanh
GRU	Sigmoid, Tanh
FastGRNN	Sigmoid, Tanh

Table 3. Online communication rounds.

Operator	Rounds
Operator	SIRNN	ABY3	SecureNN	Ours
2-Mult	4	1	2	1
DigDec	$l + 1$	-	-	$l + log l + 1$
B2A	-	$1 + log l$	-	1
LUT	$d + 1$	-	-	1

Note: In the table, l is bit width, and assume there is a

δ

-to-

σ

lookup table divided into several sub-blocks based on the input length

δ

, then the length of each sub block is d; “-” means that the baseline does not implement this function.

Table 4. Online communication complexity.

Operator	Comm (MB)
Operator	SIRNN	ABY3	SecureNN	Ours
2-Mult	$λ (l + 3) + 1.5 l^{2} + 2.5 l + 2$	$11 l$	$5 l$	l
DigDec	$l + log l + 1$	-	-	$2 l$
B2A	-	$l + l log l$	-	$l^{2}$
LUT	$2 λ + N l$	-	-	$σ$

Note: In the table,

λ

is the computational security parameter, l is bit width, and assume there is a

δ

-to-

σ

lookup table divided into several sub-blocks based on the input length

δ

, then the length of each sub block is d; “-” means that baseline does not implement this function.

Table 5. Online communication of RNN operators.

Building Block	Size	Comm (MB)
Building Block	Size	SIRNN	Ours
MatMul	$(128, 500, 100)$	0.156	0.109
sigmoid	$(128, 128)$	0.933	0.183
tanh	$(128, 128)$	0.947	0.190

Note: The data with better performance in the table is displayed in bold.

Table 6. Online runtime of RNN operators under LAN and WAN.

Building Block	LAN (ms)		WAN (s)
Building Block	SIRNN	Ours	SIRNN	Ours
sigmoid	302	296	6.15	5.65
tanh	322	297	6.03	5.81

Table 7. Online cost of end-to-end RNN inference with FastGRNN on speech command dataset.

Work	Comm (MB)	LAN (s)	WAN (s)
SIRNN	102.80	10.29	522.07
ours	62.25	10.31	598.10

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wu, Y.; Liao, C.; Sun, X.; Shen, Y.; Wu, T. Communication Efficient Secure Three-Party Computation Using Lookup Tables for RNN Inference. Electronics 2025, 14, 985. https://doi.org/10.3390/electronics14050985

AMA Style

Wu Y, Liao C, Sun X, Shen Y, Wu T. Communication Efficient Secure Three-Party Computation Using Lookup Tables for RNN Inference. Electronics. 2025; 14(5):985. https://doi.org/10.3390/electronics14050985

Chicago/Turabian Style

Wu, Yulin, Chuyi Liao, Xiaozhen Sun, Yuyun Shen, and Tong Wu. 2025. "Communication Efficient Secure Three-Party Computation Using Lookup Tables for RNN Inference" Electronics 14, no. 5: 985. https://doi.org/10.3390/electronics14050985

APA Style

Wu, Y., Liao, C., Sun, X., Shen, Y., & Wu, T. (2025). Communication Efficient Secure Three-Party Computation Using Lookup Tables for RNN Inference. Electronics, 14(5), 985. https://doi.org/10.3390/electronics14050985

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Communication Efficient Secure Three-Party Computation Using Lookup Tables for RNN Inference

Abstract

1. Introduction

2. Related Work

3. Preliminaries

3.1. Secret Sharing

3.1.1. · -Sharing

3.1.2. · -Sharing

3.2. Secret Sharing Semantics of This Work

3.3. Fixed-Point Representation

3.4. Lookup Table

4. Protocol Constructions

4.1. System Model

4.2. Conversions of Sharing

4.2.1. Wrap Function

4.2.2. Private Compare

4.2.3. Digit Decomposition

4.2.4. B2A Conversion

4.3. Lookup Table-Based 3PC Protocol

4.3.1. Mathematical Expression of Multi-Input LUT

4.3.2. Lookup Table Protocol

5. The 3PC Protocols of Secure RNNs Operators

5.1. Matrix Multiplication

5.2. Exponential

5.3. Sigmoid

5.4. Tanh

6. Security Analysis

7. Evaluation

7.1. Comparison of Theoretical Communication Cost on Building Blocks

7.2. Online Cost of RNN Building Blocks and End-to-End Inference

8. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A. Randomness Generation

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.1.1. $[\cdot]$ -Sharing

3.1.2. $〈\cdot〉$ -Sharing