An Efﬁcient BGV-type Encryption Scheme for IoT Systems

: Internet of Thing (IoT) systems usually have less storage and computing power than desktop systems. This paper proposes an efﬁcient BGV-type homomorphic encryption scheme in order ﬁt for secure computing on IoT system. Our scheme reduces the storage space for switch keys and ciphertext evaluation time comparing with previous BGV-type cryptosystems. Speciﬁcally, the switch key in homomorphic computations can be a constant but no longer one for each level. Moreover, the product of two ciphertexts can be at the same sublayer as them and the multiplication operations can be repeated between two sublayers. As a result, the multiplication times will not be limited by L in an L -level circuit and, thus, the ciphertext evaluation time will decrease signiﬁcantly. We implement the scheme with the C language. The performance test shows that the efﬁciency of the improved scheme is better than Helib in same conﬁgurations.


Introduction
The Internet of Thing (IoT) systems have been widely used in people's daily life [1,2]. In an IoT system, one can control his/her remote devices [3] to finish expected tasks. That requires the devices to have secure computing power [4]. Homomorphic encryption allows a device to perform arbitrary computations on encrypted data without user secret key. When compared with non-homomorphic encryptions, it removes the trust on the device and makes us ease to use the computing power of the device. Hence, homomorphic encryption has broad application prospects [5][6][7][8][9][10][11]. In a breakthrough work [12], Gentry demonstrated that fully homomorphic encryption was theoretically possible based on ideal lattices. In Gentry's work, the ciphertext can be regarded as some "noise" attached to the plaintext, and the noise will increase along with evaluation times. When the noise exceeds a threshold value, which makes decryption fail, a bootstrapping program can refresh the noise introduced by evaluations and make the ciphertext as a new one generated by the encryption algorithm. However, the efficiency of the scheme is far from practical application. Following Gentry's work, many researchers tried to improve the performance of homomorphic encryption. Based on the learning with error assumption [13,14], Brakerski and Vaikuntanathan [15,16] (BV) presented a dimension-modulus reduction technique and constructed a leveled homomorphic encryption without the requirement of a bootstrappable encryption. Brakerski, Gentry, and Vaikuntanathan [17] (BGV) then dramatically improved the performance of the BV-type homomorphic encryption. Moreover, Gentry and Halevi and Smart reduced the size of public keys and ciphertexts [18] of the BGV-type scheme at the cost of increasing the probability of key recovery attack. Subsequently, they optimized the execution times of fast Fourier transform (FFT) and Chinese reminder theorem (CRT) by reducing the

Plaintext Encoding
Based on the underlying algebra, we know that a cyclotomic polynomial can be factorized into l irreducible polynomials as Φ m (x) = ∏ l i=1 F i (x) mod p. Hence, the quotient ring R p = Z p [x]/Φ m (x) ∼ = Z p [x]/F 1 (x) ⊗ · · · ⊗ Z p [x]/F l (x). Let d be the order of (m, p), such that p d = 1 mod m. We have l = φ(m)/d. Here, φ() is Euler function, d is the degree of each F i (x), and l is the number of F i (x), which represents the number of plaintext slots. Subsequently, a message can be encoded with the coefficients and modulus of a polynomial ring as the plaintext.

Ciphertext Encoding
Similar to the plaintext case, the ciphertext space in BGV [17] is defined by a quotient ring R q = Z q [x]/Φ m (x), where q is far greater than p. Naturally, we wonder what is the detailed relationship between p and q. Reference [19] indicates that q can be the product of a series of prime numbers p 0 · · · p L−1 . L is the depth of the circuit in a leveled homomorphic encryption scheme and a new ciphertext is with respect to q L−1 . Each time two ciphertexts multiply, the new ciphertext of the product is mapped from R q i to R q i−1 , where q i = ∏ i j=0 p i for i = 0,· · · , L − 1. ( [19], Section 3.2) introduces the procedure for mapping an element in R q i to the corresponding element in R q i−1 in the modulus-reduction process. Hence, L is the upper bound of the multiplication and addition times. As an optimization of keeping ciphertexts and key switch matrices in evaluation (CRT) representation, Ref. [19] also suggests that p i = 1 mod m for each i = 0,· · · ,L − 1.

Security Assumption of BGV
The security of BGV-type cryptosystems is based on the learning with errors over rings (RLWE) assumption [13,35]. The RLWE(λ,q,χ) assumption is to distinguish two distributions (a,a·s + ) and (a,b), where a, s, and b are randomly selected from R q and is selected from χ referencing security parameter λ. This assumption has been proved hard over ideal lattices in [35].

Principle of BGV
Let p, q be two prime numbers and p < q. R p and R q are groups formed by p and q according to algebra. For any number m∈R p , if 0 < m + p < q, we have [(m + p) mod q] mod p = m. For example, let p = 5, q = 19, and m = 3. If = 3, m+ p = 18. 18 mod 19 mod 5 = 3. Otherwise, if = 4, m + p = 23.23 mod 19 mod 5 = 4. If = −1, m + p = −2. −2 mod 19 mod 5 = 2. Let m be the plaintext and (m + p) be the ciphertext. If the operations on the ciphertext do not make the intermediate result exceeding the modulu q, all the errors can be eliminated by moding q and then moding p. That is because the error term is multiplied by p. In our opinion, the BGV and BGV type schemes construct cryptosystems based on this principle and try their best to control the error growth in computing.

Introduction of BGV
In original BGV [17], public key and switch keys are matrices. ( [18], Appendix D.2) indicates those keys can also be vectors over polynomial ring. Subsequently, the key size can be reduced. In this section, we describe BGV in vector form. As all the other encryption schemes, BGV includes four algorithms, Setup, Keygen, Encrypt, and Decrypt as the basic encryption-decryption part. Moreover, it includes Add and Multiply to perform homomorphic calculations. These algorithms are as follows.
Setup: given a security parameter λ and level L, the setup algorithm generates L large prime numbers q 0 , · · · , q L−1 , where q 0 < · · · < q L−1 and selects the discrete Gaussian distribution χ as the error distribution. It defines the plaintext space by selecting m and p and computing l and d. q 0 , · · · , q L−1 , χ, p, and l are public parameters.
KeyGen: given the public parameters, first, the key generation algorithm selects a random vector s as the secret key. Then, it computes b = −(a·s +p· ) mod q L−1 , where a is a random element in R q L−1 , is sampled from χ, and p is the plaintext modulu, and let (a, b) be the public key. Next, it computes b i = −(a i ·s + p· i − t i ·s 2 ) mod t i ·q i , where a i a random element in R q i , t i is an integer, and i is sampled from χ, and finally let (a 0 , b 0 , t 0 , 0), · · · , (a L−1 , b L−1 , t L−1 , L − 1) be the switch key. Encrypt: given a plaintext m, the encryption algorithm randomly selects a vector v, where each v i ∈{0, ±1} with probability { 1 2 , 1 4 }, samples e 0 and e 1 from χ, and computes c 0 = b·v + p·e 0 + m mod q L−1 , c 1 = a·v + p·e 1 mod q L . ct = (c 0 , c 1 , L − 1) is the initial ciphertext.
Decrypt: given a ciphertext ct = (c 0 , c 1 , i) and the secret key s, the decryption algorithm computes m ← c 0 + c 1 ·s mod q i mod p.
Mul: given two ciphertexts ct = (c 0 ,c 1 ,i) and ct = (c 0 ,c 1 ,i) under same secret key, the multiplication algorithm first computes d 0 = c 0 ·c 0 , d 1 = c 0 ·c 1 + c 1 ·c 0 , and d 2 = c 1 ·c 1 , Then it compresses (d 0 , d 1 , d 2 ) to ct * = (c * 0 , c * 1 ), where c * 0 = t i d 0 + b i ·d 2 mod t i q i and c * 1 = t i d 1 + a i ·d 2 mod t i q i , with the switch key (a i , b i , t i , i). Next, ct * ∈R t i q i is first converted into ct * ∈R q i and then the output ciphertext ct∈R q i−1 by SwitchModulu.
Add: given two ciphertexts ct = (c 0 , c 1 , i) and ct = (c 0 , c 1 , i), the addition algorithm computes the sum of the two ciphertext as ct * = (c 0 +c 0 mod q i , c 1 +c 1 mod q i ), and then converts ct * ∈R q i to ct∈R q i−1 as the output.
SwitchModulu: given a ciphertext ct = (c 0 , c 1 , i) mod q i , two modulus q i and q j where i > j, the modulu switch algorithm computes the modulo inverse element rp j = q j q i in q j , and output the new ciphertext ct =(c 0 , c 1 , j), where c 1 = c 1 ·rp j mod q j and c 2 = c 2 ·rp j mod q j .
Correctness. first, given ct = (c 0 , c 1 , L − 1) and corresponding secret key s, ct, s = c 0 + c 1 ·s = {b·v Second, given ct = (c 0 , c 1 , i − 1) and secret key s, where ct is the product of ct 1 = (c 0 , c 1 , i) and ct 2 = (c 0 , c 1 , i) outputted by Mul, let rt i be the inverse element of t i that maps element from R t i q i to R q i , we have ct, s = c 0 + s·c 1 Third, given ct = (c 0 , c 1 , i − 1) and secret key s, where ct is the sum of ct 1 = (c 0 , c 1 , i) and ct 2 = (c 0 , c 1 , i) outputted by Add, we have ct, s = c 0 + s·c 1 By selecting small parameters,like v, , e 0 , e 1 , etc., the components of a ciphertext may be far less than modulu. However, with the increasement of multiplication and addition times, errors will eventually exceed the modulu. The function of SwitchModulu is to continually scale cumulative errors to That is why the maximum multiplication and addition times are limited by the predefined parameters L. In the multiplication algorithm, the direct product of two ciphertexts is a new ciphertext with three elements, key switch is to recompress it to two elements and the errors are still the same.

Modifications on BGV-type Cryptosystems
Based on the BGV-type Cryptosystems, we make following changes. We may observe the decryption processes after the modifications.
First, let q 0 = p 0 = p, p i = 1 mod p, and q i = ∏ i j=0 p j for i = 1,· · · ,L − 1 in the setup algorithm. Second, replace t i with p i+1 in switch key generation and key switch algorithms, and then remove the SwitchModulu operations that map elements from R p i+1 ·q i to R q i . That is to say, each b i = −(a i ·s + p· i − p i+1 ·s 2 ) mod p i+1 ·q i (= q i+1 ). Third, move the SwitchModulu operation in Mul from the last step to the first step.
Because the encryption, decryption, and addition algorithms do not change, we focus on the multiplication algorithm. Given two ciphertexts ct

, i) and ct
1 , i) of the plaintexts m 1 and m 2 , the multiplication algorithm first SwitchModulu them to ct 1 , i − 1) and then SwitchKey the product of ct If the error terms are less than q i , we have ct, s mod p = m 1 m 2 . Obviously, the correctness condition is the same as the BGV scheme. Observation 1. Next, we replace the normal switch key epk i with another sublayer's switch key epk j in Mul and observe the decryption process. In this condition, the ciphertext outputted by Mul is as follows: Given the secret key s, we have However, if p i =1 mod p and q i = ∏ i j=0 p j for i = 1,· · · ,L, we have p i mod q i mod p = p j mod q i mod p = 1. Thus, ct, s mod p = (c (1) The correctness condition is the same as BGV. If the error terms are less than q i , we also have ct, s mod p = m 1 m 2 .
As a result, we do not need to generate a switch key for each level after the modification. Only one l-dimension switch key is enough for multiplication. Thus, the total size of public keys can be reduced from linear to constant. Observation 2. Two ciphertexts and their product are at the same sublayer after the modifications. The reason that each b i should modulu t i ·q i , but not q i in BGV is the SwitchKey operation in Mul will lift elements from R q i to R t i ·q i . After replacing t i with p i+1 , the elements will be lifted from R q i to R q i+1 . The reason we move the SwitchModulu operation in Mul from the last step to the first step is to satisfy this requirement. Because all of the input elements belong to R q i in Mul, the accumulative errors are far less than R q i+1 . Thus, the decryption can succeed and another SwitchModulu operation can be omitted.
The advantage of this change is all of the multiplication operations can be repeatedly done between R q i and R q i+1 . Subsequently, the multiplication times will not be limited by L in an L-level circuit and, thus, naturally become unlimited without bootstrapping. That is to say, L can be very small. We may use small number of large primes to replace large number of small primes. As a matter of fact, the product of several large primes is far less than the product of many small primes. For example, we know 100 2 2 100 . As a result, the ciphertext evaluation time will decrease significantly.

Scheme Description
In this section, we describe a variant BGV-type scheme under our modifications. Because the scheme is to optimize the ciphertext evaluation processes. The description of encoding a message to a vector as plaintext is ignored. Moreover, the basic secret key generation, public key encryption, encryption, and decryption algorithms are the same as known schemes [5,19]. The details are as follows: Setup: given a security parameter λ, the setup algorithm selects a prime number p(λ) and a number m(λ) and computes corresponding parameters d and l to define the plaintext space. Afterwards, it selects a series of prime number p 0 (λ), · · · , p L−1 (λ), where p 0 = p, p 1 (λ) = · · · = p L−1 (λ) = 1 mod p(λ), and computes q i = ∏ i j=0 p j , for i = 0, · · · , L − 1. All of the ciphertexts and keys are defined in R q L−1 . The public parameters include all above parameters.
SecretKeyGen: given the public parameters, the secret key generation algorithm selects an l-dimension vector s. Each element in s is chosen from a noise distribution χ. Output SK = s.
PublicKeyGen: given the public parameters and an l-dimension vector s, the public key generation algorithm selects an l-dimension vector a from R q L−1 , an l-dimension error vector from distribution χ, and computes an l-dimension vector b = a·s + p· mod q L−1 . Output PK = (a, b).
SwitchKeyGen: given the public parameters and an l-dimension vector s, the switch key generation algorithm selects an l-dimension vector a uniformly from R q L−1 , an l-dimension error vector from distribution χ, and computes an l-dimension vector b = a·s + p· − s 2 ·(p·r + 1) mod q L−1 , where r is a random number. Output EPK = (a, b).
Encrypt: given the public parameters, a public key PK = (a, b) and a message m∈R l p , the encryption algorithm selects an l-dimension vector v. Each of its element is chosen from {−1, 0, 1} with probability of ( 1 4 , 1 2 , 1 4 ). Then it selects two l-dimension error vectors e 0 and e 1 , which their elements are sampled from the discrete Gaussian distribution with variance σ. The ciphertext ct consists of two elements c 0 , c 1 ∈R q L−1 and the depth is L − 1. c 0 = b·v + p·e 0 + m mod q L−1 and c 1 = a·v + p·e 1 mod q L−1 . ct = (c 0 , c 1 , L − 1) is outputted.
Decrypt: given the public parameters, a secret key SK = s and a ciphertext ct = (c 0 , c 1 , i), the decryption algorithm computes m ← ((c 0 − s·c 1 ) mod q i ) mod p.
SwitchModulu: given a ciphertext ct = (c 0 , c 1 , i) mod q i , two modulus (i, q i ) and (j, q j ) where i > j, the modulu switch algorithm computes the modulo inverse element rp j = q j q i in q j , and output the new ciphertext ct =(c 0 , c 1 , j), where c 1 = c 1 ·rp j mod q j and c 2 = c 2 ·rp j mod q j . Usually, i = j + 1 is enough if L is just fit for security requirements. The modulo inverse element rp j can be computed by the extended Euclid algorithm (see ex_gcd in Appendix A.2).
Add: given two ciphertexts ct 1 = (c  Correctness. Above scheme is nearly same as BGV except that the switch key is fixed and the key switch algorithm does not switch ct to a lower level one at last. First, given a ciphertext ct = (c 0 , c 1 , L − 1) outputted by Encrypt, a secret key SK = s, we have ct, s mod p = c 0 − s·c 1 mod p = (b·v + p·e 0 + m − sa·v − sp·e 1 ) mod q L−1 mod p = m + p(e 0 + − s·e 1 ) mod q L−1 mod p = m.
Security. All of our changes are on the homomorphic computation parts, including the switch key generation algorithm and the multiplication algorithm. The basic encryption-decryption function parts, including the secret key generation algorithm, the public key generation algorithm, the encryption algorithm, and the decryption algorithm are the same as known schemes [5,19]. As introduced in the background, the security of BGV-type cryptosystems is based on the RLWE assumption. In our scheme, the public key (a,b) where b = a·s + p· mod q L−1 has similar form as that of the distribution (a,a·s + ). The difference is p and q L−1 are exposed to the adversary in the setup algorithm. However, this does not reveal more information, because the error is provided by but not p. We may regard the distribution in RLWE assumption as (a,a·s + p· ) where p =1. In our scheme, p is a prime number referencing the security parameter. Then the forms of the assumption are same. Logically, only modifying the multiplication algorithm will not affect the security of the basic encryption-decryption function. Thus, the security of our scheme does not need to be re-proved.

Implementation and Performance Analysis
In this section, we instantiate the above scheme in the range of "int", which occupies 32 bits. "int" is the most intuitive data type in the C language. It expresses numbers between −2147483648 and 2147483647. We remove the negative part and let all computation results within [0,2147483647]. Although "int" is too small to prevent attacks, the correctness and performance of above scheme can be verified. In practical applications, large number, like "bigint" in C++ or "biginteger" in Java can easily replace it. In our implementation, the depth L is set to 3. p = 11, m = 5, d = 1, and l = (φ(5) = 4)/(d = 1) = 4. Moreover, p 0 = 11, p 1 = 23(=2 × 11 + 1), and p 2 = 67(=6 × 11 + 1). Correspondingly, q 0 = 11, q 1 = 253 (=11 × 23), q 2 = 16951 (=11 × 23 × 67). The modulu inverse rp 1 = 34 can be computed such that p 2 ·rp 1 = 67 × 34 = 1 mod 253. Detailed implements to find it are given in Appendix A.2 The reason we set p = 11 is that the two smallest prime numbers those equal 1 modulu 11 are 23 and 67. Then 16951 2 = 287336401 < 2147483647 can ensure all of the intermediate computation results do not exceed the representation range of "int". If we set p = 13, the two smallest prime numbers are p 1 = 53 and p 2 = 79. Then we have q 1 = 689, q 2 = 54431. Because 54431 2 = 2962733761 > 2147483647, the decryption may fail at last. Of course, p can be set to smaller prime numbers than 11 in the toy implementation. The initial ciphertext generated by the encryption algorithm, the secret key, the public key, and the switch key generated by the key generation algorithms are all at level 2. To simplify the implementation, all of the random numbers are chosen from the random number generator in the C library but not χ.
Next, we test the performance of our scheme. The source code of our implementation is given in Appendix A. It includes three files, stdafx.h, stdafx.cpp, and unlimited homomorphic encryption.cpp. "stdafx.h" defines all the data structures. "stdafx.cpp" defines all of the functions. The functions are invoked in "unlimited homomorphic encryption.cpp". Appendix B lists the source code of an implementation based on the Helib library. The implementation uses similar parameters as ours. Figure 2    Intuitively, the homomorphic calculation time grows linearly in our scheme, but sub-exponentially in the Helib-based scheme. Meanwhile, the execution time of our scheme is much less than the Helib-based one.
We know that RSA is an encryption scheme only supporting multiplicative homomorphic and it has been applied into many security applications. As a further comparison, we list the Mul operation time of RSA. Since the ciphertext modulu of our scheme is q 2 = 16951, we select p = 127 and p = 131 in the toy RSA example, such that their product n = 16637 is very close to q 2 . The execution time of our scheme is about 100 times of RSA. The source code of the benchmark RSA is in Appendix C.
Finally, Figure 3 lists the storage costs of Helib and ours. The size of public key in Helib is linear growth with maximum multiplications times, while the size of public key in our scheme is a constant.

Conclusions
In this paper, we present an efficient BGV-type scheme for IoT systems. Our scheme reduces the storage space and evaluation time, and it is more suitable for secure IoT systems. The performance test of the proof of concept implementation shows thatour new scheme is more practical than thecurrent method.   number exceeds 2,147,483,647, the decrypt will fail. If p 0 = 11, q 2 2 is less than 2,147,483,647. If p 0 = 13, q 2 2 will exceed 2,147,483,647. So we set the mode of the plaintext space to 11. It also can be smaller prime number in int space.