New Order-Revealing Encryption with Shorter Ciphertexts

: As data outsourcing services have been becoming common recently, developing skills to search over encrypted data has received a lot of attention. Order-revealing encryption (OREnc) enables performing a range of queries on encrypted data through a publicly computable function that outputs the ordering information of the underlying plaintexts. In 2016, Lewi et al. proposed an OREnc scheme that is more secure than the existing practical (stateless and non-interactive) schemes by constructing an ideally-secure OREnc scheme for small domains and a “domain-extension” scheme for obtaining the ﬁnal OREnc scheme for large domains. They encoded a large message into small message blocks of equal size to apply them to their small-domain scheme, thus their resulting OREnc scheme reveals the index of the ﬁrst di ﬀ ering message block. In this work, we introduce a new ideally-secure OREnc scheme for small domains with shorter ciphertexts. We also present an alternative message-block encoding technique. Combining the proposed constructions with the domain-extension scheme of Lewi et al., we can obtain a new large-domain OREnc scheme with shorter ciphertexts or with di ﬀ erent leakage information, but longer ciphertexts.


Introduction
Database encryption has received increased attention recently because of the enormous amount of sensitive data stored in outsourcing cloud databases. One of the promising solutions to protect the confidentiality of sensitive data is to use encryption and perform query evaluation over encrypted data.
Order-Preserving Encryption. Property-preserving encryption, which preserves some property of plaintexts, enables performing query evaluation on ciphertexts. Among them, order-preserving encryption (OPEnc) [1][2][3][4][5], whose ciphertexts preserve the numerical ordering of their underlying plaintexts, has received a lot of attention as it can support efficient query operation on encrypted data such as sorting and range queries using the ordering information. In 2004, Agrawal et al. [1] first proposed the concept of OPEnc. Later, Boldyreva et al. [2] provided the security notions of OPEnc formally and showed that any immutable OPEnc schemes with ideal security must have the ciphertext length that grows exponentially in the plaintext length. Recently, some ideally-secure OPEnc schemes [3][4][5] whose ciphertexts reveal no additional information beyond the order of the underlying plaintexts have been proposed. However, these schemes are stateful and mutable, thus they require large communication and storage complexities.
Order-Revealing Encryption. Boneh et al. [6] introduced order-revealing encryption (OREnc), which can be viewed as a generalization of OPEnc. In the OREnc schemes, anyone can check the ordering information of the underlying plaintexts from ciphertexts through a publicly computable comparison function, thus the encrypted data are not constrained to any particular form. Their construction is the first stateless and non-interactive OREnc scheme that achieves the ideal security. However, their OREnc scheme relies on multilinear maps that require heavy computation and strong assumptions and suffer from security analysis [7], and thus are not efficiently implementable. As part of solving this problem, Chenette et al. [8] presented the first efficiently-implementable OREnc scheme from pseudo-random functions. They also provided a novel security model of OREnc that precisely quantifies what information of the underlying plaintexts is leaked. Later, Lewi et al. [9] proposed a new OREnc scheme with reduced leakage as compared with the scheme of [8]. This result could be achieved by constructing an ideally-secure OREnc for polynomially-sized domains (OREncS) scheme and a "domain-extension" scheme for obtaining OREnc for exponentially-sized domains (OREncL) scheme. They encoded a large message into small message blocks of equal size to apply them to their OREncS scheme, thus their resulting OREncL scheme reveals the index of the first differing message block. The authors of [10] comprehensively analyzed and compared OP(R)Enc schemes described so far and provided their performance results.
Our Contribution. In this work, we begin by reviewing the constructions of [9] and then present a new ideally-secure OREncS scheme with shorter ciphertexts. Combining it with the domain-extension scheme of [9], we can obtain a new OREncL scheme with shorter ciphertexts under the same security level. We also present an alternative message-block encoding technique. In a similar way, we can also obtain a new OREncL scheme with a different security level, but the ciphertext length is getting longer. It is hard to claim that the resulting OREncL scheme is more secure than the scheme of [9]. However, these results provide a clue that there might exist more secure and efficient message-block encoding techniques.

Preliminaries
We write λ and [n] as a security parameter and a set of integers {1, . . . , n} where n is a positive integer, respectively. For any bit strings x, y ∈ {0, 1} * , x y means the concatenation of x and y.
We write x ← S to denote the sampling of a value x from the distribution S or a uniformly random sampling from the set S. Two distributions D 1 and D 2 are computationally indistinguishable if there is no efficient poly-time adversary to distinguish D 1 from D 2 , except with negligible probability. Similarly, if the statistical distance between D 1 and D 2 is negligible, we say that they are statistically indistinguishable. We now review the definition of a secure pseudo-random function F and a secure pseudo-random permutation π. A function F: K × X → Y is a secure pseudo-random function if there is no polynomially-bounded adversary who can distinguish F(k, ·), where k ← K from a truly random function f (·) from X to Y except with negligible probability on arbitrary inputs chosen by the adversary. A secure pseudo-random permutation π: K × X → X can be defined similarly as there is no polynomially-bounded adversary who can distinguish π(k, ·), where k ← K from a truly random permutation on X. All logarithms in this paper are to the base of 2.

Security of OREnc
In this section, we review a simulation-based OREnc security model of [8] that precisely quantifies what information of plaintexts is leaked by defining a leakage function. We denote an adversary and a simulator for some q = poly(λ) by A = (A 1 , . . . , A q ) and S = (S 0 , . . . , S q ), respectively. Let Π = (Π.Setup, Π.Encrypt, Π.Compare) be an OREnc scheme and L(·) denotes a leakage function of Π. For a security parameter λ, the experiments REAL Π A (λ) and SIM Π A,S,L (λ) are defined as follows: The given OREnc scheme Π is secure with leakage function L(·) if, for all polynomially-bounded adversaries A, there exists a simulator S such that the two distributions REAL Π A (λ) and SIM Π A,S,L (λ) are computationally indistinguishable. From the security notion, we say that Π is ideally-secure if the leakage function L(·) reveals only the relative order of the underlying plaintexts.

OREnc for Small Domains
In 2016, Lewi et al. [9] proposed a new OREnc scheme to solve the problem of [8] "revealing the index of the first bit that differs between two underlying plaintexts". The starting point of their construction was presenting an ideally-secure OREncS scheme. Now, we review briefly their ideally-secure OREncS scheme. Let H, F, and π denote a hash function with an output space {0, 1, 2}, a secure pseudo-random function, and a fixed random permutation, respectively. The ciphertextct ctx for a given message msg consists of the following two parts: ctx L = (F(skey, π(msg)), π(msg)) and Here, CMP(msg 1 , msg 2 ) outputs −1 if msg 1 < msg 2 , 0 if msg 1 = msg 2 , and 1 if msg 1 > msg 2 . Let ctx 1 L = (a, b) and ctx 2 R = (r, v 1 , . . . , v N ) denote the left encryption part of msg 1 and the right encryption part of msg 2 , respectively. Then, the Π.Compare algorithm can obtain CMP(msg 1 , msg 2 ) by computing v b − H(a, r). The main idea of this construction is that ctx L contains message information hidden by the pseudo-random permutation and ctx R is the encryption of the all relative order information to each message. Thus, the ciphertext size should grow linearly with the size of the domain space. More specifically, the size of each ciphertext is 2λ + log N + N log 3 for a security parameter λ and a domain [N].

Proposed OREncS Scheme
In this section, we propose a new ideally-secure OREncS scheme with shorter ciphertexts. The main idea of our construction is to reduce the length of ciphertexts by replacing a random value r of ctx R by F(skey, π(msg)) of ctx L and by eliminating the π(msg) term of ctx L using a new ciphertext form. For a fixed security parameter λ and a message space [N], let F: {0, 1} λ × [N] → {0, 1} λ be a secure pseudo-random function and H: {0, 1} λ × {0, 1} λ → {0, 1} be a 1-bit output hash function modeled as a random oracle. In our scheme, CMP(msg 1 , msg 2 ) outputs 1 if msg 1 ≤ msg 2 , and 0 otherwise. As described in [9], the order relation can be clarified by combining of the two results CMP(msg 1 , msg 2 ) and CMP(msg 2 , msg 1 ). The details of our proposed OREncS scheme Π are defined as follows: This encryption algorithm outputs ctx L = (F(k, msg), v 1 , . . . , v π(msg)−1 ) and ctx R = (v π(msg)+1 , . . . ,v N ) as a ciphertext ctx. a). In a similar way, the result of CMP(msg 2 , msg 1 ) can be obtained.
Proof of Theorem 1. We assume that there exists a message pair (msg 1 , msg 2 ) such that the Π.Compare(ctx 1 , ctx 2 ) algorithm does not output 1 where msg 1 < msg 2 . For two ciphertexts By the definition, CMP(msg 1 , msg 2 ) is defined as 1 for msg 1 ≤ msg 2 . In the identical way, we can prove that CMP(msg 2 , msg 1 ) = 0 (msg 2 > msg 1 ) can also be recovered correctly from the ciphertexts. Therefore, Π.Compare(ctx 1 , ctx 2 ) must output 1, which is a contradiction of our assumption.
Efficiency. Table 1 shows some comparison results of our scheme and the OREncS scheme of [9]. The ciphertext of our OREncS consists of a λ-bit output of the pseudo-random function F and N encrypted order information bit, thus the length of the ciphertext is λ + N. Compared with [9], the ciphertext of our scheme does not need to maintain the λ-bit random value and the log N -bit permuted message information. Table 1. Comparison of our order-revealing encryption for polynomially-sized domains (OREncS) scheme defined in Section 3.1 and the existing scheme of [9]. Note that this is the result when the same 1-bit output hash function is applied to their scheme.

Theorem 2. (Security)
The proposed OREncS scheme Π defined in Section 3.1 is ideally-secure.

Proof of Theorem 2.
To show that our proposed OREncS scheme guarantees the ideal security, it should be shown that the ciphertexts indistinguishable from real can be simulated using only the ordering information of the underlying messages. More formally, we should prove that there exists a simulator S = (S 0 , . . . , S q ) such that two distributions REAL Π A (λ) and SIM Π A,S,L (λ) are computationally indistinguishable for some q = poly(λ) and an adversary A = (A 1 , . . . , A q ) defined in the OREnc security experiment.
Simulator Modeling. On input of a security parameter λ, the following two tables are maintained to ensure this simulation consistency throughout the proof.
) maintains the simulated outputs of the pseudo-random function and the fixed random permutation.
The initial state state S of S 0 consists of the two empty tables (H T , Fπ T ). For an i (∈ [q])-th message msg i of encryption query, ctx j can be returned if msg i = msg j for some j < i. Without any loss of generality, only distinct queried messages are considered in the proof. We now describe how to simulate ctx i responding to the i-th queried message msg i using state S and the relative order information of (msg 1 , . . . , msg i−1 ). Indistinguishability. To complete our security proof, we now show that two distributions REAL Π A (λ) and SIM Π A,S,L (λ) are computationally indistinguishable by defining a series of the below hybrid games: Game G 1 : Same as G 0 , except the pseudo-random function F is switched by a truly random function f : Game G 2 : Same as G 1 , except the game aborts if the adversary queries (f (msg), ·) or (·, f (msg)) to the random oracle H before simulating the ciphertxt of msg. • Game G 3 : This game is SIM Π A,S,L (λ).

Lemma 1. Game G 0 and G 1 are computationally indistinguishable if F is a secure pseudo-random function.
Proof of Lemma 1. It is trivial from the definition of the secure pseudo-random functions.

Lemma 2. Game G 1 and G 2 are statistically indistinguishable if H is a random oracle.
Proof of Lemma 2. To prove lemma 2, we should show that the abort probability of Game G 2 is negligible. We clearly know that all components of the returned ciphertexts are distributed independently from f (π(msg)) before issuing a message msg to the encryption query input. Because f(·) is a truly random function, the probability that the adversary queries (f (msg), ·) or (·, f (msg)) to the random oracle H before simulating the ciphertext of the message msg is at most poly(λ)/2 λ .

Lemma 3. Game G 2 and G 3 are statistically indistinguishable if H is a random oracle.
Proof of Lemma 3. Let ((a, v 1 , . . . , v k1−1 ), (v k1+1 , . . . , v N )) and ((a', v' 1 , . . . , v' k2−1 ), (v' k2+1 , . . . , v' N )) be the ciphertexts from G 2 and G 3 . We now show these two distributions are statistically indistinguishable and the ciphertext under G 3 is valid. The value a is an output of a random function f and a' is uniformly sampled from {0, 1} λ , thus they are statistically indistinguishable by the definition of the random functions. A bit v i is computed as CMP(π −1 (i), msg) ⊕ H(f (msg), f (π −1 (i)) in Game 2 and the output of H is uniformly random on {0, 1}, thus each v i is also distributed uniformly in {0, 1}. That is, v i and v' i are statistically indistinguishable unless H(f (msg), ·) or H(·, f (msg)) is revealed to the adversary before simulating the ciphertext of msg, but this will never happen by the definition of Game 2 and Game 3. Finally, the bit positions k 1 and k 2 are also statistically indistinguishable because they are the outputs of the random permutation. We now show that the simulated ciphertext in Game 3 is correct. Let ((a, v 1 ) of msg 2 be the two simulated ciphertexts from Game G 3 . From the definition of the simulation, H(a , a) is defined CMP(msg 1 , msg 2 ) ⊕ v' k1 by the random oracle modeling, thus the Π.Compare algorithm can obtain the correct result as follows: Combining lemmas 1-3, we conclude that our proposed OREncS scheme is ideally-secure.

Alternative Message-Block Encoding Technique
The domain-extension algorithm of [9] is quite straightforward. At a high level, when message msg is represented in x 1 x 2 ··· x n as the d-ary strings, the corresponding ciphertext can be constructed as ctx 1 ctx 2 ··· ctx n , where each ctx i is an encryption of x i by OREncS with a domain size d. One thing to note is that a pseudo-random permutation is applied (not a fixed random permutation) and the key part in the pseudo-random permutation is derived from the prefix of each block x i to reveal only the index of the first block that differs between two plaintexts. Actually, the construction of [8] can be seen as taking an ideally secure OREnc scheme for 1-bit domains and extending it to the OREnc scheme for n-bit domains. The authors of [9] applied this general extension technique to their OREncS scheme. The ciphertext consists essentially of n ciphertexts of the OREncS scheme with domain size d, thus the total ciphertext size on domain size N ≤ d n is (n + 1)λ + n ( log d + d log 3 ). Interested readers should refer to the paper [8,9] for more details.

Proposed OREncL Scheme
In this section, we introduce a new message-block encoding technique to construct a new OREncL scheme from our proposed OREncS scheme. We first show how to divide an exponential-size message into polynomial-size blocks to use them as inputs of our proposed scheme. In the construction of [9], it caused the "revealing the index of the first differing block" problem, because a message is divided into small message blocks of equal size. This means that an adversary can infer the approximate distance between the underlying two messages and a message block can be recovered if he obtains d ciphertexts that have the same leakage information. To alleviate these problems, we provide an alternative message-block encoding technique. The main idea of our scheme is that a message is divided by the position and the size of consecutive 1's. Let 1(i, j) denote the size of j consecutive 1's staring from the i-th bit position. Here, the index i is counted from the least significant bit. For example, a message 011101, 011100, 111,111 can be represented as {1(5, 3), 1(1, 1), 1(0, 0)}, {1(5, 3), 1(0, 0), 1(0, 0)} and {1(6, 6), 1(0, 0), 1(0, 0)}. We can get the ordering information to check the same level component. In our example, we can know 011,101 > 011,100 from (5 = 5, 3 = 3, 1 > 0). To hide the exact number of 1(i, j), we use 1(0, 0) padding. Note that 3 is the largest possible element number for a 6-bit message space. More formally, an (even) n-bit message containing {1(i 1 , j 1 ), 1(i 2 , j 2 ), . . . , 1(i k , j k )} can be encoded as {i 1 , j 1 , i 2 , j 2 , . . . , i n/2 , j n/2 }, where the elements of i k+1 , j k+1 , . . . , i n/2 , j n/2 are 0. Our final ciphertext of a message encoded as {i 1 , j 1 , i 2 , j 2 , . . . , i n/2 , j n/2 } can be computed as follows: • For a given message msg, we first encode it as {i 1 , j 1 , i 2 , j 2 , . . . , i n/2 , j n/2 } by our proposed technique.

Analysis of Proposed OREncL
Because it is essentially identical to the OREncL scheme of [9], except for the way of generating message blocks, presenting the concrete description of our full OREncL scheme and the details of security proof is not necessarily required. The leakage information "CP of 1(i, j)'s" of our scheme can be defined as the common prefix {1(i 1 , j 1 ), . . . , 1(i k−1 , j k−1 )} of the underlying two messages where k is the index of the first 1(i, j) that differs. The equality information of i k and j k is also revealed. Compared with the leakage information of "revealing the index of the first differing block", it is difficult to determine which leakage information is more critical, thus we thought that it could be another alternative option. Furthermore, this result provides a clue that there might exist more secure and efficient message-block encoding techniques.
In this analysis chapter, we present the result of the efficiency analysis. The following theorem shows that our message-block encoding technique preserves the order of messages correctly.

Theorem 3. Our proposed message-block encoding technique preserves ordering information correctly.
Proof of Theorem 3. First of all, every between 1(i, j) and 1(i', j') requires at least 1-bit 0, thus there is no message that contains more than k−1(i, j) blocks where k > n/2. For any two messages msg 1 encoded as {i 1 , j 1 , i 2 , j 2 , . . . , i n/2 , j n/2 } and msg 2 encoded as {i' 1 , j' 1 , i' 2 , j' 2 , . . . , i' n/2 , j' n/2 }, where msg 1 < msg 2 , assume that i is the index of the first bit that differs, that is, the i-th bit of msg 1 is 0 and of msg 2 is 1.

•
(i − 1)-th bit is 0: Let {i 1 , j 1 , . . . , i k , j k } be a common part of the encoding of msg 1 and msg 2 . Because i-th bit of msg 1 is 0 and of msg 2 is 1, we conclude i k+1 < i' k+1 . • (i − 1)-th bit is 1: Let {i 1 , j 1 , . . . , i k , j k } be a common part of the encoding of msg 1 and msg 2 . Similar to the above case, because i-th bit of msg 1 is 0 and of msg 2 is 1, we conclude i k+1 = i' k+1 and j k+1 < j' k+1 .
Efficiency. Table 2 shows some comparison results our OREncL schemes and the scheme of [9]. Ours I and II denote the OREncL schemes with our OREncS under the normal d-size message-block encoding of [9] and the proposed message-block encoding, respectively. Because a ciphertext of Ours I consists essentially of n ciphertexts of our OREncS whose ctx size is λ + d, as described in Section 3.2, the size of the resulting ciphertext is n(λ + d). In the case of Ours II, the size of the resulting ciphertexts is n log d (λ + n log d ) because the message can be represented in n log d bits, and thus n log d ciphertexts of n log d -size domain OREncS are required. Taking log d = d/n, the size of ciphertexts with our proposed message-block encoding is asymptotically longer by a multiplicative factor Ω(log d) compared with the existing d-bit message-block encoding of [9]. Simple Implementation. Figure 1 shows the percentage of the requiring ciphertext information until obtaining the Π.Compare algorithm output for any two ciphertexts of Ours I and Ours II on two different domain sizes. For example, 98.76% of the ciphertexts of Ours II (N = 2 16 ) require only 25% of their ciphertext information to check the relative order on average. This result means the relative order information of two messages can be derived with slightly less ciphertext information when applying our proposed message-block encoding technique. However, because the ciphertext size of Ours II is longer, it does not mean our proposed encoding technique can guarantee a more efficient search time.

OREncL
Bit Size of ctx Leakage [9] n(λ + d ) + λ + ⌈ log d ⌉ Ideal Ours I n(λ + d ) First block that differs Ours II n ⌈ log d ⌉ ( λ + n ⌈ log d ⌉ ) CP of 1(i, j)'s Simple Implementation. Figure 1 shows the percentage of the requiring ciphertext information until obtaining the Π.Compare algorithm output for any two ciphertexts of Ours I and Ours II on two different domain sizes. For example, 98.76% of the ciphertexts of Ours II (N = 2 16 ) require only 25% of their ciphertext information to check the relative order on average. This result means the relative order information of two messages can be derived with slightly less ciphertext information when applying our proposed message-block encoding technique. However, because the ciphertext size of Ours II is longer, it does not mean our proposed encoding technique can guarantee a more efficient search time.

Conclusions
In this work, we introduced a new ideally-secure OREncS scheme with shorter ciphertexts compared with the existing scheme. We also presented an alternative message-block encoding technique for extending our OREncS to large-domains. Combining the proposed constructions with the existing Lewi et al.'s "domain-extension" scheme, we could obtain a new OREncL scheme with shorter ciphertexts whose security is the same as the existing scheme and a new OREncL scheme with longer ciphertexts whose leakage is the information of the common prefix consecutive 1' before the first differing bit. Moreover, we gave the efficiency and security analysis of our proposed schemes as well as a simple implementation result.