Unconditionally Secure Ciphers with a Short Key for a Source with Unknown Statistics

We consider the problem of constructing an unconditionally secure cipher with a short key for the case where the probability distribution of encrypted messages is unknown. Note that unconditional security means that an adversary with no computational constraints can only obtain a negligible amount of information (“leakage”) about an encrypted message (without knowing the key). Here, we consider the case of a priori (partially) unknown message source statistics. More specifically, the message source probability distribution belongs to a given family of distributions. We propose an unconditionally secure cipher for this case. As an example, one can consider constructing a single cipher for texts written in any of the languages of the European Union. That is, the message to be encrypted could be written in any of these languages.


Introduction
The concept of unconditional security is very attractive to cryptography and has found many applications since C. Shannon described it in his famous article [1].The concept refers to secret-key cryptography involving three participants Alice, Bob and Eve, where Alice wants to send a message to Bob in secret from Eve, who has the ability to read all correspondence between Alice and Bob.To do this, Alice and Bob use a cipher with a secret key k (i.e. a word from some alphabet), which is known to them in advance (but not to Eve).When Alice wants to send some message m, she first encrypts m using key k and sends it to Bob, who in turn decrypts the received encrypted message using the key k.Eve also receives the encrypted message and tries to decrypt it without knowing the key.The system is called unconditionally secure, or perfect, if Eve, with computers and other equipment of unlimited power and unlimited time, cannot obtain any information about the encrypted message.Not only did C. Shannon provide a formal definition of perfect (or unconditional) secrecy, but he also showed that the so-called one-time pad (or Vernam cipher) is such a system.One of the specific properties of this system is the equivalence of the length of the secret key and the message (or its entropy).Moreover, C. Shannon proved that this property must be true for any perfect system.Quite often this property has limited practical application as many modern telecommunication systems forward and store megabytes of information and the requirement to have secret keys of the same length seems to be quite stringent.There are, therefore, many different approaches to overcoming this obstacle.These include the ideal systems proposed by C. Shannon [1], the so-called honeycomb cipher proposed by Jewels and Ristenpart [2], the so-called entropy security proposed by Russell and Wang [3] and some others developed in recent decades [4][5][6][7][8][9][10][11].
The present work is concerned with entropically secure ciphers.
It is important to note that an entropically secure cipher is not perfect, and Eve may obtain some information about the message -the property referred to as "leakage," see the definition below, but this leakage can be made negligible.On the other hand, an entropically secure cipher makes it possible to significantly reduce the key length (compared to the perfect cipher).
Recently, an entropically secure cipher has been proposed for the case where encrypted messages have a known distribution, and for the case where messages are generated by a Markov chain [11].In the case of a known distribution, the length of the secret key is independent of message length, while in the case of a Markov chain, the length of the key grows logarithmically with message length; in both cases the length of the key depends on the amount of leakage.
In this paper we consider the situation where encrypted messages obey an unknown (or partially unknown) probability distribution.We propose an entropically secure cipher for which the key length depends on universal code (or data compressor) used for encoding the source and on the admissible leakage of the cipher.In a sense, the problem under consideration includes as special cases the previously solved problems with known probability distribution and the case where messages are generated by a Markov chain.The construction of the cipher is based on entropically secure ciphers [3,5,10,11] and universal coding [12].It is worth noting that the proposed cipher uses data compression and randomisation, both of which are quite popular in unconditional security, cf.[13][14][15] and [15,16], respectively.
2 Definitions and preliminaries

Basic concepts
We consider the problem of symmetric encryption, where Alice wants to securely transmit a message to Bob.The messages are n-letter binary words, they obey a certain probability distribution p defined on the set {0, 1} n , n ≥ 1.This distribution is only partially known, i.e. it is known that p belongs to some given set P , P ⊂ R n .Alice and Bob have a shared secret key K = K 1 ...K k , and Alice encrypts the message M ∈ {0, 1} n using K and possibly some random bits.Then she sends the word cipher(M, K) to Bob, who decrypts the received cipher(M, K) and obtains M. The third participant is a computationally unconstrained adversary Eve, who knows cipher(M, K) and distribution p, and wants to find some information about M without knowing K.
Russell and Wang [3] suggested a definition of entropic security which was generalised by Dodis and Smith [5] as follows: A probabilistic map Y is said to hide all functions on {0, 1} n with leakage ǫ if, for every adversary A, there exists some adversary Â (who does not know Y (M)) such that for all functions f , (note that Â does not know Y (M) and, in fact, she guesses the meaning of the function f (M).)In what follows, the probabilistic map Y will be cipher(M, K) and f is a map f : {0, 1} n → {0, 1} * .
Definition 1.The map Y () is called ǫ-entropically secure for family probability distributions P if Y () hides all functions on {0, 1} n with leakage of ǫ, whenever p ∈ P .
Note that, in a sense, Definition 1 is a generalisation of Shannon's notion of perfect security.Namely, if we take ǫ = 0 and Y = cipher(M, K) and f (x) = x, we obtain that for any So, A and Â obtained the same result, but A estimates the probability based on cipher(M, K), whereas Â does it without knowledge of cipher(M, K).Thus, the entropic security (1) can be considered as a generalisation of the Shannon's perfect secrecy.
We will use another important concept, the notion of indistinguishability. Definition 2 A randomised map Y : {0, 1} n → {0, 1} n , n ≥ 1, is ǫindistinguishable for some family of destributions P and ǫ > 0 if there is a probability distribution G on {0, 1} n such that for every probability distribution p ∈ P we have where for two distributions A, B SD(A, B) = 1 2 Importantly, G is independent of Y (M).Dodis and Smith [5] showed that the concepts of ǫ-entropic security and ǫ-indistinguishability are equivalent up to small parameter changes.

ǫ-entropically secure ciphers for distributions with bounded min-entropy
In 2006 [3], the first entropy secure cipher was developed for probability distributions with a limited value of the so-called minimum entropy, which is defined as follows where p is a probability distribution, log = log 2 .The Russell and Wang [3] cipher was generalized and developed by Dodis and Smith [5] and their result can be formulated as follows: Theorem 1 [5].Let p be a probability distribution on {0, 1} n , n > 0, whose min-entropy is not less then h, h ∈ [0, n].Then there exists an ǫentropically secure cipher with the k-bit key where (3) Let's denote this cipher as cipher rw−ds .
In a sense, this cipher generalizes the perfect Shannon cipher as follows: In a perfect cipher the key is the word from {0, 1} n , while in an entropically secure cipher the key belongs to the 2 k -element subset K ⊂ {0, 1} n , which is a so-called small-biased set.Informally, this means that for any m ≤ n and a uniformly chosen binary word u ∈ {0, 1} m , for any m positions i 1 i 2 , ..., i m , the probability that K i 1 , K i 2 , ...., K im = u is close to 2 −m .(This construction is based on some deep results in combinatorics [5,17,18].)Thus, the key length decreases from n to k.Note that the leakage ǫ and hence the summand 2 log(1/ǫ) + 2 depends on the size of the "small-biased set" 2 k (In general, larger k implies smaller ǫ.)

ǫ-entropically secure ciphers with reduced secret key
In equality (3), the linearly increasing summand n − h depends on the minentropy h.So, it seems natural to transform the set {0, 1} n so as to reduce the min-entropy of the original distribution p and hence the summand n − h.
In [11] this approach was realised as follows: let there be a set of probability distributios P defined on {0, 1} n , n ≥ 1.The key part of the cipher is such a randomised map φ : {0, 1} n → {0, 1} n * , n * ≥ n, that there exists a map φ −1 (i.e ∀ u φ −1 (φ(u)) = u) and a min-entropy of the transform probabiity distribution π p is close to n * (here the distribution π p is such that p(u) . And then the cipher rw−ds can be applied to φ(m) with a shorter key, because the difference n * − h min (π p ) will be less that n − h min (p), see (3).Thus, the smaller sup p∈P (n * − h min (π p )), the shorter the secret key.The described cipher is based on data compression and randomisation and denoted in [11] by cipher c&r .The following theorem describes its properties.
Obviously, the key length depends on the efficiency of the compression method, or code.Thus, in the case of known statistics (i.e., known p), the key length is ∆+2log(1/ǫ)+2, where ∆ is 1 or 2 and depends on the compression code chosen.If p is unknown, but the messages are known to be generated by a Markov chain with known memory, then ∆ = O(log n) (and the key length is O(log n) + 2log(1/ǫ) [11] ).

Universal coding
The problem of constructing a single code for multiple probability distributions (information sources) is well known in information theory, and there are currently dozens of effective universal codes based on different ideas and approaches.It is worth noting that, at present, there are dozens universal codes, which are the basis for so-called archivers (e.g., ZIP).The first universal code for Bernoulli and Markov processes was proposed by Fitinghof [19], and then Krichevsky found an asymptotically optimal code for these processes [12,20].Other universal codes include the PPM universal code [21], which is used together with the arithmetic code [22], the Lempel-Ziv (LZ) codes [23], the Burrows-Wheeler transformation [24], which is used together with the book-stack code (or MTF) [25] (see also also [26,27]), grammar codes [28,29] and some others [30][31][32][33].
The universal code c has to"compress" sequences x = x 1 ...x n that obey the distribution p ∈ P down to Shannon entropy p, that is h Sh (p), and the difference between E p (|c(x)|) − h Sh (p) is called redundancy r(p) [12] (here E p is the expectation and |u| is the legth u).In [34], an algorithm was proposed to construct a code c opt whose redundancy is minimal on P, that is, r popt = inf p∈P r(p).In [34] it was shown that r popt is equal to the capacity of a channel whose input alphabet is P, whose output alphabet is the alphabet on which distributions from P are defined (in our case it is the alphabet {0, 1} n ), and the lines of the channel matrix are probability distributions from P (see also [35] for the history of this discovery).This fact is important, because it allows us to use known methods to compute the channel capacity to find the optimal code.
In this paper, we will use the so-called Shtarkov maximum likelihood code c Sht [36], whose construction is much simpler, and its redundancy is often close to that of the optimal code.This code is described as follows: first define (5) Clearly, ∀u : p(u)/q(u) ≤ S P .
Shtakov proposed to build code c Sht for which |c Sht (u)| = ⌈− log q(u)⌉.(Such a code exists, see [37]. ) Note that for a finite set P S P ≤ |P | (In particular, this is true when P contains probability distributions corresponding to several languages).

The cipher
Now we are going to construct an ǫ-entropically secure cipher c c&r for the case of unknown statistics, i.e., there exists some set of probability distributions P generating words from {0, 1} n , n ≥ 1, and the constructed cipher should be applicable to messages obeying any p ∈ P with leakage no larger than ǫ.In short, we apply the general method from [11] to the probability distribution q (5).In detail, Alice wants to send messages m ∈ {0, 1} n to Bob, and they both know in advance that m can obey any probability distribution p of the set of distributions P. The cipher algorithm is as follows.
Constructing the cipher.We describe all calculations in the following steps: i) compute the distribution q according to (5) and order the set q(u), u ∈ {0, 1} n .(Denote the ordered probabilities as q 1 , q 2 , ..., q N , N = 2 n and let ν(u) = i for which q(u) = q i .) ii) encode the "letters" 1, 2, ..., N with the distribution q by the trimmed Shannon code from [11] .Denote this code λ and note that ∀i : |λ(i)| < − log q i + 2 (7) and λ is prefix-free, that is, for any i and j, i = j, neither λ(i) is a prefix λ(j), no λ(j) is a prefix λ(i) [11].
iii) build the following randomised map φ First, find n * = max i λ(i) and then define for u where r j are equiprobable independent binary digits.iv) For the desired leakage ǫ build cipher rw−ds with secret key length where δ = 2 for ǫ-entropically secure cipher and δ = 6 for ǫ-indistinguishable one.
It is worth noting that Alice and Bob (and Eve) can do all the calculations described independently of each other.
Use of the cipher.Suppose Alice and Bob have a randomly chosen secret key K, |K| = k, and Alice wants to send Bob a message m.To do this, she computes cipher c&r (m, K), as described above, and sends it to Bob.
The properties of this cipher are described in the following theorem.Theorem 3. Suppose there is a family P of probability distributions defined on {0, 1} n and some ǫ > 0. If the described cipher c&r is applied then i) the cipher c&r is ǫ-entropically secure with secret key length ⌈log S P ⌉ + 2 log(1/ǫ) + 2 and ii) the cipher c&r is ǫ-indistinguishable with secret key length ⌈log S P ⌉ + 2 log(1/ǫ) + 6.

Conclusion
We described the cipher for a family of probability distributions P defined on the set {0, 1} n , n ≥ 1, for which the length of the secret key does not depend directly on n, but depends on P. For example, if P is finite, the key length is less than log |P| + 2 log(1/ǫ) + O(1) and hence independent of n.This example includes the case where one needs to have the same cipher for texts written in different languages.Here, the size of the set P is equal to the number of languages.Thus, in some practically interesting cases, the extra length of the secret key is quite small.