Abstract
An infinite sequence of letters from some alphabet , , is called k-distributed () if any k-letter block of successive digits appears with the frequency in the long run. The sequence is called normal (or ∞-distributed) if it is k-distributed for any . We describe two classes of low-entropy processes that with probability 1 generate either k-distributed sequences or ∞-distributed sequences. Then, we show how those processes can be used for building random number generators whose outputs are either k-distributed or ∞-distributed. Thus, these generators have statistical properties that are mathematically proven.
1. Introduction
In 1909, Borel defined k-distributed and ∞-distributed sequences as follows: A sequence of digits in base b is k-distributed if for any k-letter word w over the alphabet
where is a number of occurrences of w in the sequence , , ). The sequence is normal (or ∞-distributed) if it is k-distributed for any . Borel called normal to base b a real number from the interval whose expansion in base b is normal sequence, and showed that almost all real numbers are normal to any base (with respect to the uniform measure) [1,2]. It is interesting that the construction of ∞-distributed sequence in an explicit form was first achieved by Champernowne in 1933 [3], who proved that the sequence
is ∞-distributed. Later, many ∞-distributed sequences were described and investigated in numerous papers (see for review [4]). Many researchers suppose that fractional parts of , e, and and some other “mathematical” constants are normal, but it is not proven [2,5]. On the other hand, for empirical counting over several billions of its digits suggests that this might be true (see [5,6]).
One of the reasons for interest in k- and ∞-distributed sequences is due to the fact that they are closely related to the concept of randomness. If we imagine that someone tosses a fair coin with sides marked 0 and 1, he/she obtains (almost surely) an ∞-distributed sequence [2,5]. A mathematical model of such an experiment is a sequence of an independent and identically distributed (i.i.d.) symbols from generated with probabilities . Note that quite often this i.i.d. process and the sequences generated from them are called “true random” [2].
The true random sequences are very desirable in cryptography, simulation and modeling applications. Of course, it is practically impossible to generate them tossing a coin and nowadays there are many so-called generators of pseudo-random numbers (PRNGs), whose aim is, informally speaking, to calculate sequences which mimic the truly random (see [2,7,8,9,10]). For brevity, in what follows, we consider the case when a process generates letters from the alphabet , but the obtained results can be extended to the case of any alphabet.
Modern PRNGs is a computer program whose input is a short word (a so-called seed), whereas its output is a long (compared to the input) word. Having taken into account that the seed is a true random word, the PRNG can be considered as an expander of randomness which stretches a short seed into a long word [2,7,10]. The output of “perfect” PRNG would have to generate true random output sequence. However, it is impossible.
To be more precise, we note that a mathematically correct definition of a random sequence was obtained in a framework of algorithmic information theory established by Kolmogorov (see [11,12,13,14,15]). In particular, it is shown that any algorithm (i.e., a Turing machine) can neither generate (infinite) random sequences nor stretch a short random sequence into a longer one. It means that PRNGs do not exist. The same is true in a framework of Shannon information theory. Indeed, it is known that the Shannon entropy of the true random process (i.e., i.i.d. with probabilities ) is one bit per letter, whereas for all other processes the entropy (per letter) is less than one (see [16]). On the other hand, any PRNG stretches a short true random sequence into a long one. The entropy of the output is not greater then the entropy of the input and, hence, the per letter entropy of the output is strictly less than 1 bit. Therefore, the demands of true randomness and low entropy are contradictory. Thus, we see that, in a framework of algorithmic information theory, as well as in a framework of Shannon information theory, “perfect” PRNGs do not exist.
In such a situation, researchers suggest and investigate PRNGs, which meet some “probabilistic” properties of true random sequences [2,17]. In particular, a property that a PRNG generates ∞-distributed sequences is very desirable (see [2]).
Another important type of random number generators (RNGs) is physical random number generators, among which the so-called quantum random number generators (QRNG) have become very popular in recent decades and are widely used in practice. By definition, the physical RNGs are devices whose output is a binary sequence that must be truly random (or at least look truly random) (see [10]). According to M. Herrero-Collantes and J.C. Garcia-Escartin [10], a physical RNG can be divided into the two following blocks: the entropy source and the post-processing stage. The output of the source of entropy is a bit string obtained by measuring a physical random process with subsequent quantization. The goal of post-processing is to translate this bit string into a true random binary sequence. Nowadays, there are many methods of post-processing [10], but, nevertheless, the statistical (probabilistic) properties of many physical RNGs and, in particular, QRNGs, are not proven mathematically and should be experimentally tested [10,18,19]. Even the so-called device-independent QNRGs guarantee only the randomness of their output, but true randomness must either be verified or obtained by post-processing [10]. Thus, transformations that transform the output into a normal sequence are desirable for all types of RNGs.
Here, we describe such random processes that their entropy is much less than 1, but Equation (1) is valid for generated sequences either for all integers or for k from a interval , where K is an integer. This shows that there exist low-entropy PRNGs which generate such sequences that Equation (1) is valid (for ). The description of the suggested processes show that they can be used to develop PRNGs with the property in Equation (1). The described processes are generalization of so-called two-faced processes suggested in [20,21,22].
In detail, we propose the following two processes. First, we describe the k-order Markov chain, which is a so-called two-faced process of order k, , for which, with probability one, for any generated sequence and all binary words , the frequency of occurrence of the word w in the sequence , , goes to . Secondly, we describe so-called normal two-faced processes for which this property is true for all k.
We also propose the so-called two-faced transformation, which translates the trajectories of any random process into the trajectories of a two-faced process. This transformation is applicable to the creation of a PRNG with proven statistical properties.
2. K-Distributed Sequences and Two-Faced Markov Chains
First, we consider a pair of examples in order to explain the main idea of considered Markov chains. Let a matrix of transition probabilities be as follows:
where and are non-negative and their sum equals 1 (i.e., , ).
For example, let . Then, the “typical” output sequence can be as follows:
On the one hand, this sequence is clearly not true random. On the other hand, the frequencies of 1s and 0s goes to due to the symmetry of the matrix in Equation (2). Hence, the output is 1-distributed. Again, based on the symmetry, we can build the following matrix of the second order whose output will be 2-distributed:
(Here, , .) For , the “typical” output sequence can be as follows:
where gaps correspond to seldom transitions. It can be easily seen that frequency of any two-letter word goes to .
Let us give a formal definition of two-faced Markov chains. First, we define two families of random processes and , where integer k and are parameters. The processes and are Markov chains of the connectivity (memory) k, which generate letters from the binary alphabet. We define them inductively. The matrix of is defined as follows: , . The process is defined by , . Let the transition matrices and be defined, then and are as follows
Conversely,
for all ( is a concatenation of v and u). We can see that
For example,
To describe the process, the initial probability distribution should be defined. We say that the initial distribution of and is uniform, if for all . Sometimes, we consider different initial distributions, which is why, in all cases, the initial distribution is mentioned.
Let be stationary process. Its conditional Shannon entropy of order m, , is defined as follows
and the limit entropy is as follows
see [16].
The main properties of Markov chains and , , are described by the following
Theorem 1.
Let be generated by (or ), , and w . Then,
- (i)
- If the initial distribution is uniform over , thenfor any .
- (ii)
- For any initial distribution of the Markov chain (or )
- (iii)
- With probability one the Markov chains and generates k-distributed sequences.
- (iv)
- For any , the k-order Shannon entropy () of the processes ( ) equals 1 bit per letter, whereas the limit entropy equals
The proof is given in Appendix A.
Having taken into account this theorem, we give the following.
Definition 1.
It turns out that, in a certain sense, there are many two-faced processes. More precisely, the following theorem is true.
Theorem 2.
Let , be random processes. We define the process by following equations , where . If X is a k-order (asymptotically) two-faced process, then Z is also the k-order (asymptotically) two-faced process ().
The proof is given in Appendix A.
3. Two-Faced Transformation
Now, we show that any stochastic process can be transferred into a two-faced one. For this purpose, we describe transformations which transfer random processes into two-faced ones. First, we define matrices and , , which are based on matrices and .
Definition 2.
The matrix is defined by the following equation:
Let us define the matrix as follows:
for any , , . The matrix is obtained from analogously. Note that, from Equation (6), we obtain
Definition 3.
Let k be an integer, , . Define functions and as follows:
Let , and . The two-faced transformation maps a pair into a sequence Y as follows: where .
Note that, from this definition and Equation (12), we can see that for any
Theorem 3.
Let be an integer, be generated by a stochastic process, and be a two-faced transformation. If v is uniformly distributed on , then for any and
i.e., is two-faced of order k process. The proof is given in Appendix A.
Consider now the question of the complexity of the described transformation allowing transform any process into a two-faced. When directly implementing the transformation , one must store matrix of 2 rows and columns, i.e., just numbers. Storing such matrices becomes impossible when k exceeds hundreds. Therefore, the question arises of constructing a simpler algorithm that does not require an exponential growth of memory with increasing k. It turns out that there exists an algorithm which requires bits of memory and finite number of operation (per an output letter).
To describe this algorithm, we first define transformations and . Let there be an infinite word and a finite one . For any , denote . Then, and . Similarly, . From those definitions and Equation (12), we can see that for any
It is important to note that there exists the simple algorithm for carrying out the transformation . Indeed, just store the letters and the value in the computer’s memory. Then, read the letter , calculate , include and exclude , i.e., store the new word . Then, calculate the new , read the new letter and so on.
Theorem 4.
The transformation equals , and, hence, the above-described algorithm performs the transformation in time using memory bits.
The proof is given in Appendix A.
4. ∞-Distributed Processes
The k-order two-faced process is k distributed. Here, we describe ∞-distributed processes. We call suggested processes as normal two-faced.
Definition 4.
Now, we describe a family of such processes. Suppose that is a sequence of integers, and , , , are (asymptotically) two-faced processes of order , correspondingly. Define a process W by the following equation:
and denote it as .
Theorem 5.
Let all , , be two-faced. Then, is normal two-faced. If , are asymptotically two-faced, then is asymptotically normal two-faced.
The proof is given in Appendix A. From this and Theorem 2, we can derive the following.
Corollary 1.
If and are stochastic processes and X is normal two-faced, then the process Z , , is normal two-faced.
Note that the entropy of the processes can be small; hence, the entropy of the process can be arbitrary small. On the other hand, the process looks truly random.
5. Experiments
Here, we present some experiments describing the two-faced processes with different parameters. We compared obtained sequences with truly random applying the test [23]. For this purpose, N-letter sequence , , were generated, whereas the initial part was uniformly distributed. The sequence was presented as , , and the frequency of occurrence of all words from was estimated. Then,
was calculated, where , is the number of occurrences of w in the sequence , . (Note that estimates the frequency deviation from the uniform distribution.) Then, was compared with the quantile , where ; see [23]). If , we rejected . Table 1 contains results of calculations. (The entropy is equal to ).
Table 1.
Two-faced processes testing.
Thus, we can see that the two-faced processes can be obtained from low-entropic ones.
6. Conclusions
In this paper, we describe low-entropic processes which mimic truly random ones. In other words, the output is either ∞-distributed or k-distributed for some integer k. In addition, we show how those processes can be directly used in order to construct (or “improve ”) PRNGs.
Funding
This work was supported by Russian Foundation for Basic Research (grant 18-29-03005).
Conflicts of Interest
The author declares no conflict of interest.
Appendix A. Proofs of Theorems
Proof of Theorem 1.
We prove that
is a limit (or stationary) distribution for the processes and . For this purpose, we show that the system
has the solution , Having taken into account the definitions and Equations (4) and (5), we can see that the equality is valid for all . From the law of total probability and the latest equation, we derive Equation (A1). Taking into account that the initial distribution is uniform and, hence, is the limiting one, we derive the first claim in Equation (9). Any transition probability is either p or , hence, they are greater than 0, thus is ergodic and Equation (9) is true due to ergodicity.
Let us prove Statement (iii). All transition probabilities of are nonzero numbers. Hence, this Markov chain is a stationary ergodic process; therefore, for any w, the limit equals (see [24]). From this and Statement (ii), we obtain Statement (iii).
Proof of Theorem 2.
The first claim follows from the next equations:
(It follows from Equation (9) and .) Note that, by definition,
for all , see Equation (10). Thus, for any , there is such J that
if . From this and Equation (A2), we we can see that
From this, we obtain that:
Thus, Equation (9) is true. The theorem is proven. ☐
Proof of Theorem 3.
We prove Equation (15) by induction on r. By the condition of the theorem, obeys the uniform distribution on , hence Equation (15) is true for . Supposing this equation is proven for r, let us prove it for . The matrix has columns, each of which contains 0 and 1. For any x, half of the corresponding elements of the row are 0, whereas the others are 1. By induction, obeys the uniform distribution; hence, with probability . The theorem is proven. ☐
Proof of Theorem 4.
We prove by induction on k. For , it is true by the definitions of matrices and . Supposing the equation is proven for , let us prove it for . From this and Equation (16), we obtain
From this equation and Equation (14), we can see that . ☐
Proof of Theorem 5.
Suppose . There exists such an integer that and define . (Here, .) Clearly, . is (asymptotically) -order two-faced and, from Theorem 2, we derive that is -order two-faced. Taking into account that , we can see that is k-order (asymptotically) two-faced. Thus, Equation (9) (Equation (10)) is true and the theorem is proven. ☐
References
- Borel, E. Le continu mathématique et le continu physique. 1909. Available online: https://fr.wikisource.org/wiki/Le_continu_math%C3%A9matique_et_le_continu_physique (accessed on 1 August 2019).
- L’Ecuyer, P. History of uniform random number generation. In Proceedings of the WSC 2017-Winter Simulation Conference, Las Vegas, NV, USA, 3–6 December 2017. [Google Scholar]
- Champernowne, D.G. The construction of decimals normal in the scale of ten. J. Lond. Math. Soc. 1933, 1, 254–260. [Google Scholar] [CrossRef]
- Bailey, D.H.; Crandall, R.E. Random generators and normal numbers. Exp. Math. 2002, 11, 527–546. [Google Scholar] [CrossRef]
- Bailey, D.H.; Borwein, J.M.; Calude, C.S.; Dinneen, M.J.; Dumitrescu, M.; Yee, A. An empirical approach to the normality of π. Exp. Math. 2012, 21, 375–384. [Google Scholar] [CrossRef]
- Bailey, D.H.; Borwein, J.M.; Brent, R.P.; Reisi, M. Reproducibility in computational science: A case study: Randomness of the digits of pi. Exp. Math. 2017, 26, 298–305. [Google Scholar] [CrossRef][Green Version]
- L’Ecuyer, P. Random Number Generation and Quasi-Monte Carlo; Wiley: Hoboken, NJ, USA, 2014. [Google Scholar]
- Marsaglia, G. Xorshift rngs. J. Stat. Softw. 2003, 8, 1–6. [Google Scholar] [CrossRef]
- Impagliazzo, R.; Zuckerman, D. How to recycle random bits. In Proceedings of the 30th Annual Symposium on Foundations of Computer Science, Research Triangle Park, NC, USA, 30 October–1 November 1989; pp. 248–253. [Google Scholar]
- Herrero-Collantes, M.; Garcia-Escartin, J.C. Quantum random number generators. Rev. Mod. Phys. 2017, 89, 015004. [Google Scholar] [CrossRef]
- Li, M.; Vitányi, P. An Introduction to Kolmogorov Complexity and Its Applications; Springer-Verlag: Berlin/Heidelberg, Germany, 2008. [Google Scholar]
- Calude, C.S. Information and Randomness—An Algorithmic Perspective; Springer-Verlag: Berlin/Heidelberg, Germany, 2002. [Google Scholar]
- Downey, R.G.; Hirschfeldt, D.R. Algorithmic Randomness and Complexity; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2010. [Google Scholar]
- Downey, R.; Hirschfeldt, D.R.; Nies, A.; Terwijn, S.A. Calibrating randomness. Bull. Symb. Log. 2006, 12, 411–491. [Google Scholar] [CrossRef]
- Merkle, W.; Miller, J.S.; Nies, A.; Reimann, J.; Stephan, F. Kolmogorov–loveland randomness and stochasticity. Ann. Pure Appl. Log. 2006, 138, 183–210. [Google Scholar] [CrossRef]
- Cover, T.M.; Thomas, J.A. Elements of Information Theory; Wiley-Interscience: New York, NY, USA, 2006. [Google Scholar]
- L’Ecuyer, P.; Simard, R. TestU01: A C library for empirical testing of random number generators. ACM Trans. Math. Softw. (TOMS) 2007, 33, 22. [Google Scholar] [CrossRef]
- Ma, X.; Yuan, X.; Cao, Z.; Qi, B.; Zhang, Z. Quantum random number generation. NPJ Quantum Inf. 2016, 28, 6021. [Google Scholar] [CrossRef]
- Tamura, K.; Shikano, Y. Quantum Random Numbers generated by the Cloud Superconducting Quantum Computer. arXiv 2019, arXiv:1906.04410. [Google Scholar]
- Ryabko, B.; Suzuki, J.; Topsoe, F. Hausdorff dimension as a new dimension in source coding and predicting. In Proceedings of the 1999 IEEE Information Theory and Communications Workshop, Kruger National Park, South Africa, 25 June 1999; pp. 66–68. [Google Scholar]
- Ryabko, B.; Monarev, V. Using information theory approach to randomness testing. J. Stat. Plan. Inference 2005, 133, 95–110. [Google Scholar] [CrossRef]
- Ryabko, B.; Fionov, A. Basics of Contemporary Cryptography for IT Practitioners; World Scientific Publishing Co.: Singapore, 2005. [Google Scholar]
- Kendall, M.; Stuart, A. The Advanced Theory of Statistics; Vol.2: Inference and Relationship; Hafner Publishing Company: New York, NY, USA, 1961. [Google Scholar]
- Billingsley, P. Ergodic Theory and Information; John Wiley & Sons: Hoboken, NJ, USA, 1965. [Google Scholar]
© 2019 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).