Next Article in Journal
Informational Reinterpretation of the Mechanics Notions and Laws
Next Article in Special Issue
Privacy-Aware Distributed Hypothesis Testing
Previous Article in Journal
LSSVR Model of G-L Mixed Noise-Characteristic with Its Applications
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Time-Adaptive Statistical Test for Random Number Generators

1
Institute of Computational Technologies of the Siberian Branch of the Russian Academy of Science, 630090 Novosibirsk, Russia
2
Department of Information Technologies, Novosibirsk State University, 630090 Novosibirsk, Russia
Entropy 2020, 22(6), 630; https://doi.org/10.3390/e22060630
Submission received: 8 May 2020 / Revised: 3 June 2020 / Accepted: 3 June 2020 / Published: 7 June 2020
(This article belongs to the Special Issue Information Theory, Forecasting, and Hypothesis Testing)

Abstract

:
The problem of constructing effective statistical tests for random number generators (RNG) is considered. Currently, there are hundreds of RNG statistical tests that are often combined into so-called batteries, each containing from a dozen to more than one hundred tests. When a battery test is used, it is applied to a sequence generated by the RNG, and the calculation time is determined by the length of the sequence and the number of tests. Generally speaking, the longer is the sequence, the smaller are the deviations from randomness that can be found by a specific test. Thus, when a battery is applied, on the one hand, the “better” are the tests in the battery, the more chances there are to reject a “bad” RNG. On the other hand, the larger is the battery, the less time it can spend on each test and, therefore, the shorter is the test sequence. In turn, this reduces the ability to find small deviations from randomness. To reduce this trade-off, we propose an adaptive way to use batteries (and other sets) of tests, which requires less time but, in a certain sense, preserves the power of the original battery. We call this method time-adaptive battery of tests. The suggested method is based on the theorem which describes asymptotic properties of the so-called p-values of tests. Namely, the theorem claims that, if the RNG can be modeled by a stationary ergodic source, the value l o g π ( x 1 x 2 x n ) / n goes to 1 h when n grows, where x 1 x 2 is the sequence, π ( ) is the p-value of the most powerful test, and h is the limit Shannon entropy of the source.

1. Introduction

Randomness has many applications in cryptography, statistical sampling, computer modeling, and numerical Monte Carlo methods, as well as in games, gambling, and other fields. In practice, random numbers are created by devices that generate a sequence of numbers or characters. These devices are called random number generators (RNGs) and pseudo random number generators (PRNGs). RNGs are based on physical sources, while pseudo random numbers are generated by computer programs. The goal of RNG and PRNG is to generate sequences of binary digits that are distributed as a result of throwing an “honest” coin or, more precisely, obey the Bernoulli distribution with parameters ( 1 / 2 , 1 / 2 ) . As a rule, for practically used RNGs and PRNGs, this property is verified experimentally using statistical tests developed for this purpose. Currently, there are more than one hundred applicable statistical tests, as well as dozens RNGs based on different physical processes, and an even greater number of PRNGs based on different mathematical algorithms (see for review [1,2,3]).
Informally, an ideal RNG should generate sequences that pass all tests. In practice, especially in cryptographic applications, this requirement is formulated as follows: an RNG must pass a so-called battery of statistical tests, that is, some fixed set of tests. When a battery is applied, each test in the test battery is applied separately to the RNG. Among these batteries, we mention the Marsaglia’s Diehard battery, which contains 16 tests [4], the National Institute of Standards and Technology (NIST) battery of 15 tests [5], several batteries proposed by L’Ecuyer and Simard [2], which contain from 10 to 106 tests, and many others (see for review [1,2,6]). In addition, these batteries contain many tests that can be used with different values of the parameters, potentially increasing the total number of tests in the battery. Note that practically used RNG should be tested from time to time as with any physical equipment, and therefore these test batteries should be used continuously.
How show large batteries of tests be evaluated? On the one hand, the larger is the test battery, the more likely it is to find flaws in the tested RNG. On the other hand, the larger is the battery, the more time is required for testing. (Thus, L’Ecuyer and Simard [2] remarked the need for small batteries to increase computational efficiency.) Another view is as follows: in reality, the time available to study any RNG is limited. Given a certain time budget, one can either use more tests and relatively short sequences generated by the RNG, or use fewer tests, but longer sequences and, in turn, this gives more chances to find deviations of randomness of the considered RNG.
To reduce this trade-off, we propose a time-adaptive testing of RNGs, in which, informally speaking, first all the tests are executed on relatively short sequences generated by the RNG, and then a few “promising” tests are applied for the final testing. Of course, the key question here is which tests are promising. For example, if a battery of two tests is applied to (relatively short) sequences of the same length, it can be assumed that the smaller is the p-value, the more promising is the test. However, a more complicated situation may arise when we have to compare two tests that were applied to sequences of different lengths (for example, the first test was applied to a sequence of length l 1 , and the second to a sequence of length of l 2 , l 1 l 2 ). We show that, if our goal is to choose the most powerful test, then a good strategy is to choose the test i for which the ratio log ( p v a l u e i ) / l i is maximum. This recommendation is based on the following theorem: if an RNG can be modeled by a stationary ergodic source, the value l o g π ( x 1 x 2 x n ) / n goes to 1 h , if n grows, where x 1 x 2 is a generated sequence, π ( ) is the p-value of the most powerful test, and h is the limit Shannon entropy of the stationary ergodic source μ . (Here, the Shannon entropy of order m, m = 1 , 2 , , is defined by h m = u { 0 , 1 } m 1 μ ( u ) v { 0 , 1 } μ ( v / u ) log μ ( v / u ) and the limit Shannon entropy is defined by h = lim m h m ; see [7].) This theorem plays an important rule in the suggested time-adaptive scheme and is described in the first part of the paper, whereas the time-adaptive testing is described afterwards. The description is illustrated by experiments with the battery Rabbit from [2].
As far as we know, the proposed approach to testing RNGs is new, but the idea of finding the best test among many, testing the tests step by step in an increasing sequence, is widely used in algorithmic information theory, where the notion of random sequence is formally investigated and discussed [8,9,10].

2. Hypothesis Testing and Properties of P-Values

2.1. Notation

We consider RNG which generates a sequence of letters x = x 1 x 2 x n , n 1 , from a finite alphabet { 0 , 1 } n . Two statistical hypotheses are considered: H 0 = { x obeys the uniform distribution ( μ U ) on { 0 , 1 } n } , and the alternative hypothesis H 1 = H ¯ 0 , that is, H 1 is the negation of H 0 . It is a particular case of the so-called goodness-of-fit problem, and any test for it is called a test of fit, see [11]. Let t be a test. Then, by definition, the significance level α equals the probability of the Type I error, α ( 0 , 1 ) . Denote a critical region of the test t for the significance level α by C t ( α ) and let C ¯ t ( α ) = { 0 , 1 } n C t ( α ) . (Recall that Type I error occurs if H 0 is true and is rejected. Type II error occurs if H 1 is true, but H 0 is accepted. Besides, for a certain x = x 1 x 2 x n H 0 is rejected if and only if x C t ( α ) .)
Suppose that H 1 is true, and the investigated sequence x = x 1 x 2 x n is generated by an (unknown) source ν . By definition, a test t is consistent, if, for any significance level α ( 0 , 1 ) , the probability of Type II error goes to 0, that is
lim n ν ( C ¯ t ( α ) ) = 0 .
Suppose that H 1 is true and the sequences x { 0 , 1 } n obey a certain distribution ν . It is well-known in mathematical statistics that the optimal test (Neyman–Pearson or N P test) is described by the Neyman–Pearson lemma and the critical region of this test is defined as follows:
C N P ( α ) = { x : μ U ( x ) / ν ( x ) λ α } ,
where α ( 0 , 1 ) is the significance level and the constant λ α is chosen in such a way that μ U ( C N P ( α ) ) = α (see [11]). (We did not take into account that the set { 0 , 1 } n is finite. Strictly speaking, in such a case, a randomized test should be used, but in what follows we consider asymptotic behavior of tests for large n, and this effect would be negligible). Note that, by definition, μ U ( x ) = 2 n for any x { 0 , 1 } n .

2.2. The P-Value and Its Properties

The notion of the critical region is connected with the so-called p-value, which we define for the NP-test by the following equation:
π N P ( x ) = μ U { y : ν ( y ) > ν ( x ) } = | { y : ν ( y ) > ν ( x ) } | / 2 n .
Informally, π N P ( x ) is the probability to meet a random point y which is “worse” than the observed when considering the null hypothesis.
The NP-test is optimal in the sense that its probability of a Type II error is minimal, but when testing an RNG the alternative distribution is unknown, and, hence, different tests are necessary. Let us consider a certain statistic τ (that is, a function on { 0 , 1 } n ), and define the p-value for this τ and x as follows:
π τ ( x ) = μ U { y : τ ( y ) > τ ( x ) } = | { y : τ ( y ) > τ ( x ) } | / 2 n .
(Note that the definition π N P in Equation (2) corresponds to this equation if the value ν ( x ) is considered as a statistic, i.e., τ ( x ) = ν ( x ) ).

2.3. The P-Value and Shannon Entropy

It turns out that there exist such tests whose asymptotic behavior is close to that of the N P -test for any (unknown) stationary ergodic source ν (see [12]). Those tests are based on so-called universal codes (or data-compressors) and are described in [13,14], where it is shown that they are consistent. We describe those tests in Appendix A and show that they are asymptotically optimal. The following theorem describes the asymptotic behavior of p-values for stationary ergodic sources for N P test and the above-mentioned tests, which are based on universal codes (see Appendix A). We use this theorem as the theoretical basis for adaptive statistical testing developed in this paper.
Theorem 1. 
(i) If ν is a stationary ergodic measure, then, with probability 1,
lim n 1 n log π N P ( x ) = 1 h ( ν ) ,
where h ( ν ) is the Shannon entropy of ν (see for definition [7]).
(ii) There exists such a statistic τ that, for any stationary ergodic measure ν, with probability 1,
lim n 1 n log π τ ( x ) = 1 h ( ν ) ,
where p-values π N P and π τ are defined in Equations (2) and (3), correspondingly.
The statistic τ and the corresponding test of fit are described in Appendix A and the proof of the theorem is given in Appendix B, but here we note that this theorem gives some idea of the relation between the Shannon entropy of the (unknown) process ν and the required sample size. Indeed, suppose that the N P test is used and the desired significance level is α . Then, we can see that (asymptotically) α should be larger than π N P ( x ) and, from Equation (4), we obtain n > log α / ( 1 h ( ν ) ) (for the most powerful test). It is known that the Shannon entropy is 1 if and only if ν is a uniform measure μ u . Therefore, in a certain sense, the difference 1 h ( ν ) estimates the distance between the distributions, and the last inequality shows that the sample size becomes infinite if ν approaches a uniform distribution.
The next simple example illustrates this theorem. Let there be a statistic τ and a generator (a measure ν ) created sequences of binary digits which are independent and, say, ν ( 0 ) = 0.501 , ν ( 1 ) = 0.499 . Suppose, lim n 1 n log π τ ( x ) = c , where c is a positive constant. Let us consider the following “decimation test” τ 1 / 2 : an input sequence x 1 x 2 . x n is transformed into x 1 x 3 x 5 x 2 n / 2 1 and then the test is applied to this transformed sequence. Obviously, for this test, lim n 1 n / 2 log π τ 1 / 2 ( x ) = c and, hence, lim n 1 n log π τ 1 / 2 ( x ) = c / 2 . Thus, the value 1 n log π τ ( x 1 x n ) seems to be a reasonable estimate of the power of the test for a large n.

3. Time-Adaptive Testing and Their Experimental Investigation

3.1. Batteries of Tests

Let us consider a situation where the randomness testing is performed by conducting a battery of statistical tests for randomness. Suppose that the battery contains s tests and α i is the significance level of ith test, i = 1 , , s . If the battery is applied in such a way that the hypothesis H 0 is rejected when at least one test in the battery rejects it, then the significance level α of this battery satisfies the following inequality:
α i = 1 s α i .
If all the tests in the battery are independent, then the following equation is valid: α = 1 i = 1 s ( 1 α i ) . Clearly, the upper bound in Equation (6) is true for this case and 1 i = 1 s ( 1 α i ) is close to i = 1 s α i , if each α i is much smaller than 1 / s . That is why we use the estimate in Equation (6) below.
We considered a scenario in which a test is applied to a single sequence generated by an RNG, and then the researcher makes a decision on the RNG based on the test results. Another possibility that has been considered by several authors (e.g., [2,5]) is to use the following two-step procedure for testing RNGs. The idea is to generate r sequences x 1 , x 2 , , x r and apply one test (say, τ ) to each of them independently. Then, apply another test to the received data τ ( x 1 ) , τ ( x 2 ) , , τ ( x r ) (as a rule, those values are converted into a sequence of corresponding p-values, and then the hypothesis of the uniform distribution of those p-values is tested). Then, this procedure is repeated for the second test in the battery, and so on. The final decision is made on the basis of the results obtained. We do not consider this two-step procedure in detail, but note that time-adaptive testing can be applied in this situation, too.

3.2. The Scheme of the Time-Adaptive Testing

Let there be an RNG which generates binary sequences, and a battery of s tests with statistics τ 1 , τ 2 , , τ s . In addition, suppose that the total available testing time is limited to a certain amount T and the level of significance is α ( 0 , 1 ) .
When the time-adaptive testing is applied, all the calculations are separated into a preliminary stage and a final one. The result of the preliminary stage is the list of values
γ 1 = log π τ 1 ( x 1 1 x 2 1 x n 1 1 ) n 1 , γ 2 = log π τ 2 ( x 1 2 x 2 2 x n 2 2 ) n 2
, , γ s = log π τ s ( x 1 s x 2 s x n s s ) n s .
Then, taking into account the values in Equation (7), it is possible to choose some tests from the battery and apply them to the longer sequence, calculate new values γ , and so on. When the preliminary stage is carried out, several tests from the battery should be chosen for the next stage.
The final stage is as follows. First, we divide the significance level α into α 1 , α 2 , , α k in such a way that i = 1 k α i = α . Then, we obtain new sequence(s) y 1 1 y 2 1 y m 1 1 , , y 1 k y 2 k y m k k , which may have common parts, but are independent of x 1 1 x 2 1 x n 1 1 , , x 1 s x 2 s x n s s , and calculate
π τ i 1 ( y 1 1 y 2 1 y m 1 1 ) , , π τ i k ( y 1 k y 2 k y m k k ) .
The hypothesis H 0 is accepted, if π τ i j ( y 1 j y 2 j y m j j ) > α j for all j = 1 , , k . Otherwise, H 0 is rejected. The parameters of the test should be chosen in such a way that the total time of calculation is not grater than the given limit T.
Note that, during a preliminary stage, the sequences x 1 1 x 2 1 x n 1 1 , , x 1 s x 2 s x n s s may have common parts (for example, the first sequence may be the prefix of the second, etc.). The fact is that the final stage and the preliminary stage are statistically independent, and, therefore, the use of common parts is quite correct. On the other hand, this can affect the calculation time and, indirectly, the test result.

Claim 1

The significance level of the described time-adaptive test is not larger than α .
Indeed, the sequences y 1 1 y 2 1 y m 1 1 , , y 1 k y 2 k y m k k and x 1 1 x 2 1 x n 1 1 , , x 1 s x 2 s x n s s are independent and, hence, the results of the final stage does not depend on the preliminary one. When the battery τ i 1 , τ i 2 , , τ i k is applied, the significance level of τ i j equals α j and the significance level of the battery equals i = 1 k α i . From Equation (6), we can see that the significance level of the battery (and, hence, of the described testing) is not grater than α .
Comment. The length of the sequences may depend on the speed of tests. For example, it can be done as follows: let v i be the speed per bit of the test τ i , i = 1 , , s . One possible way to take into account the speed difference is to calculate
γ ^ i = log π τ i ( x 1 i x 2 i x n i i ) n i / v i , i = 1 , , s ,
instead of Equation (7) and similar expressions.

3.3. The Experiments

We carried out some experiments which were intended to assess the potential ability of time-adaptive testing rather than to find the optimal design of the test.
For experiments, we took batteries Rabbit and Alphabit from [2], while RNGs were specially prepared. The point is that nowadays there are many “bad” PRNGs and “good” ones. In other words, the output sequences of some known PRNGs have some deviations from randomness, which are quite easy to detect with many known tests, while other PRNGs do not have deviations that can be detected by known tests [2]. Thus, we need to have some families of RNGs with such deviations from randomness that they can be detected only for quite large output sequences. To do this, we took a good generator MRG32k3a and a bad one LCG (with parameters m = 2,147,483,647, a = 16,807, b = 0, c = 12,345) from [2], generated sequences g 1 g 2 and b 1 b 2 by these two generators, and then prepared a “mixed” sequence m 1 m 2 in such a way that
m i = g i if i mod D 0 b i if i mod D = 0
where D is a parameter.
The time-adaptive testing was organized as follows: during the preliminary stage, we first generated a file m 1 m 2 m l 1 with l 1 = 2,000,000 bytes, tested it by 25 tests from the Rabbit battery and calculated the values in Equation (7) with log log 2 (see the left part of Table 1). (This battery contains 26 tests, but one of them cannot be applied to such a short sequence.) Then, we chose five tests with the biggest value l o g π t i ( m 1 m l i ) / l 1 (let them be t i 1 , , t i 5 ), generated a sequence m 1 m l 2 with l 2 = 6,000,000 bytes and applied the tests t i 1 , , t i 5 for testing this sequence (see the example in the right part of Table 1). After that, we found a test t f for which
l o g π t f / l f = max r = 1 , , 25 ; j = i 1 i 5 { l o g π r ( m 1 m l 1 ) / l 1 , l o g π j ( m 1 m l 2 ) / l 2 } .
In other words, for t f the value l o g π r ( m 1 m l k ) / l k is maximal for k = 1 , 2 and all r (see the Table 1). The preliminary stage was finished. Then, during the second stage, we generated a 40,000,000 byte sequence, and applied the test t f to it. If the obtained p-value was less than 0.001 , the hypothesis H 0 was rejected. (Note that the sequence length l 1 = 2,000,000 and l 2 = 6,000,000 are 5 % and 15 % from the final length of 40,000,000 bytes. Thus, the total length of the sequences tested by all the tests during the preliminary stage is 25 × 0.05 + 5 × 0.15 = 2 the final length, i.e., 2 × 40,000,000. If we take into account the second stage, the total length is 3 × 40,000,000. On the other hand, if one applies the battery Rabbit to the sequence of the same length, the total length of investigated sequences is 25 × 40,000,000, i.e., 8.33 times more.
Let us consider one example in detail, taking D = 2 in Equation (9).
Table 1 contains the calculation results where 25 tests from the Rabbit battery were applied to a sequence of 2,000,000 bytes. Then, five tests with the smallest p values were applied to the sequence of 6,000,000 bytes (see Table 2). After that, we calculated Equation (10) and found the that the value log 2 π / l is maximal for the test t 13 . The preliminary stage was finished. Then, at the final stage, we applied the test t 13 to the new 40,000,000-byte sequence. It turned out that π t 13 = 2.9 × 10 26 and, hence, H 0 is rejected. Besides, we estimated time of all calculation (during both stages).
After that, we conducted an additional experiment to get the full picture. Namely, we calculated p-values for all tests for the same 40,000,000-byte sequence and then estimated the total time of calculations. It turned out that the p-values of the two tests were less than 0.001. Namely, π t 13 = 2.9 × 10 26 and π t 22 = 1.1 × 10 6 . (Note that p-value of this test is 0.12 for the 6,000,000-byte file.) Besides, we estimated time of calculations for all experiments. Thus, the described time-adaptive testing revealed one of the two most powerful tests, while the time used is eight times.
Table 3 describes an experiment with the battery Alphabet. The parameter D in Equation (9) was 4 and the length of a sequence was 60,000,000 bytes. During the preliminary stage, the 3,000,000 sequence was generated and tested by all tests. Then, four tests with the smallest p-values were applied to the sequence of 9,000,000 bytes and then we calculated Equation (10) and found the that the value log 2 π / l is maximal for the test t 15 . After that, this test was applied to 60,000,000-byte sequence. As in the previous example, we calculated the p-values for all tests on the same 60,000,000-byte sequence. All p-values are presented in Table 3.
We carried out similar experiments for different D = 2 , 3 , 4 (in Equation (9)) with different good and bad generators from [2] for batteries Rabbit and Alphabet. It turned out that in all cases either the battery rejects H 0 and the time-adaptive testing also rejects H 0 or H 0 was not rejected by both tests.

4. Conclusions

In this article, we show that the proposed time-adaptive testing is promising for RNG testing. On the other hand, the proposed time-adaptive testing does not offer exact values of numerous parameters for all possible batteries. Among these parameters, we note the number of steps at the preliminary stage (in the considered example, there were two such steps: selecting five tests and then one), the number of tests compared in one step, the length of the tested sequences, the rule for choosing tests at different stages, etc. The problem of parameter selection can be considered a multidimensional optimization problem. There are many methods available for solving such problems (for example, neural networks and other AI algorithms), and some of them can be used along with the time-adaptive testing.
We believe that the proposed approach makes it possible to investigate and optimize time-adaptive testing.

Funding

This research was funded by Russian Foundation for Basic Research, grant number 18-29-03005.

Conflicts of Interest

The author declares no conflict of interest.

Appendix A. Consistent Tests Based on Universal Codes

The considered tests are based on so-called universal codes, which is why we first briefly describe them. For any integer m, a code ϕ is defined as such a map from the set of m-letter words to the set of all binary words that for any m-letter u and v ϕ ( u ) ϕ ( u ) . This property gives a possibility to uniquely decode. (More formally, ϕ is injective mapping from { 0 , 1 } m to { 0 , 1 } * , where { 0 , 1 } * = i = 1 { 0 , 1 } i .) We consider so-called universal codes which have the two following properties:
m > 0 u { 0 , 1 } m 2 | ϕ ( u ) | = 1
and for any stationary ergodic ν defined on the set of all infinite binary words x = x 1 x 2 , with probability one
lim n 1 n | ϕ ( x 1 x 2 x n ) | / n = h ( ν )
where h ( ν ) is the Shannon entropy of ν . Such code exists (see [7]). Note that a goal of codes is to “compress” sequences, i.e., make an average length of the codeword ϕ ( x 1 x 2 x n ) as small as possible. The second property in Equation (A2) shows that the universal codes are asymptotically optimal, because the Shannon entropy is a low bound of the length of the compressed sequence (per letter) (see [7]).
Let us return to the considered problem of hypothesis testing. Suppose it is known that a sample sequence x = x 1 x 2 was generated by stationary ergodic source and, as before, we consider the same H 0 against the same H 1 . Let ϕ be a universal code. The following test is suggested in [13]:
If the length | ϕ ( x 1 x n ) n log 2 α , then H 0 is rejected, otherwise accepted. Here, as above, α is the significance level and | ϕ ( x 1 x n ) | is the length of encoded (“compressed”) sequence. We denote this test by T ϕ and its statistic by τ ϕ , i.e.,
τ ϕ ( x 1 x n ) = n | ϕ ( x 1 x n ) | .
The following theorem is proven in [13,14]:
Theorem A1.
For each stationary ergodic ν, a l p h a ( 0 , 1 ) , and a universal code ϕ, with probability 1, the Type I error of the described test is not larger than α and the Type II error goes to 0, when n .

Appendix B. Proofs

Proof of Theorem 1. 
The known Shannon–McMillan–Breiman (SMB) theorem claims that, for the stationary ergodic source ν and any ϵ > 0 , δ > 0 , there exists such n that
ν { x : x { 0 , 1 } n & h ( ν ) ϵ < 1 n log ν ( x ) < h ( ν ) + ϵ } > 1 δ
for n > n (see [7]). From this, we obtain
ν { x : x { 0 , 1 } n & 2 n ( h ( ν ) ϵ ) > ν ( x ) > 2 n ( h ( ν ) + ϵ ) } > 1 δ
for n > n . It is convenient to define
Φ ϵ , δ , n = { x : x { 0 , 1 } n & h ( ν ) ϵ < 1 n log ν ( x ) < h ( ν ) + ϵ }
From this definition and Equation (A5), we obtain
( 1 δ ) 2 n ( h ( ν ) ϵ ) | Φ ϵ , δ , n | 2 n ( h ( ν ) + ϵ ) .
For any x Φ ϵ , δ , n , define
Λ x = { y : ν ( y ) > ν ( x ) } Φ ϵ , δ , n .
Note that, by definition, | Λ x | | Φ ϵ , δ , n | and, from Equation (A7), we obtain
| Λ x | 2 n ( h ( ν ) + ϵ ) .
For any ρ ( 0 , 1 ) , we define Ψ ρ Φ ϵ , δ , n such that
ν ( Ψ ρ ) = ρ & u Ψ ρ v ( Φ ϵ , δ , n Ψ ρ ) ν ( u ) ) ν ( v ) .
(That is, Ψ ρ contains the most probable words whose total probability equals ρ .) Let us consider any x ( Φ ϵ , δ , n Ψ ρ ) . Taking into account the definition in Equations (A10) and (A7), we can see that for this x
| Λ x | ρ | Φ ϵ , δ , n | ρ ( 1 δ ) 2 n ( h ( ν ) ϵ ) .
Thus, from this inequality and Equation (A9), we obtain
ρ ( 1 δ ) 2 n ( h ( ν ) ϵ ) | Λ x | 2 n ( h ( ν ) + ϵ ) .
From Equations (A5), (A6), and (A10), we can see that ν ( Φ ϵ , δ , n Ψ ρ ) ( 1 δ ) ( 1 ρ ) . Taking into account Equation (A12) and this inequality, we can see that
ν { x : x { 0 , 1 } n & h ( ν ) ϵ log ( ρ ( 1 δ ) ) / n log | Λ x | / n
h ( ν ) + ϵ } ( 1 δ ) ( 1 ρ ) .
From the definition in Equation (2) of π N P ( x ) and the definition in Equation (A8) of Λ x , we can see that π N P ( x ) = | Λ x | / 2 n . Taking into account this equation and Equation (A13), we obtain the following:
ν { x : x { 0 , 1 } n & 1 ( h ( ν ) ϵ log ( ρ ( 1 δ ) ) / n )
log π N P ( x ) / n 1 ( h ( ν ) + ϵ ) } ( 1 δ ) ( 1 ρ ) .
Having taken into account that this inequality is valid for all positive ϵ , δ , and ρ , we obtain the first statement of the theorem.
The proof of the second statement of the theorem is close to the previous one. First, from Theorem A1 we see that, for any ϵ > 0 , δ > 0 , we define
Φ ^ ϵ , δ , n = { x : h ( ν ) ϵ < | ϕ ( x 1 x n ) | / n < h ( ν ) + ϵ } .
Note that, from Equation (A2), we can see that there exists such n that, for n > n ,
ν ( Φ ^ ϵ , δ , n ) > 1 δ .
We use the set Φ ϵ , δ , n (see Equation (A6)). Having taken into account the SMB theorem in Equations (A4) and (A16), we can see that
ν ( Φ ^ ϵ , δ , n Φ ϵ , δ , n ) > 1 2 δ ,
if n > max ( n , n ) .
From this moment, the proof begins to repeat the proof of the first statement if we use the set ( Φ ^ ϵ , δ , n Φ ϵ , δ , n ) instead of Φ ϵ , δ , n . The only difference is in the definitions in Equations (A8) and (A10) which should be changed as follows.
Λ x = { y : | ϕ | ( y ) | < | ϕ ( x ) | } ( Φ ^ ϵ , δ , n Φ ϵ , δ , n )
and Ψ ρ is such a subset of ( Φ ^ ϵ , δ , n Φ ϵ , δ , n ) that
ν ( Ψ ρ ) = ρ & u Ψ ρ v ( Φ ϵ , δ , n Ψ ρ ) | ϕ ( u ) | | ϕ ( v ) | .
If we replace π N P with π τ ϕ and δ with 2 δ , we obtain the proof of the second statement. The theorem is proven. □

References

  1. L’Ecuyer, P. History of uniform random number generation. In Proceedings of the WSC 2017-Winter Simulation Conference, Las Vegas, NV, USA, 3–6 December 2017. [Google Scholar]
  2. L’Ecuyer, P.; Simard, R. TestU01: AC library for empirical testing of random number generators. ACM Trans. Math. Softw. 2007, 33, 22. Available online: http://simul.iro.umontreal.ca/testu01/tu01.html (accessed on 6 June 2020). [CrossRef]
  3. Herrero-Collantes, M.; Garcia-Escartin, J.C. Quantum random number generators. Rev. Mod. Phys. 2017, 89, 015004. [Google Scholar] [CrossRef] [Green Version]
  4. Marsaglia, G. Xorshift rngs. J. Stat. Softw. 2003, 8, 1–6. [Google Scholar] [CrossRef]
  5. Rukhin, A.; Soto, J.; Nechvatal, J.; Smid, M.; Barker, E.; Leigh, S.; Levenson, M.; Vangel, M.; Banks, D.; Heckert, A.; et al. A Statistical Test Suite for Random and Pseudorandom Number Generators for Cryptographic Applications; National Institute of Standards and Technology: Gaithersburg, MD, USA, 2010.
  6. Demirhan, H.; Bitirim, N. Statistical Testing of Cryptographic Randomness. J. Stat. Stat. Actuar. Sci. IDIA 2016, 9, 1–11. [Google Scholar]
  7. Cover, T.M.; Thomas, J.A. Elements of Information Theory; Wiley-Interscience: New York, NY, USA, 2006. [Google Scholar]
  8. Li, M.; Vitányi, P. An Introduction to Kolmogorov Complexity and Its Applications; Springer: New York, NY, USA, 2008. [Google Scholar]
  9. Calude, C.S. Information and Randomness—An Algorithmic Perspective; Springer: Berlin/Heidelberg, Germany, 2002. [Google Scholar]
  10. Downey, R.; Hirschfeldt, D.R.; Nies, A.; Terwijn, S.A. Calibrating randomness. Bull. Symb. Log. 2006, 12, 411–491. [Google Scholar] [CrossRef] [Green Version]
  11. Kendall, M.; Stuart, A. The Advanced Theory of Statistics; Volume 2: Inference and Relationship; Hafner Publishing Company: New York, NY, USA, 1961. [Google Scholar]
  12. Ryabko, B. On asymptotically optimal tests for random number generators. arXiv 2019, arXiv:1912.06542. [Google Scholar]
  13. Ryabko, B.; Astola, J. Universal Codes as a Basis for Time Series Testing. Stat. Methodol. 2006, 3, 375–397. [Google Scholar] [CrossRef] [Green Version]
  14. Ryabko, B.; Astola, J.; Malyutov, M. Compression-Based Methods of Statistical Analysis and Prediction of Time Series; Springer: Cham, Switzerland, 2016. [Google Scholar]
Table 1. Time-adaptive testing. Preliminary stage.
Table 1. Time-adaptive testing. Preliminary stage.
TestLength (l) (Bytes)P-Value ( π )−log 2 π / l
t12 × 10 6 0.42 6.3 × 10 7
t2 2 × 10 6 0.37 7.2 × 10 7
t3 2 × 10 6 0.028 26 × 10 7
t4 2 × 10 6 0.78 1.8 × 10 7
t5 2 × 10 6 0.4 6.6 × 10 7
t6 2 × 10 6 0.37 7.2 × 10 7
t7 2 × 10 6 0.059 20 × 10 7
t8 2 × 10 6 0.026 26 × 10 7
t9 2 × 10 6 0.72 2.4 × 10 7
t10 2 × 10 6 0.72 2.4 × 10 7
t11 2 × 10 6 0.63 3.3 × 10 7
t12 2 × 10 6 0.74 2.2 × 10 7
t13 2 × 10 6 0.021 28 × 10 7
t14 2 × 10 6 0.42 6.2 × 10 7
t15 2 × 10 6 0.9 0.76 × 10 7
t16 2 × 10 6 0.087 18 × 10 7
t17 2 × 10 6 0.72 2.3 × 10 7
t18 2 × 10 6 0.58 3.9 × 10 7
t19 2 × 10 6 0.89 0.84 × 10 7
t20 2 × 10 6 0.51 4.9 × 10 7
t21 2 × 10 6 0.047 22 × 10 7
t22 2 × 10 6 0.47 0.47 × 10 7
t23 2 × 10 6 0.18 12 × 10 7
t24 2 × 10 6 0.14 14 × 10 7
t25 2 × 10 6 0.024 27 × 10 7
Table 2. Time-adaptive testing. Preliminary stage.
Table 2. Time-adaptive testing. Preliminary stage.
TestLength (l) (Bytes)P-Value−log 2 π / l
t36 × 10 6 0.23 3.5 × 10 7
t8 6 × 10 6 0.0037 13 × 10 7
t13 6 × 10 6 0.0028 14 × 10 7
t21 6 × 10 6 0.73 0.76 × 10 7
t25 6 × 10 6 0.05 7.2 × 10 7
Table 3. p-values for different tests from Alphabet.
Table 3. p-values for different tests from Alphabet.
TestStep 1Step 2Step 3Alphabet
10.4--0.37
20.27--0.47
30.28--0.047
40.0560.057- 2.9 × 10 7
50.064--0.021
50.38--0.0027
60.28--0.22
70.21--0.18
80.36--0.29
90.31--0.16
100.091--0.15
110.0320.087-0.33
120.12--0.032
130.19--0.19
140.0550.073-0.12
150.0450.0423.6 × 10 6 3.6 × 10 6
160.091--0.26
170.24--0.31

Share and Cite

MDPI and ACS Style

Ryabko, B. Time-Adaptive Statistical Test for Random Number Generators. Entropy 2020, 22, 630. https://doi.org/10.3390/e22060630

AMA Style

Ryabko B. Time-Adaptive Statistical Test for Random Number Generators. Entropy. 2020; 22(6):630. https://doi.org/10.3390/e22060630

Chicago/Turabian Style

Ryabko, Boris. 2020. "Time-Adaptive Statistical Test for Random Number Generators" Entropy 22, no. 6: 630. https://doi.org/10.3390/e22060630

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop