Randomness Analysis for the Generalized Self-Shrinking Sequences

: In cryptography, the property of randomness in pseudo-random generators is very important to avoid any pattern in output sequences, to provide security against attacks, privacy and anonymity. In this article, the randomness of the family of sequences obtained from the generalized self-shrinking generator is analyzed. Moreover, the characteristics, generalities and relationship between the t -modiﬁed self-shrinking generator and the generalized self-shrinking generator are presented. We ﬁnd that the t -modiﬁed self-shrunken sequences can be generated from a generalized self-shrinking generator. Then, an in-depth analysis of randomness focused on the generalized sequences by means of complete and powerful batteries of statistical tests and graphical tools is done, providing a useful vision of the behaviour of these sequences and proving that they are suitable to be used in cryptography.


Introduction
In cryptography, randomness plays an important role in multiple and diverse applications. Random numbers are employed to generate cryptographic keys, challenges, nonces, to encrypt messages and at different steps of cryptographic algorithms and protocols [1][2][3][4].
A pseudo-random number generator is an algorithm for creating a sequence of numbers that is supposed to be indistinguishable from a uniformly chosen random sequence. The sequence is not really random, since it is completely determined by a small set of initial values, called the seed. However, in cryptography, where the security of many cryptographic schemes lies in the quality of pseudorandom generators, it is necessary that the sequences meet the following requirements-(1) the generated sequence must not be distinguished from a truly random sequence; (2) the sequence must be unpredictable; (3) the sequence period must be very large; (4) the key space must be large enough for a brute or exhaustive force attack to be impossible; (5) the design of the generator should be resistant to the specialized attacks reported in the literature.
There is no mathematical proof that ensures the randomness of a bit sequence; however, there exists a huge number of empirical tests to determine if a sequence is random enough and secure to be used in cryptography [5]. If the sequences of a generator pass the statistical tests, then this could be accepted as a generator of random sequences. Otherwise, if several tests fail, it means that the generator is not good and must be rejected. Choosing the correct number of these tests to determine whether the sequence in question can be considered random is a very difficult task since we cannot assure how many tests are needed for it. We have chosen some of those that are considered the most complete randomness tests, like the FIPS (Federal Information Processing Standard) test 140-2 [6], Diehard Battery of Tests [7], the NIST-SP-800-22 battery test [8] and other tests from the chaos theory, that were presented in References [9,10]. The Generalized Self-Shrinking Generator (GSSG) [11] is fast, easy to be implemented and generates good cryptographic sequences, so it seems suitable for its use in lightweight cryptography and, in general, in low-cost applications. However, the randomness of these sequences has never been analysed with such a complete battery of tests.
In this article, the randomness of the family of sequences obtained from the generalized self-shrinking generator is analyzed. First, the characteristics and generalities of this family of pseudorandom generators have been considered in detail. Then, an in-depth analysis of randomness focused on the generalized sequences by means of complete batteries of statistical tests was done. Tables, figures and graphical representations illustrate the obtained results.

Related Work
One of the most accepted designs of Pseudo-Random Number Generator (PRNG) is based on Linear Feedback Shift Registers (LFSR) because LFSRs' sequences can have good statistical properties and their good efficiency in hardware designs. Linear feedback shift registers have been used as basic component of such PRNG but they all have been successfully cryptanalyzed by means of different attacks such as algebraic and correlation attacks, to name a few. Its main weakness is its linearity, which allows the building of a system of equations that solves the parameters used in its design [12].
To avoid these cryptanalytic attacks, new designs use non-linear operations, such as non-linear filtering and sequence decimation, for example. The shrinking generator and the self-shrinking generators are good examples of how to convert a linearly generated sequence into a non-linear one. To do that, different rules, which decimate the LFSR produced sequence in an irregular way, are used. The Shrinking Generator (SG) was firstly proposed in 1993 by Coppersmith, Krawczyk and Mansour [13] and the Self-Shrinking Generator (SSG) in 1994 by Meier and Staffelbach [14].
In Reference [15] a novel generator based on the generalized self-shrinking stream sequence generator (called F-GSS) was proposed, the sequences generated by the F-GSS were analyzed using the NIST statistical test suite, showing that it has good pseudo-random properties.
The Modified Self-Shrinking Generator (MSSG) was proposed by Kanso in Reference [16]. The study of the randomness of this generator was carried out by the NIST statistical test suite and it was demonstrated that sequences of the MSSG have better randomness properties than those of the SSG. In Reference [17] the authors present a new non-periodic random number generator based on the shrinking generator. The randomness of the sequences of the new generator was analyzed by means of Diehard battery of tests, verifying that this new design performs well in this statistical battery of tests.
Tasheva et al. in Reference [18] proposed a variant of the SSG called the p-ary Generalized Self-Shrinking Generator (pGSSG). The authors have studied its randomness using the NIST statistical test suite, later in Reference [19] the balance property of the previously proposed p-ary Generalized Self-Shrinking Generator was studied and it was shown that the generated sequences could be considered as balanced. Erkek and Tuncer in Reference [20] have implemented the SG and Alternating Step Generator on an FPGA Altera Cyclone IV board. Generated numbers in the real time were tested using the NIST statistical test suite. The results have shown that both generators have good statistical properties. In Reference [21] the authors have studied the randomness of the Self-Shrinking generator by means of the d-Monomial test. They have found that there exist some statistical dependencies on certain randomness properties of the generalized SSG and polynomial used in its design. For this reason, they recommend to take special care when choosing the polynomial for the SSG in order to the generator be cryptographically secure. In Reference [22] the author have analyzed a keystream produced by Generalized Shrinking Multiplexing Generator controlled by Ternary m-sequences (GSMG-3m). For randomness analysis they use the NIST statistical test suite, the spectral test and, approximate entropy test. The authors have presented some cryptanalytic work of the proposed generator that prove that GSMG-3m is more secure than the Shrinking Generator.
As can be seen, there are few works that have deeply studied the randomness of the sequences generated by the different families of shrinking generators through several statistical test batteries such as those presented in this paper.

Preliminaries
In order to the work be self-contained, some basic concepts concerning binary sequences as well as sequence generators based on irregular decimation are introduced. All of them will be used throughout the paper.
As has been said previously, the security of many cryptographic algorithms is based on a well designed random and pseudorandom generators. It is worth mentioning that the design of reliable and secure pseudorandom number generators is an open problem and an intensive field of research in cryptography nowadays [23][24][25][26][27][28]. The family of shrinking generators is one of the most analyzed PRNG in the literature due to its performance and security when it is well designed [4,21,22,[29][30][31].

PN-Sequences
Let F 2 = {0, 1} be the Galois field. Consider {a i } i≥0 = {a 0 , a 1 , a 2 . . .} a binary sequence with a i ∈ F 2 , for i = 0, 1, 2, . . . We say the sequence {a i } i≥0 is periodic if there exists an integer T, called period, such that a i+T = a i , for all i ≥ 0. In the sequel, all the sequences considered will be binary sequences and the symbol + will denote the Exclusive-OR (XOR) logic operation.
Let r be a positive integer and let d 1 , d 2 , d 3 , . . . , d r be constant coefficients with d j ∈ F 2 . A binary sequence {a i } i≥0 satisfying the relation: is called a (r-th order) linear recurring sequence (LRS) in F 2 . The terms {a 0 , a 1 , . . . , a r−1 } are referred to as the initial terms and define the construction of the sequence uniquely. The monic polynomial: is called the characteristic polynomial of the linear recurring sequence and {a i } i≥0 is said to be generated by p(x). Linear recurring sequences can be generated using Linear Feedback Shift Registers (LFSRs) [5,12,32]. In fact, an LFSR can be defined as an electronic device with r memory cells (stages) with binary content. At every clock pulse, the binary element of each stage is shifted to the adjacent stage as well as a new element is computed through the linear feedback to fill the empty stage (see Figure 1). The LFSR has maximal-length if the characteristic polynomial of the LFSR is primitive. Its output sequence is called PN-sequence (Pseudo-Noise sequence) and has period T = 2 r − 1, see Reference [32]. The linear complexity, LC, of a sequence {a i } i≥0 is defined as the length of the shortest LFSR that generates such a sequence or, equivalently, as the lowest order linear recurrence relationship that generates such a sequence.
In cryptographic terms, the linear complexity must be as large as possible as LC defines the minimum piece of the sequence needed to get the whole sequence.
A simple result that will be useful in the next section is introduced below. Lemma 1. Let {a i } i≥0 be a PN-sequence with period T. Then, the sequence {u i } such that u i = ∑ t−2 k=0 a t·i+k is again a PN-sequence with the same period T iff gcd(T, t) = 1.
Proof. The sequence {a t·i } is a PN-sequence iff gcd(T, t) = 1, see Reference [32] (pag. 78). The sequences {a t·i+k } for k = 0, . . . , t − 2 are shifted versions of {a t·i } with different starting points. If we XOR a PN-sequence with a shifted sequence of itself, then we have the same PN-sequence but starting at a different bit [32] (Theorem 4.3-4.5). Thus, {u i } is the same sequence as {a t·i } except for the starting point, that is,

Modified Self-Shrinking Generator (MSSG)
Decimation is a very habitual technique to produce pseudo-random sequences with cryptographic applications [33,34]. In practice, the underlying idea in this kind of generators is the irregular decimation of a PN-sequence according to the bits of another.
The Modified Self-Shrinking Generator (MSSG) introduced by Kanso in Reference [16] is a modification of the well-known Self-Shrinking Generator (SSG) [14]. Indeed, in the MSSG the PN-sequence {a i } i≥0 generated by a maximal-length LFSR is self-decimated. The decimation rule is very simple and can be described as follows: given three consecutive bits {a 3i , a 3i+1 , a 3i+2 }, i = 0, 1, 2, . . ., the output sequence {s j } j≥0 is computed as If a 3i + a 3i+1 = 1 then s j = a 3i+2 , If a 3i + a 3i+1 = 0 then a 3i+2 is discarded.
The output sequence {s j } j≥0 is known as the Modified Self-Shrunken sequence (MSS-sequence). If L is the length of the maximal-length LFSR that generates {a i } i≥0 , then the linear complexity LC of the corresponding MSS-sequence satisfies: and the period T of the sequence, when L is odd, satisfies: as proved in Reference [16]. As usual, the key of this generator is the initial state of the LFSR that generates {a i } i≥0 . The characteristic polynomial of such a register is also recommended to be part of the key. The MSS-sequence is obtained as follows: The obtained sequence {s j } = {0 0 1 0 . . .} (encircled bits) has period T = 4 and it can be checked that its characteristic polynomial is p 4 (x) = 1 + x 4 . Thus, the linear complexity of this MSS-sequence is LC = 4.
In Reference [30], the authors showed that the sequences produced by this generator are contained in the family of sequences generated by the generalized self-shrinking generator.

The Generalized Self-Shrinking Generator (GSSG)
In this subsection, we introduce the most representative generator in this family of decimation-based sequence generators, that is, the Generalized Self-Shrinking Generator (GSSG) [11]. In fact, the sequences produced by this generator include the sequences produced by the generators previously described.
Let {a i } i≥0 be an PN-sequence produced by a maximal-length LFSR with L stages. Let G = [g 0 , g 1 , g 2 , ..., g L−1 ] ∈ F L 2 be an L-dimensional binary vector and {v i } i≥0 a sequence defined as: v i = g 0 a i + g 1 a i−1 + g 2 a i−2 + · · · + g L−1 a i−L+1 . For i ≥ 0, the decimation rule is defined as follows: The output sequence generated {s j } j≥0 associated with G, denoted by s(G), is called the Generalized Self-Shrunken sequence (GSS-sequence).
When G ranges over F L 2 , then {v i } corresponds to the 2 L − 1 possible shifts of {a i }, that is, the sequence {v i } is a shifted version of the PN-sequence {a i }. Moreover, we obtain the family of generalized self-shrunken sequences based on the PN-sequence {a i } i≥0 given by the set of sequences denoted by S(a) = {s(G)|G ∈ F L 2 }. In Table 1, the algorithm to compute these sequences is shown (Algorithm 1). Table 1. Algorithm to compute the GSS-sequences.

The t-Modified Self-Shrinking Generator
A generalization of GSSG, the t-Modified Self-Shrinking Generator (t-MSSG) was introduced by Cardell et al. in Reference [31] and can be described as follows. Consider a maximal-length LFSR with L stages that generates the PN-sequence {a i } i≥0 . The t-modified self-shrinking generator, with (t = 2, 3, . . . , 2 L − 2), can be constructed making use of a very simple decimation rule.
Given t consecutive bits {a t·i , a t·i+1 , a t·i+2 , . . . , a t·i+(t−1) } of the PN-sequence, the output sequence of this generator {s j } j≥0 is known as the t-Modified Self-Shrunken sequence (t-MSS-sequence) and computed as follows: Notice that the value t = 2 gives rise to the self-shrinking generator [14] while the value t = 3 defines the modified self-shrinking generator. In Table 3 the algorithm to compute this sequence is presented (Algorithm 2). Characteristics and generalities of the t-MSS-sequences can be found in Reference [31]. Table 3. Algorithm to compute the t-MSS-sequence.

Input:
Primitive polynomial p(x), initial state a a a and t 01: Compute the PN-sequence {a i }. 02: Set T = 2 L − 1 the period of the PN-sequence 03: for k = 0 to T − 1 do 04 Initialize sequence {s j } 05: if ∑ t−2 j=0 a t·k+j = 1 do 06: Add a t·k+(t−1) as new bit of the sequence {s j } 07: endif 08: end for Output: {s j } t-MSS-sequence.

Relationship between t-Modified Self-Shrunken Sequences and Generalized Self-Shrunken Sequences (GSS-Sequences)
Now, we analyse the close relationship between t-Modified Self-Shrunken sequences (t-MSS-sequences) and Generalized Self-Shrunken sequences (GSS-sequences).
In Theorem 1 of Reference [30], they analyse the relationship between modified self-shrunken sequences and generalized self-shrunken sequences with a result similar to the following: Theorem 1. The t-MSS-sequence as a result of self-decimating a PN-sequence with characteristic polynomial q(x) of degree L and gcd(T, t) = 1, can be generated from a generalized self-shrinking generator with a primitive polynomial p(x) of the same degree L.
Proof. Let {a i } be a PN-sequence with characteristic polynomial q(x) of degree L which is self-decimated. In order to generate the t-MSS-sequence, sets of t bits {a t·i , a t·i+1 , a t·i+2 , . . . , a t·i+(t−1) }, (i ≥ 0) have to be taken. Applying the decimation rule defined in 2, if ∑ t−2 k=0 a t·i+k = 1, the bit a t·i+(t−1) is kept. Otherwise, it is discarded. According to Lemma 1, the sequence {u i } defined as u i = ∑ t−2 k=0 a t·i+k = a t·i+D is obtained by decimating the sequence {a i } by distance t. Since gcd(T, t) = 1, according to Reference [32], we have that {u i } is a PN-sequence generated by a primitive polynomial p(x) of the same degree, L.
Also, if the sequence {v i } is taken, with v i = a t·i+(t−1) , this means that the sequence {a i } is being decimated again by the distance t. As before, we have that {v i } is also a PN-sequence with primitive polynomial p(x) [32].
In order to obtain the t-MSS-sequence, the t-MSSG decimation rule is applied to the sequences {u i } and {v i }. As both sequences are shifted versions of the PN-sequence {a i }, we can generate such a t-MSS-sequence by a GSSG with characteristic polynomial p(x).
As a result of the previous theorem, we have that: Corollary 1. If t = 2, 4, . . . , 2 L−1 , then the t-MSS-sequence is generated as a generalized sequence with the same primitive polynomial q(x).
The next theorem gives us the primitive polynomial p(x) that we need in Theorem 1 in order to the GSSG generates the t-MSS-sequence obtained with a characteristic polynomial q(x) .

Theorem 2.
When gcd(T, t) = 1, the primitive polynomial p(x) in Theorem 1 is: Proof. The primitive polynomial q(x) can be expressed as: where α ∈ F 2 L is a primitive element in such a field as well as a root of q(x). Furthermore, any element of the PN-sequence {a i } is obtained as: . When A 0 = 1, it is said that the PN-sequence is in its characteristic phase.
The following sequence is obtained: That is a PN-sequence (since gcd(T, t) = 1) and each one of its bits can be computed as: If u i = a t·i and β = α t , then any element of the PN-sequence {u i } can be computed as follows: Therefore, the characteristic polynomial of the PN-sequence {u i } is, Proof. According to Reference [32] Theorem 3. Given a PN-sequence with period prime T = 2 L − 1 and q(x) characteristic polynomial of degree L, then the t-MSS-sequence obtained for any t is a generalized sequence generated with a primitive polynomial of degree L.
Proof. The proof follows the same reasoning used in Theorem 1 and Lemma 2.
which is a prime number. Table 4 shows all the t-MSS-sequences generated with this polynomial. All of them are generalized sequences obtained from a primitive polynomial q(x) of degree 5. It is important to mention that some generalized sequences can be generated using different primitive polynomials. For example, the generalized sequence {101010101010101} can be obtained using any primitive polynomial of degree 5.
Next, the relationship between t-MSS-sequences and GSS-sequences is analyzed from other point of view, using the cyclotomic cosets given in Reference [32].
The smallest integer i in any equivalence class is defined as the leader of the coset and is denoted by C i . The cardinal of a coset is L or a proper divisor of L. The characteristic polynomial of a cyclotomic coset C i is the polynomial P C i (x) = (x + α i )(x + α 2i )... (x + α 2 r−1 i ), where the degree r (r ≤ L) equals the cardinal of the coset C i and α is a root of the LFSR characteristic polynomial.
Following [32] (Chapter 4), C i is a proper coset if gcd(2 L − 1, i) = 1, therefore in this case, P C i (x) is a primitive polynomial, which is a remarkable property because if P C i (x) is a primitive polynomial the sequence generated by the basic LFSR is as large as possible.

Example 4.
Consider the set Z * 2 5 . Notice that 2 5 − 1 is a primer integer. There are six cyclotomic cosets given by: In this case, all cosets are proper cosets and have cardinal 5. If q(x) = 1 + x 2 + x 5 is considered the characteristic polynomial of the LFSR, then the corresponding characteristic polynomial of the cosets are given in Table 5. Since all cosets are proper, all the characteristic polynomials are primitive of degree 5. Table 5. Characteristic polynomial of cyclotomic cosets.
Theorem 4. Consider a PN-sequence of period prime T = 2 L − 1 and its characteristic polynomial q(x) of degree L, then both t-MSS-sequences obtained for any t 1 and t 2 are generalized sequences produced by the same polynomial of degree L iff t 1 and t 2 belong to the same coset.
Proof. According to the proof of Theorem 1, a t-MSS-sequence is obtained decimating the sequence {a t·i } with a shifted version of itself, that is, as a generalized sequence. According to [32] (Theorem 5.5), {a t 1 ·i } and {a t 2 ·i } are shifted versions of the same PN-sequence iff t 1 and t 2 belong to the same coset. Thus, the decimation rule is applied to two shifted versions of the same PN-sequence and, consequently, a generalized sequence has been generated.
As already mentioned, Table 4 shows all the t-MSS-sequences generated by q(x) = 1 + x 2 + x 5 and t = 2, 3, . . . , 30. Notice that when t 1 and t 2 are in the same coset, then the corresponding t-MSS-sequences are generalized GSS-sequences produced by the same polynomial (characteristic polynomial of the LFSR).
Furthermore, reciprocal polynomials generate sometimes the same sequences with different starting points. For example, the generalized sequence produced with t = 29 can be also generated as a generalized sequence using q(x) = 1 + x 2 + x 5 .
In the following example (example 5), notice that when 2 L − 1 is not prime, different types of cyclotomic cosets can be obtained [31].

Example 5.
Consider the set Z * 2 4 . Notice that 2 4 − 1 is not a prime number. There are 4 cyclotomic cosets given by: In this case, C 1 and C 7 are proper cosets and C 5 and C 3 are improper cosets. Therefore, we know that the P C 1 (x) and P C 7 (x) are primitive polynomials. Consider q(x) = 1 + x + x 4 as the characteristic polynomial of the LFSR. Then, the characteristic polynomial of the cosets are given in Table 6. We can check that P C 1 (x) and P C 7 (x) are primitive polynomials of degree 4 and P C 3 (x) is an irreducible polynomial of degree 4. The polynomial P C 5 (x) is a primitive polynomial of degree 2. Table 6. Characteristic polynomial of cyclotomic cosets.

C i P C i (x)
Theorem 5. Given a PN-sequence of period T = 2 L − 1 and characteristic polynomial q(x) of degree L, then both t-MSS-sequences obtained for any t 1 and t 2 are GSS-sequences generated by the same primitive polynomial of degree L iff t 1 and t 2 belong to the same proper coset.
Proof. If the coset C i , such that t 1 , t 2 ∈ C i is proper, it means that gcd(t 1 , T) = gcd(t 2 , T) = 1. The rest follows from previous results.

Remark 1.
When gcd(t, T) = 1, the corresponding t-MSS-sequence is a generalized sequence iff P C t (x) is a primitive polynomial of degree equal to |C i | (cardinal of C i ).
Since, under not very restrictive conditions, the GSS-sequences include the other sequences produced by decimation-based generators, our randomness analysis focuses on this class of binary sequences.
In Table 7, we summarize the three more popular decimation-based sequence generators with the bounds for their periods and their linear complexities that were discussed in this work. Table 7. Summary of the main characteristics of the three decimation-based generators discussed in this work.
Generalized self-shrinking (GSSG), [11] Let {a i } i≥0 be an PN-sequence generated by a maximal-length LFSR with L stages. Let G be an L-dimensional binary vector G = [g 0 , g 1 , g 2 , ..., g L−1 ] ∈ F L 2 and {v i } i≥0 a sequence defined as: v i = g 0 a i + g 1 a i−1 + g 2 a i−2 + · · · + g L−1 a i−L+1 . For i ≥ 0, the decimation rule is: If a i = 1 then s j = v i . If a i = 0 then v i is discarded.
Other cases are not cryptographic relevant.
If gcd{2 L − 1, t} = 1 or P Ct is primitive with degree |C i | : Other cases are not cryptographic relevant.

Statistical Randomness Analysis
In this section, an exhaustive analysis of randomness of the proposed GSS-sequences is presented by using different batteries of statistical tests to study their behaviour. Some graphical tools from chaos theory have been used [9,10], for example, return maps, chaos game, Lyapunov exponent, and so forth. The generator and the battery of tests were implemented with Matlab 9.1 (2017) in a Windows 10 environment in a 64 bits PC with CPU Intel Core i7-870, at 2.93 GHz.
For our study, GSS-sequences s(G) are generated from PN-sequences coming from maximal-length LFSRs with characteristic polynomials of degree less than or equal to 27. Every one of these sequences has passed perfectly the Diehard battery of tests, considered one of the most important and powerful tool for randomness study.
Furthermore, the family of GSS-sequences is analysed with the family of statistical tests FIPS 140-2, provided by the National Institute of Standards and Technology (NIST), as well as with the Lempel-Ziv Compression Test. In both cases the sequences have passed the tests.

Graphical Testing
In this section, the main graphical tests used in Reference [9], are applied to the GSS-sequences, from which their cryptographic properties can be analyzed.
The results obtained for GSS-sequences s(G) of length 2 23 bits, is presented. These sequences are generated by the GSSG from a maximal-length LFSR with the 24-degree characteristic polynomial p(x) = x 24 + x 20 + x 17 + x 13 + x 10 + x 7 + x 4 + x 2 + 1 and whose initial state is the identically 1 vector of length 24.
The tests were performed with 2 23 bit sequences. Most of the tests works associating every eight bits in an octet, obtaining sequences of 2 20 samples of 8 bits; with the exception of the Linear complexity test that works with just one bit and the Chaos game that works associating the bits two by two.
Next, the results of graphical tests to study the randomness of our sequences is shown.

Return map
Return map [10] tries to measure visually the entropy of the sequence, that is, allows to detect the existence of some useful information about the parameters used in the design of pseudo-random generators [36]. This test, that customarily is used in theory of dynamic systems, is also a powerful tool in cryptanalysis.
Basically, it consists of a graph of the points of the sequence x t as a function of x t−1 and, under certain conditions, allows us to obtain the value of the parameters of a pseudo-random sequence, defeating the security of the cryptosystem under analysis. The result should be a distribution of points where you cannot guess neither trends, nor figures, nor lines, nor symmetry, nor patterns. Figure 2 shows the return map of our GSS-sequence as a disordered cloud, which does not provide any useful information for its cryptanalysis. Figure 3a,b are the return applications of two imperfect generators where the lack of randomness can be neatly observed. Indeed, these maps present clear patterns that permit to determine the generator function and the parameter values.

Linear Complexity
The linear complexity (LC) is considered as a measure of the unpredictability of a pseudo-random sequence and is a widely used metric of the security of a keystream sequence [37]. We have used the Berlekamp-Massey algorithm [38] to compute this parameter. If the characteristic polynomial of the LFSR is primitive [32], then it is known as maximal-length LFSR; moreover, its output sequence has period T = 2 L − 1, where L is the degree of the characteristic polynomial.
LC must be as large as possible, that is, its value has to be very close to half the period [39], LC T/2. From Figure 4a, it can be deduced that the value of the linear complexity of the first 20,000 bits of the sequence is just half its length, 10,000 and, from Figure 4b is observed that LC is irregularly close to the l 2 -line, being l the length of the sequence.

Shannon Entropy and Min-Entropy
The entropy of a sequence is defined as a measure of the amount of information of a process measured in bits or as a measure of the uncertainty of a random variable. From these two possible interpretation, the quality of the output sequence or the input of a random number generator can be described, respectively.
Shannon's entropy is measured based on the average probability of all the values that the variable can take. A formal definition can be presented as follows, Definition 2. Let X be a random variable that takes on the values x 1 , x 2 , . . . , x n . Then the Shannon's entropy is defined as where Pr(·) represents probability.
If the process is a sequence of integers modulo m perfectly random, then its entropy is equal to n. As in the case at hand m = 2 n , the entropy of a random sequence must be close to n = 8 bit per octet.
The min-Entropy is only measured based on the probability of the more frequent occurrence value of the variable. It is recommended by the NIST SP 800 − 90B standard for True Random Number Generators (TRNG).
In order to determine if the proposed generator is considered perfect from these entropies values, according to Reference [40] for a sequence of 2 20 octets, it must obtain a Shannon entropy value greater or equal than 7.976 bits per octet and a min-entropy greater or equal to 7.91 bits per octet. In this case the following values are obtained: Shannon entropy (measured) = 7.9999 bits per octet. Min-entropy (measured) = 7.9457 bits per octet, then, it can be considered that this generator is correct using entropies. Note that the Shannon's entropy value of 7.9999 bits per octet fits close to the theoretical perfection of 8 bits per octet.

Lyapunov exponent
Lyapunov exponent measures the rate of divergence of nearby trajectories, which is a key component of chaotic dynamics. It is used as a quantitative measure for the sensitive dependence on initial conditions. It is desirable that two very close initial conditions (for instance, seeds or keys) provide very different trajectories (sequences). If Lyapunov exponent is greater than zero, the distance between two close initial conditions rapidly increases in the time, which means there exists an exponential divergence of the trajectories of a chaotic system. This value gives an idea of how different are the sequences generated by similar seeds, a very important feature to avoid attacks on the key of the generator. So, Lyapunov exponent is, in this case, a useful tool to evaluate the key space.
Next, a formal definition of Lyapunov exponent [41] is given.

Definition 3.
Consider d 0 the measure of the initial distance between two sequences and d t the measure of the distance between the same sequences but after t iterations. We define Lyapunov exponent as: If LE = 0, the sequences decrease their distance, tend to join and confused in one. The system converges and it is not at all random. If LE > 0, the distance increases, there is dependence sensitive to initial conditions, there is an exponential divergence of the orbit and randomness grows as higher is the value of LE.
Note that the Lyapunov exponent uses the natural logarithm of the Euclidean distance. Nevertheless, in information theory, other type of distances for measuring the distance between two sequences are used, for example Hamming distance, which indicates the number of bit positions in which both sequences differ.
If the Lyapunov exponent is modified simply by using the Hamming distance instead of the logarithm of the Euclidean distance, then it is called the Lyapunov Hamming exponent (LHE). If two numbers are identical, then its LHE value will be 0. Nevertheless, if all the bits of both numbers are different, then its LHE will be LHE = log 2 m = log 2 2 n = n, where n is the number of bits with which the numbers are encoded.
Obtaining the Lyapunov Hamming exponent for the chosen sequence is done by calculating the average of the LHE between every two consecutive numbers of the sequence. The best value will be n/2.
For this case, the best value is 4; we show the value obtained for our particular sequence analyzed: Lyapunov Hamming exponent, ideal = 4. Lyapunov Hamming exponent, real = 4. Absolute deviation from the ideal = −1.0014 × 10 −5 .
hence, the proposed generator passes perfectly this test.

Samples in increasing order
The samples of 8 bits are ordered by increasing value and are represented by a graph. They should give a continuous straight line (red), with an inclination of 45 degrees, which must cover the blue reference line.
This representation means that all the numbers are generated (if it is continuous) and that the density is uniform (if its inclination is 45 degrees). In Figure 5a, we observe that the samples are perfectly represented by a continuous straight line with the perfect inclination of 45 degrees.
From Figure 5b, the deviation between the increasing samples is analysed and the values −1, 0 or 1 are obtained.

Chaos game
Chaos game is a method that allows converting a one-dimensional sequence into a two dimensions sequence providing a very provocative visual representation, which reveals some of the statistical properties of the sequence under study. From this graphical technique is easy to look for, visually, patterns in the sequences generated by a random number generator. Furthermore, it allows us to find non-randomness within pseudo-random sequences.
Chaos game can be described mathematically by an Iterated Functions System (IFS) [10,42,43] and through which the transition to chaos associated with fractals can be studied. The result of chaos game is called attractor and not always is a fractal, it may be any compact set. If the output is a graph with fractals or patterns, then it means that the sequence cannot be considered random.
In Figure 6, it cannot be observe any pattern or fractal, it is a messy (or unordered) cloud of points, which does not provide any useful information for analysis, which implies good randomness.  In order to better understand this graphical test, we present in Figure 7a,b two Chaos Game representations, which appeared in Reference [10], which are not cryptographically secure. Their graphics are fractal which indicates that the design depends on a pattern (denoting the lack of randomness) and it is also worth mentioning that this pattern could be used to obtain important information for cryptanalysis.

Autocorrelation
The analysis of autocorrelation is a mathematical tool for finding repeating patterns analysing different sections of a message and compares them to find similarities. The autocorrelation function is defined as the crosscorrelation of the sequence with itself and allows measuring the linear relationship between random variables of processes separated a certain distance. It is very useful for finding periodic patterns within a signal. Figure 8 represents the autocorrelation index of our GSS-sequence, for all samples available. It can be seen that the sequence has a very long period, larger than the size of the sequence analyzed since the repetition frequency is not reached in the graph. The first autocorrelation coefficient is always equal to 1, while the other coefficients must have the smallest possible amplitude so that the sequence can be considered random before finding the period in which it begins to repeat itself. In the case at hand, values close to 0 are obtained, which means that the proposed sequence can be considered random for this study.

Fast Fourier Transform
The goal of Fast Fourier Transform test is the peak heights in the discrete Fast Fourier Transform. It consists of detecting repetitive patterns in the sequence analysed which would indicate a deviation from the assumption of randomness [8].
If the sequence is random, then all the maximum harmonics of Fast Fourier Transform have approximately the same horizontal level without an up or down trend. Figure 9 shows that all amplitude values are included in the same range, which means that the test is passed.

Distribution of identical samples
In this subsection, the distance of occurrence between samples of equal value is studied, because this measure is an important property of random sequences. The most probable distance between two identical samples of a perfect sequence is zero. If this distance increases, then the probability of coincidence between the two identical samples decreases following a Poisson distribution. Figure 10 shows that the distribution of samples of the proposed sequence is close to the ideal.

Collisions of the sequence
Collisions are an intrinsic property of random sequences. If one has a sequence of integers module m, the amount of different integer numbers will be m. When a number appears repeated, we say that a collision has occurred. In Reference [44] an analysis of the collisions problem is presented based on the birthday paradox which states that in a group of k people chosen at random, at least a pair of them will have the same birthday with probability: where m is the number of days of the year and k is the number of people in the living room. This paradox can be applied to hash functions. One of the desirable properties of cryptographic hash functions is that it is computationally impossible for a collision to occur; that is, given two different inputs, hash function does not produce the same output.
Suppose that we have a hash function of n bits, so we have m = 2 n output possible values. From this idea, it can be deduced the inequality: which provides an estimated value of the quantity k of rolls of a random sequence that must be extracted to have a probability of a first collision greater than or equal to 0.5. From Equation (3) it can be deduced the collision probability density distribution Dp k as a function of k, In Figure 11 is represented the first collision probability density distribution function for a sequence of octets, that is, n = 8, m = 256 as a red line. It can be seen that the mode of the distribution is k = 17 = 1 + √ m and for a quantity of rolls k = 4 √ m = 64 the collision probability density is practically zero.
Any sequence with a perfect randomness must fit the first collision probability density distribution function corresponding to Equation (4).
The Figure 11 represents also a bar graph, with one bar for each value of k, of a GSS-sequence of 2 20 octets. It can be seen the perfect fitting with the expected theoretical distribution.  Figure 11. Distribution of the first collisions (blue bars) and collision probability density distribution function (red line).
As a curiosity, the first collision probability density distribution function coincides with a Weibull distribution function for the variable k, that is, the distribution which is most used to model data from reliability against catastrophes; in the present case, it models the amount of random number generation rolls needed for a first collision to appear, which is also a catastrophe for a hash function.

Diehard Battery of Tests
Diehard battery of tests [7] is a reliable standard and a powerful instrument for practical evaluation of the randomness of sequences of pseudo-random number generators. This tool is the first step in the evaluation process of cryptographic primitives. It cannot guarantee if your generator can be considered perfectly random, but if it does not pass the test suite, then it is not suitable for cryptographic applications.
Diehard battery consists of 15 different independent statistical tests, some of them repeated but with different parameters. The Diehard tests employ chi-squared goodness-to-fit technique to calculate a p-value, which should be uniform on [0, 1) if the input file contains truly independent random bits. It is considered that a bit stream really fails when it is gotten p-values of 0 or 1 to six or more places.
The GSS-sequences with characteristic polynomial of degree ≤ 27 have passed all tests in the Diehard battery. In Table 8 we show the results obtained with the Diehard battery from a s(G) sequence with characteristic polynomial p(x) = x 27 + x 23 + x 22 + x 17 + 1.

FIPS Test 140-2. Security Requirements for Cryptographic Modules
FIPS (Federal Information Processing Standard) Publication 140-2, is a U.S. government computer security standard [6] used to approve cryptographic modules. The National Institute of Standards and Technology (NIST) issued the FIPS 140-2 publication series to coordinate the requirements and standards for cryptography modules that include both hardware and software components (last updated 2002).
In FIPS 140-2 there are 4 statistical random number generator tests-The Monobit Test, The Poker Test, The Runs Test and The Long Runs Test. The proposed GSS-sequences with characteristic polynomials of degree ≤ 27 pass all these tests. Below we detail the results: Runs frequency Figure 12. Run test for a GSS-sequence with characteristic polynomials of degree ≤ 27. Observe that the test is passed both for the runs of zeros (red line) and for the runs of ones (blue line) since they all fall within the corresponding range specified by the green line.

Lempel-Ziv Compression Test
The goal of this test is the number of cumulatively distinct patterns in the sequence. This test consists of determining how much is possible to compress the analysed sequence. If the sequence can be significantly compressed, it is considered to be non-random. The proposed GSS-sequences with characteristic polynomials of degree ≤ 27, pass this test with perfect results.
As can be seen throughout this section, the analyzed generator meets all the requirements needed to be used in the field of cryptography, according to points 1-4 mentioned in Section 1. Further work would be to study the resistance of this generator against the cryptographic attacks reported in the literature (Section 1, point 5).

Conclusions
In this article, we have found a relationship between two families of binary sequences belong to the class of decimation-based sequence generators, that is, the t-modified self-shrunken sequences can be generated from a generalized self-shrinking generator. We have analysed this relationship from two different points of view-one of them as binary sequences and other using the cyclotomic cosets. Furthermore, we have considered one of the most complete statistical test batteries for the study of randomness of sequences generated by the GSSG. In addition, we have reviewed some important graphical tests and basic and recent individual randomness tests found in the cryptographic literature. From the study of the last section, we can conclude that our random number generator (GSSG) produces good pseudo-random sequences since all the family of the sequences generated with characteristic polynomials of degree less than or equal to 27 pass satisfactorily the most important batteries of tests. The obtained results confirm the potential use of the generalized self-shrunken sequences for cryptographic purposes.
With regard to future work on this subject, the concatenation of GSS sequences from different primitive polynomials of different degrees could be analysed and studied, as well as the resistance of this generator against cryptographic attacks reported in the literature. Another important future work would be to do a comparative study of our generator with other well-known generators used in cryptographic applications nowadays.
Author Contributions: All the authors have equally contributed to the reported research in conceptualization, methodology, software and manuscript revision.
Funding: This research received no external funding.