Gaussian Pseudorandom Number Generator Using Linear Feedback Shift Registers in Extended Fields

: A new proposal to generate pseudorandom numbers with Gaussian distribution is presented. The generator is a generalization to the extended ﬁeld GF ( 2 n ) of the one using cyclic rotations of linear feedback shift registers (LFSRs) originally deﬁned in GF ( 2 ) . The rotations applied to LFSRs in the binary case are no longer needed in the extended ﬁeld due to the implicit rotations found in the binary equivalent model of LFSRs in GF ( 2 n ) . The new proposal is aligned with the current trend in cryptography of using extended ﬁelds as a way to speed up the bitrate of the pseudorandom generators. This proposal allows the use of LFSRs in cryptography to be taken further, from the generation of the classical uniformly distributed sequences to other areas, such as quantum key distribution schemes, in which sequences with Gaussian distribution are needed. The paper contains the statistical analysis of the numbers produced and a comparison with other Gaussian generators.


Introduction
Random number generators are of vital importance in many areas and, particularly, in cryptography. Most cryptographic algorithms and protocols make use of random or pseudorandom numbers. Encryption and authentication schemes in wireless and mobile communications, such as Bluetooth [1], IEEE 802.15.4, IEEE 802.11 WLAN [2], GSM [3] or LTE [4], employ pseudorandom numbers; radio frequency identification [5] standards define and recommend the utilization of true random numbers [6].A large part of the pseudo-random number generators (PRNGs) used in cryptography are based on linear feedback shift registers (LFSRs), mainly due to their simplicity, low cost of implementation, good statistical behavior and the possibility of using a mathematical model that allows the generator to be designed for an optimal performance [7].
In fact, the maximal sequence length generated by an LFSR of m cells is 2 m − 1. However, those sequences suffer from a high predictability in such a way that the whole sequence can be reproduced if an eavesdropper gains access to 2m bits. Despite that, the LFSR is still an important part of the cryptographic generators because those sequences are used to derive more robust ones but keeping the original statistical properties. Two main methods are applied to fix that weakness: nonlinear combination and nonlinear filtering. The former is based on several LFSR, usually with different number of cells [3], and the latter on a unique LFSR whose sequence is processed (filtered) by a nonlinear function [4].
Another advantage of using LFSRs in cryptography is that the sequences generated have a uniform statistical distribution. For all these reasons, there is a lot of published works related to the LFSR, but only a few regarding its utilization to produce numbers with Gaussian distribution.
More precisely, in 2010, Kang [8] proposed a Gaussian PRNG, using a LFSR of length N = 4M bits, to generate pseudorandom numbers of (M + 4) bits. The numbers were produced by means of an accumulator applied on decimated M-bits numbers, producing a sequence of length (2 N − 1)/(8N). In order to fix this LFSR oversizing, Condo et al [9] proposed, in 2015, a Gaussian PRNG using permutations over the successive states of a unique LFSR, thus reducing the implementation cost. The main drawback is that not all the possible permutations yield numbers with the required Gaussian distribution and, consequently, a high computational effort must be performed to find the suitable permutations. Later, in 2020, Cotrina et al [10] presented an improvement of the Condo's PRNG focusing on a particular type of permutations of the LFSR state, the cyclic rotations. The authors concluded that more than 90% of the cycle rotations are usable for the PRNG and produce Gaussian distributed numbers.
These proposals are based on the central limit theorem (CLT) [11], trying to obtain a Gaussian distribution using sequences of uniformly distributed numbers. In this sense, the first proposal [12] employed several LFSRs to generate independent sequences of numbers. However, most of the proposals use a unique LFSR to generate all the sequences, such as those of Kang [8], Condo [9] and Cotrina [10].
Following the same approach, the present paper describes a Gaussian PRNG based on an LFSR operated and defined in an extension field GF(2 n ) instead of using the binary field GF (2). An LFSR in GF(2 n ) can be represented as a combination of n LFSR in GF(2 n ). This fact, as it is described later, allows a particular implementation of the CLT. Furthermore, as it can be deduced from the equivalent model, the cyclic rotations proposed by Cotrina [10], as a particular case of the permutations proposed by Condo [9], are implicitly included in the operations of an LFSR in GF(2 n ). This proposal is also in line with the trend of using cryptographic algorithm and protocols in extended fields to take advantage of the word length of processors [13].
On the other hand, the proposed generator is a way to keep using the LFSR as a basic element to generate pseudorandom numbers in cryptographic areas where other than uniform distribution is required. An example of this is quantum key distribution (QKD) schemes [14]. This type of scheme, designed to establish keys between two endpoints, can be considered the most mature application of quantum communications. Currently, all developed countries have deployed QKD schemes, some experimentally (in controlled environments) and others in their current transit networks [15]. The first QKD schemes were based on the transmission of polarized photons using non-orthogonal states. These schemes, named discrete-variable QKD (DV-QKD) [16,17], require the utilization of specific components for single-photon detection. A different type of QKD scheme has been developed to carry information on some continuous properties of the light, such as the values of the quadrature components of a coherent state. The so-called continuous-variable QKD (CV-QKD) [18][19][20][21], currently deployed in several countries e.g., China, Japan, Spain and Italy [15] present a lower implementation cost due to the utilization of standard communications components. They use coherent detection techniques usually employed in classical optical communications. Furthermore, they employ Gaussian modulation to send random amplitude and phase values that must be generated following a Gaussian distribution [10,14,22]. Section 2 introduces the fundamentals of the LFSR in the binary and extended Galois fields. Section 3 describes the binary equivalent model of an LFSR defined in a GF(2 n ) and the proposal of a Gaussian PRNG based on this model. The statistical analysis of the numbers produced by the proposed PRNG is presented in Section 4 and compared to the Box-Muller [23,24] algorithm, a well-known algorithm not based on CLT. Finally, conclusions are presented in Section 5.

LFSR Fundamentals
In this section the basic properties of the linear feedback shift register (LFSR) (see Figure 1), and its generated sequences are described.
A linear feedback shift register (LSFR) is a shift register that takes a linear function of a previous state as an input. Most commonly. The LFSR [25] of length m consists of m stages numbered 0, 1, 2, · · · m − 1, each capable of storing one bit and having one input and one output and a clock which controls the movement of data. Definition 1. An LFSR of length m consists of: A linear feedback shift register (LFSR) of length m consists of m stages numbered 0, 1, 2, · · · m − 1, each capable of storing one bit and having one input and one output; and a clock which controls the movement of data. During each unit of time the following operations are performed: • The content of stage 0 is output and forms part of the output sequence.

•
The content of stage i is moved to stage i − 1 for each i where The new content content of state m − 1 is the feedback bit a j which is calculated by adding together modulo 2 the previous contents of a fixed subset of stages 0, 1, · · · , m − 1.
The values q i are either 0 or 1 and the feedback bit a j is the modulo 2 sum of the contents of those stages i, 1 ≤ i ≤ m − 1, for which q m−i = 1. As a consequence, the output sequence of the LFSR is A = (a 0 , a 1 , a 2 , · · · ) and is uniquely determined by the following recursion: The evolution of the LFSR and their sequences generated can be performed by means of a polynomial whose coefficients are the values q i that represents the stages used to compute the feedback bit a j . For this reason, the LFSR is denoted m, p(x) , where p(x) = 1 + q 1 x + q 2 x 2 + · · · + q m x m is the connection polynomial.
The LFSR is said to be nonsingular if the degree of p(x) is m (that is, q m = 1). If the initial content of stage i is a i ∈ {0, 1} for each i, 0 ≤ i ≤, m − 1, then [a m−1 , · · · , a 1 , a 0 ] is called the initial state or seed of the LFSR.
On the other hand, the state of the LFSR at the time t is denoted as which corresponds to the application of the recursion in the Equation (1) t consecutive times starting with the seed s (0) = [a m−1 , . . . , a 1 , a 0 ] Example 1. Consider the LFSR 4, 1 + x + x 4 . If the initial state of the LFSR is s (0) = [0, 0, 0, 0], the output sequence is the zero sequence A = (0, 0, · · · ). For the initial state s (0) = [0, 1, 1, 0], the sequence has a length of 15 The following table shows the successive states s (t) . Note that the rightmost bit of each state constitutes the output sequence A = (0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0, · · · ) as we can see in Table 1.  Definition 2 (cf. [25]). An output sequence A = (a 0 , a 1 , · · · ) generated by an LFSR m, p(x) , is said to be periodic if there exits j 0 ∈ N such that a i = a i+j 0 ∀i ∈ N. Such j 0 is called period of the sequence.
From this definition, the sequence of the example is a periodic sequence with period L = 15.
One of the advantages of LFSR is the mathematical model that allows one to predict the length of the sequences generated. The following definition and theorem states how and when the maximal length is reached by the sequences.
Definition 3 (cf. [25]). If p(x) ∈ Z 2 [x] is a connection polynomial of degree m, then m, p(x) is called a maximum length LFSR if the output sequence, with non-zero initial state, has period 2 m − 1. This sequence is called m-sequence. Theorem 1 (cf. [7]). An output sequence A generated by an LFSR m, p(x) is an m-sequence if and only if the connection polynomial p(x) is a primitive polynomial. The sequence length is independent of the initial state.
Consequently, a primitive polynomial of degree m will generate a sequence of length 2 m − 1 and the LFSR will run through 2 m−1 different nonzero states, that is, all possible nonzero states. Hence, if we consider each state as an m-bit pseudorandom number, we can say that LFSR produce numbers with uniform distribution.
Besides its maximal length, the m-sequences have many desirable statistical properties that can be summarized in the three Golomb's postulates [7]. Given a periodic binary sequence A = (a i ) i∈N with period length L = 2 m − 1, it is said to be pseudoradom if the following postulates hold.

1.
Balance property. In every period, the number of zeros is nearly equal to the number of ones (the disparity does not exceed 1, or In every period, half of the run have length 1, one fourth have length 2, one eighth have length 3, and so on. For each of these lengths there are the same number of runs of 0's and runs of 1's.

3.
Two level autocorrelation. The autocorrelation function c(τ) is two-valued given by where k is a constant. If k = −1 for m odd, or k = 0 for m even, we say that the sequence has the ideal two level autocorrelation function.

4.
The ideal k− tuple distribution. In every period of A , if each nonzero k− tuple (a 1 , a 2 , a 3 , · · · , a k ) occurs q times and the zero k−tuple occurs q n−k − 1 times, then we say that the sequence satisfies the ideal k-tuple distribution.
LFSR over GF(2 n ) Definition 5. Let's suppose (F, +, * ) be a field with operations + and * , we will say that this field F is a Galois field if the cardinal of field F is finite. If the cardinal of the finite field F is p, then F will be represented as GF(p).
It is possible to extend the prime field GF(p) to a field of p m elements, represented by GF(p m ), which is called an extension field of GF(p). In our case we shall work for p = 2.
To build up the field extension of GF(2), GF(2 n ), let's consider p(x) a primitive polynomial over GF(2) of degree n, once this polynomial is defined, let's consider a root al pha of this primitive polynomial, therefore p(α) = 0. Let's consider GF(2 n ) = 0, 1, α, α 2 , · · · , α 2 n −1 . It can be proven that all the elements of GF(2 n ) can be represented by 2 n − 1 distinct non-zero polynomials of α over GF(2) with degree m − 1 or less. The 0 ∈ GF(2 n ) can be represented as the zero polynomial. Then the set GF(2 n ) with the usual operations is a Galois field with 2 n elements.

Example 2.
Let's build the finite field GF(2 4 ) generated by the primitive polynomial p(x) = 1 + x + x 4 . First of all, we consider α to be a root of the primitive polynomial p(x), then we determine all multiplicative powers of α, α 0 = 1, α 1 , · · · until we obtain the multiplicative identity 1 on this field. For any Galois field we will have three representations as shown in Table 2.
This Table 2 shows how all the elements of the extended Field are obtained. The equivalence has been obtained using the vector notation as a function of one of the roots of the primitive polynomial that generates the extended field. In the same way, it can be observed that the body is generated cyclically and that no equal values are obtained except when it has been cycled and all its elements have been obtained periodically. Table 2. Representation of the Galois Field GF(2 4 ) over GF(2) with primitive polynomial p(x) = 1 + x + x 4 over GF(2).

Power Representation Polynomial Representation Vector Representation
According to the Example 2, form this point on, we shall represent the elements of the extended fields using the vector notation, where for example the element 1 + α 2 + α 3 will be represented as (1, 0, 1, 1).
To generate an LFSR on a GF(2 n ) we will start by determining a primitive polynomial p over GF(2) that will be used to generate the extended field. Once this polynomial and this field have been set, we choose a primitive polynomial g over GF(2 n ) and a seed or non-zero initial state value β 0 on which the primitive polynomial g is applied. In this way, a value β 1 , is obtained by g(β 1 ) on which the primitive polynomial g will be applied and thus the process will be repeated.

Example 3.
We consider the GF(2 6 ) that has been built using the primitive polynomial q(x) = 1 + x + x 6 . Let's consider the primitive polynomial p(x) over GF (2 6 ) whose vector representation is and taking into consideration these conditions the output sequence will be As in GF (2), the LFSR in GF(2 n ) produces an m-sequence if and only of the connection polynomial is primitive. The m− sequence has a period of 2 m·n − 1 that corresponds to all nonzero states, where m is the degree of the connection polynomial.

Gaussian Number Generator over GF(2 n )
To apply the CLT, Gaussian generators based on a single LFSR [8][9][10] try to obtain different sequences of pseudo-random numbers with uniform distribution from the same sequence generated by the LFSR. To do this, they apply permutations-generic in some cases, rotations in others-on each state of the LFSR. In this way, with each number generated by the LFSR, other numbers are also being generated (as many as permutations are applied) and therefore different uniform sequences suitable to be combined in a sequence of numbers are being simultaneously constructed. LFSRs defined on an extended field can be analyzed as the combination of a series of binary LFSRs. For this reason, each state of an LFSR in an extended field is related to the states of a series of binary LFSRs. This relation is the one used to propose a PRNG with Gaussian distribution. Next, the equivalent model of an LFSR defined in an extended field that allows the definition of a Gaussian PRNG with excellent performance is described.

Binary Equivalent Model of an LFSR in GF(2 n )
As it is known [26], there is a relationship between the m− sequences generated in GF(2 n ) and those generated in GF (2), in such a way that the former can be obtained from the latter. The relationship is established from the feedback polynomials that define each LFSR. Specifically, if q(x) is the primitive polynomial of degree m that defines the LFSR in GF(2 n ) and therefore generates an m−sequence in GF(2 n ), the decimated sequences obtained by taking the j−th bit of each element in the m−sequence are generated by a binary LFSR with primitive feedback polynomial h(x) of degree m · n, where h(x) divides q(x). This allows to develop an equivalent model of the LFSR defined in GF(2 n ) using n LFSRs in GF(2), as shown in Figure 2 where the j−th bit of each element is generated by the j−th binary LFSR. Note that all the binary LFSRs used in the model have the same polynomial h(x) although initialized in different seeds. Therefore, the sequences generated by the binary LFSRs are m−sequences in GF(2), and consequently, shifted versions of the same sequence. In this way, the generation of an m−sequence in GF(2 n ) implies the generation of n binary m− sequences, one for each bit, which can be used as sources with a uniform distribution to later apply CLT. Definition 6. Let m, q(x) be a LFSR over GF(2 n ). Let s(t) = [a m−1+t , a m−2+t , · · · , a t ] be the state of the LFSR at time t, with a i ∈ GF(2 n ) that can be represented as the vector [a i,n−1 , . . . , a i,1 , a i,0 ] with a i,j ∈ GF(2) ∀ 0 ≤ j ≤ n − 1. Then, Π k is defined as the projection over the k−th component of an element. Hence, Π k (a i ) = a i,k and Π k (s(t)) = s(t) k = [a m−1+t,k , a m−2+t,k , · · · , a t,k ] .
This definition allows us to represent the decimated sequences Π k (A) of a sequence A = (a 0 , a 1 , a 2 , · · · ) as: Theorem 2. Let m, q(x) be an LFSR over GF(2 n ) and let r k , f k (x) be the set of all its decimated sequences ∀k ∈ {1, 2, · · · , n} then, q(x) is primitive over GF(2 n ) if and only if ∀k ∈ {1, 2, · · · , n} f k is primitive over GF (2). If these conditions hold, f j (x) = f k (x) = h(x) ∀j, k ∈ {1, 2, · · · , n} and m | r k Therefore, if we have an LFSR m, q(x) over GF(2 n ) where q(x) is primitive, then we can consider that we have n's m−sequences over GF (2).
In other words, this equivalent model represents the interleaving process that generates the sequence in GF(2 n ); that is, the sequence generated by the LFSR in GF(2 n ) can be expressed as an interleaved sequence, in the sense describe by Gong in [27], composed by n component sequences, corresponding with the decimated sequences. More precisely, it is a primitive interleaved sequence as all the component sequences are generated by the same primitive polynomial in GF (2).
On the other hand, pseudorandom sequences must be difficult to reproduce. Linear complexity (or linear span) of a sequence is defined as the degree of the minimal polynomial that generates it, or equivalently the length of the shortest LFSR that generates it. Consequently, the linear complexity of a sequence generated by a primitive LFSR is the length of that LFSR. Hence, considering the sequence produced by an LFSR of m cells in GF(2 n ) as an interleaved sequence, its linear complexity LC ext is n times the linear complexity LC bin of its primitive components; that is,

The Proposed Generator
Taking in mind that one of the potential applications of a Gaussian PRNG based in LFSR could be a QKD scheme, it is important to note the following requirements: • The PRNG should allow a discrete set of values to be generated large enough to approximate the continuous probability distribution.
• The set of values generated must have a Gaussian probability distribution. • The security of the system must allow the generation of a set of values with a sufficiently large cardinal. • The generation of obtained values should be done as fast as possible and within the lowest implementation cost. In addition, for the system to be effective, the possibility of a hardware implementation must be considered.

•
The system must allow the generation of the pseudo-random values with Gaussian distribution to be different for each of its different executions.
In order to fulfil all the requirements, we propose a PRNG based on an LFSR in GF(2 n ). The PRNG is composed by two units: the control unit and the processing unit (see Figure 3).The control unit consists in an LFSR with m cells, defined by a primitive polynomial q(x) over GF(2 n ). This unit is responsible for the generation of the basic m−sequence from which all the sequences with uniform distribution are obtained for the subsequent application of CLT. As it is described in the previous sections, i f q(x) is primitive the LFSR generates an m−sequence, i.e., a sequence of maximal period 2 n·m − 1. The processing unit has been design with a two-fold objective. On the one hand, like many other cryptosystems, it applies a nonlinear filtering to the sequence produced by the LFSR in the control unit to increase the difficulty that an eavesdropper reproduces the whole sequence. On the other hand, this unit implements the operations that transform the statistical distribution from uniform to Gaussian, i.e., implements the CLT. To do that, the operator Π k is applied on each LFSR state, thus producing n strings of m bits that corresponds with segments of the n binary sequences in the equivalent model described in Section 3.1. Hence, applying the CLT, the pseudorandom number B is obtained by means of the integer sum of those n bitstrings as: where D is the function that maps an m−bit vector x = (x 0 , x 1 , · · · , x n−1 ) into a decimal value, The period, range and accuracy of the proposed PRNG can be configured using two main parameters: m and n. The sequence of pseudorandom numbers has a period of 2 mn − 1 because the feedback polynomial in the LFSR is a degree m primitive polynomial in GF(2 n ). The range of the numbers is mainly determined by m since each state of the LFSR contains n · m bits that split in n strings of m bits later summed to produce the pseudorandom number. The parameter n increases slightly, until m + log 2 (n), the number of bits of the generated numbers due to the carry of the integer sum. Thus, for m = 5 and n = 8, the PRNG generates values with a range of 5 + 3 = 8 bits.
Finally, when the CLT is applied to obtain a Gaussian distribution, its accuracy is related to the numbers of uniform distributed numbers that are summed. In this case, each pseudorandom number is obtained summing n values. Hence, generating 16−bit pseudorandom numbers requires the utilization of an LFSR with 16 cells. If the LFSR operates in GF(2 4 ), the PRNG will present a period of 2 4·16 − 1 = 2 64 − 1, obtained from the addition of 4 uniformly distributed sequences. The accuracy may increase using 8 values instead of 4. In that case, the LFSR would work in GF (2 8 ) what would also increase the period to 2 16·8 − 1 = 2 128 − 1.
Regarding the speed of generation, although, generally, the use of LFSRs in GF(2 n ) is motivated by the speed increase that is achieved by generating n bits instead of 1 in each iteration, in this case, the use of the LFSR in GF(2 n ) pursues a second objective: to take advantage of the implicit relationship with n different binary sequences. This makes it easy to apply the CLT by using a single LFSR that is equivalent to n binary LFSRs. Therefore, the final rate of generating numbers with Gaussian distribution is the same as the rate at which a binary LFSR generates a bit. However, taking into account that the generated numbers are m bits long, the generation speed in bits per second is m times higher.

Statistical Analysis
In this section, the distribution of the numbers generated by the proposed PRNG is analyzed. Several normality tests have been applied to identify the configurations that generates numbers with Gaussian distribution.

Distribution Fit Test
Goodness-of-fit tests are used to evaluate how well a proposed model fits or predicts a particular data set. Usually, test statistics compute deviations between the observed data and predictions from the model. The value of a test statistic is said to be statistically significant if it is found to be within the rejection area of the distribution of the test statistic under the assumption that the model is true. The rejection area is often the upper its significance level α of 5% or 10% of the distribution frequencies. In our work we have set this acceptance minimum level to 10%. Therefore a distribution fit test performs a goodness of fit hypothesis test with null hypothesis H 0 that data was drawn from a population with a specific distribution of values, in this case the Normal distribution, and alternative hypothesis that it was not.
Usually, a statistical hypothesis test returns a value called p or the p-value. This value is used to reject or fail to reject the null hypothesis. This is done by comparing the p-value to a threshold value chosen beforehand called the significance level α. When the p-value is less than α, the default hypothesis can be rejected. In the same way, the confidence level of the test is 1 − α. If we set the significance level to 5% and the p-value is greater than 95%, we would conclude that the null hypothesis affirming that the data is distributed according to the Normal Distribution would not be rejected at the 5 percent significance level. In the present context, the higher the p-value, the better the data fits the normal distribution.
According to the CLT [21], if we consider {X 1 , . . . , X n } a random sample of size n that is, a sequence of independent and identically distributed random variables drawn from a distribution of expected value given by µ and finite variance given by σ 2 . Suppose we are interested in the sample average S n = X 1 +···+X n n of these random variables. By the law of large numbers, the sample averages converge in probability and almost surely to the expected value µ as n −→ ∞. The classical central limit theorem describes the size and the distributional form of the stochastic fluctuations around the deterministic number µ during this convergence. More precisely, it states that as n gets larger, the distribution of the difference between the sample average S n and its limit µ, when multiplied by the factor √ n (that is √ n(S n − µ) ), approximates the normal distribution with mean 0 and variance σ 2 . For large n, the distribution of s n is close to the normal distribution with mean µ and variance σ 2 /n. The usefulness of the theorem is that the distribution of √ n(S n − µ) approaches normality regardless of the shape of the distribution of the individual Xi.
There exist different methods to distinguish whether or not the range of values in a distribution follows a Normal distribution. Due to the large number of values that we have obtained in all our tests, we have decided to use the Chi Square Test that will be described in the next Section 4.1.1.

Chi Square Test
The chi-square test [22,28] is used to test if a sample of data came from a population with a specific distribution. In this case we shall focus on this test to check if the distribution of numbers fits the normal distribution.
The χ 2 goodness-of-fit test examines the discrepancy between observed values and the values expected under some particular distribution of a random variable A.
The null hypothesis H 0 : The random variable A follows the normal distribution. The alternative hypothesis H 1 : The random variable A does not follow the normal distribution.
Let a 1 , a 2 , · · · , a m be the observed values of a variable A. We shall follow: • Categorize the observations into k categories. • Calculate the frequencies f i where k = {1, 2, · · · , n} and f i is the observed frequency of the category i. • Let p i be the probability, that under null hypothesis, the random variable A belongs to the category i. Calculate the expected frequencies E i = npi of the observations in category i.

•
Now, under the null hypothesis, the random variables f 1 , f 2 , · · · , f k follow multinomial distribution with parameters n, p 1 , p 2 , · · · , p k . • We shall continue working out the test statistic • If n is large, then under the null hypothesis, the test statistic χ 2 g approximately follows where e is the number of estimated parameters. • The expected value of the test statistic, under the null hypothesis, is k − 1 − e. • Large and small values of the test statistic (compared to the expected value) suggest that the null hypothesis H 0 does not hold. • If the p-value is small enough, the null hypothesis H 0 is rejected.

Measures of Central Tendency, Dispersion, Kurtosis and Skewness
To check if these measures make the values fit in a feacient way with the data of a Gaussian distribution, we have normalized the results applying the elementary transformation From here, we have determined the measures of central tendency of the variable, the measures of dispersion and the measures of skewness and kurtosis. The first thing that we have should verify is that if the values of the degree of the feedback polynomial are increased and the cardinal of the field is increased, the obtained values fit better to the normal distribution, obtaining in each case values closer to standard values of the normal distribution.
If the numerical data have been normalized, in the sense that we have applied the typification (11), then we can set the various control parameters to verify whether or not the data follow a normal distribution.
The expected values for the normal distribution are as follows: 1.
About 95% of the observations are within 2 times the standard deviation of the mean. 95% of the values will be within 1.96 times the standard deviation from the mean (between −1.96 and 1.96). Approximately 68% of the observations are within one standard deviation of the mean (−1 to +1), and around 99.7% of the observations would be within three standard deviations of the mean (−3 to 3). 3.
The skewness for a normal distribution is zero, and any symmetric data should have a skewness near zero. Negative values for the skewness indicate data that are skewed left and positive values for the skewness indicate data that are skewed right. By skewed left, we mean that the left tail is long relative to the right tail. Similarly, skewed right means that the right tail is long relative to the left tail. If the data are multi-modal, then this may affect the sign of the skewness. 5.
The kurtosis for a standard normal distribution is 3.

Results
In this section we will go on to show the results that have been obtained for the generation of numbers with a Gaussian distribution.
To illustrate the results obtained in the generation of numbers with a Gaussian distribution, the following polynomials {q i } i∈N have been taken into account for the generation of the base Fields and the following polynomials as polynomials of connection polynomials To generate all the extended fields, different primitive polynomials have been used. The degrees of these polynomials have ranged from 4 to 16. The list of primitive polynomials that have been used are represented in Table 3. Table 3. List of primitive polynomial used for the extended field generation.
We will continue describing the primitive polynomials that we have used as the connection polynomial for each of extended field. Note that due to the great length of each of the terms of each polynomial, the conversion to hex has been carried out.
First, we will determine the arithmetic mean, the standard deviation and the quartiles of the obtained numerical values. According to the Table 5, it can be seen that all the values obtained are within the range of expected values so that they fit to the values of a Gaussian distribution. In the same way we have generated all the quantiles and we have compared them with the theoretical values of those of the Gaussian distribution. In each of the cases it can be verified that the values fit the Gaussian model. In the Figure 4 we plot the obtained quantiles list against the quantiles list of a normal distribution for some cases. In order to better support the results presented, we have represented the cumulative distribution function (CDF) and the results obtained and compared them with those expected in the normal distribution. In the Figure 5 we confront the CDFs of the normal distribution against those obtained resutls.
In this regard, it should also be noticed that the histograms of the results obtained have been analyzed and these have been compared within the corresponding histograms of the normal distribution. Continuity correction has been applied since a discrete set of values are being used and we have also been able to verify that the data fit a normal distribution of values. In the Figure 6 we can see how the histograms tend to the normal ditribution CDFs curve.
Although different normality tests were originally used to check if the data fit the normal distribution, as we can see in Table 6, due to the large number of data, we have opted for the method of the Chi-Square Test, which allows us to check whether or not a model or theory follows an approximately normal distribution. These described tests have been used to check whether the statistical variables are distributed according to a Normal distribution. A minimum level of confidence has been set to 90% therefore the significance level is set to 10% and according to that level of confidence the sequence obtained has been screened. That is, given a set of values, the Normality tests have been applied to verify whether the data followed a normal distribution or not. The output of these mentioned tests is a p-test. If the p-test value obtained is greater than 90%, the sequence obtained is considered valid and otherwise has been discarded.
The proposed generator has been tested for all possible polynomials. All primitive polynomial combinations have been tested using the Mathematica environment. The tests have been performed using Chi sqaure, the Anderson-Darling and Shapiro methods. The results are exemplified in Table 6.
The proposed PRNG [24] has also been compared with the Box-Muller algorithm that was designed as a pseudo-random number sampling method for generating pairs of independent, standard, normally distributed (zero expectation, unit variance) random numbers, given a source of uniformly distributed random numbers. If U 1 and U 2 are independent samples chosen from the uniform distribution on the unit interval (0, 1), then the variables defined as: are independent random variables with a standard normal distribution. After having executed the Box-Muller algorithm we have found the following disadvantages.

•
According to the results presented in Table 7, we can see that the values of the p-test are better our LFSR model, than in the Box-Muller algorithm. • The computational cost required to implement the algorithm is much higher. Another way to improve the accuracy of the Gaussian distribution is to modify the way the m-bit strings are generated. If all the states of the LFSR are used, each cell is used in the generation of n consecutive pseudo-random numbers. Although the fit tests reveal very good results (see Section 4.2), decreasing or removing the amount of numbers affected by the same cell would help to improve the accuracy. Therefore, we propose, as an alternative, to use one out of every m states. More formally, we propose to decimate by m the LFSR output. In this way, the period would be (2 mn − 1)/gcd((2 mn − 1), m) giving rise to select m, n such that gcd((2 mn − 1), m) = 1 in order to reach the same period 2 mn − 1.
In any case, it is important to note that the period is much greater than the range, i.e., 2 mn − 1 2 m+log 2 (n) , giving rise to a probability about 0.005 of generating the same number in 10, 000 values generated.

Conclusions
A new Gaussian PRNG has been proposed in this paper. It is based on a unique LFSR, using the same approach than the previous proposals [8][9][10], in order to generate a certain number of sequences of uniformly distributed numbers, needed to apply the CLT. Unlike the previous proposals, no explicit permutations or rotations have been applied to the successive LFSR states. Instead, the PRNG is operated in GF(2 n ) to take advantage of the relationship between the states and sequences generated in GF(2 n ) and GF (2), that allows to represent the m−sequences in GF(2 n ) as primitive interleaved sequences composed by m−sequences in GF (2).
The PRNG, presented in Section 3.2, allows to configure it by means of two main parameters, m (the number of cells in the LFSR) and n (the dimension of GF(2 n )), determining the period as 2 m·n − 1, and the range 2 m+log 2 (n) . The statistical analysis reveals an excellent behaviour when the fit tests are applied.
Finally, this PRNG is a way to keep using LFSR in cryptographic applications where a uniform distribution is not required. Furthermore, as in other applications, the use of LFSRs in GF(2 n ) is motivated by the speed increase that is achieved by generating n bits instead of 1 in each iteration; in this case, the use of the LFSR in GF(2 n ) pursues a second objective: to take advantage of the implicit relationship with n different binary sequences looking for an easier implementation of the CLT. Therefore, the final rate of generating numbers with Gaussian distribution is the same as the rate at which a binary LFSR generates a bit. However, taking into account that the generated numbers are m + log 2 (n) bits long, the generation speed in bits per second is m + log 2 (n) times higher.