Pseudo Random Binary Sequence Based on Cyclic Difference Set

: With the increasing reliance on technology, it has become crucial to secure every aspect of online information where pseudo random binary sequences (PRBS) can play an important role in today’s world of Internet. PRBS work in the fundamental mathematics behind the security of different protocols and cryptographic applications. This paper proposes a new PRBS namely MK (Mamun, Kumu) sequence for security applications. Proposed sequence is generated by primitive polynomial, cyclic difference set in elements of the ﬁeld and binarized by quadratic residue (QR) and quadratic nonresidue (QNR). Introduction of cyclic difference set makes a special contribution to randomness of proposed sequence while QR/QNR-based binarization ensures uniformity of zeros and ones in sequence. Besides, proposed sequence has maximum cycle length and high linear complexity which are required properties for sequences to be used in security applications. Several experiments are conducted to verify randomness and results are presented in support of robustness of the proposed MK sequence. The randomness of proposed sequence is evaluated by popular statistical test suite, i.e., NIST STS 800-22 package. The test results conﬁrmed that the proposed sequence is not affected by approximations of any kind and successfully passed all statistical tests deﬁned in NIST STS 800-22 suite. Finally, the efﬁciency of proposed MK sequence is veriﬁed by comparing with some popular sequences in terms of uniformity in bit pattern distribution and linear complexity for sequences of different length. The experimental results validate that the proposed sequence has superior cryptographic properties than existing ones.


Introduction
Pseudo random binary sequences (PRBS) are widely used in many applications such as wireless communications and cryptography [1][2][3][4]. In cryptography, many security protocols such as SSL/TLS, HTTP are developed based on pseudo random sequences. Randomness of a sequence indicates the degree of difficulty of predicting next bit in that sequence whether it is physical or statistical analysis. Such ideal random sequences can easily be produced from natural resources, for example, atmospheric noises, radioactive decay and other natural phenomena. However, reproducibility of such sequences is impossible mathematically because of variation in natural resources [4,5]. Due to this disadvantage, sources of such true random sequences are unreliable for practical computer applications. On the counterpart, pseudo random sequences are derived using mathematical formulas but have some standard properties that are investigated in true random sequences. Sequences can be regenerated using deterministic mathematics and a large sequence can be produced in short time using small random seeds. Reproducibility and features like true random sequence make pseudo random

Notation and Convention
Throughout this paper, we use following notations to present definitions, properties and terms related to pseudo random sequences:

Primitive Polynomial
In field theory, a primitive polynomial is a minimal polynomial whose root is a primitive element determining the extension field. Finite field, F * p m constructs a cyclic group with respect to multiplication and consists of p m − 1 non-zero elements. Every finite field has a generator and every non-zero element is represented as a power of the generator.

Definition 1.
A generator of a finite field F * p m is an element of order p m − 1 and the powers of generator runs through all elements of F * p m .
Let, g be a generator of F * p m , any non-zero element is derived from power of g, i.e., g i for i = 0, 1, .., p m − 2. g i is said to be primitive if and only if gcd(i, p m − 1) = 1. In particular, there are a total of ϕ(p m − 1) different primitive elements [18] of F * p m where ϕ(·) represents euler totient function [19].

Definition 2.
A polynomial f (x) is said to be primitive if and only if ω, i.e., root of f (x), forms a cyclic group consisting of all elements in F * p m .
Following two conditions hold for f (x) to be primitive polynomial: It is well known that the number of primitive polynomials of degree m is Theorem 1. For a generator g in F * p m , a non-zero element g (p m −1)/(p−1) in prime field F p is a generator of F * p as well.
Proof. Let, g be a generator of F * p m whose order is p m − 1. Then, for a non-zero element g i , its order can be derived as follows: Therefore, for g (p m −1)/(p−1) , the order is p − 1. This implies that g (p m −1)/(p−1) is a generator of F * p .

Theorem 2.
A polynomial of degree n over a finite field has at most n roots.
Proof. Here we prove by induction over n. The result is clearly true for n = 0 and n = 1. Let f (x) be a polynomial with degree m. Let us assume that f (x) has at most m roots where m < n. If a is a root of f (x), a polynomial of degree n over a field, then f (x) = (x − a)q(x) where q(x) has degree n − 1 and q(a) = 0. If f (x) has no root other than a, we are done. On the other hand, if f (b) = 0 then either a = b or q(b) = 0. This follows by induction that f (x) has at most n roots. Proof. Let m be the order of a in F * p m , i.e., the least positive integer for which a m = 1. Then F sub := {1, a, a 2 , · · ·, a m−1 } is a subgroup of F * p m . Since m divides q, we have,

Quadratic Residue and Quadratic Nonresidue
An element a in finite field F * p is a quadratic residue modulo p if it is congruent to a perfect square in F p , i.e., there exists an element x such that: If there is no such x, then a is called quadratic nonresidue modulo p. In this work, we utilize quadratic residue in extension field F * p m .
Proof. Let a be an element in F * p m . Then by the fact in Theorem 3, a p m −1 = 1. It follows that every element a ∈ F * p m is a root of polynomial x p m −1 − 1 = 0. On the other hand, by Theorem 2 the polynomial can have at most p m − 1 roots in F * p m . From both the facts, it can be concluded that x p m −1 − 1 = 0 has p m − 1 roots. Consequently, since Therefore, |QR| ≤ p m −1 2 . On the other hand, by Theorem 2 polynomial x 2 − a has at most two roots for any quadratic residue a. Therefore, We conclude that | QR |= p m −1 2 and QR is equal to set of roots of x p m −1 2 and QNR is equal to set of roots of x p m −1 2

Linear Complexity
Linear complexity is a measure of unpredictability of a sequence. A sequence of low linear complexity can be easily determined if a number of consecutive terms of the sequence is known. Only 2 × l-consecutive terms are required to recover a sequence with l-linear complexity. Therefore, sequence with high linear complexity is a fundamental requirement for security applications.

Definition 4.
Linear complexity is defined as the length of the shortest linear feedback shift register (LFSR) that can generate the sequence. Linear complexity is considered to be zero for sequence of length zero.
Let, S be a sequence of period λ. The linear complexity LC(S ) is presented as: where h S is called the generating polynomial. For a sequence S λ = {s i } where i = 0, 1, · · · , λ − 1, generating polynomial is defined as: This work is focused on binary sequence. Therefore, gcd( (4) is computed in F 2 . A popular algorithm, Berlekamp-Massey algorithm [20] can find linear complexity in F p m .

Cyclic Difference Set
In this section, we introduce cyclic difference set which differs from the ones proposed in [21][22][23][24].
In this work, we use differences in elements in extension field to change order of elements and named cyclic difference set in this work. For a given prime p and a non-negative integer m, any extension field element X i : X i ∈ F * p m can be presented as: The construction of cyclic difference set X i from elements of X i for i = 0, 1, 2, · · · , (p m − 2) is given below: Cyclic difference set randomizes sequence bits by changing the order of extension field elements which are converted to sequence bits later.

Generation Algorithm
For a given prime p and a non-negative integer m, generation of proposed pseudo random binary sequence S = {s 0 , s 1 , s 2 , · · ·, s p m −2 } of length λ = p m − 1 is presented here. The procedure composes of four phases that are described below (Algorithm 1): x Primitive polynomial and primitive element: Generate a primitive polynomial f (x) over F * p m as defined in Section 2.2. Let, ω be a primitive root of f (x) defined as ω i = ∑ m−1 j=0 c j x j . A primitive root is a reduced residue of order p m − 1. y Generation of all elements in F * p m : Every element in F * p m is congruent to some power ω i mod f (x) of ω, and i can be reduced mod p m − 1. Therefore, any element X i in F * p m can be generated as follows: z Generation of cyclic difference set: Now, generate cyclic difference set X i from elements of X i for i = 0, 1, 2, · · · , (p m − 2) as described in Section 3.1: { Binary sequence using quadratic residue: For any element a ∈ X , sequence element s i in proposed sequence S ≥ = {s 0 , s 1 , ..., s m−2 } of length λ = p m − 1 is generated using quadratic residue, i.e., a p m −1 2 = 1 as follows:

Experimental Results
In this section, we evaluate our proposed MK sequence. First, the effectiveness is verified using NIST STS [16,17]. Then, experimental results of linear complexity and uniformity are presented. Finally, a comparison with existing NTU sequence is presented.

Randomness Analysis
In cryptography, PRBS is adopted in many applications as the primary security component. Therefore, the efficiency of PRBS must be verified with standard statistical measures before practical applications. Several statistical test suits such as NIST, DIEHARD, Gustafson, CryptXS suite, and Donald Knuth [25][26][27][28] are available to verify randomness of a sequence. However, NIST is regarded as the most complete test suite for verification of randomness of a sequence. NIST is composed of 15 statistical tests which measure different behaviors of binary sequences to verify their randomness. All tests are independent that reveal various deviations from random behavior. Each test computes a probability value called p value from given binary sequence. The p value falls within range [0,1]. When p value equals to 1, it means that the sequence is random. Again, when p value equals to 0, it means that the sequence is not random. When the value is greater than a given value, α ∈ (0, 1), the sequence is considered random with a confidence of 1 − α. In other cases, it is not considered random.
This work considers value of α is 0.01 as suggested in studies [29][30][31]. A value α = 0.01 indicates a probability of one sequence out of hundred to be rejected. A sequence is random with a confidence of 99% when p value is higher than 0.01. Similarly, it is not random with a confidence of 99% when p value is less than 0.01. The range of acceptable proportions is determined by the following expression: where n is sample size. For sample size, n = 1000, the acceptable interval is between [0.98056, 0.99943]. Any proportion outside of this interval is regarded as non-random [25]. The randomness of proposed MK sequence is verified by using NIST 800-22 test suite. Sequence with at least 10 6 bits is applied as input to NIST test suite. The experimental result is listed in Table 1. The experiment is conducted using primitive polynomial x 3 + x + 3, p = 467 and m = 3. In NIST test suite, some tests such as random excursions variant, random excursions, and non-overlapping template test comprise of several number of tests. Therefore, minimum and maximum results for those tests are listed in Table 1. The results in Table 1 demonstrate that MK sequence successfully passed all tests defined in NIST suite.

Linear Complexity Analysis
In this experiment, linear complexity and linear complexity profile of different sequences are analyzed to understand the statistical behavior of proposed sequence. As a measure of unpredictability these properties are extensively studied in cryptography. The linear complexity is calculated from the length of shortest linear feedback shift register (LFSR) [6]. Similarly, the n-th linear complexity can be calculated from the length of LFSR that can produce first n elements of the sequence. A series of n-th linear complexities is considered as the linear complexity profile. In this work, Berlekamp-Massey algorithm [20] is utilized to derive both linear complexity and linear complexity profile of MK sequence. It should be noted that linear complexity is expected to be n 2 for a sequence of length n [32][33][34]. Table 2 summarizes the linear complexity analysis of MK sequence for different sets of p and m. The numerical results in Table 2 demonstrate that for length n, proposed sequence has linear complexity of n 2 which is equal to ideal.  Figure 1 shows linear complexity profile of MK sequence for primitive polynomial x 3 + 3x + 2, p = 7 and m = 3 in extension field F 7 3 . In Figure 1, the green line represents linear complexity profile for ideal random sequence where the red line does for the proposed MK sequence. The linear complexity profile curve in Figure 1, can be approximated to ideal n 2 line curve, with the length and linear complexity of the sequence. The experimental results indicate that proposed sequence has expected linear complexity like ideal one.

Result of Uniformity
Evaluation of randomness of a PRBS is a challenging task. Some important measures are introduced in Golomb's postulates which work as basis to form basic properties for a pseudorandom sequence to be random looking. One important measure in Golomb's [35][36][37] postulates is uniformity of bits in sequence, which is determined by the number of 0's and 1's in it. A random sequence of n-bits is expected to have approximately n 2 bits of 0's and n 2 bits of 1's. Inspired by this postulate, herein, we study distribution of bit pattern of proposed MK sequence for evaluation of its uniformity.
The experimental results of bit pattern distribution of the proposed sequence for different sets of p and m are presented in Tables 3 and 4. The experiments are conducted using primitive polynomials x 5 + 4x + 2 and x 3 + x + 3 respectively. For any bit pattern, the number of 0's almost equals to the number of 1's. The experimental results ensure that QR/QNR-based binarization can successfully generate uniform sequence of equal number of 0's and 1's. This uniform behavior is consistent and continued even when considered pattern length is increased. Table 3. Result of uniformity test for p = 5, m = 5.

Evaluation by Comparison
This section evaluates our proposed sequence by comparing with other sequence generated from primitive polynomial. We consider NTU sequence [9] for this purpose as it is derived from primitive polynomial, trace function and Legendre symbol that matches ours methodically for fair comparison. Linear complexity and uniformity of bit pattern distribution of sequences are taken into consideration while comparing two sequences. For linear complexity, we derived linear complexity for different length of proposed MK and NTU sequences. Table 5 shows comparison results of linear complexity. For n bit sequence, linear complexity of the proposed sequence is n 2 which is similar to ideal. On the other hand, linear complexity of NTU sequence is lower than proposed sequence, i.e., ideal value. Then, we investigated linear complexity profile of both sequences and the result is showed in Figure 2 for p = 5 and m = 3. The result indicates that the linear complexity profile of proposed sequence is almost similar to ideal. On the other hand, for NTU sequence it becomes saturated at a lower point and lags far behind the proposed sequence.  For comparison of uniformity of bit pattern distribution, we consider 2-bit pattern, i.e., (00, 01, 10, 11) and 3-bit pattern, i.e., (000, 001, 010, 011, 100, 101, 110, 111) to compare their number of appearance in the sequence. It should be noted that for ideal sequence, the number of appearance of any bit pattern should be equal in a sequence. The comparison result for 2 bit pattern and 3 bit pattern for p = 5 and m = 5 is showed in Figure 3. In Figure 3, dotted horizontal red line indicates the ideal value for bit pattern. The results indicate that bit patterns are equally distributed for proposed MK sequence. However, for NTU sequence, pattern distribution is irregular and varies more with increasing number of bit pattern considered.
It should be noted that the randomness of proposed MK sequence is evaluated by experimental results. However, theoretical analysis on computation complexity of different properties of sequence is still worth of further investigation in future.

Conclusions
In this work, we proposed a new pseudo random binary sequence, i.e., MK sequence with an aim to use in security of applications. The proposed sequence is derived from a primitive polynomial in extension field, cyclic difference set and finally binarized using quadratic residue and quadratic nonresidue. The proposed sequence is uniform in terms of zeros and ones, has maximum cycle length and high linear complexity that are prerequisite for any security applications. Numerical results are presented for different length of sequences in support of the claim. Our method was verified with statistical randomness test suite, NIST STS 800-20 package where proposed MK sequence successfully passed all statistical randomness tests. The results confirmed that proposed sequence has high degree of randomness, statistical characteristics conforming to ideal sequence and uniform in bit distribution. In future, we will consider security measure as a function of parameters, e.g., p and m of the proposed algorithm for specific applications and would like to derive theoretical proof of properties presented in this paper. In addition, we want to apply our proposed sequence in practical cryptographic applications such as stream cipher, steganography and investigate its worthiness for security applications and compare with other cryptographically secured pseudo random sequence generator such as AES-128-CTR, ChaCha20 and SHAKE-128 [38][39][40].