3.1. Theoretical Concept
When analyzing binary sequences by gaps, a burst is based on the distribution of the zero and non-zero elements within the binary sequence and is defined as a pattern that begins with a non-zero element and ends with the next non-zero element when at least
a zero elements are between them. The burst terminates with the last non-zero element. This non-zero element is the starting point of the next burst. The parameter
a is also called the distance parameter (gap) between two non-zero elements. If the gap after a non-zero element is greater than or equal to the distance parameter (gap)
a, the burst is regarded as terminated.
Figure 8 highlights the burst definition with
. The burst ends when the gap after a non-zero element is greater than or equal to the distance parameter
a (in the example,
). The proportion of these gaps is characterized by the parameter
, referred to as the gap-distribution function
. This definition states that a burst can contain more than one non-zero element.
When analyzing such patterns, the weights—defined as the number of non-zero or “1” elements—and the lengths of the sequences play a critical role.
Figure 9 illustrates the generation of bursts. The Markov chain [
32] is started from state
, meaning that the burst is assumed to have already begun with a non-zero element. The Markov chain remains in state
as long as the next occurring non-zero element—belonging to burst
i—has a gap to the previous non-zero element that is shorter than
a. When moving to the next non-zero element, the burst will be finished when the gap
k is greater than or equal to
a, i.e.,
. The Markov chain is then in state
, indicating that the next burst has started.
The number of bursts
with the distance parameter
a in a given sequence with
non-zero elements results in
Each non-zero element can be a burst start. The burst ends when the gap k to the following non-zero element is greater than or equal to the distance parameter a. The associated probability is , When non-zero elements populate the random sequence, only bursts can occur. With the distance parameter and , every non-zero element represents a burst’s starting point. Then, holds.
In (
24), the average number of non-zero elements
g within a burst is calculated as
This value provides information about how strongly the non-zero elements are concentrated in the bursts, which is especially interesting for sequences where the IID-DU assumption is violated.
The number of non-zero elements per burst can be calculated by the weight distribution
as a function of the distance parameter
a, resulting in
The explanation makes use of
Figure 10. A burst starts with a non-zero element and ends when the gap toward the following non-zero element is greater than or equal to the distance parameter
a. Thus, the burst has weight
if, at the beginning of the burst, the gap is equal to or greater than
a. The probability of such an event is given by
. The burst has weight
if, at the beginning of the burst, a gap smaller than
a initially occurs (with probability
), followed by a gap greater than or equal to
a.
Figure 11 and
Figure 12 show the distribution of the non-zero elements within the bursts. For small weights
g, the probability
decreases as the distance parameter
a increases (
Figure 11). However, this behavior is reversed for larger weights
g (
Figure 12). This is because, as the distance parameter
a increases, the number of bursts decreases, i.e., the number of non-zero elements per burst rises. It should be noted that in
Figure 12—as in
Figure 11—there are only discrete weight values of
g on the horizontal axis; the continuous line is included to make the behavior, in particular the crossover point, more visible.
3.2. Verification
Assuming that the non-zero elements of the sequence are independently distributed (also referred to as non-bursty non-zero elements), the gap-distribution function
can be derived as a function of the BOP
[
3]. With the BOP
, the probability that a single element in the sequence is zero is given by
. Therefore, the probability
that
neighboring elements in the data stream are zero is given in (
4).
The following sections illustrate the various approaches using a representative example. For this, the gap-distribution function (
4) for independent non-zero elements indicates the starting point. The associated bit sequence is generated using the inversion transform method [
1,
18]. The exemplary bit sequence has a length of
bits, containing
non-zero elements (corresponding to the sequence’s Hamming weight). This leads to
; thus, the sequence is nearly balanced.
Table 2 shows the number of measured bursts (compared with theory) for different values of
a. It can be seen that with the increase in the distance parameter
a, the number of bursts decreases; therefore, the number of non-zero elements per burst increases (see
Table 2).
The simulation results align well with the theoretical values.
Furthermore, the weight distribution—defined as the number of non-zero elements per burst—was analyzed.
Table 3 presents the simulated probabilities for different weights
g.
The theoretical values, as derived from (
4), are shown in
Table 4.
The results demonstrate good agreement between the simulated and theoretical values.
In addition, the relative frequencies for the bursts’ length–weight distribution were analyzed, and the results are shown in
Table 5. When evaluating bursts with a specific weight, e.g.,
, the (absolute) frequencies shown in
Table 3 appear. If all weights are considered, the number of bursts
, as shown in the last row of
Table 3, is obtained. Summing up yields
results in the considered example (see
Table 5, last row, last column). Furthermore,
Table 5 provides information about the bursts’ length distribution (see the last row). When all burst lengths are again taken into account, the number of bursts
is obtained.
The frequencies are normalized with respect to the non-zero elements, resulting in the burst weight–length density function . To complete the numerical example, the length–weight distribution is analyzed for .
Given a known density function, it is then checked whether the generated bit sequence agrees with the ideal characteristic given in (
5).
As a quality metric for assessing the approximation between the obtained gap-distribution function
and a reference distribution
, the mean square error
is used and minimized [
33]. The parameter
denotes the maximum gap length incorporated into the analysis and optimization (see [
16])
Table 6 shows the optimal parameters for the considered distributions and the associated errors from the approximation. The best results were obtained for the ideal distribution (
5), since for
, the Weibull distribution reduces to the exponential distribution. This also holds for the Wilhelm distribution at
, highlighting that the ideal characteristic corresponds to exponential decay.
Figure 13 shows the length–weight distribution obtained for the ideal random sequence (i.e., a random sequence with an exponential gap distribution) as defined by (
5), with the distance parameter
. The three-dimensional plot shows the probability of bursts as a function of both their length
ℓ and weight
g.
Designating a non-zero element in the random sequence as
L, the probability that two successive elements of the random sequence are non-zero is given by
where
is the probability that an arbitrary element of the sequence is non-zero and
is the conditional probability that a non-zero element is immediately followed (i.e., with gap length
) by another non-zero element. In this case, the gap has length
occurs, and
holds, since
represents exactly the probability that a non-zero element is immediately followed by another non-zero element.
To evaluate whether the random variables are independent and identically distributed and follow a discrete uniform distribution, in other words, whether the IID-DU assumption is satisfied, it is important to determine —the probability that one non-zero element immediately follows another in the random sequence. If the IID-DU assumption holds, then .
This probability can be computed by analyzing the gap distribution using (
3). For
,
and for
, it follows that
Here, denotes the probability that a gap of length occurs, corresponding to the number of bursts defined using the distance parameter .
In the sequence under investigation, there are non-zero elements and 2481 gaps with (corresponding to the number of bursts for ). This yields and . Therefore, the IID-DU requirement is approximately satisfied.
Furthermore, the gap-distribution function
can be easily determined by calculating the number of bursts with a given distance parameter
a, according to
Table 7 shows the gap distribution determined from the burst distribution for the exemplary random sequence using (
31). The value
, obtained by calculating the difference between
and
, confirms that the IID-DU assumption is satisfied.
If the random sequence exhibits good random behavior, the bursts are more widely distributed in terms of the non-zero elements they contain. In contrast, when the number of bursts is limited to only a few combinations, the sequence is not very random and should be regarded as a poor random sequence.
So far, the analysis has focused on the distance parameter . In order to broaden the scope of the results, another numerical value for the distance parameter is investigated ().
Figure 14 shows the length–weight distribution obtained for an ideal random sequence, derived from (
5) with
. The three-dimensional plot shows the probability of bursts as a function of both their length
ℓ and weight
g. Compared with the results for
, the probabilities for
shift toward smaller values.