Gaps are always even numbers and as there exist arbitrarily large gaps, in order to make the gap sequence stationary we consider this sequence modulo 6. Remember that the choice of the modulus is not arbitrary here, as the residue classes classify gaps into three big families: 2 mod 6 (twin-like), 4 mod 6 (cousin-like) and 0 mod 6 (sexy-like).
3.4.1. Entropic Analysis
In principle, a null model for this sequence should be of the type I, yielding for the uniform case
. For the more realistic case where the empirical frequencies of the residue classes are considered, we have observed that for the sequence of the first
gaps,
For these frequencies, we find
. The null model predicts
, so the first partial entropy should be enough to extrapolate the asymptotic value. Using a Cramer null model we also find the same result. However, the actual spectrum is shown in
Figure 10, pointing out two unexpected features: (i) the convergence of Renyi entropies to finite, positive values smaller than those predicted by the null models. This happens even for
, pointing out to the presence of forbidden patterns in this sequence: blocks of
m symbols that appear in the null models but are forbidden in the prime gaps. And (ii) there seems to be a non-trivial, monotonic dependence on
which is not found in the null models. In what follows we discuss both results.
Explaining forbidden patterns. Some observed low order forbidden patterns are enumerated in
Table 1. The first forbidden pattern is for a block of size
and consists in
. A forbidden pattern in the gaps residue sequence such as
actually relates to an infinite set of forbidden patterns in the prime sequence. First, all gap pairs
with
are congruent to
, that is, consecutive gaps such as
,
,
,
, etc. are all forbidden. In the prime sequence, each of these forbidden gap pairs is associated with a forbidden prime triple of the form
, with
q prime.
From a dynamical systems viewpoint, these results suggest that the gap residue sequence manifests chaotic behaviour (as the amount of admissible blocks grows exponentially with
m and thus the KS entropy is positive), but of weaker intensity as
. According to Pesin identity [
26], for symbolic dynamics with positive KS entropy, this quantity is equal to the Lyapunov exponent of the underlying dynamical system, this latter quantity being responsible for the exponentially fast separation of initially close trajectories. In a system with
p symbols, information could be erased at least as fast as
per time step (in Information Theory both the topological and KS entropies are usually defined in base 2 rather than in base
e, and in this setting information can be erased as fast as
bit per iteration). In any case, for the gaps residue sequence modulo 6 we have
, so the dynamics is chaotic albeit information is lost at a slower pace as
.
Why is this the case? Why is the dynamics underlying the prime gap residue sequence ‘less chaotic’ than a shift of finite type or a purely random process? In other words: what is the nature of these forbidden blocks? Consider for instance the block , which is not forbidden as it appears at least once in the prime progression . It is easy to show that such progression is actually the only possibility. The proof consists in studying the divisibility properties of and . It turns out that in this progression there is always an element which is not prime because it is a multiple of three. Assume for a contradiction that this is not the case, and consider the remainder of the integer division . If q is not a multiple of three, the remainder should either be 1 or 2. If the remainder is 1, then is a multiple of three. If the remainder is 2, then is a multiple of three. The only case where there is no contradiction is when (a prime which is multiple of three), what finishes the proof.
Note that this proof also certifies that no progression other than 3, 5, 7 is possible, i.e., this holds not just for a twin triplet but for any twin-like triplet of the type where .
The origin of forbidden blocks can be thus directly linked to the divisibility properties of the integers. Actually, it is well-known that a block of m consecutive gaps , which gives rise to a prime block of the type is forbidden if and only if one can find a prime r for which each all and every partial series is congruent to a different residue from . For instance, in the case above of a gap duple that generates a prime sequence of the form , and .
For
again all residues for
are ticked, thus
is forbidden. Using this principle one can therefore systematically enumerate these forbidden patterns. For a generic
m one finds
from which we derive the partial entropies
for the topological entropy:
for
and
. This monotonic decay is in good agreement with the experiment (see the left panel of
Figure 10). Interestingly, the number of admissible blocks of size
m -which is here related to the divisibility properties of the integers- precisely coincides with the number of admissible blocks in the type II null model, see Equation (
5). Altogether,
Monotonic dependence on and the distribution of blocks. In order to find an analytical expression for which would allow us to elucidate whether the Renyi spectrum is indeed non-trivial or, on the contrary, whether this is just a finite size effect which while not present in the null models might be present for finite size statistics of gaps residue sequences but vanish asymptotically. We would need to be able to find an analytical expression for the frequencies of each admissible block.
Let us start by considering a Cramer null model. According to this model any large integer q has roughly a probability of being a prime, then assuming probability independence the probability that an m-tuple of integers is prime is simply . However, obviously the Cramer model predicts the same probability for every m-tuple, that is, in a Cramer random model every gap block of size m would be equiprobable and therefore the frequency of each admissible block of size m is simply . Under this situation it is easy to show that : a Cramer null model does not explain the dependence on found in the gap residue sequence.
Fortunately however, a well-known conjecture in number theory comes to save the day. First, let us define a prime m-tuple as a sequence of consecutive primes of the form . Such a prime m-tuple is trivially associated to a gap block (of course, in our case many different gap blocks have the same associated residue block). The diameter of a prime m-tuple is the difference of its largest and smallest element, i.e., . For a fixed m, the m-tuple with smallest diameter is called a prime constellation. For instance, for a prime duple one can find twin primes (), cousin primes (), sexy primes () and so on, all associated to a gap block . The smallest diameter is for and therefore for the only prime constellation consist of twin primes. For gap blocks of size , the smaller diameter is , and there are two possible constellations for that, associated with the gap blocks and . An example of the former is the prime triple whereas for the latter the smallest constellation is .
It is clear that prime constellations generate only a subset of our gap residue blocks, but still an infinite subset, and more importantly, for a fixed
m there is in general more than one constellation. Now, the celebrated
m-tuple conjecture by Hardy and Littlewood [
25] states that the frequencies of these prime constellations can be computed explicitly:
Prime -
tuple conjecture. The amount of prime constellations
found for
is given asymptotically by
where
are the so-called Hardy–Littlewood constants, the product runs over odd primes
and the function
denotes the number of distinct residues in
mod
q.
Incidentally, note that for readability we have used the symbol m to be consistent with the previous exposition, however this conjecture is usually stated as the k-tuple conjecture. Remarkably, these probabilities are proportional to the Cramer model but, contrary to the Cramer model, do indeed depend on the particular block type via the Hardy–Littlewood constants. This enables us to explore the uneven distribution of blocks via this conjecture:
For
, we are considering pairs of primes. The prime constellation in this case, as we know, is for
, therefore the conjecture predicts that the amount of twin primes below
x is
where
The formula given by Hardy and Littlewood can actually be used to estimate the amount of any prime
m-tuples, not necessarily only those with smaller diameter. For instance, still for
we can work out the case for which
(cousin primes) and
(sexy primes), finding
From this first analysis it is straightforward that sexy pairs are, asymptotically, twice more frequent than twins and cousins, what already suggests for that we have a non-uniform distribution of blocks.
The uneven distribution of blocks can be further assessed by enumerating for a given size
m all the
Hardy–Littlewood constants
. This is in principle possible but is a formidable exercise in practice. An additional technical issue to have in mind is that here we deal with blocks of prime gap residues, so for a given
m we need to sum over all indices congruent with a given residue block. Let us consider again the case
. The amount of gaps congruent to 0 mod 6 (sexy primes) is given by taking into account gaps 6, 12, 18, etc so
etc. Accordingly, this amount is associate to
. Formally, if we define the normalization factor
as:
the densities of each gap residues is therefore:
Truncating these series at order
i accounts for the gaps of size at most
. For instance, for third order (accounting for gaps up to size 18) we have an approximation:
which is still reasonable far away from the empirical frequencies (Equation (
6)) (at order four
so convergence is slow). Still, this analytical approximation already certifies that the three possible gap residue blocks of size
are not uniformly represented. Formally, one can express, assuming Hardy–Littlewood conjecture, the asymptotic probability of an admissible gap residue block of size
m , where
as
where the summation in
runs between
and
∞ when
and
and between
and
∞ otherwise, and
is the normalization factor. Equation (
8), together with Equations (
3) and (
4) constitute the formal solution to the problem.
Let us consider the case
, for which there are 8 admissible blocks. To prove non-uniformity for
we only need to find that two different blocks have different frequencies: for convenience we will concentrate on the first two ones, namely
in
(which gathers all prime triples whose gaps are a multiple of 6) and
. Equation (
8) reduces in these cases to:
where
is the normalization sum. At second order in
,
and
where
is the normalisation sum truncated at order two. At this point we can make use of Hardy–Littlewood’s conjecture one more time. If we label
and
then after a lengthy but trivial computation we get
,
,
,
hence
whereas
,
,
,
thus
which is enough to conclude non-uniformity in the approximation of the frequencies of blocks of size
. This is a clear support to the apparence monotonic dependence on
of the Renyi entropies. A full proof would require to perform the infinite summations in each case, what appears to be a quite challenging endeavor and is left as an open problem.