Combinatorial Models of the Distribution of Prime Numbers

Barbarani, Vito

doi:10.3390/math9111224

Open AccessArticle

Combinatorial Models of the Distribution of Prime Numbers

by

Vito Barbarani

European Physical Society, via Cancherini 85, 51039 Quarrata, Italy

Mathematics 2021, 9(11), 1224; https://doi.org/10.3390/math9111224

Submission received: 31 March 2021 / Revised: 21 May 2021 / Accepted: 24 May 2021 / Published: 27 May 2021

Download

Browse Figures

Versions Notes

Abstract

:

This work is divided into two parts. In the first one, the combinatorics of a new class of randomly generated objects, exhibiting the same properties as the distribution of prime numbers, is solved and the probability distribution of the combinatorial counterpart of the n-th prime number is derived together with an estimate of the prime-counting function

π (x)

. A proposition equivalent to the Prime Number Theorem (PNT) is proved to hold, while the equivalent of the Riemann Hypothesis (RH) is proved to be false with probability 1 (w.p. 1) for this model. Many identities involving Stirling numbers of the second kind and harmonic numbers are found, some of which appear to be new. The second part is dedicated to generalizing the model to investigate the conditions enabling both PNT and RH. A model representing a general class of random integer sequences is found, for which RH holds w.p. 1. The prediction of the number of consecutive prime pairs as a function of the gap d, is derived from this class of models and the results are in agreement with empirical data for large gaps. A heuristic version of the model, directly related to the sequence of primes, is discussed, and new integral lower and upper bounds of

π (x)

are found.

Keywords:

set partitions; stirling numbers of the second kind; harmonic numbers; prime number distribution; Riemann Hypothesis; Gumbel distribution

1. Introduction: The Bingo Bag of Primes

This work aims to investigate two main objectives: what random means in the case of the distribution of prime numbers and the general conditions ensuring that what we know as Prime Number Theorem and Riemann Hypothesis also hold in the case of a generic random sequence of natural numbers. To achieve these goals, we will make use of models based on combinatorics and probability theory. Of course, this is not the first time that models are used to derive new conjectures and theorems about the prime sequence. In the mid-1930s, Cramér [1,2] proposed the first stochastic model of the sequence of primes as a way to formulate conjectures about their distribution. This model can be briefly summarized as follows.

Denote by

{X_{n}, n = 1, 2, 3 \dots}

a sequence of independent random variables

X_{n}

, taking values 0 and 1 according to

P r o b {X_{n} = 1} = \frac{1}{ln (n)}, n > 2

(the probability may be arbitrarily chosen for

n = 1, 2

). Then, the random sequence of natural numbers

S = {n : X_{n} = 1, n > 1}

is a stochastic model of the sequence of primes. This means that if

Ω = {ω}

is the set of all possible sequences

ω

realizations of S, then the prime sequence, say

ω_{P}

, is obviously an element of

Ω

. We can therefore study the properties of the particular sequence

ω_{P}

, through the methods of probability theory applied to the sample space

Ω

, and conjecture with Cramér that if a certain property holds in

Ω

with probability 1 (w.p. 1 in that follows), that is for almost all the sequences, the same property holds for the prime sequence

ω_{P}

. Obviously, the results of these conjectures lay heavily on the assumption about the probability of the event

{X_{n} = 1}

, which connects the model with the distribution of prime numbers as stated by the Prime Number Theorem (PNT)

π (x) \sim \frac{x}{ln (x)} \sim L i (x) = \int_{2}^{x} \frac{d x}{ln (x)}

(1)

with

π (x)

being the prime-counting function. We see from (1) that

\frac{1}{ln (n)}

can be interpreted as the expected density of primes around n, or the probability that a given integer n should be a prime.

As reported by Granville [3] in his review article dedicated to the work of Cramér in this area, he imagined building the set of random sequences S through a game of chance consisting of a series of repeated draws (independent trials) from an ordered sequence of urns (

U_{1}, U_{2}, U_{3}, \dots

) such that the n-th urn contains

ln (n)

black balls and 1 white ball, starting from

n = 3

forward. In this game, the event

{X_{n} = 1}

is then realized by drawing the white ball from the urn number n.

The heuristic method based on Cramér’s model has been playing an essential role since the mid-1930s to the present day when formulating conjectures concerning primes. The first serious difference between the prediction of the model and the actual behavior of the distribution of primes, was found in 1985 [4]. As pointed out in [3], this difference arises because of the assumptions made by heuristic procedures based on the probabilistic model: the hypothesis, when applying the ’Sieve of Eratosthenes’, that sieving by the different primes

\leq \sqrt{n}

are independent events and Gauss’s conjecture about the density of primes around n.

The problem was known well before Cramér’s work. In their 1923 paper [5], Hardy and Littlewood, while discussing some asymptotic formulas of Sylvester and Brun containing erroneous constant factors and functions of the number

e^{- γ}

(

γ

is the Euler–Mascheroni constant), observe that “any formula in the theory of primes, deduced from considerations of probability, is likely to be erroneous just in this way”. They explain this error arises because of the different answers to the question about the chance that a large number n should be prime. Following

P N T

and Gauss’s conjecture, this chance is approximately

1 / ln (n)

, while if we consider the chance n should not be divisible by any prime

\leq \sqrt{n}

, assuming independent events for any prime (Hardy and Littlewood do not mention this condition, but it is implicit), this chance is asymptotically equivalent to

\prod_{p \leq \sqrt{a}} (1 - \frac{1}{p}) \sim 2 e^{- γ} \frac{1}{ln (n)}, p prime .

Hence, they conclude that any inference based on the previous formula is “incorrect to the extent of a factor

2 e^{- γ} = 1.123 \dots

”. More recently, new contradictions have been found, between the predictions of the model and the actual distribution of primes, that no Cramér-type model, assuming the hypothesis of independent random variables, can overcome [6].

Cramer’s model founds its connection with the distribution of prime numbers on Gauss’s conjecture; hence, it assumes the Prime Number Theorem as a priori hypothesis. In this work, a new model is proposed based on a class of combinatorial objects I call First Occurrence Sequences (FOS), reproducing the stochastic structure of the prime sequence as a result of the straightforward random structure of the model, based on a sequence of independent equally probable trials. The Prime Number Theorem emerges, though without any a priori ad hoc assumption about the probability of events. Perhaps the best way to summarize the aim of this work is that it tries to answer the question Granville cites at the beginning of his paper [3]: “It is evident that the primes are randomly distributed but, unfortunately, we do not know what ‘random’ means” (the quote is of Prof. R. C. Vaughan).

Following Cramér’s method, we can describe the pinpoints of our model as a game of chance. Suppose we have an urn with

n = 2

balls of different colors, say black (B), and white (W), and let us consider the following sampling with replacement game: choose a ball, say B and replace it; repeat drawing until the white ball W occurs for the first time. In general, the game ends when you draw the ball with a color other than the first ball drawn. The sequences of B and W balls we get from this game are FOS of different lengths k. Some examples:

B B W

,

k = 3

;

B W

,

k = 2

;

W W W B

,

k = 4

;

W B

,

k = 2

.

Assuming each ball has the same probability of being drawn, it is easy to see the average numbers of steps we need to get both colors for the first time, or the average length of FOS, when

n = 2

, is equal to 3. If we repeat this simple game with different values of n (number of colors) and the rule that the game ends when all colors appear in the sequence for the first time and compare the average length of FOS of order n (say

L_{n / n}^{'}

) with the prime number

p_{n}

of the same order, we would find the results reported in Table 1 for

n = 2, 3, 4, 5

, and 6.

The correlation between FOS average lengths and the sequence of primes is a general property of the model. Indeed, assuming each colored ball has the same probability of being drawn, and every sampling step with replacement is independent of each other, we can easily find

L_{n / n}^{'} = n \sum_{k = 1}^{n} \frac{1}{k} \sim n ln n .

(2)

The above equation shows the close connection existing between the combinatorial objects I have called FOS and the primes. In particular, the average length of the n order FOS, obtained through the sampling with replacement game from an urn with n colored balls, is asymptotic to the expression of the n-th prime obtained as a consequence of PNT. In this sense, the n-th order FOS can be defined as the combinatorial counterpart of the n-th prime number.

The following sections are devoted to developing the theory of FOS and their analogies with the distribution of primes. Despite their simplicity, FOS may serve as a general model of random integer sequences which can be applied in different contexts, from combinatorics to physical models (I give a brief hint about this in the last section), showing the common features to all these fields, such as what is known as PNT. The method I followed in this work is to give a complete account of these combinatorial objects independently from the definition of other similar objects, starting from proper probability spaces and the general (recursive) probability relations, then to develop the full theory with particular attention to showing the connections with other research fields. A generalization is proposed in the last part of the work, starting from a continuous version of the original model.

In Section 2, a formal definition of this kind of objects is given together with the probability spaces we need to define probability relations and probability recursive formulas. In Section 3, the combinatorics of FOS is completely solved, demonstrating its close connections with set partitions and Stirling numbers of the second kind. In Section 4, the probability equations derived from the combinatorial analysis lead to new identities for Stirling numbers of the second kind and for harmonic numbers. The combinatorial model of primes based on FOS is defined together with a discrete estimate

\hat{π} (x)

of the prime-counting function

π (x)

, which is studied and numerically tested. We prove PNT holds for this model while the Riemann Hypothesis does not. In Section 5, the model is generalized to investigate the conditions enabling both PNT and the Riemann Hypothesis. We find a general model of random sequences for which the Riemann Hypothesis holds w.p. 1 and apply this model to the problem of counting consecutive prime pairs as a function of the gap between successive primes. The theoretical results obtained from the model agree with the empirical data for large gaps and show the importance of the correlation between primes in the case of small gaps. Finally, this model is directly related to the sequence of primes, leading to new integral lower and upper bounds of the prime-counting function

π (x)

. At last, Section 6 summarizes this work’s general significance and hints at its links with other research areas, particularly with research on physical models.

2. First Occurrence Sequences: A New Class of Combinatorial Objects

In this section, we will be concerned about defining probability spaces and First Occurring Sequences as events and deriving some fundamental relations among the probability of certain events, while the evaluation of these probabilities will be treated in Section 3.

2.1. Definitions and Probability Spaces

Each FOS can be seen as a particular result of a sampling with replacement random experiment, consisting of a repeated sequence of independent trials with more than one outcome. The proper probability space for such an experiment can be defined as follows (see (Lect. 1, 2, [7]) and (Chap. 10, 13, [8])). Let

A_{n}

be a collection of n distinct symbols

A_{n} = {a_{1}, a_{2}, \dots, a_{n}}, n \geq 2

and suppose our experiment of random draws from

A_{n}

is repeated k times. Then, an outcome of the compound experiment is a k-length sequence of elements of

A_{n}

ω_{n} (k) = {(x_{1}, x_{2}, \dots, x_{k}) : x_{j} \in A_{n}, j = 1, 2, \dots, k}, k \geq 1 .

The probability space of the experiment is the triplet

(Ω_{n}^{k}, F_{n}^{k}, P_{n}^{k})

, where

Ω_{n}^{k} = {ω_{n} (k)}

is the sample space of all possible

n^{k}

outcome sequences given by the k-fold Cartesian product of

A_{n}

with itself

Ω_{n}^{k} = A_{n} \times A_{n} \times \dots \times A_{n}, k times;

F_{n}^{k}

is the

σ

-algebra of subsets of

Ω_{n}^{k}

generated by finite-dimensional cylinder sets

C_{n}

of the form

C_{n} (B_{1}, B_{2}, \dots, B_{k}) = {ω_{n} (k) : x_{1} \in B_{1}, x_{2} \in B_{2}, \dots, x_{k} \in B_{k}}

(3)

with

B_{i} \in B i = 1, 2, \dots, k

,

B

a

σ

-algebra on

A_{n}

of subsets of

A_{n}

.

To define the probability measure

P_{n}^{k} : F_{n}^{k} \to [0, 1]

, let us consider a collection of numbers

{p_{i}, i = 1, 2, \dots, n}

such that

p_{i} \geq 0, i = 1, 2, \dots, n

and

\sum_{i = 1}^{n} p_{i} = 1 .

The collection of numbers

p_{i}

defines the probability P of any outcome

a_{i}

of the single trial experiment and of any subset

B \in B

through the equations

P [a_{i}] = p_{i}

(4)

P [B] = \sum_{a_{i} \in B} p_{i} .

(5)

Assuming each trial is independent, the probability of an outcome

ω_{n} (k)

of the compound experiment is given by

P [ω_{n} (k)] = \prod_{i = 1}^{k} P [x_{i}] .

(6)

The above equation defines a probability distribution on the sample space

Ω_{n}^{k}

, since it is (par. 2.3, [7])

P [Ω_{n}^{k}] = \sum_{ω_{n} (k)} P [ω_{n} (k)] = 1

and the probability measure

P_{n}^{k}

on the

σ

-algebra

F_{n}^{k}

can be generated by P through the formula

P_{n}^{k} [E] = \sum_{ω_{n} (k) \in E} P [ω_{n} (k)], for every event E \in F_{n}^{k} .

(7)

I recall here two properties of

P_{n}^{k}

, which will be used throughout this paper.

Remark 1.

[σ-additivity] Given

C_{i} \in F_{n}^{k}

,

i = 1, 2, \dots

, and

C_{i} \cap C_{j} = \emptyset

for

i \neq j

, then

P_{n}^{k} (⋃_{i = 1}^{\infty} C_{i}) = \sum_{i = 1}^{\infty} P_{n}^{k} (C_{i}) .

Remark 2.

Given any finite-dimensional cylinder set (3) the probability measure

P_{n}^{k}

has the property (Corollary 2.1, [7])

P_{n}^{k} [C_{n} (B_{1}, B_{2}, \dots, B_{k})] = \prod_{i = 1}^{k} P [B_{i}] .

(8)

where P is the probability distribution defined by (4)–(6) (obviously

A_{n} \in B

anf if

B_{i} = A_{n}

for some i, it is

P [A_{n}] = 1

).

Definition 1.

A collection

O_{i} = {a_{j_{1}}, a_{j_{2}}, \dots, a_{j_{i}}} \subset A_{n}

is an ordered choice of i distinct symbols from

A_{n}

,

(1 \leq i \leq n)

, if

a_{j_{l}} \neq a_{j_{m}}

for

l \neq m

,

l = 1, 2, \dots, i

,

m = 1, 2, \dots, i

.

Among the events of the

σ

-algebra

F_{n}^{k}

we are interested in the following ones.

Definition 2.

We denote with

S_{i / n} (k)

, k-length sequences with

i / n

distinct symbols

(1 \leq i \leq n

,

k \geq i)

, the following events

S_{i / n} (k) = {ω_{n} (k) = (x_{1}, x_{2}, \dots, x_{k}) : ω_{n} (k) \subset O_{i} and O_{i} \subset ω_{n} (k)}

for some i-distinct ordered choice

O_{i}

.

Definition 3.

We denote with

S_{i / n}^{'} (k)

, k-length First Occurrence Sequences (FOS) with

i / n

distinct symbols

(1 \leq i \leq n

,

k \geq i)

, the events

S_{i / n} (k)

such that

x_{j} \neq x_{k}

,

j = 1, 2, \dots, (k - 1)

.

Hence if we choose among the elementary outcomes in

Ω_{n}^{k}

, those with only i distinct symbols and denote them with

ω_{i / n} (k)

, the event

S_{i / n} (k)

is the subset

S_{i / n} (k) = {ω_{i / n} (k)}

. If we denote further with

ω_{i / n}^{'} (k)

those outcomes in the above subset, where the i-th distinct symbol is unique (occurs once only) and occupies the last k-th position in the sequence, then the event

F O S

is the subset

S_{i / n}^{'} (k) = {ω_{i / n}^{'} (k)}

. The following remarks follow obviously from the above definitions.

Remark 3.

For every elementary event

ω_{i / n} (k)

and

ω_{i / n}^{'} (k)

, there is one and only one ordered choice

O_{i}

replicating the order of the first occurrence of the i distinct symbols in the elementary sequence.

Remark 4.

In general it is

S_{i / n}^{'} (k) \subset S_{i / n} (k)

, the two events coincide when

i = k = 1

and

i = k = n

. In these cases:

S_{1 / n}^{'} (1) = S_{1 / n} (1) = {a_{i}, i = 1, 2, \dots, n}

S_{n / n}^{'} (n) = S_{n / n} (n) = {ω_{n / n} (n)}

where

{ω_{n / n} (n)}

is the group of permutations of the set

A_{n}

.

Remark 5.

S_{1 / n}^{'} (k) = \emptyset, if k \geq 2 .

Let us now derive the general probability relation between events

S_{i / n}^{'} (k)

,

S_{i / n} (k)

and elementary outcome sequences.

Definition 4.

Given a single k-length outcome with

i / n

distinct symbols

ω_{i / n} (k)

(1 \leq i \leq n

,

k \geq i)

, let us denote with

Y (ω_{i / n} (k))

the subset of i symbols occurring in

ω_{i / n} (k)

Y (ω_{i / n} (k)) = {a_{j} \in A_{n} : a_{j} \in ω_{i / n} (k)}

and with

Y^{c} (ω_{i / n} (k))

the complementary set of the remaining

(n - i)

symbols

Y^{c} (ω_{i / n} (k)) = A_{n} - Y (ω_{i / n} (k)) .

An analogous definition applies to the symbol set

Y (ω_{i / n}^{'} (k))

and the complementary symbol set

Y^{c} (ω_{i / n}^{'} (k))

of an elementary outcome

ω_{i / n}^{'} (k)

. Obviously both Y and

Y^{c}

are subsets of a σ-algebra

B

on

A_{n}

of subsets of

A_{n}

.

Theorem 1.

The probability of the events

S_{i / n}^{'} (k)

and

S_{i / n} (k)

as a function of elementary outcomes of the compound experiment and their symbol sets and complementary symbol sets, are given by

P_{n}^{k} [S_{i / n}^{'} (k)] = \sum_{ω_{i - 1 / n} (k - 1)} P [ω_{i - 1 / n} (k - 1)] P [Y^{c} (ω_{i - 1 / n} (k - 1))]

(9)

P_{n}^{k} [S_{i / n} (k)] = \sum_{j = i}^{k} \sum_{ω_{i / n}^{'} (j)} P [ω_{i / n}^{'} (j)] {(P [Y (ω_{i / n}^{'} (j))])}^{k - j} .

(10)

Proof.

For every sequence

ω_{i - 1 / n} (k - 1)

, the cylinder set

C_{n} [ω_{i - 1 / n} (k - 1), Y^{c} (ω_{i - 1 / n})] \subset S_{i / n}^{'} (k)

contains

(n - i + 1)

FOS

ω_{i / n}^{'} (k) \in S_{i / n}^{'} (k)

, and we get the whole event as the union of the disjoint cylinder sets

S_{i / n}^{'} (k) = ⋃_{ω_{i - 1 / n} (k - 1)} C_{n} [ω_{i - 1 / n} (k - 1), Y^{c} (ω_{i - 1 / n})] .

From Equation (8), the probability of the cylinder set is given by

P_{n}^{k - 1} [C_{n} (ω_{i - 1 / n} (k - 1), Y^{c} (ω_{i - 1 / n}))] = P [ω_{i - 1 / n} (k - 1)] P [Y^{c} (ω_{i - 1 / n})]

and, hence, Equation (9) follows from the

σ

-additivity property of the probability measure.

For every sequence

ω_{i / n}^{'} (j)

(i \leq j \leq k)

, the cylinder set

C_{n} [ω_{i / n}^{'} (j), \underset{(k - j) times}{\underset{⏟}{Y (ω_{i / n}^{'} (j)), Y (ω_{i / n}^{'} (j)), \dots, Y (ω_{i / n}^{'} (j))}}] \subset S_{i / n} (k)

(when

j = k

the cylinder coincides with

ω_{i / n}^{'} (k)

), contains

i^{k - j}

sequences

ω_{i / n} (k) \in S_{i / n} (k)

and the whole event is the union of the disjoint cylinder sets

S_{i / n} (k) = ⋃_{j = i}^{k} ⋃_{ω_{i / n}^{'} (j)} C_{n} [ω_{i / n}^{'} (j), \underset{(k - j) times}{\underset{⏟}{Y (ω_{i / n}^{'} (j)), Y (ω_{i / n}^{'} (j)), \dots, Y (ω_{i / n}^{'} (j))}}] .

From Equation (8) the probability of the cylinder set is

P_{n}^{k} [C_{n} [ω_{i / n}^{'} (j), \underset{(k - j) times}{\underset{⏟}{Y (ω_{i / n}^{'} (j)), Y (ω_{i / n}^{'} (j)), \dots, Y (ω_{i / n}^{'} (j))}}]] = P [ω_{i / n}^{'} (j)] {(P [Y (ω_{i / n}^{'} (j))])}^{k - j} .

Equation (10) follows from the

σ

-additivity of the probability measure

P_{n}^{k}

. □

Corollary 1.

Given

i = n

(the number of distinct symbols is equal to the total number of symbols in

A_{n}

) and

k \geq n

then

P_{n}^{k} [S_{n / n} (k)] = \sum_{j = n}^{k} P_{n}^{j} [S_{n / n}^{'} (j)] .

Proof.

When

i = n

the symbol set of FOS

ω_{n / n}^{'} (j)

,

j \geq i

, becomes

Y (ω_{n / n}^{'} (j)) = A_{n}

and remembering

P [A_{n}] = 1

, the thesis follows immediately through simple manipulations of (10), after noting that the measure

P_{n}^{k}

of cylinder sets

C_{n} [S_{n / n}^{'} (j), \underset{(k - j) times}{\underset{⏟}{A_{n}, A_{n}, \dots, A_{n}}}]

is equal to

P_{n}^{j}

measure of

S_{n / n}^{'} (j)

. □

Dealing with FOS in general and problems such as the average length of sequences will require considering an infinite number of independent trials or repetitions of the random sampling with replacement experiment. The probability space

(Ω_{n}^{k}, F_{n}^{k}, P_{n}^{k})

we have defined above for finite values of k, can be generalized to consider sampling sequences of infinite length from the set

A_{n}

, so that the set of all possible outcomes is

Ω_{n}^{\infty}

, the Cartesian product of countable many copies of

A_{n}

. The product measure (Theorem 6.3, p. 141, [8]) allows us to define an extended probability space

(Ω_{n}^{\infty}, F_{n}^{\infty}, P_{n}^{\infty})

where sequences

S_{i / n} (k)

, for a fixed k, are subsets of

Ω_{n}^{\infty}

defined as

S_{i / n} (k) \times A_{n} \times A_{n} \times \dots \in F_{n}^{\infty}

and the probability measure and its properties remain unchanged (Par. 13.1, [8])

P_{n}^{\infty} [S_{i / n} (k) \times A_{n} \times A_{n} \times \dots] = P_{n}^{k} [S_{i / n} (k)] = \sum_{ω_{i / n} (k)} P [ω_{i / n} (k)]

where

P [ω_{i / n} (k)]

is defined in (6). The same obviously applies to

S_{i / n}^{'} (k)

events.

If we consider only

i / n

FOS as outcomes of our experiment, We may define an even more straightforward probability space (Example 2, p. 270, [8]), say

(Ω_{i / n}^{'}, F_{i / n}^{'}, P_{i / n}^{'})

, where the sample space is the countable set

Ω_{i / n}^{'} = {ω_{i / n}^{'} (k), k = i, i + 1, i + 2, \dots}

and the probability of each elementary sample

ω_{i / n}^{'} (k)

is defined as in (6), while for every event

E \in F_{i / n}^{'}

it is

P_{i / n}^{'} [E] = \sum_{ω_{i / n}^{'} (k \geq i) \in E} P [ω_{i / n}^{'} (k)] = \sum_{k = i}^{\infty} \sum_{ω_{i / n}^{'} (k) \in E} P [ω_{i / n}^{'} (k)] .

Hence, the probability of the event

S_{i / n}^{'} (k)

is still given by

P_{i / n}^{'} [S_{i / n}^{'} (k)] = P_{n}^{k} [S_{i / n}^{'} (k)]

.

Remembering the definition of partition of a sample space (p. 4, [7]), the following remarks follow immediately from the definitions above.

Remark 6.

Given

k \geq n

, the collection of events

{S_{i / n} (k) i = 1, 2, \dots, n}

is a partition of

Ω_{n}^{k}

. Hence

S_{i / n} (k) \cap S_{j / n} (k) = \emptyset f o r i \neq j, k \geq n

Ω_{n}^{k} = ⋃_{i = 1}^{n} S_{i / n} (k) .

Remark 7.

The collection of events

{S_{i / n}^{'} (k) k = i, i + 1, i + 2, \dots}

is a partition of

Ω_{i / n}^{'}

. Hence

S_{i / n}^{'} (k) \cap S_{i / n}^{'} (m) = \emptyset for k \neq m, k \geq i, m \geq i

Ω_{i / n}^{'} = ⋃_{k = i}^{\infty} S_{i / n}^{'} (k) .

2.2. Probability Relations

In order to simplify our expressions, let us adopt the following symbols for the probability of events

S_{i / n} (k)

and

S_{i / n}^{'} (k)

,

1 \leq i \leq n

,

k \geq i

Notation.

P_{n}^{k} [S_{i / n} (k)] = ϕ_{i / n} (k)

P_{n}^{k} [S_{i / n}^{'} (k)] = ϕ_{i / n}^{'} (k) .

This notation will be used throughout the paper.

In the following we will derive some relations which are valid under the assumption of the general probability distribution (4)–(6).

Note that from Remarks 4 and 5 it follows

ϕ_{1 / n} (1) = ϕ_{1 / n}^{'} (1) = 1

(11)

ϕ_{n / n} (n) = ϕ_{n / n}^{'} (n)

(12)

ϕ_{1 / n}^{'} (k) = 0, k \geq 2,

(13)

while from Remarks 6 and 7

\sum_{i = 1}^{n} ϕ_{i / n} (k) = 1, k \geq n

(14)

\sum_{k = i}^{\infty} ϕ_{i / n}^{'} (k) = 1 .

(15)

Assuming

k \geq n

, from (14) and Corollary 1, which can be written using the new notation

ϕ_{n / n} (k) = \sum_{j = n}^{k} ϕ_{n / n}^{'} (j),

(16)

it follows

\sum_{i = 1}^{n - 1} ϕ_{i / n} (k) + \sum_{j = n}^{k} ϕ_{n / n}^{'} (j) = 1 .

(17)

Since of course

\sum_{i = 1}^{n - 1} ϕ_{i / n} (k) \to 0

as

k \to \infty

, it is interesting to note that the above equation is a heuristic proof of

\sum_{k = n}^{\infty} ϕ_{n / n}^{'} (k) = 1 .

From (14) with

k = n

, (12) and (15) with

i = n

, we can write the following equality

\sum_{i = 1}^{n - 1} ϕ_{i / n} (n) = \sum_{k = n + 1}^{\infty} ϕ_{n / n}^{'} (k) .

(18)

From here on throughout this paper, we will assume that the the symbols in the set

A_{n}

are equally probable; that is, with reference to the probability distribution (4)–(6), we will make the following assumption.

Assumption 1.

(Equally probable symbols)

P [a_{i}] = \frac{1}{n}, i = 1, 2, \dots, n

Under the above assumption, the probability (6) of any sequence

ω_{n} (k) \in Ω_{n}^{k}

is given by

P [ω_{n} (k)] = \frac{1}{n^{k}}

while for the probability of FOS and related events it holds

ϕ_{i / n} (k) = \frac{# S_{i / n} (k)}{n^{k}}

(19)

ϕ_{i / n}^{'} (k) = \frac{# S_{i / n}^{'} (k)}{n^{k}},

(20)

where the symbol “#

S

” denotes the cardinality of the set

S

. Hence Equation (12) can be completed in this case as

ϕ_{n / n} (n) = ϕ_{n / n}^{'} (n) = \frac{n!}{n^{n}} .

(21)

while, since #

S_{1 / n} (k) = n

, when

k \geq 1

, it is

ϕ_{1 / n} (k) = \frac{1}{n^{k - 1}} .

(22)

The trivial case

i = 1

is completely solved by Equations (11), (13) and (22), so in the following we shall assume

i \geq 2

.

The following lemma follows directly from the definitions.

Lemma 1.

Under Assumption 1, the probability of the symbol set and its complementary (see Definition 4) of any outcome sequence

ω_{n} (k)

is independent of the sequence and is given by

P [Y (ω_{i / n} (k))] = \frac{i}{n}

P [Y^{c} (ω_{i / n} (k))] = \frac{n - i}{n}

Corollary 2.

Under Assumption 1, from Theorem 1 it follows (

2 \leq i \leq n

,

k \geq i

)

ϕ_{i / n}^{'} (k) = ϕ_{i - 1 / n} (k - 1) (\frac{n - i + 1}{n})

(23)

ϕ_{i / n} (k) = \sum_{j = i}^{k} ϕ_{i / n}^{'} (j) {(\frac{i}{n})}^{k - j}

(24)

ϕ_{i / n}^{'} (k) = \sum_{j = i - 1}^{k - 1} ϕ_{i - 1 / n}^{'} (j) {(\frac{i - 1}{n})}^{k - j - 1} (\frac{n - i + 1}{n})

(25)

Proof.

From Lemma 1, we get

P [Y^{c} (ω_{i - 1 / n} (k - 1))] = \frac{n - i + 1}{n}

P [Y (ω_{i / n}^{'} (j))] = \frac{i}{n} .

By substituting these in (9) and (10) respectively, and remembering the definition (7) of the probability measure, Equations (23) and (24) follow.

Equation (25) is obtained simply by putting together the previous results. □

Note that the last Equation (25) provides us with a method to calculate the sequence

(ϕ_{i / n}^{'} (k), i = 2, 3, \dots, n)

, starting from (11) and (13), recursively. In particular, for

i = 2

we get

ϕ_{2 / n}^{'} (k) = \frac{n - 1}{n^{k - 1}}, k \geq 2 .

(26)

3. Combinatorics of FOS

In the previous section, we have obtained the probability of FOS through the general recursive Formula (25) and through closed-form solutions only in a few special cases. The aim of this section is the development of a combinatorial theory of FOS in order to investigate their connections with other topics of combinatorics, such as set partitions and Stirling numbers of the second kind (Ch. 9, [9]), then to derive the closed-form probability functions in the next section.

The following simple equation relates the cardinality of the sets

S_{i / n} (k)

and

S_{i / n}^{'} (k)

defined in Definition 2 and 3

# S_{i + 1 / n}^{'} (k + 1) = # S_{i / n} (k) (n - i) .

(27)

The combinatorics of objects such as sequences

ω_{i / n} (k) \in S_{i / n} (k)

is well known and treated within the more general problem of counting functions between two finite sets in enumerative combinatorics (1.9 p. 71, [10]), with applications to set partitions, words, and random allocation (II.3 p. 106, [11]). The problem of finding

# S_{i / n} (k)

is the same as finding the number of words (or sequences) of length k, over an alphabet of cardinality n, containing i letters (or symbols). As reported in (Equation (5), [12]), the total number

n^{k}

of possible sequences can be obtained through the following sum, involving the Stirling numbers of the second kind

S (k, i)

(see equation (1.96) p. 75 for details, [10]). (I keep here the same symbol with two different spellings,

S

and S, both to denote the set of events (sequences) and the Stirling numbers of the second kind. The meaning is always clear from the context, in any case when

S

denotes a set, it is always followed by a subscript. The adoption of the same symbol is justified by the close connection existing between the cardinality of the set of sequences and the Stirling numbers of the second kind.)

n^{k} = \sum_{i = 1}^{k} (\begin{matrix} n \\ i \end{matrix}) i! S (k, i) .

(28)

The above equation decomposes the total number of sequences as the sum over i of the number of such sequences that contain exactly i distinct symbols. The same identity, obtained through classical algebraic methods, is reported in (Equation (29), [13])). On the other hand, remembering Remark 6, we can also write

n^{k} = \sum_{i = 1}^{n} # S_{i / n} (k),

(29)

hence, by comparing (28) and (29), it follows

# S_{i / n} (k) = (\begin{matrix} n \\ i \end{matrix}) i! S (k, i) .

(30)

The same solution is found in II.6 p. 113 [11] under the hypothesis of random allocation of symbols, which is the same as Assumption 1. From (30) and (27), we can finally also derive an explicit expression for the cardinality of FOS.

In the rest of this section, we prefer to present an alternative procedure that allows us to derive the number of FOS sequences directly in order both to find new identities involving the Stirling numbers of the second kind and to show a new combinatorial meaning of these numbers, strictly related to the sequence of primes, which is the main focus of this paper.

3.1. Ordered FOS

Remembering Definition 1 and Remark 3, let us introduce a new kind of combinatorial object.

Definition 5.

Let

O_{i}

be an ordered subset of i distinct elements from

A_{n}

, we denote with

S_{i / n}^{'} (k / O_{i})

, k-length ordered First Occurrence Sequences (oFOS) with

i / n

distinct symbols on the ordered choice

O_{i}

,

(1 \leq i \leq n

,

k \geq i)

, the events

S_{i / n}^{'} (k / O_{i}) = {ω_{i / n}^{'} (k / O_{i})}

where

ω_{i / n}^{'} (k / O_{i})

is an elementary FOS outcome having

O_{i}

as i-distinct ordered choice, replicating the order of the first occurrence of its distinct i symbols.

The following example can help to clarify the above definition.

Example 1.

Suppose

n = 4

,

i = 3

and

k = 4

, with

A_{4} = {1, 2, 3, 4}

,

O_{3} = {3, 1, 4}

; then the event

S_{3 / 4}^{'} (4 / O_{3})

is made up of the following elementary outcomes:

S_{3 / 4}^{'} (4 / O_{3}) = {ω_{3 / 4}^{'} (4 / O_{3})} = {(3, 3, 1, 4), (3, 1, 3, 4), (3, 1, 1, 4)} .

Remark 8.

The cardinality of oFOS events depends on parameters i and k only, not on the particular choice of symbols in the ordered set

O_{i}

.

The following theorem establishes the relation between FOS and oFOS.

Theorem 2.

The cardinality of the set of FOS

S_{i / n}^{'} (k)

is given by (

1 \leq i \leq n

,

k \geq i

):

# S_{i / n}^{'} (k) = \frac{n!}{(n - i)!} g (k, i)

(31)

where

g (k, i)

is the number of k-length oFOS on an ordered subset of i symbols

O_{i}

# S_{i / n}^{'} (k / O_{i}) = g (k, i) .

(32)

Proof.

Given the ordered choice of symbols

O_{i} = {a_{j_{1}}, a_{j_{2}}, \dots, a_{j_{i}}}

, any finite subset of i indices

M_{i} = {m_{1}, m_{2}, \dots, m_{i}} \subset I_{k} = {1, 2, \dots, k}

subject to the constraints

1 = m_{1} < m_{2} < m_{3} < \dots < m_{i} = k,

defines a finite-dimensional cylinder set on

O_{i}

C_{M_{i}} (B_{1}, B_{2}, \dots, B_{k} / O_{i}) = {ω_{n} (k) = (x_{1}, x_{2}, \dots, x_{k}) : x_{m} \in B_{m},, m = 1, 2, \dots, k}

through the following assignments:

B_{m} = {a_{j_{1}}} for 1 = m_{1} \leq m < m_{2}

B_{m} = {a_{j_{2}}} for m = m_{2}

B_{m} = {a_{j_{1}}, a_{j_{2}}} for m_{2} < m < m_{3}

B_{m} = {a_{j_{3}}} for m = m_{3}

B_{m} = {a_{j_{1}}, a_{j_{2}}, a_{j_{3}}} for m_{3} < m < m_{4}

\dots \dots \dots

B_{m} = {a_{j_{i - 1}}} for m = m_{i - 1}

B_{m} = {a_{j_{1}}, a_{j_{2}}, \dots, a_{j_{i - 1}}} for m_{i - 1} < m < m_{i}

B_{m} = {a_{j_{i}}} for m = m_{i} = k .

Therefore, the oFOS event is equal to

S_{i / n}^{'} (k / O_{i}) = ⋃_{M_{i}} C_{M_{i}} (B_{1}, B_{2}, \dots, B_{k} / O_{i})

and the FOS event

S_{i / n}^{'} (k) = ⋃_{O_{i}} S_{i / n}^{'} (k / O_{i}) = ⋃_{O_{i}} ⋃_{M_{i}} C_{M_{i}} (B_{1}, B_{2}, \dots, B_{k} / O_{i}) .

As far as the cardinality of the sets from the above equation we get

# S_{i / n}^{'} (k) = # {O_{i}} g (k, i) .

Since

# {O_{i}} = C (n, i) P (i) = (\begin{matrix} n \\ i \end{matrix}) i!

with

C (n, i)

i-combinations of n elements, and

P (i)

number of permutations of i elements, Equation (31) follows. □

3.2. Counting Ordered FOS

The problem of finding the number of FOS is thus reduced to that of finding the numbers

g (k, i)

. When

i = 1

,

k \geq i

we immediately get

g (1, 1) = 1, g (k, 1) = 0, k > 1;

(33)

when

i = 2

, it is

g (k, 2) = 1, k \geq 2 .

(34)

Indeed, for

i = 2

, there is only one oFOS, whatever the value of

k \geq 2

. Note that (33) is a particular case of the general property

g (i, i) = 1, i \geq 1 .

(35)

Example 2.

i = 2

,

O_{2} = {1, 2}

g (2, 2) = 1, the only oFOS is (1, 2)

g (3, 2) = 1, the only oFOS is (1, 1, 2)

\dots \dots \dots

The calculations are more complex when

i \geq 3

.

Example 3.

i = 3

,

O_{3} = {1, 2, 3}

, through elementary enumerative combinatorics we find

g (3, 3) = 1, the only oFOS is

(1, 2, 3)

g (4, 3) = 3, the oFOS are

(1, 1, 2, 3), (1, 2, 1, 3), (1, 2, 2, 3) .

g (5, 3) = 7, the oFOS are

(1, 1, 1, 2, 3), (1, 1, 2, 1, 3), (1, 1, 2, 2, 3), (1, 2, 1, 1, 3),

(1, 2, 1, 2, 3), (1, 2, 2, 1, 3), (1, 2, 2, 2, 3) .

\dots

Example 4.

i = 4

,

O_{4} = {1, 2, 3, 4}

, through elementary enumerative combinatorics we find

g (4, 4) = 1, the only oFOS is

(1, 2, 3, 4)

g (5, 4) = 6, the oFOS are

(1, 1, 2, 3, 4), (1, 2, 1, 3, 4), (1, 2, 3, 1, 4),

(1, 2, 2, 3, 4), (1, 2, 3, 2, 4), (1, 2, 3, 3, 4) .

g (6, 4) = 25, the oFOS are

(1, 1, 1, 2, 3, 4), (1, 1, 2, 1, 3, 4), (1, 1, 2, 2, 3, 4), (1, 1, 2, 3, 1, 4), (1, 1, 2, 3, 2, 4),

(1, 1, 2, 3, 3, 4), (1, 2, 1, 1, 3, 4), (1, 2, 1, 2, 3, 4), (1, 2, 2, 1, 3, 4), (1, 2, 2, 2, 3, 4),

(1, 2, 1, 3, 1, 4), (1, 2, 1, 3, 2, 4), (1, 2, 2, 3, 1, 4), (1, 2, 2, 3, 2, 4), (1, 2, 1, 3, 3, 4),

(1, 2, 2, 3, 3, 4), (1, 2, 3, 1, 1, 4), (1, 2, 3, 1, 2, 4), (1, 2, 3, 2, 1, 4), (1, 2, 3, 2, 2, 4),

(1, 2, 3, 1, 3, 4), (1, 2, 3, 3, 1, 4), (1, 2, 3, 3, 3, 4), (1, 2, 3, 2, 3, 4), (1, 2, 3, 3, 2, 4) .

\dots

I will give, in the following, some results about the calculations of

g (k, i)

numbers, starting with the next lemma that states a recursive formula.

Lemma 2.

The number of k-length

o F O S

on an ordered subset

O_{i}

, as function of the cardinality on an ordered subset

O_{i - 1}

, is

g (k, i) = \sum_{j = i - 1}^{k - 1} g (j, i - 1) {(i - 1)}^{k - j - 1}, f o r 2 \leq i \leq k,

(36)

starting with

g (k, 1)

,

k \geq 1

, given by (33).

Proof.

Equation (36) follows directly from Definition 5 and Remark 8, considering that from any j-length oFOS on an ordered set

O_{i - 1}

(i - 1 \leq j \leq k - 1)

, one can obtain

{(i - 1)}^{k - j - 1}

k-length oFOS on an ordered set

O_{i}

having the i-th symbol in the last position k of the sequence.

Note that this lemma holds independently from Assumption 1 of equally probable symbols. □

The result reported below is simply the sum of the first k terms of a geometric series and will be referred to by the next theorem.

Lemma 3.

Given q real,

| q | < 1

, then

\sum_{j = 0}^{k} q^{j} = \frac{1 - q^{k + 1}}{1 - q}, k \geq 0 integer .

Theorem 3.

Let

g (k, i)

be the number of k-length oFOS on an ordered set of i distinct elements. Then the following statements hold:

g (k, i) = \sum_{j = 1}^{i - 2} a_{i, j} [{(i - 1)}^{k - i + 1} - j^{k - i + 1}], i \geq 3, k \geq i

(37)

with coefficients

a_{i, j}

defined recursively as

a_{i + 1, j} = - (\frac{j}{i - j}) a_{i, j}, i \geq 3, j = 1, 2, \dots, (i - 2)

(38)

a_{i + 1, i - 1} = (i - 1) \sum_{j = 1}^{i - 2} a_{i, j}

(39)

starting with

a_{3, 1} = 1 .

Proof.

Let us proceed by induction. Equation (37) holds for

i = 3

, indeed from Lemma 2 remembering (34) we get

g (k, 3) = \sum_{j = 2}^{k - 1} 2^{k - j - 1} = 2^{k - 3} \sum_{j = 0}^{k - 3} \frac{1}{2^{j}}, k \geq 3,

and, hence, from Lemma 3

g (k, 3) = 2^{k - 2} - 1 .

Let us now prove that if (37) is true for i,

i \geq 3

, then it is true for

(i + 1)

. Substituting (37) in (36) for

g (k, i + 1)

we get

g (k, i + 1) = \sum_{s = i}^{k - 1} \sum_{j = 1}^{i - 2} a_{i, j} [{(i - 1)}^{s - i + 1} - j^{s - i + 1}] i^{k - s - 1}

from which, after some manipulations, one gets

g (k, i + 1) = i^{k} \sum_{j = 1}^{i - 2} \sum_{s = i}^{k - 1} [\frac{a_{i, j}}{{(i - 1)}^{i}} {(\frac{i - 1}{i})}^{s + 1} - \frac{a_{i, j}}{j^{i}} {(\frac{j}{i})}^{s + 1}] .

The single sum

i^{k} \sum_{s = i}^{k - 1} [\frac{a_{i, j}}{{(i - 1)}^{i}} {(\frac{i - 1}{i})}^{s + 1} - \frac{a_{i, j}}{j^{i}} {(\frac{j}{i})}^{s + 1}] =

= i^{k} [\frac{a_{i, j}}{{(i - 1)}^{i}} {(\frac{i - 1}{i})}^{i + 1} \sum_{s = 0}^{k - i - 1} {(\frac{i - 1}{i})}^{s} - \frac{a_{i, j}}{j^{i}} {(\frac{j}{i})}^{i + 1} \sum_{s = 0}^{k - i - 1} {(\frac{j}{i})}^{s}]

after the application of Lemma 3

\sum_{s = 0}^{k - i - 1} {(\frac{i - 1}{i})}^{s} = \frac{1 - {(\frac{i - 1}{i})}^{k - i}}{1 - (\frac{i - 1}{i})}

\sum_{s = 0}^{k - i - 1} {(\frac{j}{i})}^{s} = \frac{1 - {(\frac{j}{i})}^{k - i}}{1 - (\frac{j}{i})}

and some manipulations, can be written as

i^{k} \sum_{s = i}^{k - 1} [\frac{a_{i, j}}{{(i - 1)}^{i}} {(\frac{i - 1}{i})}^{s + 1} - \frac{a_{i, j}}{j^{i}} {(\frac{j}{i})}^{s + 1}] =

= a_{i, j} (i - 1) (i^{k - i} - {(i - 1)}^{k - i}) - a_{i, j} (\frac{j}{i - j}) (i^{k - i} - j^{k - i}) .

Finally we can write

g (k, i + 1) = \sum_{j = 1}^{i - 2} [a_{i, j} (i - 1) (i^{k - i} - {(i - 1)}^{k - i}) - a_{i, j} (\frac{j}{i - j}) (i^{k - i} - j^{k - i})]

g (k, i + 1) = (i - 1) (i^{k - i} - {(i - 1)}^{k - i}) \sum_{j = 1}^{i - 2} a_{i, j} - \sum_{j = 1}^{i - 2} a_{i, j} (\frac{j}{i - j}) (i^{k - i} - j^{k - i})

hence

g (k, i + 1) = \sum_{j = 1}^{i - 1} a_{i + 1, j} (i^{k - i} - j^{k - i})

with

a_{i + 1, j}

given by (38),

j = 1, 2, \dots, i - 2

,

a_{i + 1, i - 1}

given by (39). □

The last theorem gives us a method to calculate each coefficient of the row

i + 1

from the coefficients of the previous row i. The following corollary state an alternative way to calculate the last term of each row, as function of the other coefficients of the same row.

Corollary 3.

It is

a_{i + 1, i - 1} = 1 - \sum_{j = 1}^{i - 2} (i - j) a_{i + 1, j}, i \geq 3 .

(40)

Proof.

The result follows directly from (35) and (37):

\sum_{j = 1}^{i - 2} a_{i, j} (i - j - 1) = 1, i \geq 3,

(i - 1) \sum_{j = 1}^{i - 2} a_{i, j} = 1 + \sum_{j = 1}^{i - 2} j a_{i, j} .

Hence, remembering (38) and (39), after some manipulations, one gets (40). □

The first rows of coefficients

a_{i, j}

are reported in Table 2.

Theorem 3 and the subsequent corollary state a method of calculating coefficients

a_{i, j}

by rows. The next proposition establishes another way to find them, proceeding by columns.

Corollary 4.

Given the coefficient

a_{j + 2, j}

, head of column j, the successive coefficients in the same column can be calculated as

a_{i, j} = \frac{{(- j)}^{i - j - 2}}{(i - j - 1)!} a_{j + 2, j}, j \geq 1, i \geq j + 3 .

(41)

The column head coefficients can be calculated as a function of the previous ones only, through the following formula

a_{j + 2, j} = j \sum_{l = 1}^{j - 1} \frac{{(- l)}^{j - l - 1}}{(j - l)!} a_{l + 2, l}, j \geq 2,

(42)

starting with

a_{3, 1} = 1 .

Proof.

Equation (41) is derived simply through a recursive application of Equation (38), from the row index

j + 3

to the required row index i.

By substituting (41) into (39), we obtain the second Equation (42). □

The next theorem states a general non-recursive formula to get the coefficients

a_{i, j}

directly as a function of the indices

i, j

.

Theorem 4.

The following equation holds for each coefficient

a_{i, j}

a_{i, j} = {(- 1)}^{i - j - 2} \frac{j^{i - 3}}{(i - j - 1)! (j - 1)!}, i \geq 3, j = 1, 2, \dots, (i - 2) .

(43)

Proof.

First of all, let us prove by induction, the formula holds in the case of column head coefficients. In this case, using the notation of Corollary 4, Equation (43) becomes

a_{l + 2, l} = \frac{l^{l - 1}}{(l - 1)!}, l \geq 1 .

(44)

The above equation is true for

l = 1

, since we know it is

a_{3, 1} = 1

. Assuming it is true for

l = 1, 2, \dots, j

, from (42) we get the identity

j \sum_{l = 1}^{j - 1} \frac{{(- 1)}^{j - l - 1} l^{j - 2}}{(j - l)! (l - 1)!} = \frac{j^{j - 1}}{(j - 1)!},

(45)

where the term on the left-hand side of the equation is the value of

a_{j + 2, j}

calculated through (42), the one on the right-hand side is the value of the same coefficient given by (44). Then, the same identity holds for the next head of column coefficient

a_{j + 3, j + 1}

; indeed Equation (42) leads to the result

a_{j + 3, j + 1} = (j + 1) \sum_{l = 1}^{j} \frac{{(- 1)}^{j - l} l^{j - 1}}{(j - l + 1)! (l - 1)!} = \frac{{(j + 1)}^{j}}{j!} .

The proof of (43) in the case of non-head of column coefficients follows simply from (41) after assuming (44). □

The identity (45) deserves some more attention: it expresses the normalization property of the FOS probabilities

\sum_{j = n}^{\infty} ϕ_{n / n}^{'} (j) = 1

(see forward Equation (76) and Remark 10). After some simple manipulations, it can be written as

\sum_{l = 0}^{j} {(- 1)}^{l} (\begin{matrix} j + 1 \\ l \end{matrix}) l^{j} = {(- 1)}^{j} {(j + 1)}^{j},

(46)

that appears to be “complementary” of the well-known identity (see Equation (1.13), [14])

\sum_{l = 0}^{j} {(- 1)}^{l} (\begin{matrix} j \\ l \end{matrix}) l^{j} = {(- 1)}^{j} j! .

(47)

3.3. Ordered FOS and Stirling Numbers of the Second Kind

The solution we have found about the number of elementary sequences of the event

S_{i / n}^{'} (k / O_{i})

, presented in Definition 5 as oFOS of length k with i distinct symbols on the ordered choice

O_{i}

, allows us to show the equivalence of

g (k, i)

numbers and Stirling numbers of the second kind. Indeed, from Equation (37) of Theorem 3, it follows

g (k, i) = {(i - 1)}^{k - i + 1} \sum_{j = 1}^{i - 2} a_{i, j} - \sum_{j = 1}^{i - 2} a_{i, j} j^{k - i + 1}, i \geq 3, k \geq i

and remembering (39) of the same theorem

g (k, i) = a_{i + 1, i - 1} {(i - 1)}^{k - i} - \sum_{j = 1}^{i - 2} a_{i, j} j^{k - i + 1} .

(48)

Substituting in the above equation

a_{i, j}

and

a_{i + 1, i - 1}

as given by (43) and (44) of Theorem 4, we get

g (k, i) = \sum_{j = 1}^{i - 1} {(- 1)}^{i - j - 1} \frac{j^{k - 2}}{(i - j - 1)! (j - 1)!} .

Since

\frac{1}{(i - j - 1)! (j - 1)!} = (\begin{matrix} i - 1 \\ j \end{matrix}) \frac{j}{(i - 1)!}

we finally arrive at

g (k, i) = \frac{{(- 1)}^{i - 1}}{(i - 1)!} \sum_{j = 0}^{i - 1} {(- 1)}^{j} (\begin{matrix} i - 1 \\ j \end{matrix}) j^{k - 1}, i \geq 3, k \geq i .

(49)

The right-hand side of the previous equation is the explicit formula for the Stirling number of the second kind

S (k - 1, i - 1)

(see Equation (9.21), [9]). The equivalence can be extended to values

i = 1

and

i = 2

, since from Equation (34) for

i = 2

it follows

S (k - 1, 1) = 1, k \geq 2

and for

i = 1

from (33)

S (k - 1, 0) = 0, k > 1,

S (0, 0) = 1 .

Note that the last equation solves the “bit tricky” case of the Stirling number

S (0, 0)

(p. 258, [15]), by calculating it instead of assuming it “by convention” (this assumption is common to all the treatments of the subject, see for example (p. 73, [10])). We have thus proved

Theorem 5.

Given

g (k, i)

, the cardinality of the oFOS set

S_{i / n}^{'} (k / O_{i})

defined in Definition 5, and the Stirling number of the second kind

S (k - 1, i - 1)

, the following equation holds true:

g (k, i) = S (k - 1, i - 1), i \geq 1, k \geq i .

(50)

Remark 9.

Due to the equivalence established by the previous theorem, Equation (36) of Lemma 2 can be rewritten as

S (k, i) = \sum_{j = i}^{k} S (j - 1, i - 1) i^{k - j},

that can be obtained through the repeated application of the recursive equation of the Stirling numbers of the second kind (see Equation (9.1), [9])

S (k + 1, n) = n S (k, n) + S (k, n - 1) .

(51)

When

k = i

we know it is (see Equation (35))

g (i, i) = 1

and, hence, from the general Formula (49) we get

g (i, i) = \frac{{(- 1)}^{i - 1}}{(i - 1)!} \sum_{j = 0}^{i - 1} {(- 1)}^{j} \leq (\begin{matrix} i - 1 \\ j \end{matrix}) j^{i - 1} = 1

or

\sum_{j = 0}^{i - 1} {(- 1)}^{j} (\begin{matrix} i - 1 \\ j \end{matrix}) j^{i - 1} = {(- 1)}^{i - 1} (i - 1)!

which is the identity (47) mentioned above. This identity is thus closely related to the Stirling numbers and to the general property

g (i, i) = S (i - 1, i - 1) = 1

,

i \geq 1

(see Equation (4) and [13] and references therein). Note that by substituting (31) with

g (k + 1, i + 1)

given by (50), into Equation (27) we obtain for

# S_{i / n} (k)

the same Equation (30).

We know Stirling numbers of the second kind have a combinatorial meaning directly related to the problem of partitioning a set into a fixed number of subsets,

S (n, m)

, counting the number of ways a set of n elements can be partitioned into m nonempty disjoint subsets (Ch. 9, p. 113, [9], Ch. 1.9, p. 73, [10], and see also [15] for a complete combinatorial treatment of the Stirling numbers). The analysis performed in this section highlights an alternative combinatorial interpretation of these numbers, together with the close connection of oFOS with set partitions, that will be shown to be strictly related to the sequence of primes in the next section.

4. First Occurrence Sequences, Set Partitions, and the Sequence of Primes

In this section, we will delve into FOS sequences as a model of the distribution of prime numbers. The simple oFOS sequences can already act as a first-level model of the prime number distribution, as we will see in the next subsection, while the complete model based on FOS will be developed at the end of this section. In order to develop the tools to build this model, we continue in the following to deepen the combinatorial implications of Theorem 5, exploring some interesting identities involving Stirling numbers of the second kind and harmonic numbers, derived through the probabilities of FOS.

4.1. Ordered FOS and the Prime Number Theorem

The equivalence (50) stated by Theorem 5 and the combinatorial meaning associated with S and g numbers suggest the following analogy with the distribution of primes. Given any integer set of the type

I_{k + 1} = {1, 2, 3, 4, \dots, k, k + 1}, k \geq 2, (k + 1) prime,

there exists a sequence of

(n + 1)

successive primes

O_{n + 1} = {p_{1} = 2, p_{2} = 3, \dots, p_{n}, p_{n + 1} = k + 1} \subset I_{k + 1}

such that the prime-counting function

π (k) = n

. The set

O_{n + 1}

can be viewed as an ordered choice of

n + 1

integers and the set

I_{k + 1}

as an oFOS over

O_{n + 1}

(remember Definition 5). Note that the integers i between two consecutive primes,

p_{j} < i < p_{j + 1}

, are multiples of the primes from

p_{1}

to

p_{j}

, hence they can be considered as repetitions of these symbols, thus confirming the oFOS schema. In the light of this analogy, some classical results of the theory of random partitions of finite sets can be reinterpreted as a general form of the Prime Number Theorem, which holds for all oFOS-like sequences.

The oFOS model of the distribution of prime numbers can be defined as follows. Let us consider the number of primes less than or equal to k, represented by the prime-counting function

π (k)

, as a random variable

π_{o} (k)

with probability mass function defined by

P_{k} [π_{o} (k) = n] = \frac{g (k + 1, n + 1)}{\sum_{j = 0}^{k} g (k + 1, j + 1)}, n = 1, 2, \dots, k,

(52)

and, hence, from Theorem 5

P_{k} [π_{o} (k) = n] = \frac{S (k, n)}{\sum_{j = 0}^{k} S (k, j)}, n = 1, 2, \dots, k .

(53)

Equation (53) defines a uniform probability distribution on the class

Π_{k}

of partitions of a set of k elements, obtained by assigning the same probability

B_{k}^{- 1}

to each partition

σ \in Π_{k}

,

P_{k} (σ) = \frac{1}{B_{k}},

where the total number of partitions

B_{k}

is equal to the k-th Bell number defined as (9.4, p. 133, [9])

B_{k} = \sum_{j = 1}^{k} S (k, j) .

(54)

The combinatorics of

Π_{k}

is studied in [16,17] (see also Ch. IX, p. 692 [11], for a brief summary of these results) in connection with the asymptotic (

k \to \infty

) distribution of the probability measure

P_{k}

(Harper points out the methods used are based on the combinatorial implications of probability theory and ascribed to W. Feller (see Ch. X, p. 256 [18]), and V. Goncharov (see [16] and references therein), the first applications in this field). From this approach, it follows that the random variable

π_{o} (k)

with uniform probability (53), defined as a model of

π (k)

, has mathematical expectation and variance asymptotically equal to (see Ch. 4, pp. 114–115 [19])

E [π_{o} (k)] = \frac{k}{r} (1 + o (1))

V a r [π_{o} (k)] = \frac{k}{r^{2}} (1 + o (1)),

where

r = W (k)

, the unique positive solution of

r e^{r} = k

,

W (x)

being the Lambert function [20] defined in the domain

x \geq - 1 / e

. Since for the function

W (k)

, we know it is

r \sim ln k

as

k \to \infty

, the above equations say the average and variance of the random variable

π_{o} (k)

are asymptotic to

E [π_{o} (k)] \sim \frac{k}{ln k}

(55)

V a r [π_{o} (k)] \sim \frac{k}{{ln}^{2} k} .

(56)

Equation (55), and the related variance (56), represents the form assumed by PNT when considering oFOS-like sequences with uniform probability distribution of set partitions. This is not the only admissible distribution. About this problem and the general conditions in order that a uniform distribution is obtained, through random allocation algorithms (or random urn models), see [21] and the references therein. The following result (see Theorem 1.1, p. 115, [19]) completes the application of the methods derived from the theory of random partitions of sets, to the oFOS model of the prime number distribution. The probability distribution of the normalized random variable

η_{k} = \frac{π_{o} (k) - E [π_{o} (k)]}{{(V a r [π_{o} (k)])}^{1 / 2}}

converges to that of the standard normal distribution as

k \to \infty

, that is:

lim_{k \to \infty} P {η_{k} < x} = \frac{1}{\sqrt{2 π}} \int_{- \infty}^{x} e^{- u^{2} / 2} d u .

4.2. FOS Probabilities and Some Combinatorial Identities Involving Stirling Numbers

Let us collect, in this subsection, some formulas about the probability function

ϕ_{i / n}^{'} (k)

, which will be referred to in the rest of this paper. Under Assumption 1, from Equation (31) of Theorem 2 the probability of FOS is given by

ϕ_{i / n}^{'} (k) = \frac{n!}{(n - i)! n^{k}} g (k, i), 1 \leq i \leq n, k \geq i .

(57)

By substituting

g (k, i)

with Equations (37) and (39) of Theorem 3, we get the following expressions for

3 \leq i \leq n

,

k \geq i

ϕ_{i / n}^{'} (k) = \frac{n!}{(n - i)! n^{k}} \sum_{j = 1}^{i - 2} a_{i, j} [{(i - 1)}^{k - i + 1} - j^{k - i + 1}]

(58)

ϕ_{i / n}^{'} (k) = \frac{n!}{(n - i)! n^{k}} [{(i - 1)}^{k - i} a_{i + 1, i - 1} - \sum_{j = 1}^{i - 2} a_{i, j} j^{k - i + 1}] .

(59)

Finally, from (43) of Theorem 4 the explicit expression for

ϕ_{i / n}^{'} (k)

follows

ϕ_{i / n}^{'} (k) = \frac{n!}{(n - i)! n^{k}} \sum_{j = 1}^{i - 1} {(- 1)}^{i - j - 1} \frac{j^{k - 2}}{(i - j - 1)! (j - 1)!} .

(60)

If we set

l = i - j - 1

, the above equation can be written

ϕ_{i / n}^{'} (k) = \frac{n!}{(n - i)! n^{k}} \sum_{l = 0}^{i - 2} {(- 1)}^{l} \frac{{(i - l - 1)}^{k - 2}}{(i - l - 2)! l!} .

(61)

When

i = n

(

n \geq 2

,

k \geq n

), Equation (60) becomes (it is easy to check this equation, and (64) below, which also applies to the case

n = 2

)

ϕ_{n / n}^{'} (k) = \frac{n!}{n^{k}} \sum_{j = 1}^{n - 1} {(- 1)}^{n - j - 1} \frac{j^{k - 2}}{(n - j - 1)! (n - 1)!},

(62)

which, observing that

\frac{n!}{n^{k}} \frac{j^{k - 2}}{(n - j - 1)! (n - 1)!} = (\begin{matrix} n - 1 \\ j \end{matrix}) {(\frac{j}{n})}^{k - 1},

can be written as

ϕ_{n / n}^{'} (k) = \sum_{j = 1}^{n - 1} {(- 1)}^{n - j - 1} (\begin{matrix} n - 1 \\ j \end{matrix}) {(\frac{j}{n})}^{k - 1} .

(63)

Equation (61) when

i = n

(

n \geq 2

,

k \geq n

), becomes

ϕ_{n / n}^{'} (k) = \frac{n!}{n^{k}} \sum_{l = 0}^{n - 2} {(- 1)}^{l} \frac{{(n - l - 1)}^{k - 2}}{(n - l - 2)! l!},

(64)

which, observing that

\frac{n!}{n^{k}} \frac{{(n - l - 1)}^{k - 2}}{(n - l - 2)! l!} = (\begin{matrix} n - 1 \\ l \end{matrix}) {(1 - \frac{l + 1}{n})}^{k - 1}

can be written as

ϕ_{n / n}^{'} (k) = \sum_{l = 0}^{n - 2} {(- 1)}^{l} (\begin{matrix} n - 1 \\ l \end{matrix}) {(1 - \frac{l + 1}{n})}^{k - 1} .

(65)

Let us now derive the sequence probabilities expressed through the Stirling numbers of the second kind. Remembering the equivalence (50), Equation (57) can be written as

ϕ_{i / n}^{'} (k) = \frac{n!}{(n - i)! n^{k}} S (k - 1, i - 1), 1 \leq i \leq n, k \geq i .

(66)

From this equation and (27), always under the hypothesis of equally probable symbols, hence (19), (20), it is easy to get

ϕ_{i / n} (k) = \frac{n!}{(n - i)! n^{k}} S (k, i), 1 \leq i \leq n, k \geq i .

(67)

The above equations, combined with the general properties of the probability functions (11)–(18), give rise to a series of combinatorial identities involving the Stirling numbers of the second kind. The following corollary reports the less trivial ones.

Corollary 5.

The following identities hold true.

From (14) it follows

\sum_{i = 1}^{n} \frac{S (k, i)}{(n - i)!} = \frac{n^{k}}{n!}, k \geq n .

(68)

From (15) it follows

\sum_{k = i}^{\infty} \frac{S (k - 1, i - 1)}{n^{k}} = \frac{(n - i)!}{n!}, 1 \leq i \leq n .

(69)

From (16) it follows

\frac{S (k, n)}{n^{k}} = \sum_{j = n}^{k} \frac{S (j - 1, n - 1)}{n^{j}} .

(70)

From (17) it follows

\sum_{i = 1}^{n - 1} \frac{S (k, i)}{(n - i)!} + n^{k} \sum_{j = n}^{k} \frac{S (j - 1, n - 1)}{n^{j}} = \frac{n^{k}}{n!} .

(71)

From (18) it follows

\sum_{k = n + 1}^{\infty} \frac{S (k - 1, i - 1)}{n^{k}} = \frac{1}{n^{n}} \sum_{i = 1}^{n - 1} \frac{S (n, i)}{(n - i)!} .

(72)

Note that identity (68) is the same as (28) and identities (68), (71) imply (70).

I report in the following two more results about the sum of probabilities

ϕ_{n / n}^{'} (k)

, that clarify the probabilistic meaning of identity (46) and the close connection between these probabilities and the Stirling numbers of the second kind.

Corollary 6.

The following equations hold true

\sum_{s = n}^{k} ϕ_{n / n}^{'} (s) = 1 - \sum_{j = 0}^{n - 2} {(- 1)}^{j} (\begin{matrix} n \\ j + 1 \end{matrix}) {(1 - \frac{j + 1}{n})}^{k},

(73)

\sum_{s = n}^{k} ϕ_{n / n}^{'} (s) = \frac{n!}{n^{k}} S (k, n) .

(74)

Proof of Equation (73).

From (65)

\sum_{s = n}^{k} ϕ_{n / n}^{'} (s) = \sum_{j = 0}^{n - 2} {(- 1)}^{j} \leq (\begin{matrix} n - 1 \\ n - j - 1 \end{matrix}) (\frac{n}{n - j - 1}) \sum_{s = n}^{k} {(\frac{n - j - 1}{n})}^{s} .

By applying Lemma 3

(\frac{n}{n - j - 1}) \sum_{s = n}^{k} {(\frac{n - j - 1}{n})}^{s} = \frac{{(1 - \frac{j + 1}{n})}^{n - 1} - {(1 - \frac{j + 1}{n})}^{k}}{(\frac{j + 1}{n})}

hence

\sum_{s = n}^{k} ϕ_{n / n}^{'} (s) = \sum_{j = 0}^{n - 2} {(- 1)}^{j} (\begin{matrix} n \\ j + 1 \end{matrix}) [{(1 - \frac{j + 1}{n})}^{n - 1} - {(1 - \frac{j + 1}{n})}^{k}] .

(75)

Since we know that

\sum_{s = n}^{\infty} ϕ_{n / n}^{'} (s) = 1

and

{(1 - \frac{j + 1}{n})}^{k} \to 0 as k \to \infty

it is

\sum_{j = 0}^{n - 2} {(- 1)}^{j} (\begin{matrix} n \\ j + 1 \end{matrix}) {(1 - \frac{j + 1}{n})}^{n - 1} = 1 .

(76)

Finally from (75) and (76), Equation (73) follows. □

Proof of Equation (74).

Remembering (66) we can write

\sum_{j = n}^{k} ϕ_{n / n}^{'} (j) = n! \sum_{j = n}^{k} \frac{S (j - 1, n - 1)}{n^{j}}

and, hence, through identity (70), we simply get (74). □

Remark 10.

Note that Equation (76) above is the same as (45), (46). Indeed, after some manipulations, it can be written as

\frac{{(- 1)}^{n - 1}}{n^{n - 1}} \sum_{j = 0}^{n - 1} {(- 1)}^{j} (\begin{matrix} n \\ j \end{matrix}) j^{n - 1} = 1 .

This identity thus has a probabilistic meaning connected with the normalization property of FOS probabilities.

Remark 11.

We know (Equation (26.8.42), [22]) the Stirling number of the second kind, for fixed value of n, as

k \to \infty

is asymptotic to

S (k, n) \sim \frac{n^{k}}{n!} .

This asymptotic behavior follows very simply from Equation (74) due to the connection established between FOS probabilities and

S (k, n)

.

4.3. Bounds of $g (k, i)$ Numbers and Stirling Numbers of the Second Kind

We speak of

g (k, i)

numbers since the results are directly related to their formulation through the coefficients

a_{i, j}

, as stated by Theorem 3. The results are obviously applicable to the Stirling numbers of the second kind.

Definition 6.

Let us define for

i \geq 3

,

1 \leq j \leq (i - 2)

σ_{i, j} = \sum_{s = 1}^{j} {(- 1)}^{j - s} | a_{i, s} | = | a_{i, j} | - | a_{i, j - 1} | + \dots \pm | a_{i, 1} |

and

τ_{i, j} (k) = \sum_{s = 1}^{j} {(- 1)}^{j - s} | a_{i, s} | s^{k - i + 1} = | a_{i, j} | j^{k - i + 1} - | a_{i, j - 1} {| (j - 1)}^{k - i + 1} + \dots \pm | a_{i, 1} |,

where the sign of

| a_{i, 1} |

is negative if j is an even number, positive if it is an odd number.

Lemma 4.

The following statements hold true:

lim_{i \to \infty} \frac{a_{i + 1, i - 1}}{a_{i, i - 2}} = e;

(77)

lim_{i \to \infty} \frac{σ_{i, i - 3}}{a_{i, i - 2}} = 1;

(78)

a_{i, i - 2} > σ_{i, i - 3}, i \geq 5;

(79)

σ_{i, i - 3} > 0, i \geq 5;

(80)

σ_{i, i - 2} > a_{i, i - 2};

(81)

| a_{i, i - 3} | > σ_{i, i - 4}, i \geq 5 .

(82)

Proof.

Equation (77) follows simply from the values of the coefficients given by (44).

Equation (39) of Theorem 3 gives

\frac{a_{i + 1, i - 1}}{(i - 1) a_{i, i - 2}} = 1 - \frac{σ_{i, i - 3}}{a_{i, i - 2}}

from that and (77), Equation (78) follows.

Considering Definition 6, from (39) we can write

a_{i, i - 2} - σ_{i, i - 3} = \frac{a_{i + 1, i - 1}}{(i - 1)}

with

a_{i + 1, i - 1} > 0

, hence (79).

From Equation (44) it is easy to prove that

0 < \frac{a_{i + 1, i - 1}}{(i - 1)} < a_{i, i - 2} for i \geq 4

hence the previous equation also implies (80).

Since

σ_{i, i - 2} = a_{i, i - 2} - σ_{i, i - 3} and

σ_{i, i - 3} = | a_{i, i - 3} | - σ_{i, i - 4}

(80) implies both (81) and (82). □

Lemma 5.

Given

σ_{i, j}

and

τ_{i, j} (k)

of Definition 6, it is

τ_{i, j} (k) > σ_{i, j} .

Proof.

Suppose j even, then we can write

| a_{i, s} | s^{k - i + 1} - | a_{i, s - 1} {| (s - 1)}^{k - i + 1} > | a_{i, s} | - | a_{i, s - 1} |, s = 2, 4, \dots, j .

By adding the pairs of the previous inequality, we obtain the thesis.

In the case when j is odd, we can repeat the above procedure considering the pairs for

s = 3, 5, 7, \dots, j

, then adding up these pairs and the last term

| a_{i, 1} |

taken with a positive sign. □

Theorem 6.

Given the numbers

g (k, i)

,

i \geq 3, k \geq i

, the following bounds hold

a_{i + 1, i - 1} {(i - 1)}^{k - i} - a_{i, i - 2} {(i - 2)}^{k - i + 1} < g (k, i) < a_{i + 1, i - 1} {(i - 1)}^{k - i}

(83)

and, for i fixed

lim_{k \to \infty} \frac{g (k, i)}{a_{i + 1, i - 1} {(i - 1)}^{k - i}} = 1

(84)

Proof.

From the Definition 6 we can write in general

σ_{i, j} = | a_{i, j} | - σ_{i, j - 1}

τ_{i, j} (k) = | a_{i, j} | j^{k - i + 1} - τ_{i, j - 1} (k),

and, remembering (48),

g (k, i) = a_{i + 1, i - 1} {(i - 1)}^{k - i} - τ_{i, i - 2} (k)

g (k, i) = a_{i + 1, i - 1} {(i - 1)}^{k - i} - a_{i, i - 2} {(i - 2)}^{k - i + 1} + τ_{i, i - 3} (k) .

The thesis (83) follows from Lemma 5 and (80), (81) of Lemma 4.

The inequalities (83) can be rewritten as

1 - (i - 2) \frac{a_{i, i - 2}}{a_{i + 1, i - 1}} \frac{{(i - 1)}^{i}}{{(i - 2)}^{i}} \frac{{(i - 2)}^{k}}{{(i - 1)}^{k}} < \frac{g (k, i)}{a_{i + 1, i - 1} {(i - 1)}^{k - i}} < 1,

from which Equation (84) follows. □

Remark 12.

The closed-form (44) of coefficients allows us to rewrite (83) and (84) as

\frac{{(i - 1)}^{k - 1}}{(i - 1)!} - \frac{{(i - 2)}^{k - 1}}{(i - 2)!} < g (k, i) < \frac{{(i - 1)}^{k - 1}}{(i - 1)!},

(85)

g (k, i) \sim \frac{{(i - 1)}^{k - 1}}{(i - 1)!} .

(86)

When speaking of Stirling numbers of the second kind, the previous bounds become

\frac{i^{k}}{i!} - \frac{{(i - 1)}^{k}}{(i - 1)!} < S (k, i) < \frac{i^{k}}{i!} .

(87)

Note that (87) gives an alternative proof of the asymptotic behavior of

S (k, i)

for fixed i (see Remark 11).

Remark 13.

Assuming the left-hand side of the above inequality is positive both for

S (k, i)

and

S (k - 1, i - 1)

, that is

a - \frac{ln (i - a)}{ln (1 - \frac{1}{i - a})} < k, a = 1, 2

we get the bounds of the ratio

\frac{{(i - 1)}^{k - 1}}{i^{k - 1}} - (i - 1) \frac{{(i - 2)}^{k - 1}}{i^{k - 1}} < \frac{S (k - 1, i - 1)}{S (k, i)} < \frac{{(i - 1)}^{k - 1}}{i^{k - 1} - {(i - 1)}^{k}},

that for

i ≫ 1

becomes

e^{- \frac{k}{i}} (1 - i e^{- \frac{k}{i}}) < \frac{S (k - 1, i - 1)}{S (k, i)} < e^{- \frac{k}{i}} \frac{1}{(1 - i e^{- \frac{k}{i}})},

(88)

under condition

i ln i < k .

4.4. FOS Average Length and Harmonic Numbers

As anticipated in the Introduction, calculating the average length of FOS sequences, that is, of elementary events

ω_{i / n}^{'} (k)

,

k \geq i

of the countable space

Ω_{i / n}^{'}

, is a simple task leading to the result (2) in the case

i = n

. This calculation is based on the following well-known results, in particular on Lemma 7, expressing the average number of trials one has to wait before obtaining the first success in a sequence of independent Bernoulli trials with p probability of success.

Lemma 6.

Given q real,

| q | < 1

, then

\sum_{k = 1}^{\infty} k q^{k} = \frac{q}{{(1 - q)}^{2}} .

Lemma 7.

Given p real,

0 < p \leq 1

, then

\sum_{k = 1}^{\infty} k p {(1 - p)}^{k - 1} = \frac{1}{p} .

Therefore, if we denote, in general, the quantity we are looking for with

L_{i / n}^{'}

, we can establish the following recursive relation

L_{i / n}^{'} = L_{i - 1 / n}^{'} + \frac{n}{n - i + 1}, i = 1, 2, \dots, n

(89)

starting with

L_{0 / n}^{'} = 0 .

It is easy to see the above equation leads to

L_{i / n}^{'} = n (H_{n} - H_{n - i}), i = 1, 2, \dots, n,

(90)

where

H_{n}

is the n-th harmonic number

H_{n} = 1 + \frac{1}{2} + \frac{1}{3} + \dots + \frac{1}{n}

and

H_{0} = 0 .

If we consider the set of discrete random variables

q_{i / n}

with probability mass function

ϕ_{i / n}^{'} (k) = P r o b {q_{i / n} = k}, 1 \leq i \leq n, k \geq i,

then the quantity

L_{i / n}^{'}

can be also obtained as the mathematical expectation of the corresponding random variable

q_{i / n}

, hence through

L_{i / n}^{'} = E [q_{i / n}] = \sum_{k = i}^{\infty} k ϕ_{i / n}^{'} (k) .

(91)

From (91) and (90), remembering (66), we get the following relation between harmonic numbers and Stirling numbers of the second kind.

Corollary 7.

H_{n} - H_{n - i} = \frac{(n - 1)!}{(n - i)!} \sum_{k = i}^{\infty} \frac{k}{n^{k}} S (k - 1, i - 1), 1 \leq i \leq n .

(92)

Other combinatorial identities involving harmonic numbers and finite term series can be derived from the direct calculation of (91) as established by the following proposition. (These identities appear to be new, such as the method of deriving them. Of course, it is not simple to establish the novelty of an identity involving harmonic numbers because of the historical interest of this subject and the great development of theory and applications. For an extensive collection of these identities, see [23,24,25] and the references therein.)

Theorem 7.

Assuming n and i positive integers,

n \geq 3

and

3 \leq i \leq n

, the following identity holds true

H_{n} - H_{n - i} =

= \frac{n!}{(n - i)! n^{i}} [\frac{a_{i + 1, i - 1}}{{(n - i + 1)}^{2}} [n i - {(i - 1)}^{2}] - \sum_{j = 1}^{i - 2} a_{i, j} (\frac{j}{n - j}) (\frac{n i + j - i j}{n - j})]

(93)

where

a_{i, j}

are the coefficients defined by Theorem 3.

Proof.

By substituting in (91) the term

ϕ_{i / n}^{'} (k)

as given by (58), we get

L_{i / n}^{'} = \sum_{k = i}^{\infty} \frac{n!}{(n - i)! n^{k}} k \sum_{j = 1}^{i - 2} a_{i, j} [{(i - 1)}^{k - i + 1} - j^{k - i + 1}]

from which it follows

L_{i / n}^{'} = \frac{n!}{(n - i)! n^{i - 1}} \sum_{j = 1}^{i - 2} a_{i, j} [\sum_{k = i}^{\infty} k (\frac{i - 1}{n})^{k - i + 1} - \sum_{k = i}^{\infty} k (\frac{j}{n})^{k - i + 1}] .

(94)

We can write

\sum_{k = i}^{\infty} k (\frac{i - 1}{n})^{k - i + 1} = \sum_{k = 1}^{\infty} k (\frac{i - 1}{n})^{k} + \frac{{(i - 1)}^{2}}{n} \sum_{k = 0}^{\infty} (\frac{i - 1}{n})^{k}

and remembering Lemma 6 and the sum of geometric series

\sum_{k = i}^{\infty} k (\frac{i - 1}{n})^{k - i + 1} = \frac{(i - 1)}{{(n - i + 1)}^{2}} [n i - {(i - 1)}^{2}] .

In a similar way, the second term becomes

\sum_{k = i}^{\infty} k (\frac{j}{n})^{k - i + 1} = (\frac{j}{n - j}) (\frac{j + n i - i j}{n - j}) .

Inserting the previous formulas into (94), we get

L_{i / n}^{'} = \frac{n!}{(n - i)! n^{i - 1}} \sum_{j = 1}^{i - 2} a_{i, j} [\frac{(i - 1)}{{(n - i + 1)}^{2}} [n i - {(i - 1)}^{2}] - (\frac{j}{n - j}) (\frac{j + n i - i j}{n - j})]

from which and Equations (39), (90), and (93) follow. □

The following corollary is a simple application of the formulas of Theorem 4 to that of the previous theorem.

Corollary 8.

For n and i positive integers,

n \geq 3

and

3 \leq i \leq n

, the following identity holds true

H_{n} - H_{n - i} =

= \frac{n!}{(n - i)! n^{i}} [\frac{{(i - 1)}^{i - 2}}{(i - 2)!} \frac{[n i - {(i - 1)}^{2}]}{{(n - i + 1)}^{2}} - \sum_{j = 1}^{i - 2} {(- 1)}^{i - j - 2} \frac{j^{i - 2}}{(i - j - 1)! (j - 1)!} \frac{(n i + j - i j)}{{(n - j)}^{2}}] .

When

i = n

we get for the n-th harmonic number

H_{n} = \frac{n!}{n^{n}} [\frac{{(n - 1)}^{n - 2}}{(n - 2)!} (2 n - 1) - \sum_{j = 1}^{n - 2} {(- 1)}^{n - j - 2} \frac{j^{n - 2}}{(n - j - 1)! (j - 1)!} \frac{(n^{2} + j - n j)}{{(n - j)}^{2}}] .

4.5. FOS as a Model of the Distribution of Primes

This subsection aims to analyze analogies and limits of FOS with respect to the true sequence of prime numbers. There is a correspondence between prime numbers and FOS as stated by the following remark.

Remark 14.

Given any prime

p_{n}

,

n \geq 2

, there exists an event FOS

S_{n / n}^{'} (p_{n}) \in Ω_{n / n}^{'} = ⋃_{k = n}^{\infty} S_{n / n}^{'} (k)

(see Remark 7), and the whole sequence of prime numbers corresponds to the element

(S_{n / n}^{'} (p_{n}), n = 2, 3, \dots) \in U

of the product set

U = \prod_{n = 2}^{\infty} Ω_{n / n}^{'},

we can define as FOS universe.

The model based on FOS takes then into consideration the set

Ω_{n / n}^{'}

as the combinatorial counterpart of the n-th prime

p_{n}

, considered as a discrete random variable assuming integer values between n and ∞. The statistics of this random variable are then used to define a “counting variable”, which is the FOS model equivalent of the prime-counting function. In order to avoid any confusion when dealing with our model, we keep the symbol

p_{n}

to denote the “true” value of the n-th prime, and adopt the symbol

q_{n}

for the random variable modeling it. The basic assumption of the model is the following.

Definition 7.

The discrete random variable

q_{n}

has probability mass function

ϕ_{n / n}^{'} (k) = P r o b {q_{n} = k}, k = n, n + 1, n + 2, \dots,

with

ϕ_{n / n}^{'} (k)

, defined in Equation (20).

The results of the model are referred to the sequence of random variables

{q_{n}}

and confronted with the analytical or numerical results we know about the true prime sequence

{p_{n}}

. These results, some of which are reported in two sample tables, justify a posteriori the model’s adoption and the not completely rigorous approach we follow in deriving it.

From (66) and (90) respectively we get the probability

ϕ_{n / n}^{'} (k) = \frac{n!}{n^{k}} S (k - 1, n - 1), k \geq n

and the mean value of

q_{n}

E [q_{n}] = L_{n / n}^{'} = n H_{n} \sim n ln n .

(95)

Hence, given

k \geq n

, k integer, the probability distribution function of

q_{n}

P r o b {q_{n} \leq k} = Φ_{n} (k)

(96)

with

Φ_{n} (k) = \sum_{j = n}^{k} ϕ_{n / n}^{'} (j) = n! \sum_{j = n}^{k} \frac{S (j - 1, n - 1)}{n^{j}},

(97)

is simply expressed through the number of ways a set of k elements can be partitioned into n disjoint nonempty subsets

Φ_{n} (k) = \frac{(n - 1)!}{n^{k - 1}} S (k, n),

(98)

where the last equation follows from (74).

Let us define the discrete random variable

ξ_{k}

whose probability distribution is induced by

q_{n}

through the following

Definition 8.

P r o b {ξ_{k} \geq n} = P r o b {q_{n} \leq k} = Φ_{n} (k)

(99)

and obviously

P r o b {ξ_{k} < n} = 1 - Φ_{n} (k) .

(100)

In our combinatorial model,

ξ_{k}

is associated with the value

π (k)

of the prime-counting function, considered as a random variable assuming integer values between 1 and k.

Theorem 8.

The discrete random variable

ξ_{k}

with probability distribution (99) assumes values between 1 and k, has probability mass function

P r o b {ξ_{k} = n} = Φ_{n} (k) - Φ_{n + 1} (k), n = 1, 2, \dots, k,

(101)

and mathematical expectation

E [ξ_{k}] = \sum_{n = 1}^{k} Φ_{n} (k) .

(102)

Proof.

From (98), due to the properties of

S (k, n)

numbers, we get for

n = 1

P r o b {ξ_{k} \geq 1} = Φ_{1} (k) = 1

P r o b {ξ_{k} < 1} = 1 - Φ_{1} (k) = 0;

for

n = k

P r o b {ξ_{k} \geq k} = Φ_{k} (k) = \frac{(k - 1)!}{k^{k - 1}}

P r o b {ξ_{k} < 1} = 1 - Φ_{k} (k) = 1 - \frac{(k - 1)!}{k^{k - 1}};

for

n > k

P r o b {ξ_{k} \geq n} = Φ_{n} (k) = 0

P r o b {ξ_{k} < n} = 1 - Φ_{n} (k) = 1 .

This proves the first statement.

Given n and m positive integers with

n < m

, from (100) we get

P r o b {n \leq ξ_{k} < m} = Φ_{n} (k) - Φ_{m} (k) .

(103)

From this relation, assuming

m = n + 1

the probability mass function (101) follows.

The mathematical expectation of

ξ_{k}

is given by

E [ξ_{k}] = \sum_{n = 1}^{k} n (Φ_{n} (k) - Φ_{n + 1} (k)) = \sum_{n = 1}^{k} Φ_{n} (k) - k Φ_{k + 1} (k),

that is the same as Equation (102) since

Φ_{k + 1} (k) = 0

. □

Remark 15.

The mean value of the discrete random variable

ξ_{k}

is equal to

E [ξ_{k}] = \sum_{n = 1}^{k} P r o b {q_{n} \leq k} = \sum_{n = 1}^{k} \sum_{j = n}^{k} ϕ_{n / n}^{'} (j) .

The previous theorem suggests to assume

E [ξ_{k}]

as an estimate

\hat{π} (k)

derived from the combinatorial model of the prime-counting function

π (k)

, that is, remembering the equality (98)

\hat{π} (k) = E [ξ_{k}] = \sum_{n = 1}^{k} \frac{(n - 1)!}{n^{k - 1}} S (k, n) .

(104)

An alternative expression for

\hat{π} (k)

can be derived from Remark 15 and Equation (73)

\hat{π} (k) = k - \sum_{j = 0}^{k - 2} {(- 1)}^{j} \sum_{n = j + 2}^{k} (\begin{matrix} n \\ j + 1 \end{matrix}) {(1 - \frac{j + 1}{n})}^{k} .

To check the quality of the function

\hat{π} (k)

as a numerical estimate of

π (k)

, Equation (104) looks more suitable since it can be written in a recursive form. Indeed, if we put

\hat{π} (k) = \sum_{n = 1}^{k} w (k, n)

with

w (k, n) = \frac{(n - 1)!}{n^{k - 1}} S (k, n),

from the recursive equation of the Stirling numbers of the second kind (51), we get the following recurrence for the terms

w (k, n)

w (k + 1, n) = w (k, n) + (\frac{n - 1}{n})^{k} w (k, n - 1), k \geq 1, 1 \leq n \leq (k + 1)

initialized as

w (k, 1) = 1, k \geq 1

w (1, n) = 0, n > 1 .

Table 3 reports a series of values of the recursive estimate

\hat{π} (k)

, compared with the prime-counting function

π (k)

and the logarithmic integral function

l i (k)

. The last column of the table reports the values obtained through continuous approximation formulas I will expose hereafter.

In order to analyze the asymptotic behavior of

\hat{π} (k)

, we look for a continuous approximation of the probability function

Φ_{n} (k)

. After promoting k and n to be continuous variables, which is a justifiable assumption for

k ≫ 1

and considering the relation (98) between

Φ

and S, Equation (97) is written as

Φ_{n} (k) = \int_{n}^{k} {(\frac{n - 1}{n})}^{u - 1} Φ_{n - 1} (u - 1) d u .

Taking the derivative of the above equation, we get the differential equation for

Φ

\frac{\partial Φ_{n} (k)}{\partial k} = {(\frac{n - 1}{n})}^{k - 1} Φ_{n - 1} (k - 1)

(105)

that can be written as

\frac{1}{Φ_{n} (k)} \frac{\partial Φ_{n} (k)}{\partial k} = \frac{S (k - 1, n - 1)}{S (k, n)} .

We set

\frac{S (k - 1, n - 1)}{S (k, n)} = \frac{λ_{n} (k)}{n} e^{- \frac{k}{n}}

(106)

where

λ_{n} (k)

is a parameter function depending on n and k. Then the general solution of the differential equation is

Φ_{n} (k) = C e^{\frac{1}{n} \int λ_{n} (k) e^{- \frac{k}{n}} d k}

(107)

and the value of the constant is C is determined by the asymptotic condition

Φ_{n} (k \to \infty) = 1

.

Remark 16.

It is interesting to note the method, based on probabilistic arguments, that led to Equation (107), provides an alternative approach to the problem of exploring the asymptotics (

k \to \infty

) of Stirling numbers of the second kind. Since this topic is outside the scope of this work, I cite only the following example. If we choose

λ_{n} (k) = (n + 1 - \frac{k}{n})

then it is

Φ_{n} (k) = e^{(\frac{k}{n} - n) e^{- \frac{k}{n}}}

and from (98), we get the well known asymptotic approximation of

S (k, n)

(see [26] and references therein)

S (k, n) \approx \frac{n^{k}}{n!} e^{(\frac{k}{n} - n) e^{- \frac{k}{n}}},

which is valid in the region

n < \frac{k}{ln k}

, coinciding asymptotically with the region of validity of (88),

n ln n < k

.

As a continuous approximation of the probability function

Φ_{n} (k)

we are looking for, let us choose a parameter function depending on n only through the following simple relation

λ_{n} (k) = λ_{n} = a n, a > 0 constant,

where the constant a varies between 1 and e as suggested by the boundary conditions of the Stirling number ratio. Indeed for

n = k

the ratio (106) is

\frac{S (k - 1, k - 1)}{S (k, k)} = 1,

that implies

λ_{n} = e n

, while for

k ≫ n

, we know it is

\frac{S (k - 1, n - 1)}{S (k, n)} \sim e^{- \frac{k}{n}},

hence

λ_{n} = n

.

Therefore (107) becomes

Φ_{n} (k) = e^{- a n e^{- \frac{k}{n}}}

(108)

and from (102) and (104), we get the continuous approximation of

\hat{π} (k)

\hat{π} (k) = \int_{1}^{k} Φ_{n} (k) d n = \int_{1}^{k} e^{- a n e^{- \frac{k}{n}}} d n .

(109)

The last column of Table 3 reports some examples from this approximation formula with

a = (1 + e) / 2

. Table 4 reports the values calculated through (109) with

a = e

compared with the values of

π (k)

up to

k = 10^{27}

(data with

π (k)

values are taken from N. J. A. Sloane, ‘The On-Line Encyclopedia of Integer Sequences. Sequence A006880’, http://oeis.org (accessed on 31 March 2021); Wikipedia contributors, ‘Prime-counting function’, Wikipedia, The Free Encyclopedia, https://en.wikipedia.org/w/index.php?title=Prime-counting_function&oldid=987382574 (accessed on 31 March 2021)).

The following proposition, which is the FOS equivalent of the Prime Number Theorem, can be stated for the function

\hat{π} (k)

defined by (109).

Theorem 9.

Given

ϵ > 0

, however small, it holds

1 - ϵ < \frac{\hat{π} (k)}{\frac{k}{ln k}} < 1 + ϵ,

definitively for

k > k_{0}

.

Proof.

Φ_{n} (k)

,

n > 0

,

k > 0

, is a decreasing function with n. Indeed,

\frac{\partial Φ_{n} (k)}{\partial n} = - a Φ_{n} (k) e^{- \frac{k}{n}} (1 + \frac{k}{n}) < 0 .

Given

n_{0}

and

n_{1}

such that

1 < n_{0} < n_{1} < k

, we can write

\hat{π} (k) = \int_{1}^{n_{0}} Φ_{n} (k) d n + \int_{n_{0}}^{n_{1}} Φ_{n} (k) d n + \int_{n_{1}}^{k} Φ_{n} (k) d n

and, due to the decreasing of

Φ_{n} (k)

with n,

Φ_{n_{0}} (k) (n_{0} - 1) + Φ_{n_{1}} (k) (n_{1} - n_{0}) + Φ_{k} (k) (k - n_{1}) \leq

\leq \hat{π} (k) \leq

Φ_{1} (k) (n_{0} - 1) + Φ_{n_{0}} (k) (n_{1} - n_{0}) + Φ_{n_{1}} (k) (k - n_{1}) .

Given

ϵ

,

0 < ϵ < 1

, let

\frac{1}{γ} = 1 + \frac{ϵ}{2}

, hence

\frac{2}{3} < γ < 1

, and

n_{0} = n_{0} (k) = \frac{k}{ln k}

n_{1} = n_{1} (k) = \frac{k}{γ ln k}

then the previous inequalities become (note that

n_{1} (k) < k

for

\frac{1}{γ} < ln k

)

(1 - \frac{ln k}{k}) e^{- \frac{a}{ln k}} + (\frac{1}{γ} - 1) e^{- a (\frac{k^{1 - γ}}{γ ln k})} + (ln k - \frac{1}{γ}) e^{- \frac{a}{e} k} \leq

\leq \frac{\hat{π} (k)}{\frac{k}{ln k}} \leq

\leq (1 + \frac{ln k}{k}) e^{- a e^{- k}} + (\frac{1}{γ} - 1) e^{- \frac{a}{ln k}} + (ln k - \frac{1}{γ}) e^{- a (\frac{k^{1 - γ}}{γ ln k})} .

The limit for

k \to \infty

of the left-hand and the right-hand side of the previous inequality, say

L (k)

and

R (k)

respectively, gives

lim_{k \to \infty} L (k) = 1,

that is

1 - ϵ < L (k) < 1 + ϵ, for k > k_{L},

and

lim_{k \to \infty} R (k) = \frac{1}{γ},

that is

\frac{1}{γ} - \frac{ϵ}{2} < R (k) < \frac{1}{γ} + \frac{ϵ}{2}, for k > k_{R} .

Let

k_{0} = M a x {k_{L}, k_{R}}

, then for

k > k_{0}

we get the thesis. □

What does the model say about the Riemann Hypothesis [27]? We know the most famous millennium problem (E. Bombieri, “Problems of the Millennium: The Riemann Hypothesis”, Clay Math. Institute. (2000) online at http://www.claymath.org/sites/default/files/official_problem_description.pdf accessed on 31 March 2021) is equivalent to establish the bound of the prime-counting function

π (x)

[28]

| π (x) - l i (x) | < M \sqrt{x} ln x

with

M > 0

constant, definitively for x sufficiently large (

l i (x)

is the logarithmic integral function defined for

x > 1

as

l i (x) = \int_{0}^{x} \frac{d t}{ln t}

, where the integral is to be intended as Cauchy principal value). We can prove that this bound, which I will refer to as the RH rule throughout this work, occurs w.p. 0, or better, that the RH rule does not hold w.p. 1 for the random variable

ξ_{k}

, defined above as the stochastic counterpart of

π (k)

in our model.

Theorem 10.

Let

ξ_{k}

then random variable with probability distribution (99), then it is

P r o b {| ξ (k) - l i (k) | < M \sqrt{k} ln k} \to 0 as k \to \infty .

Proof.

Let

n (k) = l i (k) - M \sqrt{k} ln k

and

m (k) = l i (k) + M \sqrt{k} ln k .

Since

l i (k) = \frac{k}{ln k} + O (\frac{k}{{ln}^{2} k})

it is easy to see that both

Φ_{n (k)} (k)

and

Φ_{m (k)} (k)

Φ_{n (k)} (k) \sim Φ_{m (k)} (k) \sim e^{- e \frac{1}{ln k}} \to 1 as k \to \infty .

From (103) then it follows

lim_{k \to \infty} P r o b {n (k) \leq ξ_{k} < m (k)} = lim_{k \to \infty} (Φ_{n (k)} (k) - Φ_{m (k)} (k)) = 0 .

□

Theorems 9 and 10 say what we know as PNT when dealing with primes holds for the counting variable

ξ_{k}

associated with the set of sequences generated through the random sampling with replacement game I proposed in the Introduction, while the Riemann Hypothesis requires compliance with much stricter requirements such that it has zero probability to occur for the same counting variable.

5. Model Generalization and the Connection with the Riemann Hypothesis

This section deals with the generalization of the FOS model so that we can discriminate between two classes of models, compliant and not compliant with the RH rule. This allows us to use the different stochastic properties, depending on different models, of the random variables

q_{n}

and

ξ_{k}

to deduce some properties of the whole sequence of primes. In particular, we apply the model to infer the counting of successive prime pairs and then compare the result with actual sequences of prime pairs.

5.1. Model Generalization

In this section, we will consider the random variables

q_{n}

of Definition 7 as a continuous random variable, assuming values in

[n, \infty)

according to the following probability distribution function

P r o b {q_{n} \leq x} = \{\begin{matrix} 0 & , & x < n \\ Φ_{n} (x) & , & x \geq n \end{matrix}

(110)

with

Φ_{n} (x)

given by the solution of the differential equation

\frac{1}{Φ_{n} (x)} \frac{\partial Φ_{n} (x)}{\partial x} = \frac{λ_{n}}{f (n)} e^{- \frac{x}{f (n)}}

obtained by considering a more general expression for the ratio (106), where n is substituted by

f (n)

, a positive function of n.

Notation.

The probability density function of

q_{n}

, that is the function

ϕ_{n / n}^{'} (x)

of the FOS model, will be denoted as follows:

ϕ_{n} (x) = ϕ_{n / n}^{'} (x), x \geq n .

Therefore, the probability functions of

q_{n}

become

Φ_{n} (x) = C e^{- λ_{n} e^{- \frac{x}{f (n)}}} + c o n s t .

ϕ_{n} (x) = \frac{\partial Φ_{n} (x)}{\partial x} = C \frac{λ_{n}}{f (n)} e^{- \frac{x}{f (n)}} e^{- λ_{n} e^{- \frac{x}{f (n)}}} .

The constant C is determined by the normalization condition

\int_{n}^{\infty} ϕ_{n} (x) d x = C {[Φ_{n} (x)]}_{n}^{\infty} = 1

hence

C = \frac{1}{(1 - e^{- λ_{n} e^{- n / f (n)}})} .

The complete expressions of the probability density and distribution function are

ϕ_{n} (x) = \frac{1}{(1 - e^{- λ_{n} e^{- n / f (n)}})} \frac{λ_{n}}{f (n)} e^{- \frac{x}{f (n)}} e^{- λ_{n} e^{- \frac{x}{f (n)}}}

(111)

Φ_{n} (x) = \frac{e^{- λ_{n} e^{- \frac{x}{f (n)}}} - e^{- λ_{n} e^{- n / f (n)}}}{1 - e^{- λ_{n} e^{- n / f (n)}}} .

(112)

Note that

ϕ_{n} (x)

has a single maximum point at

x = x_{M} = f (n) ln (λ_{n}) .

(113)

The mathematical expectation of

q_{n}

(this calculation is reported in [29] for the case

f (n) = n

) is

E [q_{n}] = \int_{n}^{\infty} x ϕ_{n} (x) d x = n + \frac{f (n)}{1 - e^{- λ_{n} e^{- n / f (n)}}} [ln (λ_{n}) + E_{1} (λ_{n} e^{- \frac{n}{f (n)}}) + γ - \frac{n}{f (n)}]

(114)

where

γ = 0.57721 \dots

is the Euler–Mascheroni constant and

E_{1} (x)

is the exponential integral function with positive real argument

x > 0

, sometimes referred to as upper (or complementary) incomplete gamma function

Γ (0, x)

, defined by (see functions 5.1.1, 6.5.3, [30])

E_{1} (x) = Γ (0, x) = \int_{x}^{\infty} \frac{e^{- t}}{t} d t .

We require the mathematical model to satisfy a similar condition to (95) about the mean of

q_{n}

, that leads to the following formula

E [q_{n}] = f (n) [ln (λ_{n}) + γ] \sim n ln n

(115)

obtained from (114), assuming

n ≫ 1

sufficiently large, so we can neglect the terms

e^{- λ_{n} e^{- n / f (n)}}

and

E_{1} (λ_{n} e^{- \frac{n}{f (n)}})

. Note that the function

E_{1} (x)

decreases rapidly as

x \to \infty

, in particular [31], (1.4)

\frac{e^{- x}}{2} ln (1 + \frac{2}{x}) \leq E_{1} (x) \leq e^{- x} ln (1 + \frac{1}{x}) .

Remark 17.

The FOS combinatorial model studied in Section 4.5 corresponds to the function

f (n) = n

. In this case (115) is satisfied for

λ_{n} = a n, a > 0, constant

(in particular when

a = e^{- γ}

it is

E [q_{n}] = n ln n

).

We know from Theorem 10 that the RH rule does not hold w.p. 1 for this class of models, which I will refer to as n-Model.

Another class of models of particular interest to our study, as we will see in the following, is the one determined by the choice

f (n) = ln n

. In this case, condition (115) can be satisfied using different functions having opposing implications as far as the compliance of the model with the RH rule. We focus, in particular, on a solution ensuring the RH rule holds w.p. 1.

Remark 18.

The model with

f (n) = ln n

will be referred to as log(n)-Model.

The simplest log(n)-Model is the one defined by the choice

E [q_{n}] = n ln n

(116)

hence by the parameter value

λ_{n} = a e^{n}, a = e^{- γ} .

(117)

Through the methods of the previous section, it is possible to prove the same statements as Theorems 9 and 10, that is, this log(n)-Model is compliant with PNT and not with the RH rule.

Let us consider instead the log(n)-Model defined by the assumption

E [q_{n}] = a l i (n)

(118)

where

a l i : R \to (1, + \infty)

is the inverse function of the logarithmic integral function

l i (x)

(I use for this function the same notation as in [32]). In this case, we get

λ_{n} = a e^{\frac{a l i (n)}{ln n}}, a = e^{- γ},

(119)

and condition (115) is satisfied since we know it is (Theorem 3.3, [32])

\frac{a l i (n)}{n ln n} = 1 + O (\frac{ln ln n}{ln n}) .

(120)

Equations (111) and (112), under the approximation for

n ≫ 1

, become

ϕ_{n} (x) = \frac{a}{ln n} e^{\frac{- x + r_{n}}{ln n}} e^{- a e^{\frac{- x + r_{n}}{ln n}}}

(121)

Φ_{n} (x) = e^{- a e^{\frac{- x + r_{n}}{ln n}}},

(122)

with

r_{n} = a l i (n)

.

The double exponential distribution function (122) is often called Gumbel distribution [33,34].

Remark 19.

In the case of the model with

f (n) = ln n

, the ratio (106) is equal to

\frac{a}{ln n} e^{\frac{- k + r_{n}}{ln n}};

it is

o (e^{- \frac{k}{n}})

as

k \to \infty

and deviates asymptotically from the Stirling number ratio which is

\sim e^{- \frac{k}{n}}

.

Briefly speaking, the log(n)-Model defined above corresponds to a subset of the FOS universe

U

which is different from the one of the n-Model, containing sequences conforming to both PNT and the RH rule, as stated by the following theorems.

The continuous approximation function

\hat{π} (k)

becomes

\hat{π} (k) = \int_{1}^{k} Φ_{n} (k) d n = \int_{1}^{k} e^{- a e^{\frac{- k + r_{n}}{ln n}}} d n .

(123)

In addition, for this function we can enunciate a proposition, which is the equivalent of the Prime Number Theorem.

Theorem 11.

Given

ϵ > 0

however small, there exists

k_{0}

such that for

k > k_{0}

it holds

1 - ϵ < \frac{\hat{π} (k)}{\frac{k}{ln k}} < 1 + ϵ,

where the function

\hat{π} (k)

is defined by (123).

The proof is completely analogous to that of Theorem 9.

Remark 20.

Associated with the log(n)-Model we can define the counting random variable

ξ_{k}

as in Definition 8, after assuming the probability distribution (110). It is easy to prove the analog of Theorem 8 for this variable, in particular Equations (101) and (103). We can therefore apply the probability method developed for the n-Model also in the case of the log(n)-Model.

The RH rule holds w.p. 1 for the log(n)-Model with

E [q_{n}]

defined by (118).

Lemma 8.

Let

f (x) = a l i (x) : R \to (1, + \infty)

be the inverse function of the logarithmic integral function

l i (x)

and

f^{(n)} (x) = \frac{d^{n} f (x)}{d x^{n}}

its n-th derivative. Then we have

f^{(1)} (x) = ln (a l i (x))

and for

n \geq 2

f^{(n)} (x) = \frac{1}{f^{n - 1} (x)} \sum_{k = 1}^{n - 1} a_{k, n} (f^{(1)} (x))^{k}

with

a_{1, n} = 1

a_{k, n} = k a_{k, n - 1} - (n - 2) a_{k - 1, n - 1}, k = 2, 3, \dots, n - 2

a_{n - 1, n} = - (n - 2) a_{n - 2, n - 1} .

Proof.

Note that the function

f (x) = a l i (x)

is real analytic. The first derivative follows directly from the definition, since

l i (a l i (x)) = x

. We proceed by induction. The thesis is true for

n = 2

, indeed

f^{(2)} (x) = \frac{1}{f (x)} a_{1, 2} f^{(1)} (x) = \frac{f^{(1)} (x)}{f (x)} .

Let us assume it is true for (

n - 1

)

f^{(n - 1)} (x) = \frac{1}{f^{n - 2} (x)} \sum_{k = 1}^{n - 2} a_{k, n - 1} (f^{(1)} (x))^{k}

then we have for the n-th derivative

f^{(n)} (x) = \frac{1}{f^{n - 1} (x)} \sum_{k = 1}^{n - 2} a_{k, n - 1} k (f^{(1)} (x))^{k} - \frac{(n - 2)}{f^{n - 1} (x)} \sum_{k = 1}^{n - 2} a_{k, n - 1} (f^{(1)} (x))^{k + 1} t e x t,

from which, grouping the terms by powers of

f^{(1)} (x)

, the thesis follows. □

Theorem 12.

Let

ξ_{k}

be the random counting variable of Definition 8 with probability distribution (110) associated with the log(n)-Model (118), then it is

P r o b {| ξ (k) - l i (k) | < M \sqrt{k} ln k} \to 1 as k \to \infty, M > 0, constant .

Proof.

Let

n (k) = l i (k) - M \sqrt{k} ln k

and

m (k) = l i (k) + M \sqrt{k} ln k,

then from (103) we know it is

P r o b {n (k) \leq ξ_{k} < m (k)} = (Φ_{n (k)} (k) - Φ_{m (k)} (k))

with probability distribution functions given by (122),

Φ_{n (k)} (k) = e^{- a e^{\frac{- k + a l i (n (k))}{ln n (k)}}},

Φ_{m (k)} (k) = e^{- a e^{\frac{- k + a l i (m (k))}{ln m (k)}}} .

Since we know

a l i (x)

is real analytic, consider its Taylor series

a l i (x_{0} + h)

at

x_{0} = l i (k)

h = - M \sqrt{k} ln k,

from Lemma 8

a l i (n (k)) = k - M \sqrt{k} {ln}^{2} k + \frac{1}{2} M^{2} {ln}^{3} k - \frac{1}{6} \frac{(1 - ln k)}{\sqrt{k}} M^{3} {ln}^{4} k + \dots,

which can be written

a l i (n (k)) = k - M \sqrt{k} {ln}^{2} k + O (l n^{3} k) .

(124)

From the asymptotic expansion of

l i (x)

we get

l i (x) = \frac{k}{ln k} (1 + O (\frac{1}{ln k}))

which leads to the following expression for

ln n (k)

ln n (k) = ln [l i (k) - M \sqrt{k} ln k] = ln k - ln ln k + O (\frac{1}{ln k}) .

(125)

From (124) and (125) we finally get

\frac{- k + a l i (n (k))}{ln n (k)} = \frac{- M \sqrt{k} ln k + O (l n^{2} k)}{1 - \frac{ln ln k}{ln k} + O (\frac{1}{{ln}^{2} k})}

Φ_{n (k)} (k) \sim e^{- a e^{- M \sqrt{k} ln k}}

and the limits for

k \to \infty

lim_{k \to \infty} \frac{- k + a l i (n (k))}{ln n (k)} = - \infty

lim_{k \to \infty} Φ_{n (k)} (k) = 1 .

Through analogous calculations for

m (k)

we find

\frac{- k + a l i (m (k))}{ln m (k)} = \frac{M \sqrt{k} ln k + O (l n^{2} k)}{1 - \frac{ln ln k}{ln k} + O (\frac{1}{{ln}^{2} k})}

Φ_{m (k)} (k) \sim e^{- a e^{M \sqrt{k} ln k}}

and the limits for

k \to \infty

lim_{k \to \infty} \frac{- k + a l i (m (k))}{ln m (k)} = \infty

lim_{k \to \infty} Φ_{m (k)} (k) = 0 .

From (103) then it follows

lim_{k \to \infty} P r o b {n (k) \leq ξ_{k} < m (k)} = lim_{k \to \infty} (Φ_{n (k)} (k) - Φ_{m (k)} (k)) = 1 .

□

We can get an estimate of the speed of convergence to 1 of the probability of the RH rule as follows. If we denote with

U^{'} \subset U

the subset of random sequences of positive integers distributed according to the log(n)-Model with mean

a l i (n)

, under the assumption that the sequence of prime numbers

{p_{n}, n = 1, 2, 3, \dots}

belongs to the set

U^{'}

, we can use the properties of the Gumbel distribution and Chebyshev’s inequality to get the probability of the RH rule, more generally the probability of any difference

| p_{n} - a l i (n) |

, as a function of n. In other words, we can consider

U^{'}

as a stochastic process

{q_{n}, n = 1, 2, 3, \dots}

of random variables differently distributed and the sequence of primes as particular realization of the process.

Different bounds of the difference

| p_{n} - a l i (n) | \leq δ_{n}

have been established for the sequence of prime numbers, depending on the conditions assumed. From ((1.3), [35]) we know that it holds unconditionally for

δ_{n} = O (n e^{- c \sqrt{ln n}}), c > 0,

while under the Riemann Hypothesis, we get (Theorem 6.2, [32])

δ_{n} = \frac{1}{π} \sqrt{n} {(ln n)}^{5 / 2}, n \geq 11 .

From the properties of the Gumbel distribution we know the variance of

q_{n}

is (note that the probability distribution (112) is not exactly a Gumbel distribution because of the cut at

x = n

of definition (110). Assuming n sufficiently large, so that (121) and (122) hold, we can consider the probability distribution of the model as a Gumbel distribution).

E [{(q_{n} - a l i (n))}^{2}] = \frac{π^{2}}{6} {(ln n)}^{2}

hence, from Chebyshev’s inequality (see Chapter IX, p. 233, [18]), it follows that the probability of the unconditioned bound converges to 1 with a quadratic law

P r o b {| p_{n} - a l i (n) | \leq C n e^{- c \sqrt{ln n}}} \geq 1 - \frac{π^{2}}{6} \frac{{(ln n)}^{2} e^{2 c \sqrt{ln n}}}{C^{2} n^{2}}, C, c > 0 const .,

while in the case the Riemann Hypothesis holds

P r o b {| p_{n} - a l i (n) | \leq \frac{1}{π} \sqrt{n} {(ln n)}^{5 / 2}} \geq 1 - \frac{π^{4}}{6} \frac{1}{n {(ln n)}^{3}} .

5.2. Counting Successive Prime Pairs through the log(n)-Model

The general problem of prime pairs can be stated as follows: given d and N positive integers, d even, what is the number

H_{N} (d)

of pairs of primes (not necessarily consecutive),

p, p^{'} \leq N

such that

p^{'} = p + d

? Connected with this there is the following one, we can call as the problem of counting successive prime pairs: given d and N positive integers, d even, what’s the number

h_{N} (d)

of pairs of primes

p_{i}, p_{i + 1} \leq N

, separated by the gap d, that is such that

p_{i + 1} = p_{i} + d

?

Mathematics has no certain results about these two problems which are different from the so called “bounded gaps problem”, recently solved by mathematician Yitang Zhang [36], who proved that as

i \to \infty

there are infinitely many prime pairs

p_{i + 1} - p_{i} = d

, with

d < 7.0 \times 10^{7}

. Successively, this bound has been lowered to 246 through the extension of the method of Zhang and other contributors. Nevertheless, the twin primes conjecture which requires the bound to be reduced to

d = 2

, remains unproven. Important results towards this goal can be found in [37,38], showing in particular that

{lim inf}_{i \to \infty} \frac{d}{ln p_{i}} = 0

, that is there always be pairs of primes

p_{i + 1} = p_{i} + d

with d less than any fraction of

ln p_{i}

, the average spacing between successive primes.

In 1923, Hardy and Littlewood proposed the following formula as a solution of the general problem, stressing that it was impossible for them “to offer anything approaching a rigorous proof” (see Conjecture B, [5]):

H_{N} (d) \sim 2 c_{2} J (d) \frac{N}{{ln}^{2} N}

(126)

where

c_{2} = \prod_{p > 2} (1 - \frac{1}{{(p - 1)}^{2}})

(127)

J (d) = \prod_{p / d} \frac{p - 1}{p - 2} .

(128)

The product is over all odd primes in (127), over odd primes dividing d in (128). The constant

C_{2} = 2 c_{2} = 1.32032363 \dots

is called the twin prime constant, while the term

J (d)

is responsible for the irregularities and the oscillating pattern of the counting functions. A formula similar to (126) for the counting of successive prime pairs separated by a gap d, was suggested by Wolf ([39], and the references therein) after enumerating, by means of a computer program, all gaps between consecutive primes up to

N = 2^{44} \approx 1.76 \times 10^{13}

. Starting from the observed pattern, which showed an exponential decay of the number of pairs depending on d, at the end of a heuristic procedure including approximations due to PNT, he deduced the complete formula (the procedure is reported in M. Wolf (2018), Some Heuristics on the Gaps between Consecutive Primes, available at https://arxiv.org/abs/1102.0481 accessed on 31 March 2021).

h_{N} (d) \sim 2 c_{2} J (d) \frac{π^{2} (N)}{N} e^{- d π (N) / N} .

(129)

The functional dependence on N of the formula above can be transformed into a dependence on the sequence number n of the greatest prime

p_{n}

interested in the count. Indeed, we can set

N = p_{n} \sim n ln n

and

π (N) = n

and, hence, it is easy to get

\frac{π^{2} (N)}{N} e^{- d π (N) / N} \sim F_{n} (d) = \frac{n}{ln n} e^{- \frac{d}{ln n}}

(130)

and the counting formula becomes

h_{n} (d) \sim 2 c_{2} J (d) \frac{n}{ln n} e^{- \frac{d}{ln n}}

(131)

where I changed the capital letter N to lower case n so as to remember the different meaning of the functional dependence.

The original Hardy and Littlewood conjecture was empirically tested by several authors and used to calculate the number of pairs

h_{N} (d)

in the case of small gaps d between successive primes [40]. I show in the following that the conjecture of Wolf, which is based on a large amount of empirical data, can be derived from the combinatorial model we defined as log(n)-Model in Remark 19.

Remark 21.

Note that the conjecture (131) can be written as

h_{n} (d) \propto \int_{1}^{n} \frac{\partial F_{t} (d)}{\partial t} d t

where

F_{n} (d)

is the function defined in (130) and

\frac{\partial F_{n} (d)}{\partial n} = \frac{1}{ln n} e^{- \frac{d}{ln n}} (1 + o (1)) .

(132)

Definition 9.

Given the sequence of primes

{p_{n}, n = 1, 2, 3, \dots}

, let us define the gap

g_{n}

between successive primes and the number of successive prime pairs

h_{n} (d)

separated by a gap d depending on the sequence number n as

g_{n} = p_{n + 1} - p_{n}

h_{n} (d) # {p_{i} : g_{i} = d, i \leq n} .

The gap

g_{n}

can be modeled as the difference between the random variables

g_{n} = q_{n + 1} - q_{n}

with probability mass function

σ_{n} (d) = \int_{n}^{\infty} ϕ_{n, n + 1} (x, x + d) d x

(133)

where

ϕ_{n, n + 1} (x, y)

is the joint probability density function of the couple

(q_{n}, q_{n + 1})

. We can obtain the estimate of the function

h_{n} (d)

, as resulting from the combinatorial model adopted, through the equation

h_{n} (d) = \sum_{i = 1}^{n} σ_{i} (d) \approx \int_{1}^{n} σ_{t} (d) d t .

(134)

We do the following assumption.

Assumption 2.

(Large gaps independence law) For

d ≫ 2

sufficiently large, the two events

q_{n} = x

and

q_{n + 1} = x + d

are independent.

Under the previous assumption, the joint probability

ϕ_{n, n + 1} (x, x + d)

can be written as the product of the two probability density functions

ϕ_{n, n + 1} (x, x + d) = ϕ_{n} (x) ϕ_{n + 1} (x + d), d ≫ 1

(135)

and its value depends on the type of model we choose.

Let us calculate the joint probability (135) for the log(n)-Model defined in Remark 18.

We adopt the following approximations

ln (n + 1) \approx ln n

and

r_{n + 1} \approx r_{n} + ln n, n > 1,

valid for both (this means the results we are going to obtain depend on PNT only, not on the Riemann Hypothesis).

r_{n} = n ln n

and

r_{n} = a l i (n) .

Then (135) becomes

ϕ_{n} (x) ϕ_{n + 1} (x + d) = \frac{e a^{2}}{{ln}^{2} n} e^{- \frac{d}{ln n}} e^{- 2 (\frac{x - r_{n}}{ln n})} e^{- a e^{- (\frac{x - r_{n}}{ln n})} (1 + e^{1 - \frac{d}{ln n}})},

and (133), after the variable change

\frac{x - r_{n}}{ln n} = t,

is written

σ_{n} (d) = \frac{e a^{2}}{ln n} e^{- \frac{d}{ln n}} \int_{\frac{n - r_{n}}{ln n}}^{\infty} e^{- 2 t} e^{- a e^{- t} (1 + e^{1 - \frac{d}{ln n}})} d t .

Let

α

be

α = a (1 + e^{1 - \frac{d}{ln n}}),

then the integral

\int_{\frac{n - r_{n}}{ln n}}^{\infty} e^{- 2 t} e^{- e^{- t} α} d t = \frac{1}{α^{2}} (1 - e^{- α e^{- \frac{n}{ln n} + n}} - α e^{- \frac{n}{ln n} + n} e^{- α e^{- \frac{n}{ln n} + n}}) \approx \frac{1}{α^{2}}, for n ≫ 1 .

Finally, the probability mass function (133) resulting from the log(n)-Model under Assumption 2 and

n ≫ 1

is (note it is independent of parameter a)

σ_{n} (d) \approx \frac{e}{{(1 + e^{1 - \frac{d}{ln n}})}^{2}} \frac{e^{- \frac{d}{ln n}}}{ln n} .

(136)

Comparing the previous equation with (132), we see the log(n)-Model confirms Wolf’s conjecture, in particular the exponential decay of the pair number depending on d, at least for large values of the gap d, when Assumption 2 can be considered valid. Note that in the case of the n-Model, we would get the same expression as in (136), but with the function n in place of

ln n

. The fact that only the RH rule compliant model agrees with the counting function of consecutive prime pairs, derived from empirical data, is a notable achievement of the theory, since it is a strong indication that the counting problem is deeply connected with the Riemann Hypothesis.

Figure 1 and Figure 2 show the results obtained from the log(n)-Model with respect to the actual numbers of pairs in two cases:

49 \times 10^{6} < n \leq 50 \times 10^{6}

and

2^{47} \leq p_{n} \leq 2^{48}

.

5.3. A Heuristic Model of the Distribution of Prime Numbers

The generalized model we presented in Section 5.1 can be directly related to the sequence of primes by changing condition (115) into the following

E [q_{n}] = f (n) [ln (λ_{n}) + γ] = p_{n}

(137)

which leads to the value

λ_{n} = a e^{\frac{p_{n}}{f (n)}}, with a = e^{- γ}

while the point of maximum of the probability density function (113) becomes

x_{M} = p_{n} - γ f (n) .

It is interesting to note we would obtain the same expressions, with

a = 1

, if we substitute the condition (137) with the following one, on the point of maximum of the function

ϕ_{n} (x)

x_{M} = f (n) ln (λ_{n}) = p_{n} .

(138)

In this case, the mean value of

q_{n}

changes to

E [q_{n}] = p_{n} + γ f (n) .

This model has the property to express the prime counting function

π (k)

through the values of the sequence of primes

p_{n}

from

n = 1

to

n = k

. Let consider indeed the case when

f (n) = σ

is constant. In this case, the average of the random counting variable (102) is

E [ξ_{k}] = \hat{π} (k, σ) = \sum_{n = 1}^{k} e^{- a e^{\frac{- k + p_{n}}{σ}}}

(139)

and it approximates the value of

π (k)

with any precision depending on

σ

.

Remark 22.

It holds

\sum_{n = 1}^{π (k)} e^{- a e^{\frac{- k + p_{n}}{σ}}} \leq \hat{π} (k, σ) \leq \sum_{n = 1}^{π (k)} e^{- a e^{\frac{- k + p_{n}}{σ}}} + 1

(140)

for

σ < \frac{2}{ln ln (k - π (k)) + γ}

and

lim_{σ \to 0^{+}} \hat{π} (k, σ) = lim_{σ \to 0^{+}} \sum_{n = 1}^{π (k)} e^{- a e^{\frac{- k + p_{n}}{σ}}} = π (k) .

(141)

Indeed

\hat{π} (k, σ) = \sum_{n = 1}^{l} e^{- a e^{\frac{- k + p_{n}}{σ}}} + \sum_{n = l + 1}^{k} e^{- a e^{\frac{- k + p_{n}}{σ}}}

since

e^{- a e^{\frac{- k + p_{l + 1}}{σ}}} > e^{- a e^{\frac{- k + p_{n}}{σ}}}, n = l + 2, \dots, k

we can write

\sum_{n = 1}^{l} e^{- a e^{\frac{- k + p_{n}}{σ}}} \leq \hat{π} (k, σ) \leq \sum_{n = 1}^{l} e^{- a e^{\frac{- k + p_{n}}{σ}}} + e^{- a e^{\frac{- k + p_{l + 1}}{σ}}} (k - l) .

Assuming

l = π (k)

and

e^{- a e^{\frac{- k + p_{l + 1}}{σ}}} (k - l) \leq 1

, (140) easily follows.

Note that for each term of the sum (139) it is

lim_{σ \to 0^{+}} e^{- a e^{\frac{- k + p_{n}}{σ}}} = \{\begin{matrix} 1 & , & p_{n} < k \\ 0 & , & p_{n} > k \end{matrix}

hence (141).

From Equation (137) we get in the case of the log(n)-Model,

f (n) = ln n

:

ϕ_{n} (x) = \frac{a}{ln n} e^{- \frac{x - p_{n}}{ln n}} e^{- a e^{- \frac{x - p_{n}}{ln n}}},

(142)

Φ_{n} (x) = e^{- a e^{- \frac{x - p_{n}}{ln n}}};

(143)

in the case of the n-Model,

f (n) = n

:

ϕ_{n} (x) = \frac{a}{n} e^{- \frac{x - p_{n}}{n}} e^{- a e^{- \frac{x - p_{n}}{n}}},

(144)

Φ_{n} (x) = e^{- a e^{- \frac{x - p_{n}}{n}}} .

(145)

The probability functions above are valid for

x \geq n

(remember (110)),

n > 1

sufficiently large, and

a = e^{- γ}

.

Definition 10.

Equations (142)–(145) define the heuristic log(n)-Model and the heuristic n-Model respectively.

Examples of different probability distribution functions

ϕ_{n} (x)

resulting for these heuristic models are given in Figure 3, for

n = 20, 30, 40

.

From Figure 3, we see the dispersion of the random variable

q_{n}

around its mean value

p_{n}

is very small in the case of the heuristiclog(n)-Model, so that we can assume the estimate

\hat{π} (k)

is very close to the prime-counting function itself

π (k)

for this model as stated by Remark 22 in the particular case

f (n) = c o n s t a n t

.

Table 5 shows some examples of estimate of

π (k)

, calculated through the heuristiclog(n)-Model with

a = e^{- γ}

, for different orders of magnitude.

The following considerations we propose in the case of the heuristic log(n)-Model, because of its relevance, are obviously valid in the case

f (n) = σ

, constant.

The following theorem gives a quantitative evaluation of the probability of error we do, when considering the random variable

ξ_{k}

(in practice its mean value for small variances) as an estimate of

π (k)

.

Theorem 13.

Given the random variable

ξ_{k}

of Definition 8,

ϵ > 0

, let

ξ_{p_{n} - ⌈ ϵ ⌉} \geq n = π (p_{n})

be the first kind error,

ξ_{p_{n} + ⌈ ϵ ⌉} < n = π (p_{n})

the second kind error of

ξ_{k}

at

k = p_{n}

and

W_{ξ} (n, ϵ) = P r o b {ξ_{p_{n} - ⌈ ϵ ⌉} \geq n} + P r o b {ξ_{p_{n} + ⌈ ϵ ⌉} < n}

the sum of the two error probabilities. Then for the heuristic log(n)-Model defined by Equations (142) and (143) with

a = e^{- γ}

it is

W_{ξ} (n, ϵ) \leq \frac{π^{2}}{6} \frac{{ln}^{2} n}{ϵ^{2}} .

(146)

Proof.

From the theory of Gumbel distribution we know the variance of

q_{n}

is

E [{(q_{n} - p_{n})}^{2}] = \frac{π^{2}}{6} {ln}^{2} n

hence from Chebyshev’s inequality

P r o b {| q_{n} - p_{n} | \geq ϵ} \leq \frac{π^{2}}{6} \frac{{ln}^{2} n}{ϵ^{2}} .

It is

P r o b {| q_{n} - p_{n} | \geq ϵ} = Φ_{n} (p_{n} - ϵ) + (1 - Φ_{n} (p_{n} + ϵ))

and

Φ_{n} (p_{n} - ⌈ ϵ ⌉) + (1 - Φ_{n} (p_{n} + ⌈ ϵ ⌉)) < Φ_{n} (p_{n} - ϵ) + (1 - Φ_{n} (p_{n} + ϵ)),

and, therefore, we can also write

Φ_{n} (p_{n} - ⌈ ϵ ⌉) + (1 - Φ_{n} (p_{n} + ⌈ ϵ ⌉)) \leq \frac{π^{2}}{6} \frac{{ln}^{2} n}{ϵ^{2}} .

From (99), (100) it follows

Φ_{n} (p_{n} - ⌈ ϵ ⌉) = P r o b {ξ_{p_{n} - ⌈ ϵ ⌉} \geq n}

(1 - Φ_{n} (p_{n} + ⌈ ϵ ⌉)) = 1 - P r o b {ξ_{p_{n} + ⌈ ϵ ⌉} \geq n} = P r o b {ξ_{p_{n} + ⌈ ϵ ⌉} < n}

P r o b {ξ_{p_{n} - ⌈ ϵ ⌉} \geq n} + P r o b {ξ_{p_{n} + ⌈ ϵ ⌉} < n} \leq \frac{π^{2}}{6} \frac{{ln}^{2} n}{ϵ^{2}} .

□

Remark 23.

Through the heuristic model we can derive new lower and upper bounds of the prime-counting function

π (k)

:

\int_{1}^{k} e^{- a e^{\frac{- k + β (n)}{ln n}}} d n \leq π (k) \leq \int_{1}^{k} e^{- a e^{\frac{- k + α (n)}{ln n}}} d n,

(147)

where

α (n) = n (ln n + ln ln n - 1),

(148)

β (n) = n (ln n + ln ln n) .

(149)

We know ([41,42]) the following lower and upper bounds of the n-th prime

p_{n}

hold

α (n) < p_{n} < β (n)

for

n \geq 2

the left-hand side and

n \geq 6

the right-hand side, hence for

Φ_{n} (k)

given by (143) it follows

e^{- a e^{\frac{- k + β (n)}{ln n}}} \leq Φ_{n} (k) \leq e^{- a e^{\frac{- k + α (n)}{ln n}}}, n \geq 6,

(150)

and assuming

k ≫ 6

sufficiently large so that the condition

n \geq 6

may be neglected, we can write (147) for

\hat{π} (k)

given by

\hat{π} (k) = \sum_{n = 1}^{k} Φ_{n} (k) .

From Remark 22 and Theorem 13, we know we can consider the approximation

\hat{π} (k) \approx π (k)

as valid for the heuristic model, from which we get (147).

6. Concluding Remarks

The first aim of this work was to deepen the problem of randomness in the distribution of prime numbers through such simple combinatorial objects as First Occurrence Sequences, showing new analogies between the classical set partition problem and the distribution of primes themselves. First Occurrence Sequences define a general class of objects for which the Prime Number Theorem holds, such as for the prime sequence, but they fail to represent more stringent constraints, required by the Riemann Hypothesis, such as the equivalent condition established by Helge von Koch I called RH rule. In order to investigate this second step, the simple model must be generalized (or the class of combinatorial objects must be restricted) to discriminate between RH rule compliant and noncompliant models. The analysis based on probability methods shows the Riemann Hypothesis holds w.p. 1 (together with the Prime Number Theorem, of course) for the class of random sequences represented by the log(n)-Model with mean equal to

a l i (n)

, the inverse function of the logarithmic integral function. Therefore, we can conclude that the property represented by the Riemann Hypothesis through the RH rule is largely independent of the primes and belongs to a large class of random sequences. Whether this class also contains the sequence of prime numbers cannot be decided by the model except statistically. Something similar happens within the context of analytical theory when speaking of zeta function

ζ (s)

, Dirichlet L-functions

L (s, χ)

(Chap. 12, [43]) (s is the complex variable

s = σ + i t

), and the Generalized Riemann Hypothesis “that is the hypothesis that not only

ζ (s)

but all the functions

L (s, χ)

have their zeros in the critical strip on the line

σ = \frac{1}{2}

” (p. 124, [44]).

Attempts to prove the Riemann hypothesis through the methods of physics go back as far as the so-called Hilbert–Polya conjecture, which associates the imaginary part of non-trivial zeros of

ζ (s)

to the eigenvalues of some self-adjoint (Hermitian) operator that might be considered like the Hamiltonian of a physical system (the story of Hilbert–Polya conjecture is documented on Odlyzko’s personal website; see http://www.dtc.umn.edu/~odlyzko/ accessed on 31 March 2021). Each type of combinatorial model we presented in the previous sections, based on probability equations such as (111) and more generally (25), can be transformed into a single particle Hamiltonian of an equivalent quantum system that emerges as a solution to an underlying combinatorial problem. This topic is outside the scope of this work and will not be treated here; I just want to mention that the analogy with the physical problem allows us to overcome some critical points of the combinatorial model and eliminate any arbitrariness in the choice of the mean of the random variable

q_{n}

(the combinatorial counterpart of the n-th prime). The solution

a l i (n)

emerges as a general asymptotic solution for such models due to concepts such as interaction potential and energy levels, that play an important role also in describing integer sequences such as Fibonacci numbers and prime numbers, and the application of the Hellmann–Feynman theorem to the whole system (see [45] for a mathematical foundation of the theorem). These methods result in computational benefits improving the estimates obtained from the combinatorial model and may suggest new conjectures about the distribution of primes. In [29], the quantum model derived from what we have called n-Model is developed and applied to explore the region beyond the Skewes number in connection with the numerical results known in the literature derived through analytical methods.

These two topics (the parallelism between the treatment from the point of view of random sequences and of Riemann zeta and Dirichlet L-functions; the transposition of the problem into a quantum physics framework), seem worthy of being addressed in future work.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

I am grateful to Marek Wolf (Cardinal Stefan Wyszynski University in Warsaw, Department of Physics) for his data about the number of successive prime pairs up to

p_{n} \leq 2^{48}

.

Conflicts of Interest

The authors declare no conflict of interest.

References

Cramér, H. Prime numbers and probability. Skand. Math. Kongr. 1935, 8, 107–115. [Google Scholar]
Cramér, H. On the order of magnitude of the difference between consecutive prime numbers. Acta Arith. 1936, 2, 23–46. [Google Scholar] [CrossRef]
Granville, A. Harald Cramér and the Distribution of Prime Numbers. Scand. Actuar. J. 1995, 1, 12–28. [Google Scholar] [CrossRef]
Maier, H. Primes in short intervals. Mich. Math. J. 1985, 32, 221–225. [Google Scholar] [CrossRef]
Hardy, G.H.; Littlewood, J.E. Some problems of “Partitio numerorum”, III: On the expression of a number as a sum of primes. Acta Math. 1923, 44, 1–70. [Google Scholar] [CrossRef]
Pintz, J. Cramér vs. Cramér. On Cramér’s probabilistic model for primes. Funct. Approx. Comment. Math. 2007, 37, 361–376. [Google Scholar] [CrossRef]
Sinai, Y.G. Probability Theory. An Introductory Course; Springer: Berlin/Heidelberg, Germany, 1992. [Google Scholar]
Kingman, J.F.C.; Taylor, S.J. Introduction to Measure and Probability; Cambridge University Press: Cambridge, UK, 1966. [Google Scholar]
Quaintance, J.; Gould, H.W. Combinatorial Identities for Stirling Numbers: The Unpublished Notes of H. W. Gould; World Scientific Publishing: Singapore, 2016. [Google Scholar]
Stanley, R.P. Enumerative Combinatorics, 2nd ed.; Cambridge University Press: Cambridge, UK, 2012; Volume 1. [Google Scholar]
Flajolet, P.; Sedgewick, R. Analytic Combinatorics; Cambridge University Press: Cambridge, UK, 2009. [Google Scholar]
Pitman, J. Some Probabilistic Aspects of Set Partitions. Am. Math. Mon. 1997, 104, 201–209. [Google Scholar] [CrossRef]
Boyadzhiev, K.N. Close Encounters with the Stirling Numbers of the Second Kind. Math. Mag. 2012, 85, 252–266. [Google Scholar] [CrossRef] [Green Version]
Gould, H.W. Combinatorial Identities: A Standardized Set of Tables Listing 500 Binomial Coefficient Summations, Revised ed.; Morgantown Printing and Binding Company: Morgantown, WV, USA, 1972. [Google Scholar]
Graham, R.L.; Knuth, D.E.; Patashnik, O. Concrete Mathematics; Addison-Wesley: New York, NY, USA, 1994. [Google Scholar]
Harper, L.H. Stirling Behaviour is Asymptotically Normal. Ann. Math. Stat. 1967, 38, 410–414. [Google Scholar] [CrossRef]
Sachkov, V.N. Random Partitions of Sets. SIAM Theory Probab. Appl. 1974, 19, 184–190. [Google Scholar] [CrossRef]
Feller, W. An Introduction to Probability Theory and Its Applications, 3rd ed.; John Wiley and Sons: New York, NY, USA, 1968; Volume I. [Google Scholar]
Sachkov, V.N. Probabilistic Methods in Combinatorial Analysis; Cambridge University Press: Cambridge, UK, 1997. [Google Scholar]
Corless, R.M.; Gonnet, G.H.; Hare, D.E.G.; Jeffrey, D.J.; Knuth, D.E. On the Lambert Function. Adv. Comput. Math. 1996, 5, 329. [Google Scholar] [CrossRef]
Stam, A.J. Generation of a Random Partition of a Finite Set by a Urn Model. J. Comb. Theory Ser. A 1983, 35, 231–240. [Google Scholar] [CrossRef] [Green Version]
NIST. Digital Library of Mathematical Functions; Olver, F.W.J., Olde Daalhuis, A.B., Lozier, D.W., Schneider, B.I., Boisvert, R.F., Clark, C.W., Miller, B.R., Saunders, B.V., Cohl, H.S., McClain, M.A., Eds.; NIST: Gaithersburg, MD, USA, 15 September 2020; Release 1.0.28. Available online: http://dlmf.nist.gov/ (accessed on 31 March 2021).
Choi, J.; Srivastava, H.M. Some summation formulas involving harmonic numbers and generalized harmonic numbers. Math. Comput. Model. 2011, 54, 2220–2234. [Google Scholar] [CrossRef]
Chu, W.; De Donno, L. Hypergeometric series and harmonic number identities. Adv. Appl. Math. 2005, 34, 123–137. [Google Scholar] [CrossRef] [Green Version]
Chu, W.; De Donno, L. Identità Binomiali e Numeri Armonici. Boll. Dell’Unione Mat. Ital. Ser. 8 2007, 10, 213–235. [Google Scholar]
Louchard, G. Asymptotics of the Stirling numbers of the second kind revisited. Appl. Anal. Discret. Math. 2013, 7, 193–218. [Google Scholar] [CrossRef]
Conrey, J.B. The Riemann Hypothesis. Not. AMS 2003, 50, 341–353. [Google Scholar]
Koch, V.H. Sur la distribution des nombres premiers. Acta Math. 1901, 24, 159–182. [Google Scholar] [CrossRef] [Green Version]
Barbarani, V. A Quantum Model of the Distribution of Prime Numbers and the Riemann Hypothesis. Int. J. Theor. Phys. 2020, 59, 2425–2470. [Google Scholar] [CrossRef]
Abramowitz, M.; Stegun, I.A. (Eds.) Handbook of Mathematical Functions with Formulas, Graphs and Mathematical Tables; National Bureau of Standards: Washington, DC, USA, 1964.
Alzer, H. On Some Inequalities for the Incomplete Gamma Function. Math. Comput. 1997, 66, 771–778. [Google Scholar] [CrossRef] [Green Version]
De Reyna, J.A.; Toulisse, J. The n-th prime asymptotically. J. Theor. Nombres Bordx. 2013, 25, 521–555. [Google Scholar] [CrossRef] [Green Version]
Beirlant, J.; Goegebeur, Y.; Segers, J.J.J.; Teugels, J. Statistics of Extremes: Theory and Applications; Wiley: Chichester, UK, 2004. [Google Scholar]
Gumbel, E.J. Les valeurs extrêmes des distributions statistiques. Ann. L’Inst. Henri Poincaré 1935, 5, 115–158. [Google Scholar]
Massias, J.-P.; Robin, G. Bornes effectives pour certaines fonctions concernant les nombres premiers. J. Theor. Nombres Bordx. 1996, 8, 215–242. [Google Scholar] [CrossRef] [Green Version]
Zhang, Y. Bounded gaps between primes. Ann. Math. 2014, 179, 1121–1174. [Google Scholar] [CrossRef]
Goldston, D.A.; Pintz, J.; Yildirim, C.Y. Primes in tuples I. Ann. Math. 2009, 170, 819–862. [Google Scholar] [CrossRef] [Green Version]
Goldston, D.A.; Pintz, J.; Yildirim, C.Y. Primes in tuples II. Acta Math. 2010, 204, 1–47. [Google Scholar] [CrossRef] [Green Version]
Wolf, M. Application of statistical mechanics in number theory. Phys. A 1999, 274, 149–157. [Google Scholar] [CrossRef]
Brent, R.P. The Distribution of small Gaps Between Successive Primes. Math. Comput. 1974, 28, 315–324. [Google Scholar] [CrossRef]
Dusart, P. The k-th prime is greater than k(lnk + ln lnk - 1) for k ≥ 2′. Math. Comput. 1999, 68, 411–415. [Google Scholar] [CrossRef] [Green Version]
Rosser, J.B. Explicit bounds for some functions of prime numbers. Am. J. Math. 1941, 63, 211–232. [Google Scholar] [CrossRef]
Apostol, T.M. Introduction to Analytic Number Theory; Undergraduate Texts in Mathematics; Springer: New York, NY, USA; Heidelberg, Germany, 1976. [Google Scholar]
Davenport, H. Multiplicative Number Theory; Graduate Texts in Mathematics; Springer: New York, NY, USA, 2000. [Google Scholar]
Carfì, D. The Pointwise Hellmann-Feynman Theorem. AAPP Phys. Math. Nat. Sci. 2010, 88, 1. [Google Scholar] [CrossRef]

Figure 1. Numbers of successive prime pairs as function of the gap

d = p_{n + 1} - p_{n}

in 49,000,000

\leq n \leq

50,000,000 (logarithmic scale).

Figure 1. Numbers of successive prime pairs as function of the gap

d = p_{n + 1} - p_{n}

in 49,000,000

\leq n \leq

50,000,000 (logarithmic scale).

Figure 2. Numbers of successive prime pairs as function of the gap

d = p_{n + 1} - p_{n}

in 4,461,632,979,716

\leq n \leq

8,731,188,863,469 (logarithmic scale).

Figure 2. Numbers of successive prime pairs as function of the gap

d = p_{n + 1} - p_{n}

in 4,461,632,979,716

\leq n \leq

8,731,188,863,469 (logarithmic scale).

Figure 3. Heuristic model probability distribution functions

ϕ_{n} (x)

for

n = 20, 30, 40

.

Figure 3. Heuristic model probability distribution functions

ϕ_{n} (x)

for

n = 20, 30, 40

.

Table 1. FOS average length and primes

p_{n}

.

Table 1. FOS average length and primes

p_{n}

.

n	$L_{n / n}^{'}$	$p_{n}$
2	3	3
3	5.5	5
4	8.333	7
5	11.416	11
6	14.7	13

Table 2. Coefficients

a_{i, j}

,

3 \leq i \leq 8

.

Table 2. Coefficients

a_{i, j}

,

3 \leq i \leq 8

.

	$j = 1$	$j = 2$	$j = 3$	$j = 4$	$j = 5$	$j = 6$
$i = 3$	1
$i = 4$	$- \frac{1}{2}$	2
$i = 5$	$\frac{1}{6}$	$- 2$	$\frac{9}{2}$
$i = 6$	$- \frac{1}{24}$	$\frac{4}{3}$	$- \frac{27}{4}$	$\frac{32}{3}$
$i = 7$	$\frac{1}{120}$	$- \frac{2}{3}$	$\frac{27}{4}$	$- \frac{64}{3}$	$\frac{625}{24}$
$i = 8$	$- \frac{1}{720}$	$\frac{4}{15}$	$- \frac{81}{16}$	$\frac{256}{9}$	$- \frac{3125}{48}$	$\frac{324}{5}$

Table 3.

\hat{π} (k)

estimate through recursive and continuous formulas.

Table 3.

\hat{π} (k)

estimate through recursive and continuous formulas.

k	$π (k)$	$li (k)$	$\hat{π} (k) = \sum_{n = 1}^{k} w (k, n)$	$\hat{π} (k) = \int_{1}^{k} e^{- {ane}^{- \frac{k}{n}}} dn$
100	25	$30.1261$	$26.9462$	$23.1623$
200	46	$50.1921$	$47.0309$	$41.3716$
300	62	$68.3336$	$65.50659$	$58.2265$
400	78	$85.4178$	$83.06021$	$74.2987$
500	95	$101.7939$	$99.98131$	$89.8312$
600	109	$117.6465$	$116.4275$	$104.9572$
700	125	$133.0889$	$132.4971$	$119.7596$
800	139	$148.1967$	$148.2565$	$134.2952$
900	154	$163.0236$	$163.7537$	$148.6046$
1000	168	$177.6097$	$179.0246$	$162.7185$
2000	303	$314.8092$	$323.4725$	$296.6907$
3000	430	$442.7592$	$458.9438$	$422.8337$
4000	550	$565.3645$	$589.1406$	$544.3509$
5000	669	$684.2808$	$715.6488$	$662.6328$
6000	783	$800.4141$		$778.4514$
7000	900	$914.3308$		$892.2944$
8000	1007	$1026.416$		$1004.4960$
9000	1117	$1136.949$		$1115.2989$
10,000	1229	$1246.137$		$1224.8866$

Table 4.

\hat{π} (k)

estimate through continuous approximation.

Table 4.

\hat{π} (k)

estimate through continuous approximation.

k	$π (k)$	$\hat{π} (k) = \int_{1}^{k} e^{- {a n e}^{- \frac{k}{n}}} d n$	$\hat{π} (k) / π (k)$
$10^{5}$	9592	$9428.02$	$0.98291$
$10^{6}$	78,498	78,480.93	$0.99978$
$10^{7}$	664,579	671,099.45	$1.00981$
$10^{8}$	5,761,455	5,855,689.76	$1.01636$
$10^{9}$	50,847,534	51,900,660.41	$1.02071$
$10^{10}$	455,052,511	465,792,892.49	$1.02360$
$10^{11}$	4,118,054,813	4,223,145,802.17	$1.02552$
$10^{12}$	37,607,912,018	38,614,679,105.06	$1.02677$
$10^{13}$	346,065,536,839	355,603,668,431.86	$1.02756$
$10^{14}$	3,204,941,750,802	3,294,779,143,238.6	$1.02803$
$10^{15}$	29,844,570,422,669	30,688,289,307,555	$1.02827$
$10^{16}$	279,238,341,033,925	287,153,196,808,146	$1.02834$
$10^{17}$	2,623,557,157,654,233	$2.69779945552531 \times 10^{15}$	$1.02830$
$10^{18}$	24,739,954,287,740,860	$2.54367476772712 \times 10^{16}$	$1.02816$
$10^{19}$	234,057,667,276,344,607	$2.40603623244741 \times 10^{17}$	$1.02797$
$10^{20}$	2,220,819,602,560,918,840	$2.28238863108907 \times 10^{18}$	$1.02772$
$10^{21}$	21,127,269,486,018,731,928	$2.17071405049150 \times 10^{19}$	$1.02745$
$10^{22}$	201,467,286,689,315,906,290	$2.06936330707283 \times 10^{20}$	$1.02715$
$10^{23}$	1,925,320,391,606,803,968,923	$1.97697565027884 \times 10^{21}$	$1.02683$
$10^{24}$	18,435,599,767,349,200,867,866	$1.89241852576407 \times 10^{22}$	$1.02650$
$10^{25}$	176,846,309,399,143,769,411,680	$1.81474179606716 \times 10^{23}$	$1.02617$
$10^{26}$	1,699,246,750,872,437,141,327,603	$1.74314253336239 \times 10^{24}$	$1.02583$
$10^{27}$	16,352,460,426,841,680,446,427,399	$1.676937646378480 \times 10^{25}$	$1.02550$

Table 5.

π (k)

estimate through heuristic log(n)-Model.

Table 5.

π (k)

estimate through heuristic log(n)-Model.

k	$π (k)$	$\hat{π} (k)$
10	4	$4.33$
100	25	$25.50$
1000	168	$167.65$
10,000	1229	$1229.34$
100,000	9592	9592.09
1,000,000	78,498	78,498.23

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Barbarani, V. Combinatorial Models of the Distribution of Prime Numbers. Mathematics 2021, 9, 1224. https://doi.org/10.3390/math9111224

AMA Style

Barbarani V. Combinatorial Models of the Distribution of Prime Numbers. Mathematics. 2021; 9(11):1224. https://doi.org/10.3390/math9111224

Chicago/Turabian Style

Barbarani, Vito. 2021. "Combinatorial Models of the Distribution of Prime Numbers" Mathematics 9, no. 11: 1224. https://doi.org/10.3390/math9111224

APA Style

Barbarani, V. (2021). Combinatorial Models of the Distribution of Prime Numbers. Mathematics, 9(11), 1224. https://doi.org/10.3390/math9111224

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Combinatorial Models of the Distribution of Prime Numbers

Abstract

1. Introduction: The Bingo Bag of Primes

2. First Occurrence Sequences: A New Class of Combinatorial Objects

2.1. Definitions and Probability Spaces

2.2. Probability Relations

3. Combinatorics of FOS

3.1. Ordered FOS

3.2. Counting Ordered FOS

3.3. Ordered FOS and Stirling Numbers of the Second Kind

4. First Occurrence Sequences, Set Partitions, and the Sequence of Primes

4.1. Ordered FOS and the Prime Number Theorem

4.2. FOS Probabilities and Some Combinatorial Identities Involving Stirling Numbers

4.3. Bounds of $g (k, i)$ Numbers and Stirling Numbers of the Second Kind

4.4. FOS Average Length and Harmonic Numbers

4.5. FOS as a Model of the Distribution of Primes

5. Model Generalization and the Connection with the Riemann Hypothesis

5.1. Model Generalization

5.2. Counting Successive Prime Pairs through the log(n)-Model

5.3. A Heuristic Model of the Distribution of Prime Numbers

6. Concluding Remarks

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Combinatorial Models of the Distribution of Prime Numbers

Abstract

1. Introduction: The Bingo Bag of Primes

2. First Occurrence Sequences: A New Class of Combinatorial Objects

2.1. Definitions and Probability Spaces

2.2. Probability Relations

3. Combinatorics of FOS

3.1. Ordered FOS

3.2. Counting Ordered FOS

3.3. Ordered FOS and Stirling Numbers of the Second Kind

4. First Occurrence Sequences, Set Partitions, and the Sequence of Primes

4.1. Ordered FOS and the Prime Number Theorem

4.2. FOS Probabilities and Some Combinatorial Identities Involving Stirling Numbers

4.3. Bounds of g ( k , i ) Numbers and Stirling Numbers of the Second Kind

4.4. FOS Average Length and Harmonic Numbers

4.5. FOS as a Model of the Distribution of Primes

5. Model Generalization and the Connection with the Riemann Hypothesis

5.1. Model Generalization

5.2. Counting Successive Prime Pairs through the log(n)-Model

5.3. A Heuristic Model of the Distribution of Prime Numbers

6. Concluding Remarks

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

4.3. Bounds of $g (k, i)$ Numbers and Stirling Numbers of the Second Kind