2. Short History of Large Gap Results
Starting with the papers [
4,
22] of Erdős, all the results on large gaps between primes are based on modifications of the Erdős–Rankin method. Its basic features are as follows:
Let
. All steps are considered for
. Let
By the prime number theorem we have
A system of congruence classes
(with
being the primes less than
x) is constructed, such that the congruence classes
cover the interval
.
Associated with the system (2) is the system of congruences
By the Chinese Remainder Theorem the system (3)
has a unique solution
Let
,
. Then, there is a
j,
, such that
From (2) and (3)
If
is sufficiently large, then all integers
are composite. If
then it follows that
, a large gap result.
The large gap problem has thus been reduced to a covering problem: Find a system of congruence classes that cover the interval , where y is as large as possible.
In all papers since Erdős [
4,
22], the covering system (2) has been constructed by a sequence of sieving steps. The set
is partitioned into a disjoint union of subsets:
Associated with each sieving step
is a choice of congruence classes
for
. We also consider the sequence
of residual sets. It is recursively defined as follows:
The 0-th residual set
covers the entire interval
. Thus,
The
residual set
is obtained by removing from
all the integers from
congruent to
for some
. The sequence
,
is complete; if
,
, that means all integers in
have been removed. For a complete sequence of sieving steps the union
thus covers all of
and the choice
in (2) gives a covering system of the desired kind.
In all versions of the Erdős–Rankin method, the first sieving steps have been very similar.
We describe—with minor modifications, adjusting to our notations—the construction of the covering system (2) in Erdős [
4,
22].
One sets
The sets
of primes are defined as follows:
For the first two sieving steps, one defines the congruence classes
by
A simple consideration shows that for the second residual set
the intersection
is the union of a set
Q of prime numbers
with a set of
Z-smooth integers, i.e., integers whose largest prime factor is
. A crucial fact in all variants of the Erdős–Rankin method is that the number of smooth integers is very small. This fact was established by Rankin [
5] and Bruijn [
23].
A central idea of Rankin’s method is “Rankin’s trick”. Let us write
for the largest prime factor of
m. Let
mean summation over all integers
n with
. Then, one has for
:
The bound needed follows by evaluating the product by the prime number theorem and by choosing
optimally.
Thus, the elements of the second residual set essentially only consist of prime numbers, the number of Z-smooth numbers of the second residual set being negligible.
In the third sieving step in Erdős [
4], the classes
are chosen via a greedy algorithm. In each step, the congruence class not belonging to the previous congruence classes that contains the most elements of the residual set
is removed.
In each version of the Erdős–Rankin method, there is a weak sieving step, which we will not number, since this number might be different in different versions. Instead, we call it the weak sieving step, since only a few elements of the residual set are removed.
In the first paper [
4] of Erdős, which is being discussed right now, in the fourth sieving step
one uses the primes
to remove the elements from the set
.
An important quantity is the hitting number of the weak sieving step
. The hitting number of the prime
is defined as the number of elements belonging to the congruence class
. In all papers prior to [
6], this hitting number was bounded below by 1. Thus, for each element
u of the residual set
a prime
could be found such that
and thus the removal of a single element from the congruence class
could be guaranteed. The progress in the papers was achieved not by changing the estimate for the hitting number, but by better estimates for the number of smooth integers.
In the paper [
6] by Maier and Pomerance, the hitting number in the weak sieving step for a positive proportion of the primes
was at least 2.
A further improvement was obtained in the paper [
9] by Pintz, where the hitting number was at least 2 for almost all primes in
. We give a short sketch of these two papers.
The paper [
6] consists of an arithmetic part and a graph-theoretic part, combined with a modification of the Erdős–Rankin method. The arithmetic information needed concerns the distribution of generalized twin primes in arithmetic progressions on average.
We recall definitions and theorems from [
6]. Fix some arbitrary, positive numbers
. For a given large number
N, let
satisfy
If
n is a positive integer, let
where as usual
p denotes a prime.
Further, if
are positive integers, let
Let
and let
Let
Then, one has with a fixed constant
:
The result (5) is proven by application of the Hardy–Littlewood Circle method. We now come to the graph-theoretic part:
We have the following definitions:
Definition 1 ([
6], Definition 4.1′).
Say that a graph G is N-colored if there is a function χ from the edge set of G to . In the paper [
6], first a graph is discussed, whose properties are idealized and thus simpler to formulate than the properties really needed for the applications. A proof of the existence of certain colored subgraphs (partial matchings) is given. Then, the graphs with properties needed for the applications are discussed. The existence of certain colored subgraphs is given without proof. The proof can easily be obtained by a modification of the proof for the idealized graphs mentioned above. For the sketch of the details, we cite ([
6], Definition 4.2).
Say an N-colored graph G is K-uniform if and there are integers such that
- (i)
Each color in is assigned to exactly S edges of G.
- (ii)
For each and each vertex V in G, there are exactly edges E coincident at V with color in . Thus, each vertex of G has valence T.
One has
Theorem 1 ([
6], Theorem 4.1).
Say G is a K-uniform, N-colored graph with N vertices, where . Then, there is a set of B mutually non-coincident edges with distinct colors such that We describe the construction of these edges:
Let
be as in ([
6], Definition 4.2).
Let
B, be the largest collection of mutually noncoincident edges with distinct colors in
. After
have been chosen and
, let
be the largest collection of edges of
G with distinct colors in
such that the members of
are mutually noncoincident. Let
be such that
and let
It can be shown that
We now describe the modifications suited for applications.
Definition 2 ([
6], Definition 4.2′).
Let K be a positive integer and let , be arbitrary. Say an N-colored graph G with N vertices is -uniform if there are numbers such that- (i)
For at most exceptions, each color in is assigned to between and edges of G;
- (ii)
If we let denote the number of edges coincident at the vertex V with color in , thenfor each , but for at most exceptional vertices V, we havefor each .
Then, we have the following result:
Theorem 2 ([
6], Theorem 4.1′).
Let , be arbitrary. There is a number such that for each integer there is some with the property that each -uniform, N-colored graph with vertices, where , has a set of B mutually noncoincident edges with distinct colors, where We now describe the application of the Erdős–Rankin method in the paper [
6] and its combination with the arithmetic and graph-theoretic results just mentioned.
Let
The first two sieving steps are as follows:
For the system of congruence classes
as described in (2), we choose:
The first residual set
is the disjoint union
, where
is the set of integers in
divisible by some prime
and
is the set of
v-smooth integers in
. Let
be the members of the second residual set that are in
and let
be the members of the second residual set that are in
. Then
where
It is again important that the number of smooth integers is small and it easily follows that
For the weak sieving step, one now applies the graph-theoretic results (Theorem 2).
One defines a graph whose vertex set is
. Let
Define
Let
denote the set of primes
q in the interval
Let
be the graph with vertex set
and such that
are connected by an edge if and only if
for some
.
Define the “color” of an edge by the prime
q, so that
is a
-colored graph. From the arithmetic information, combined with standard sieves, it can easily be deduced that the graphs
satisfy the conditions of the graph-theoretic result ([
6], Definition 4.2). Thus, the graphs
contain a sufficient number of edges
and thus pairs
with
We consider the system
for
If we determine
by
then the hitting number for the prime
is 2. Thus, by the weak sieving step, two members of the residual set are removed for each prime
q. The weak sieving step is completed by removing one member of the residual set for the remaining primes.
The paper [
9] by Pintz contains exactly the same arithmetic information as the paper [
6] by Maier and Pomerance, whereas the graph-theoretic construction is different. The edges of the graphs are obtained by a random construction and a hitting number of 2 for almost all primes in the weak sieving step is achieved.
The order of magnitude of
could finally be improved in the paper [
10]. The result is:
with
for
.
The paper is related to the work on long arithmetic progressions consisting of primes by Green and Tao [
12,
13] and work by Green, Tao and Ziegler [
14] on linear equations in primes. The authors manage to remove long arithmetic progressions of primes in the weak sieving step and thus are able to obtain a hitting number tending to infinity with
X. We shall not describe any more details of this paper. Simultaneously and independently, James Maynard [
15] achieved progress based on multidimentional sieve methods. The authors of the paper [
10] and Maynard in [
19] joined their efforts to prove
for a constant
.
Again the hitting number in the weak sieving step tends to infinity for
. Whereas in the papers [
6,
9] by Maier and Pomerance and Pintz, the pairs of the integers removed in the weak sieving step were interpreted as edges of a graph, now the tuplets of integers removed are seen as edges of a hypergraph. One uses a hypergraph covering theorem generalizing a result of Pippenger and Spencer [
24] using the Rödl nibble method [
25].
The choice of sieve weights is related to the great breakthrough results on small gaps between consecutive primes, based on the Goldston–Pintz–Yildirim (GPY) sieve and Maynard’s improvement of it. We give a short overview.
3. Small Gaps, GPY Sieve and Maynard’s Improvement
The first non-trivial bound was proved by Erdős [
4,
22], who showed that
By applying Selberg’s sieve, he showed that pairs of primes
with a fixed difference cannot appear too often.
The first major breakthrough was achieved by Bombieri and Davenport [
26], who showed that
Let
Then,
with
One row considers the integral
By orthogonality, one obtains:
One now tries to establish a lower bound for
I(
x). This bound can be combined with upper bounds for
for large values of
m to obtain estimates
for small values of
m. Thus, gaps of size
exist.
These estimates became possible by application of the Bombieri–Vinogradov theorem, proven one year before [
27].
For its formulation, the following definition will be useful:
Definition 3. LetWe say that the primes have an admissible level of distribution θ ifholds for any and any . The Bombieri–Vinogradov theorem now states that:
For any
, there is a
such that, for
This implies that the primes have an admissible level of distribution
.
Definition 4. We say that the primes have anadmissible level of ditribution ϑ if (11) holds for any and any with .
A great breakthrough was achieved in the paper [
16]. They consider admissible
k-tuples for which we reproduce the definition:
Definition 5. is called admissible if for each prime p the number of distinct residue classes modulo p occupied by elements of satisfies .
The two main results in the paper [
16] of Goldston, Pintz and Yildirim are
Theorem 3 ([
16], Theorem 3.3).
Suppose the primes have a level of distribution . Then, there exists an explicitly calculable constant depending only on ϑ such that any admissible k-tuple with contains at least two primes infinitely often. Specifically, if , then this is true for . Theorem 4 ([
16], Theorem 3.4).
We have The method of Goldston, Pintz and Yildirim has also become known as the GPY sieve.
There are several overview articles on the history of the GPY method (cf. [
18,
28]).
The overview article most relevant for this paper is due to Maynard [
29], whose improvements of the GPY sieve is of crucial importance for the large gap results described in this paper.
Before we recall Maynard’s description, we should mention another milestone which, however, is not relevant for large gap results. The results were obtained by Yitang Zhang [
30] from 2014. He proves the existence of infinitely many bounded gaps. He does not establish an admissible level of distribution
, which would imply the result, but succeeds in replacing the sum
by a sum over the smooth moduli.
We now come to the short description of the GPY method and its improvement by Maynard, closely following the paper “Small gaps between primes" by Maynard [
29]. One of the main results of [
29] is:
Theorem 5 (of [
29]).
Let . We have Tao (in private communication to Maynard) has independently proven Theorem 5 (with a slightly weaker bound at much the same time).
Theorem 5 implies that for every there exist intervals whose lengths depend only on H with arbitrarily large initial point that contain at least H primes.
Now, we follow [
29] for a short description of the GPY sieve and its improvement.
Let
be an admissible
k-tuple. One considers the sum
Here,
is the characteristic function of the primes, and
and
are non-negative weights. If one can show that
, then at least one term in the sum over
n must have a positive contribution. By the non-negativity of
, this means that there must be some integer
such that at least
of the
are prime.
The weights
are typically chosen to mimic Selberg sieve weights. The standard Selberg
k-dimensional weights are
The key new idea in the paper [
16] of Goldston, Pintz and Yildirim was to consider more general sieve weights of the form
for a suitable smooth function
F.
Goldston, Pintz and Yildirim chose for suitable , which has been shown to be essentially optimal, when k is large.
The new ingredient in Maynard’s method is to consider a more general form of the sieve weights
The results of [
29] were modified and extended in the paper [
15] “Dense clusters of primes in subsets” of Maynard. Some of his results and their applications will be described later in this paper.
4. Large Gaps with Improved Order of Magnitude and Its K-Version, Part I
Here, we state the theorems from [
19,
20] and sketch their proofs.
We number definitions and theorems in the following manner:
Definition (resp. Theorem) X of paper (in the list of references) is referred to as (, Definition (resp. Theorem) X).
We start with a list of the theorems from [
19] and the definitions relevant for them:
Theorem 6 ([
19], Theorem 1, large prime gaps).
For any sufficiently large X, one has for sufficiently large X. The implied constant is effective. Definition 6 ([
19], Definition (3.1)).
where c is a certain (small) fixed positive constant. Definition 7 ([
19], Definition (3.2)).
Definition 8 ([
19], Definitions (3.3)–(3.5)).
For congruence classes
and
define the sifted sets
and likewise
Theorem 7 ([
19], Theorem 2—sieving primes).
Let x be sufficiently large and suppose that y obeys (7). Then, there are vectors such that Theorem 8 ([
19], Theorem 3, probabilistic covering).
There exists a constant such that the following holds. Let , , and let be an integer. Let satisfy the smallness bound Let be disjoint finite non-empty sets and let V be a finite set. For each and , let be a random finite subset of V. Assume the following:
Corollary 1 ([
19], Corollary 4).
Let . Let be sets with and . For each , let be a random subset of satisfying the size bound:Assume the following:(Sparcity) For all and (Small codegrees) For any distinct (Elements covered more than once in expectation) For all but at most elements , we have:for some quantity C, independent of q, satisfyingThen, for any positive integer m withWe can find random sets for each such that is either empty or a subset of which attains with positive probability and thatwith probability . More generally, for any with cardinality at least , one haswith probability . The decay rates in the and ∼ notation are uniform in .
Theorem 9 ([
19], Theorem 4, random construction).
Let x be a sufficiently large real number and define y by (7). Then, there is a quantity with with the implied constants independent of c, a tuple of positive integers with and some way to choose random vectors and of congruence classes and integers respectively, obeying the following:For every in the essential range of , one haswhere With probability , we have that Call an element in the essential range of good if, for all but at most elements , one hasThen, is good with probability . The theorem and definitions are from [20].
Theorem 10 ([
20], Theorem 1.1).
There is a constant and infinitely many n, such that and the interval contains the K-th power of a prime. Definition 9 ([
20], Definitions (3.1)–(3.5)).
where c is a fixed positive constant. Let and introduce the three disjoint sets of primes For residue classes and define the sifted sets and likewise We set Theorem 11 ([
20], Theorem 3.1, sieving primes).
Let x be sufficiently large and suppose that y obeys (7). Then, there are vectors and , such that Theorem 12 ([
20], Theorem 4.1).
(Has wording identical to [19], Theorem 3.) Corollary 2 ([
20], Corollary 4.2).
(Has wording identical to [19], Corollary 3.) Theorem 13 ([
20], Theorem 4.3, random construction).
(Has wording identical to [19], Theorem 4.) Definition 10 ([
20], Definition 6.1).
An admissible r-tuple is a tuple of distinct integers that do not cover all residue classes modulo p for any prime p.For , we defineFor , letWe setFor an admissible r-tuple to be specified later and for primes p with , we set Theorem 14 ([
20], Theorem 6.2—Existence of good sieve weights).
Let x be a sufficiently large real number and let y be any quantity obeying (7). Let be defined by Definitions 7 and 8. Let r be a positive integer withfor some sufficiently large absolute constant and some sufficinetly small .Let be an admissible r-tuple contained in . Then, one can find a positive quantityand a positive quantity depending only on rand a non-negative function supported on with the following properties: Uniformly for every , one has Uniformly for every and , one has Uniformly for every that is not equal to any of the , one hasuniformly for all and .
In [
19], we have the following dependency graph for the proof of ([
19], Theorem 1).
Replacing these theorems by their
K-versions we obtain the following dependency graph for the
K-version ([
19], Theorem 1.1):
The graphs (13) and (14) can be combined in the graph:
(with Theorems 1, 2, 4, 5 corresponding to [
19] and Theorems 1.1, 3.1, 4.3, 6.2 corresponding to [
20]).
The horizontal arrows indicate the deduction of Theorem B from Theorem A; the vertical arrows indicate the transition from Theorem A to its K-version Theorem A’.
Part I of “Large gaps with improved order of magnitude and its
K-version” (
Section 4) deals with the graph (16). The end of the graph, Theorem 5 and its
K-version Theorem 6.2 is deduced from results of Maynard’s paper [
15] “Dense clusters of primes in subsets”. The
K-version, Theorem 6.2 is deduced from its
K-version. These deductions make up Part II and are the contents of
Section 5.
The graph (15) consists of segments, the last one being
(with Theorems 1, 2 corresponding to [
19] and Theorems 1.1, 3.1 corresponding to [
20]).
We shall proceed segment by segment starting with (16). In this way, the transition from a theorem to its K-version should become more transparent.
We start with the upper string in (16):
Let
and
be as in ([
19], Definitions (3.3)–(3.5)). We extend the tuple
of congruence classes
for all primes
by setting
for
and
for
and consider the sifted set
As in previous versions, one shows that the second residual set consists of a negligible set of smooth numbers and the set
Q of primes. Thus, we find that
Next let
C be a sufficiently large constant such that
is less than the number of primes in
. By matching each of these surviving elements to a distinct prime in
and choosing congruence classes appropriately, we thus find congruence classes
for
which cover all of the integers in
. This finishes the deduction of Theorem 1 from Theorem 2.
K-version deduction of ([20], Theorem 1.1) from ([20], Theorem 3.1) The first two sieving steps are the same as in the “upper string” of ([
19], Theorem 2 ⇒ Theorem 1). Thus, the second residual set is again
Q apart from a negligible set of smooth integers. The random choice in the remaining sieving steps now has to be modified.
Theorem 15 ([
20], Theorem 3.1).
Let x be sufficiently large and suppose that y obeys Definition 9. Then, there are vectors and , such that We now sketch the deduction of ([
20], Theorem 1.1) from ([
20], Theorem 3.1).
Let
and
be as in ([
20], Theorem 3.1). We extend the tuple
to a tuple
of congruence classes
for all primes
by setting
for
and
for
. Again the sifted set
differs from the set
only by a negligible set of
z-smooth integers. We find ([
20], Lemma 3.2)
As in the “upper string deduction” ([
19], Theorem 2) ⇒ ([
19], Theorem 1) we now further reduce the sifted set
by using the prime numbers from the interval
,
being a sufficiently large constant.
One follows—with some modification in the notation—the papers [
20,
21]. One distinguishes the cases
K odd and
K even. We recall the following definition:
Definition 11 ([
20], Definition 3.3).
Let For K even and , we set Lemma 1. .
Lemma 2. There are pairs with , , such that all satisfy a congruencewith the possible exceptions of u from an exceptional set V with Proof. If
K is odd, the congruence
is solvable, whenever
.
If K is even, the congruence is solvable whenever and . The claim now follows from Lemma 1. □
We now conclude the deduction of Theorem 1.1 by the application of the matrix method. The following definition is borrowed from [
31].
Definition 12. Let us call an integer a “good” modulus if for all characters and all withThis definition depends on the size of . Lemma 3. There is a constant , such that, in terms of , there exist arbitrarily large values of x, for which the modulusis good. Lemma 4 Let q be a good modulus. Then,where denotes Euler’s totient function, uniformly for and . Here, the constant D depends only on the value of in Lemma 3. Remark 2. This result, which is due to Gallagher [32], is Lemma 2 from [31]. We now define the matrix .
Definition 13. Choose x, such that is a good modulus. Let and be given. From the definition of and , there aresuch thatWe now determine byand the congruences By the Chinese Remainder Theorem
is uniquely determined. We let
with
For
, we denote by
the
r-th row of
and for
, we denote by
the
u-th column of
.
Lemma 5. We have that , is composite unless .
Proof. From the congruences
in (21), it follows that for
we have
□
Remark 3. We observe that each row with has as its first elementthe K-th power of the prime . If , is the K-th power of a prime of the desired kind. To deduce Theorem 5 from Theorem 15, it thus remains to show that is nonempty.
Proof. This follows from Lemma 4. □
We obtain an upper estimate for
by the observation that, if
contains a prime number, then
are primes for some
.
The number
is estimated by standard sieves as in Lemma 6.1 of [
21].
This concludes the deduction of Theorem 5 from Theorem 15. We now come to the next section in graph (16).
We first state a hypergraph covering theorem (Theorem 3 of [
19]) of a purely combinatorial nature, generalizing a result of Pippenger and Spencer [
24] using the Rödl nibble method [
25]. We also state a corollary.
Both the deduction of Theorem 2 (Theorem 7) from Theorem 4 (Theorem 17) and its
K-version, the deduction of Theorem 15 from Theorem 18, are based on Theorem 3 of [
19].
Theorem 16 (Theorem 3 of [
19], Probabilistic covering).
There exists a constant such that the following holds. Let and let , be an integer. Let satisfy the smallness bound Let be disjoint finite non-empty sets and let V be a finite set. For each and , let be a random finite subset of V. Assume the following:
We have the following:
Corollary 3 (Corollary 4 of [
19]).
Let . Let be sets with and . For each , let be a random subset of satisfying the size bound: Assume the following:(Sparsity) For all and (Uniform covering) For all but at most elements , we have:for some quantity C, independent of q, satisfying (Small codegrees) For any distinct Then, for any positive integer m withwe can find random sets for each such thatwith probability . More generally, for any with cardinality at least , one haswith probability . The decay rates in the and ∼ notation are uniform in .
Proof. For the proof, we refer to [
19]. □
Theorem 17 ([
19], Theorem 4, Random construction).
Let x be a sufficiently large real number and define y by Definition 9. Then, there is a quantity with with the implied constants independent of c, a tuple of positive integers with and some way to choose random vectors and of congruence classes and integers respectively, obeying the following:For every in the essential range of , one haswhere With probability , we have that Call an element in the essential range of good if, for all but at most elements , one hasThen, is good with probability .
We now show that Theorrem 17 implies Theorem 16. By (38), we may choose
small enough so that (35) holds. Take
Now, let
and
be the random vectors guaranteed by Theorem 17. Suppose that we are in the probability
event that
takes a value
which is good and such that (40) holds. Fix some
within this event. We may apply Corollary 3 with
and
for the random variables
conditioned to
. A few hypotheses of the corollary must be verified. First, (34) follows easily. The small codegree condition (36) is also quickly checked. Indeed, for distinct
if
then
. But
is a nonzero integer of size at most
and is thus divisible by at most one prime
. Hence
the sum on the left side being zero if
does not exist.
By Corollary 3, there exist random variables
, whose essential range is contained in the essential range of
together with ∅ and satisfying
with probability
, where we have used (40). Since
for some random integer
, it follows that
with probability
. Taking a specific
for which this relation holds and setting
for all
p concludes the proof of claim (17) and establishes Theorem 7 (Theorem 2 of [
19]).
We now come to the K-version of the deduction Theorem 4 ⇒ Theorem 2, “the lower string” Theorem 4.3 ⇒ Theorem 3.1 of the section
Theorem 18 ([
20], Theorem 4.18—Random construction).
Let x be a sufficiently large real number and define y by Definition 9. Then, there is a quantity C with with the implied constants independent of c, a tuple of positive integers with and some way to choose random vectors and of congruence classes and integers , respectively, obeying the following:For every in the essential range of , one haswhere . With probability , we have that Call an element in the essential range of good if, for all but at most elements , one hasThen, is good with probability .
Remark 4. The wording of Theorem 18 is the same as the wording of ([19], Theorem 4). However, the contents of these two theorems are different, since the term essential range has different meaning. In Theorem 17
and
, assume values of the form
and
, whereas in Theorem 18 they are of the form
Also, the wording of the deduction of Theorem 15 from Theorem 18 is the same as the deduction of Theorem 7 (Theorem 2 of [
19]) from Theorem 17 (Theorem 4 of [
19]).
We come to the section:
of graph (16).
The proof of this theorem relies on the estimates for multidimensional prime-detecting sieves established by the fourth author in [
19].
We show now that Theorem 14 implies Theorem 17.
Let
be as in Theorem 17. We set
r to be the maximum value permitted by Theorem 14, namely
and let
be the admissible
r-tuple consisting of the first
r primes larger than
r; thus,
for
. From the prime number theorem, we have
for
and so we have
for
if
x is large enough. We now invoke Theorem 14 to obtain quantities
and a weight
with the stated properties.
For each
, let
denote the random integer with probability density
for all
(we will not need to impose any independence condition on
). We have
Also, one has
for all
and
.
We choose the random vector by selecting each uniformly at random from , independently in s and independently of the .
The resulting sifted set
is a random periodic subset of
with density
From the prime number theorem (with sufficiently strong error term),
so in particular we see that
We also see from (43) that
We have a useful correlation bound:
Lemma 7. Let be a natural number and let be distinct integers of magnitude . Then, one has Proof. For each
, the integers
occupy
t distinct residue classes modulo
s, unless
s divides one of
for
. Since
and
are of size
, the latter possibility occurs at most
times. Thus, the probability that
avoids all of the
is equal to
except for
values of
s, where it is instead
Thus,
□
Among other things, this gives claim (40):
Corollary 4. With probability , we haveandand so by the prime number theorem we see that the random variable has meanand varianceThe claim then follows from Chebyshev’s inequality (with plenty of room to spare). For each
, we consider the quantity
and let
denote the set of all the primes
such that
In light of Lemma 7, we expect most primes in
P to lie in
and this will be confirmed below in Lemma 9. We now define the random variables
as follows. Suppose we are in the event
for some
in the range of
. If
, we set
. Otherwise, if
, we define
to be the random integer with conditional probability distribution
with the
jointly independent, conditionally on the event
. From (47), we see that these random variables are well defined.
Lemma 8. With probability , we havefor all but at most of the primes . Let
be good and
. Substituting definition (49) into the left-hand side of (50), using (48), and observing that
is only possible if
, we find that
where
is as defined in Theorem 17 (Theorem 4 of [
19]). Relation (41) (that is,
is good with probability
) follows upon noting that by (43) and (46),
Before proving Lemma 8, we first confirm that
is small with high probability.
Lemma 9. With probability , contains all but of the primes . In particular, Proof. By linearity of expectation and Markov’s inequality, it suffices to show that for each
, we have
with probability
. It suffices to show that
and
where
,
are independent copies of
that are also independent of
. □
The claim (50) follows from Lemma 7 (performing the conditional expectation over
first). A similar application of Lemma 7 allows one to write the left-hand side of (52) as
From (44), we see that the quantity
is equal to
with probability
and is less than
otherwise. The claim now follows from (46).
(Proof of Lemma 8). We first show that replacing
with
P has negligible effect on the sum, with probability
. Fix
i and substitute
. By Markov’s inequality, it suffices to show that
by Lemma 7, we have
Next, by (47) and Lemma 9 we have
subtracting, we conclude that the left-hand side of (53) is
. The claim then follows from (42). By (53), it suffices to show that with probability
, for all but at most
primes
, one has
Call a prime
bad if
but (55) fails. Using Lemma 7 and (44), we have
and
where
and
are independent copies of
over
. In the last step, we used the fact that the terms with
contribute negligibly.
By Chebyshev’s inequality, it follows that the number of bad
q is
with probability
. □
We now come to the K-version, the “lower string” Theorem 6.2 ⇒ Theorem 4.3 of section (42).
Like in the “upper string” in Theorem 5 of [
19], a certain weight function
w is of importance. The construction of
w will be modelled on the construction of the function
w in [
19], Theorem 5.
The restrictions , bring some additional complications. The function will be different from zero only if n belongs to a set of p-good integers. The definition of is based on the set of good integers.
Definition 15. For , we defineFor , letWe setFor an admissible r-tuple to be specified later and for primes p with , we set Theorem 19 (Theorem 6.2 of [
20], Existence of good sieve weights).
Let x be a sufficiently large real number and let y be any quantity obeying Definition 9. Let be defined by Definition 9. Let r be a positive integer with for some sufficiently large absolute constant and some sufficiently small .Let be an admissible r-tuple contained in . Then, one can find a positive quantityand a positive quantity depending only on r withand a non-negative functionsupported on with the following properties:unless for some , and . Uniformly for every , one hasUniformly for every and , one hasUniformly for every that is not equal to any of the , one hasUniformly for all and We now show how Theorem 19 implies Theorem 18.
Let
be as in Theorem 16. We set
We now invoke Theorem 19 to obtain quantities
and weight
with the stated properties.
For each
, let
, denote the random integer with probability density
for all
. From (59), (60), we have
Also, from (57), (59), (63), one has
for all
and
.
We choose the random vector by selecting each uniformly at random from independently in s.
Lemma 10. Let be a natural number and let be distinct integers from . Then, one has Proof. For
, let
be the set of
for which
, for
. Then, since
we have
Let
,
,
. We write
where
We set
□
We have
We now use certain well-known facts from the theory of
K-th power residues.
There are
possible choices for the
. From these, for each
h,
there are
choices such that
Thus, the total number of choices for
for which not all
,
is
Since the choices for the components
are independent, we have
We have
Since
for
, we have by the definition for
:
From (65) and (66), we thus obtain
Corollary 5 (to Lemma 10).
With probability , we have: Proof. From Lemma 10, we have
and
and so by the prime number theorem we see that the random variable
has mean
and variance
The claim then follows from Chebyshev’s inequality. □
For each
, we consider the quantity
and let
denote the set of primes
, such that
We now define the random variables
as follows. Suppose we are in the event
for some
in the range of
. If
, we set
. Otherwise, if
, we define
to be the random integer with conditional probability distribution
where
with the
jointly conditionally independent on the event
.
Lemma 11. With probability , we havefor all but at most of the primes . Before proving Lemma 11, we first confirm that is small with high probability.
Lemma 12. With probability contains all butof the primes . In particular Proof. By linearity of expectation and Markov’s inequality, it suffices that for each
we have
with probability
By Chebyshev’s inequality it suffices to show that
and
where
are independent copies of
that are also independent of
.
To prove claim (69), we first select the value
n for
according to the distribution (63):
Because of the property
, if
we have with probability 1:
Relation (69) now follows from Lemma 10 with
, applying the formula for total probability
A similar application of Lemma 10 allows one to write the left-hand side of (70) as
From (69), we see that the quantity
is equal to
with probability
and is less than
otherwise.
The claim now follows from . □
(Proof of Lemma 11). We first show that replacing with P has negligible effect on the sum with probability . Fix i and substitute .
By Lemma 11, we have
Next by
and Lemma 12 we have
Subtracting, we conclude that the difference of the two expectations above is
. The claim then follows from (56).
By this, it suffices to show that
for all but at most
primes
, one has
We call a prime
“bad” if
, but (71) fails. Using Lemma 12 and (63) we have
By the definition of
, we have
unless
. By Definition 15 this means that
.
We may thus apply Lemma 12 with
and obtain for all
i:
With (71), we thus obtain
Next, we obtain
where
and
are independent copies of
over
. In the last step, we used the fact that the terms with
contribute negligibly.
By Chebyshev’s inequality, it follows that the number of bad
q’s is
We may now prove Theorem 16.
Relation (40) is actually the corollary to Lemma 10. In order to prove (14), we assume that is good and .
Substituting (67) into the left-hand side of (68) using
and observing that
is only possible if
, we find that
where
is as defined in Theorem 16. The fact that
is good with probability
follows upon noticing that
This concludes the proof of Theorem 16. □
5. Large Gaps with Improved Order of Magnitude and Its K-Version, Part II
We first state definitions and results from “Dense clusters of primes in subsets” by Maynard [
15].
We make use of the notation given in
Section 7: “Multidimensional Sieve Estimates” of [
15].
Definition 16. A linear form is a function of the form with integer coefficients and . Let be a set of integers. Given a linear form . We define the setsfor any and congruence class and define the quantitywhere ϕ is the Euler totient function. A finite set of linear forms is said to be admissible if has no fixed prime divisor; that is, for every prime p there exists an integer such that is not divisible by p.
Definition 17. Let x be a large quantity, let be a set of integers, a finite set of linear forms and B a natural number. We allow to vary with x. Let be a quantity independent of . Let be a subset of . We say that the tuple obeys Hypothesis 1 at if we have the following three estimates:
- (1)
( is well-distributed in arithmetic progressions). We have - (2)
( is well-distributed in arithmetic progressions). For any , we have - (3)
( not too concentrated). For any and , we have
In [
15], this definition was only given in the case
, but we will need the (mild) generalization to the case in which
is a (possibly empty) subset of
.
As is common in analytic number theory, we will have to address the possibility of a Siegel zero. As we want to keep all our estimates effective, we will not rely on Siegel’s theorem or its consequences. Instead, we will rely on the Landau–Page theorem, which we now recall. Throughout, denotes a Dirichlet character.
Lemma 13 (Landau–Page Theorem).
Let . Suppose that for some primitive character χ of modulus at most Q and some . Then, either or else and χ is a quadratic character , which is unique. Furthermore, if exists, then its conductor is square-free apart from a factor of at most 4 and obeys the lower bound Proof. See, e.g., ([
27], Chapter 14). The final estimate follows from the bound
for a real zero
of
with
of modulus
q, which can also be found in ([
27], Chapter 14).
We can then eliminate the exceptional character by deleting at most one prime factor of . □
Corollary 6. Let . Then, there exists a quantity which is either equal to 1 or is a prime of sizewith the property thatwhenever and χ is a character of modulus at most Q and coprime to . Proof. If the exceptional character from Lemma 13 does not exist, then take ; otherwise, we take to be the largest prime factor of . As is square-free apart from a factor of at most 4, we have by the prime number theorem and the claim follows. □
Lemma 14. Let x be a large quantity. Then, there exists a natural number , which is either 1 or a prime, such that the following holds.
Let , let and be a finite set of linear forms (which may depend on x) with , and .
Let and let be a subset of such that is non-negative on and is coprime to B for all . Then, obeys Hypothesis 1 at with absolute implied constants (i.e., the bounds in Hypothesis 1 are uniform over all such choices of and y).
Proof. Parts (1) and (3) of Hypothesis 1 are easy to see; the only difficult verification is (2). We apply Corollary 6 with
for some small absolute constant
to obtain a quantity
with the stated properties. By the Landau–Page theorem (see [
27], Chapter 20), we have that if
is sufficiently small then we have the effective bound
for all
with
and all
. Here, the summation is over all primitive
and
Following a standard proof of the Bombieri–Vinogradov Theorem (cf. [
27], Chapter 28), we have (for a suitable constant
):
Combining these two statements and using the triangle inequality gives the bound required for (2). □
We now recall the construction of sieve weights from ([
15],
Section 7).
Let
For each prime
p not dividing
B, let
be the elements
n of
for which
If
p is also coprime to
w, then for each
, let
denote the least element of
such that
Let
denote the set
Define the singular series
the function
and let
R be a quantity of size
Let
be a smooth function supported on the simplex
For any
, define
For any
, define
and then define the function
by
We then have the following slightly modified form of Proposition 6.1 of [
15].
Theorem 20. Fix θ, . Then, there exists a constant C depending only on such that the following holds. Suppose that obeys Hypothesis 1 at some subset of . Write and suppose that , and . Moreover, assume that the coefficients of the linear forms in obey the size bound and . Moreover, assume that the coefficients , of the linear forms in obey the size bound for all . Then, there exists a smooth function depending only on k and supported on the simplex and quantities , depending only on k withandsuch that, for given in terms of F as above, the following assertions hold uniformly for . For any linear form in with coprime to B and on , we have Let be a linear form such that the discriminantis non-zero (in particular L is not in ). Then, We have the crude upper boundfor all n .
Proof. The first estimate (78) is given by [
15], Proposition 9.1, (79) follows from [
15], Proposition 9.2, in the case of
, (80) is given by [
15], Proposition 9.4, (taking
and
) and the final statement (81) is given by part (iii) of [
15], Lemma 8.5. The bounds for
and
are given by [
15], Lemma 8.6.
We can now prove Theorem 20. Let
be as in that theorem. We set
and let
be the quantity from Lemma 14.
We define the function
by setting
for
and
, where
is the (ordered) collection of linear forms
for
and
was defined in (76). Note that the admissibility of the
r-tuple
implies the admissibility of the linear forms
.
An important point is that many of the key components of
are essentially uniform in
p. Indeed, for any primes, the polynomial
is divisible by
s only at the residue classes -
. From this, we see that
In particular,
is independent of
p as long as
s is distinct from
p; therefore,
for some
independent of
p, with the error terms uniform in
p. Moreover, if
then
, so all the
are distinct
(since the
are less than
). Therefore, if
we have
and
Since all
are at least
, we have
whenever
. From this, we see that
is independent of
p and where the error term is independent of
.
It is clear that
w is non-negative and supported on
and from (81) we have (57). We set
and
Since
B is either 1 or prime, we have
and from the definition of
R we also have
From (77), we thus obtain (57). From [
15], Lemma 8.1(i), we have
and from [
15], Lemma 8.6, we have
and so we have the lower bound (56a). (In fact, we also have a matching upper bound
, but we will not need this.)
It remains to verify the estimates (59) and (60). We begin with (59). Let
p be an element of
. We shift the
n variable by
and rewrite
where
denotes the set of linear forms
for
. (The
error arises from (61) and roundoff effect if
y is not an integer.) This set of linear forms remains admissible and
The claim (59) now follows from (75) and the first conclusion (78) of Theorem 20 (with
x replaced by
,
and
), using Lemma 14 to obtain Hypothesis 1.
Now, we prove (60). Fix
and
. We introduce the set
of linear forms
, where
and
We claim that this set of linear forms is admissible. Indeed, for any prime
, the solutions of
are
and
the number of which is equal to
. Thus
as before. Again, for
we have that the
are distinct
and so if
and
we have
and
In particular
is independent of
and so
where again the
error is independent of
. From this, since
takes values in
, we have that
whenever
(note that the
summation variable implicit on both sides of this equation is necessarily equal to 1). Thus, recalling that
we can write the left-hand side of (60) as
Applying the second conclusion on (79) of Theorem 20 (with
x replaced by
,
and
) and using Lemma 14 to obtain Hypothesis 1, this expression becomes
Clearly
and from the prime number theorem, one has
for any fixed
. Using (83), we can thus write the left-hand side of (79) as
From (42) and (56a), the second error term may be absorbed into the first and (59) follows.
Finally, we prove (60). Fix
not equal to any of the
and fix
. By the prime number theorem, it suffices to show that
By construction, the left-hand side is the same as
which we can shift as
where again the
error is a generous upper bound for round-off errors. This error is acceptable and may be discarded. Applying (80), we may then bound the main term by
where
Applying (83), we may simplify the above upper bound as
Now,
for each
i; hence,
and it follows from (82) and (56), observing
This concludes the proof of Theorem 20 and hence Theorem 4. □
The K-version deduction of Theorem 19 (of [20]). We now modify the weights
to incorporate (for fixed primes
p) the conditions
and
We carry out the modification in two steps. In a first step, we replace by . Here, p is a fixed prime with .
Here, we have to be more specific about the set . We set .
Definition 18. Let be as in (76), , p a fixed prime with . Let also . We set We first express the solvability of (86) by the use of Dirichlet characters.
Lemma 15. Let p be a prime number. Let , and be the principal character . There are non-principal characters , such that for all we have Proof. Let
be a primitive root
,
Setting
we see that the congruence
is solvable if and only if
has a solution
y.
By the theory of linear congruences, this is equivalent to
. We have
We now define the Dirichlet character
, (
),
and obtain the claim of Lemma 15. □
Theorem 21. Let , as in the Definition of , . Then, we have Proof. By Lemma 15, we have
The sum belonging to the principal character
differs from the sum
only by
, since there are only
terms with
, each of them has size at most
. We therefore have
Let now
. Here, we closely follow the proof of Proposition 9.1 of [
15]. We split the sum into residue classes
. We recall that
If
then we have
and so we restrict our attention to
with
We substitute the definition of
, expand the square and swap the order of summation. This gives
The congruence conditions in the inner sum may be combined via the Chinese Remainder Theorem by a single congruence condition
where
stands for the least common multiple.
There are
Dirichlet characters
such that
We thus may write
with a suitable absolute constant
A, an interval
I of length
and the
non-principal Dirichlet characters
of conductor
and modulus
.
By the Pólya–Vinogradov bound, we obtain:
The claim of Theorem 21 now follows from (89) and (90). □
As a preparation for the proof of Theorem 22 which is a modification of Proposition 9.2 of [
15], we state a lemma on character sums over shifted primes.
Lemma 16. Let χ be a Dirichlet character . Then, for we have Proof. This is Theorem 1 of [
33]. □
Theorem 22. Let ,satisfy for andThen, we have for sufficiently small θ: Proof. By Lemma 15, we have
The sum belonging to the principal character
differs from the sum
only by
and thus in [
15], Proposition 9.2, we have
For
, we follow closely the proof of Proposition 9.2 in [
15]. We again split the sum into residue classes
If
then we have
and so we restrict our attention to
with
We substitute the definition of
, expand the square and swap the order of summation. Setting
, we obtain
If
runs through the arithmetic progression
then also
runs through an arithmetic progression
Thus, we have
Also, the condition
may be expressed with the help of Dirichlet characters
using orthogonality relations.
Theorem 22 thus follows from (91) and Lemma 16. □
For the definition of the weight whose existence is claimed in Theorem 19, we now have to be more specific about the set of linear forms.
Definition 19. Let the tuple be given. For and , let be the (ordered) collection of linear forms for and set In the sequel, we now show that in the sums
appearing in (58) and (59) of Theorem 19, the function
may be replaced by the function
with a negligible error.
Since these sums have been treated in Theorem 21 and Theorem 22, this will essentially conclude the proof of Theorem 19 and thus of Theorem 5. □
Definition 20. Let be an admissible r-tuple, . For , , let Proof. This follows immediately from Definition 5 and 20. □
Lemma 19. Let , be as in (76), as in Definition 19. Let . Then Proof. We only give the proof for the hardest case
and briefly indicate the proof for
.
□
In the inner sum, we only deal with the case
; the case
has a negligible contribution. The inner sum is non-empty if and only if the system
is solvable. In this case, (93) is equivalent to a single congruence
where
is uniquely determined by the system (93) and
We apply Theorem 20 with
B independent of
and with
We have
and obtain
This proves the claim for
. The proof of the case
is analogous but simpler, since there is only the single variable of summation
. □
Lemma 20. Let the conditions be as in Lemma 19. Then, we have Theorem 23. Let the conditions be as in the previous lemmas. For sufficiently small , we have Proof. Let
. By Definition 20, we have
which yields
Thus,
and therefore
The claim of Theorem 23 follows by summation over all pairs
if
is sufficiently small. □
We now investigate the sum (60) of Theorem 19.
Definition 21. Let , . Let : . Then, we define Lemma 21. Let be as in Definition 20. Let . Then, we have Proof. We only give the proof for the hardest case
. The case
is analogous but simpler. We have
We deal only with the case
for the inner sum, the case
having a negligible contribution. The inner sum is non-empty if and only if the system
is solvable.
In this case, the system is equivalent to a single congruence
uniquely determined by the system (95) and
. The inner sum then takes the form
By the substitution
, we obtain
We set
, where
is replaced by the set
, where
We thus have
We apply Theorem 22 with
,
instead of
x,
,
. We have
From Bombieri’s Theorem, it can easily be seen that conditions (78) are satisfied for all
s with the possible exception of
,
being an exceptional set, satisfying
For
, we use the trivial bound
. Thus, we obtain the claim of Lemma 21 for the case
.
The proof for is analogous but simpler, since we have only to sum over the single variable . □
Lemma 22. Let be as in Definition 20. We have Proof. By Definition 21, we have
□
Theorem 24. Let be as in Definition 21. Then, we have Proof. Let
. By Definition 20, we have
It follows that
Thus
The second term is absorbed in the first one, since by the definition
and thus
Therefore
The claim of the Theorem 24 now follows by summing over all pairs
. □
We now can conclude the proof of Theorem 19 and therefore also the proof of Theorem 1.1.
By Theorems 21–24, we have
and
The deduction of Equations (58) and (59) of Theorem 19 can thus be deduced from results on the sums on the right-hand side of Equations (96) and (97).