Recent Results on Large Gaps Between Primes

Rassias, Michael Th.

doi:10.3390/axioms14030198

Open AccessArticle

Recent Results on Large Gaps Between Primes

by

Michael Th. Rassias

Department of Mathematics and Engineering Sciences, Hellenic Military Academy, 16673 Vari Attikis, Greece

Axioms 2025, 14(3), 198; https://doi.org/10.3390/axioms14030198

Submission received: 5 February 2025 / Revised: 3 March 2025 / Accepted: 4 March 2025 / Published: 6 March 2025

(This article belongs to the Section Algebra and Number Theory)

Download Versions Notes

Abstract

One of the themes of this paper is recent results on large gaps between primes. The first of these results was achieved in the paper by Ford, Green, Konyagin and Tao. It was later improved in the joint paper of these four authors with Maynard. One of the main ingredients of these results is old methods from Erdős and Rankin. Other ingredients are important breakthrough results from Goldston, Pintz and Yildirim, and their extension by Maynard on small gaps between primes. All these previous results are discussed in brief. The results on the appearance of k-th powers of primes contained in those large gaps obtained by the author in joint work with Maier are based on a combination of the results just described with the matrix method of Maier.

Keywords:

large gaps between primes; Erdős-Rankin method; small gaps between primes; maier matrix method; sieve methods

MSC:

11N02; 11N05; 11N35; 05C70

1. Introduction

Let

p_{n}

denote the n-th prime number,

d_{n} = p_{n + 1} - p_{n}

. The topic of this article is recent results on large values of

d_{n}

.

We start with a short overview on the historical development of the subject. The prime number theorem easily implies that

\frac{d_{n}}{log p_{n}}

is 1 on average:

lim_{x \to \infty} \frac{1}{x} \sum_{n \leq x} \frac{d_{n}}{log p_{n}} = 1 .

For an infinite sequence of values of n this average value is superseded by a factor tending to infinity for

n \to \infty

. Let

G (x) : = max_{p_{n + 1} \leq x} (p_{n + 1} - p_{n}) .

In 1931, Westzynthius [1], improving on prior results of Backlund [2] and Brauer-Zeitz [3], proved that

G (X) ≫ \frac{log X {log}_{2} X}{{log}_{3} X} .

(here and in the sequel we define

{log}_{k} x

by

{log}_{1} : = log x

and

{log}_{k} : = log ({log}_{k - 1} x)

,

(k > 1)

). In 1935, Erdős [4] sharpened this to

G (X) ≫ \frac{log X {log}_{2} X}{{({log}_{3} X)}^{2}}

and in 1938 Rankin [5] made a subsequent improvement

G (X) \geq (c + o (1)) \frac{log X {log}_{2} X {log}_{4} X}{{({log}_{3} X)}^{2}} .

(1)

The constant c was improved several times (cf. [6,7,8,9]). It was a famous problem of Erdős to improve on the order of magnitude of the lower bound in (1). This problem was solved recently. In two independent papers, the paper [10] by the four authors K. Ford, B. Green, S. Konyagin and T. Tao and the paper [11] by J. Maynard, it was shown that the constant in (1) could be taken to be arbitrarily large.

The methods of the two papers differed in some key aspects. The arguments in [10] used recent results from the papers [12,13] by Green and Tao and the paper [14] by Green, Tao and Ziegler on the number of solutions to linear equations in primes. The arguments in [11] by J. Maynard instead relied on multidimensional sieves introduced in [15], which in turn heavily relied on the breakthrough results of D. A. Goldston, J. Pintz and C. Y. Yildirim (cf. [16,17,18]).

In this article, we shall restrict our description to the approach of the paper [19]. (We follow the notation in [19], since this is also used in Maier-Rassias’ work [20].)

Later on, the author of the present paper in a joint paper [20] with Maier obtained large gaps of the order of that in [19] that contain a perfect K-th power of a prime for a fixed natural number

K \geq 2

. They combined the results and the methods of the paper [19], the method of the paper [21] of Ford, Heath-Brown and Konyagin with the Maier matrix method. The bulk of this paper will deal with the description of the results of the paper [19] and its K-version, the paper [20].

The paper will be concluded with results about large gaps containing K-th powers of primes of special types: Beatty primes and Piatetski–Shapiro primes.

2. Short History of Large Gap Results

Starting with the papers [4,22] of Erdős, all the results on large gaps between primes are based on modifications of the Erdős–Rankin method. Its basic features are as follows:

Let

x > 1

. All steps are considered for

x \to \infty

. Let

P (x) : = \prod_{p < x} p, y > x .

By the prime number theorem we have

P (x) = e^{x (1 + o (1))} .

A system of congruence classes

\begin{matrix} {v : v \equiv & h_{p_{1}} mod p_{1}} \\ ⋮ \\ {v : v \equiv & h_{p_{l}} mod p_{l}}, \end{matrix}

(2)

(with

p_{1} < \dots < p_{l}

being the primes less than x) is constructed, such that the congruence classes

h_{p_{l}} mod p_{l}

cover the interval

(0, y]

.

Associated with the system (2) is the system of congruences

\begin{matrix} {m \equiv & - h_{p_{1}} mod p_{1}} \\ ⋮ \\ {m \equiv & - h_{p_{l}} mod p_{l}} . \end{matrix}

(3)

By the Chinese Remainder Theorem the system (3)

1 \leq m < P (x) .

has a unique solution

m_{0} mod P (x)

Let

u \in N

,

1 \leq u \leq y

. Then, there is a j,

1 \leq j \leq l

, such that

u \equiv h_{p_{j}} mod p_{j} .

From (2) and (3)

m_{0} + u \equiv 0 mod p_{j} .

If

m_{0}

is sufficiently large, then all integers

w \in (m_{0}, m_{0} + y]

are composite. If

p_{n} = max {p prime : p \leq m_{0}}

then it follows that

p_{n + 1} - p_{n} \geq y

, a large gap result.

The large gap problem has thus been reduced to a covering problem: Find a system of congruence classes that cover the interval

(0, y]

, where y is as large as possible.

In all papers since Erdős [4,22], the covering system (2) has been constructed by a sequence of sieving steps. The set

S (x) : = {p_{1}, \dots, p_{l}}

is partitioned into a disjoint union of subsets:

S (x) = S_{1} \cup \dots \cup S_{g} .

(4)

Associated with each sieving step

{st}_{j}

is a choice of congruence classes

f_{p} mod p

for

p \in S_{j}

. We also consider the sequence

R_{j}

of residual sets. It is recursively defined as follows:

The 0-th residual set

R_{0}

covers the entire interval

(0, y]

. Thus,

R_{0} = {n \in N \cap (0, y]} .

The

{(j + 1)}^{s t}

residual set

R_{j + 1}

is obtained by removing from

R_{j}

all the integers from

(0, y]

congruent to

f_{p} mod p

for some

p \in S_{j}

. The sequence

({st}_{j})

,

1 \leq j \leq g

is complete; if

R_{g - 1} \neq \emptyset

,

R_{g} = \emptyset

, that means all integers in

(0, y]

have been removed. For a complete sequence of sieving steps the union

⋃_{1 \leq j \leq g} ⋃_{p \in S_{j}} (f_{p} mod p)

thus covers all of

(0, y]

and the choice

h_{p} = f_{p} (p \in S (x))

in (2) gives a covering system of the desired kind.

In all versions of the Erdős–Rankin method, the first sieving steps have been very similar.

We describe—with minor modifications, adjusting to our notations—the construction of the covering system (2) in Erdős [4,22].

One sets

X = log x, y = δ x \frac{log x {log}_{3} x}{{log}_{2}^{2} x}, Z = \frac{1}{2} x,

Y = exp (α \frac{log x {log}_{3} x}{{log}_{2} x}) = exp (α \frac{log x {log}_{3} x}{{log}_{2} x} (1 + o (1)) .

The sets

S_{j}

of primes are defined as follows:

S_{1} : = {p : p \in (0, X]}, S_{2} : = {p : p \in (y, Z]} .

S_{3} : = {p : p \in (X, y]}, S_{4} : = {p : p \in (Z, x]}, S_{5} : = {p : p \in (x, y]} .

For the first two sieving steps, one defines the congruence classes

h_{p_{i}} mod p_{i}, p_{i} \in S_{i} (i = 1, 2)

by

h_{p_{1}} \equiv 0 mod p_{1}, h_{p_{2}} \equiv 0 mod p_{2} .

A simple consideration shows that for the second residual set

R_{2}

the intersection

R_{2} \cap (x, y]

is the union of a set Q of prime numbers

Q : = {q prime : x < q \leq y}

with a set of Z-smooth integers, i.e., integers whose largest prime factor is

\leq Z

. A crucial fact in all variants of the Erdős–Rankin method is that the number of smooth integers is very small. This fact was established by Rankin [5] and Bruijn [23].

A central idea of Rankin’s method is “Rankin’s trick”. Let us write

p^{+} (m)

for the largest prime factor of m. Let

Σ^{'}

mean summation over all integers n with

p^{+} (n) < y

. Then, one has for

η > 1

:

\begin{matrix} \sum_{\begin{matrix} m \leq x \\ p^{+} (m) \leq y \end{matrix}} 1 & \leq \sum_{m \leq x}^{'} 1 \leq \sum_{m \leq x}^{'} {(\frac{x}{m})}^{η} \\ = x^{η} \sum_{m \leq x}^{'} \frac{1}{m^{η}} \leq x^{η} \prod_{p \leq y} (1 - \frac{1}{p^{η}}) . \end{matrix}

The bound needed follows by evaluating the product by the prime number theorem and by choosing

η

optimally.

Thus, the elements of the second residual set essentially only consist of prime numbers, the number of Z-smooth numbers of the second residual set being negligible.

In the third sieving step in Erdős [4], the classes

h_{p_{3}} mod p_{3} (p_{3} \in S_{3})

are chosen via a greedy algorithm. In each step, the congruence class not belonging to the previous congruence classes that contains the most elements of the residual set

R_{3}

is removed.

In each version of the Erdős–Rankin method, there is a weak sieving step, which we will not number, since this number might be different in different versions. Instead, we call it the weak sieving step, since only a few elements of the residual set are removed.

In the first paper [4] of Erdős, which is being discussed right now, in the fourth sieving step

{st}_{4}

one uses the primes

p \in (0, x] ∖ (S_{1} \cup S_{2} \cup S_{3})

to remove the elements from the set

R_{4}

.

An important quantity is the hitting number of the weak sieving step

{st}_{w_{0}}

. The hitting number of the prime

p \in S_{w_{0}}

is defined as the number of elements belonging to the congruence class

h_{p} mod p

. In all papers prior to [6], this hitting number was bounded below by 1. Thus, for each element u of the residual set

R_{w_{0}}

a prime

p (u) \in S_{w_{0}}

could be found such that

u \equiv h_{p} (u) mod p (u)

and thus the removal of a single element from the congruence class

(h_{p} (u) mod p (u))

could be guaranteed. The progress in the papers was achieved not by changing the estimate for the hitting number, but by better estimates for the number of smooth integers.

In the paper [6] by Maier and Pomerance, the hitting number in the weak sieving step for a positive proportion of the primes

p \in S_{w_{0}}

was at least 2.

A further improvement was obtained in the paper [9] by Pintz, where the hitting number was at least 2 for almost all primes in

S_{w}

. We give a short sketch of these two papers.

The paper [6] consists of an arithmetic part and a graph-theoretic part, combined with a modification of the Erdős–Rankin method. The arithmetic information needed concerns the distribution of generalized twin primes in arithmetic progressions on average.

We recall definitions and theorems from [6]. Fix some arbitrary, positive numbers

A, B

. For a given large number N, let

x_{1}, x_{2}

satisfy

\frac{N}{{(log N)}^{A}} \leq x_{1} < x_{2} \leq N, x_{2} - x_{1} \geq \frac{N}{{(log N)}^{B}} .

If n is a positive integer, let

T (n) = {p prime : x_{1} \leq p \leq x_{2} - n}

where as usual p denotes a prime.

Further, if

l, M

are positive integers, let

T (n, l, M) = {p \in T (n) : p \equiv l mod M} .

Let

T (n, l, M) = | T (n, l, M) |

and let

T (n) = \sum_{x_{1} < k \leq x_{2} - n} \frac{1}{log k log (k + n)} .

Let

α_{0} = 2 \prod_{p > 2} \frac{p (p - 2)}{{(p - 1)}^{2}} = 1, 3203 .

Then, one has with a fixed constant

c_{1} > 0

:

\begin{matrix} \sum_{\begin{matrix} n \leq x \\ n \equiv 0 mod 2 \end{matrix}} \sum_{M \leq x^{c_{1}}} max_{\begin{matrix} l \\ (l, m) = (n + l, M) = 1 \end{matrix}} & max |T (x, n, l, M) - \frac{α_{0} T (x, n)}{ϕ (M)} \prod_{\begin{matrix} p ∣ M \\ p > 2 \end{matrix}} \frac{p - 1}{p - 2}| \\ ≪_{E} x^{2} {(log x)}^{- E} . \end{matrix}

(5)

The result (5) is proven by application of the Hardy–Littlewood Circle method. We now come to the graph-theoretic part:

We have the following definitions:

Definition 1

([6], Definition 4.1′). Say that a graph G is N-colored if there is a function χ from the edge set of G to

{1, \dots, N}

.

In the paper [6], first a graph is discussed, whose properties are idealized and thus simpler to formulate than the properties really needed for the applications. A proof of the existence of certain colored subgraphs (partial matchings) is given. Then, the graphs with properties needed for the applications are discussed. The existence of certain colored subgraphs is given without proof. The proof can easily be obtained by a modification of the proof for the idealized graphs mentioned above. For the sketch of the details, we cite ([6], Definition 4.2).

Say an N-colored graph G is K-uniform if

K ∣ N

and there are integers

S, T

such that

(i): Each color in ${1, \dots, N}$ is assigned to exactly S edges of G.
(ii): For each $i = 1, \dots, K$ and each vertex V in G, there are exactly $T / K$ edges E coincident at V with color in $((i - 1) N / K, i N / k)$ . Thus, each vertex of G has valence T.

One has

Theorem 1

([6], Theorem 4.1). Say G is a K-uniform, N-colored graph with N vertices, where

c \geq 1

. Then, there is a set of B mutually non-coincident edges with distinct colors such that

B > \frac{c N}{4} (1 - exp (- \frac{4}{c} + \frac{8}{c^{2} K}))

We describe the construction of these edges:

Let

S, T

be as in ([6], Definition 4.2).

Let B, be the largest collection of mutually noncoincident edges with distinct colors in

(0, N / K]

. After

B_{1}, \dots, B_{i - 1}

have been chosen and

i \leq K

, let

B_{i}

be the largest collection of edges of G with distinct colors in

((i - 1) N / K, i N / K)

such that the members of

B_{1} \cup \dots \cup B_{i}

are mutually noncoincident. Let

β_{i}

be such that

| B_{i} | = β_{i} N

and let

β = β_{1} + \dots + β_{K} .

It can be shown that

β > \frac{c}{4} (1 - exp (- \frac{4}{c} + \frac{8}{c^{2} K})) .

We now describe the modifications suited for applications.

Definition 2

([6], Definition 4.2′). Let K be a positive integer and let

C > 0

,

δ \geq 0

be arbitrary. Say an N-colored graph G with N vertices is

(K, C, δ)

-uniform if there are numbers

S, T

such that

(i): For at most $δ N$ exceptions, each color in ${1, \dots, N}$ is assigned to between $(1 - δ) S$ and $(1 + δ) S$ edges of G;
(ii): If we let $n (V, i)$ denote the number of edges coincident at the vertex V with color in $((i - 1) N / K, i N / K)$ , then

$n (V, i) \leq C T / K$

for each $i = 1, \dots, K$ , but for at most $S M$ exceptional vertices V, we have

$(1 - δ) T / K \leq n (V, i) \leq (1 + δ) T / K$

for each $i = 1, \dots, K$ .

Then, we have the following result:

Theorem 2

([6], Theorem 4.1′). Let

C > 0

,

η > 0

be arbitrary. There is a number

K (C, η)

such that for each integer

K \geq K (C, η)

there is some

δ = δ (C, η, K) > 0

with the property that each

(K, C, δ)

-uniform, N-colored graph with

c N

vertices, where

c \geq 1

, has a set of B mutually noncoincident edges with distinct colors, where

B > (1 - η) \frac{c N}{4} (1 - exp (- \frac{4}{C})) .

We now describe the application of the Erdős–Rankin method in the paper [6] and its combination with the arithmetic and graph-theoretic results just mentioned.

Let

\begin{matrix} y & : = c^{'} e^{γ} x log x log log log x {(log log x)}^{- 2} \\ z & : = x / log log x \\ v & : = exp (1 - ϵ) log x log log log x {(log log x)}^{- 2} . \end{matrix}

The first two sieving steps are as follows:

For the system of congruence classes

h_{p_{1}} mod p_{1}

as described in (2), we choose:

\begin{matrix} h_{p_{1}} = 0 for every prime p_{1} \in S_{1} : = (y, z], \\ h_{p_{2}} = 1 for every prime p_{2} \in S_{2} : = (1, y] . \end{matrix}

The first residual set

R_{1}

is the disjoint union

R_{(1)} \cup R_{(2)}

, where

R_{(1)}

is the set of integers in

(1, y]

divisible by some prime

p > z

and

R_{(z)}

is the set of v-smooth integers in

(1, y]

. Let

R

be the members of the second residual set that are in

R_{(1)}

and let

R^{'}

be the members of the second residual set that are in

R_{(2)}

. Then

R = ⋃_{m \leq Y / z} R_{m},

where

\begin{matrix} R_{m} : = {m p : z < p \leq U / m, (m p - 1, P (y)) = 1}, \\ R^{'} : = {n \leq y : p ∣ n \Rightarrow p \leq y, q ∣ n - 1 \Rightarrow q > y} . \end{matrix}

It is again important that the number of smooth integers is small and it easily follows that

| R^{'} | ≪ \frac{x}{{(log x)}^{1 + ϵ}} .

For the weak sieving step, one now applies the graph-theoretic results (Theorem 2).

One defines a graph whose vertex set is

R_{m}

. Let

k_{0} : = \prod_{r < log log log x} r .

Define

r_{m} = \frac{α_{0}}{m log log x} \prod_{r ∣ m} \frac{r - 1}{r - 2} .

Let

Q_{m}

denote the set of primes q in the interval

((1 - \sum_{j = 1}^{m} r_{j}), (1 - \sum_{j = 1}^{m - 1} r_{j}) x] .

Let

Q_{m}

be the graph with vertex set

R_{m}

and such that

m p, m p^{'} \in R_{m}

are connected by an edge if and only if

| p^{'} - p | = k_{0} q

for some

q \in Q_{m}

.

Define the “color” of an edge by the prime q, so that

G_{m}

is a

| Q_{m} |

-colored graph. From the arithmetic information, combined with standard sieves, it can easily be deduced that the graphs

G_{m}

satisfy the conditions of the graph-theoretic result ([6], Definition 4.2). Thus, the graphs

G_{m}

contain a sufficient number of edges

(m p, m p^{'})

and thus pairs

(p, p^{'})

with

p \equiv p^{'} (mod q) .

We consider the system

\begin{matrix} {v : v & \equiv h_{p_{1}} mod p_{1}}, \\ ⋮ \\ {v : v & \equiv h_{p_{l}} mod p_{l}}, \end{matrix}

(6)

for

p_{j} = q \in Q_{m} .

If we determine

h_{p_{j}} mod p_{j} = h_{q} mod q

by

h_{q} \equiv p \equiv p^{'} mod q

then the hitting number for the prime

p_{j} = q

is 2. Thus, by the weak sieving step, two members of the residual set are removed for each prime q. The weak sieving step is completed by removing one member of the residual set for the remaining primes.

The paper [9] by Pintz contains exactly the same arithmetic information as the paper [6] by Maier and Pomerance, whereas the graph-theoretic construction is different. The edges of the graphs are obtained by a random construction and a hitting number of 2 for almost all primes in the weak sieving step is achieved.

The order of magnitude of

G (X)

could finally be improved in the paper [10]. The result is:

G (x) \geq f (x) \frac{log X {log}_{2} X {log}_{4} X}{{({log}_{3} X)}^{2}},

with

f (X) \to \infty

for

X \to \infty

.

The paper is related to the work on long arithmetic progressions consisting of primes by Green and Tao [12,13] and work by Green, Tao and Ziegler [14] on linear equations in primes. The authors manage to remove long arithmetic progressions of primes in the weak sieving step and thus are able to obtain a hitting number tending to infinity with X. We shall not describe any more details of this paper. Simultaneously and independently, James Maynard [15] achieved progress based on multidimentional sieve methods. The authors of the paper [10] and Maynard in [19] joined their efforts to prove

G (X) \geq C log X {log}_{2} X {log}_{4} X {({log}_{3} X)}^{- 1}

for a constant

C > 0

.

Again the hitting number in the weak sieving step tends to infinity for

x \to \infty

. Whereas in the papers [6,9] by Maier and Pomerance and Pintz, the pairs of the integers removed in the weak sieving step were interpreted as edges of a graph, now the tuplets of integers removed are seen as edges of a hypergraph. One uses a hypergraph covering theorem generalizing a result of Pippenger and Spencer [24] using the Rödl nibble method [25].

The choice of sieve weights is related to the great breakthrough results on small gaps between consecutive primes, based on the Goldston–Pintz–Yildirim (GPY) sieve and Maynard’s improvement of it. We give a short overview.

3. Small Gaps, GPY Sieve and Maynard’s Improvement

The first non-trivial bound was proved by Erdős [4,22], who showed that

\underset{n \to \infty}{lim inf} \frac{d_{n}}{log p_{n}} < 1 .

By applying Selberg’s sieve, he showed that pairs of primes

(\tilde{p_{1}}, \tilde{p_{2}})

with a fixed difference cannot appear too often.

The first major breakthrough was achieved by Bombieri and Davenport [26], who showed that

\underset{n \to \infty}{lim inf} \frac{d_{n}}{log p_{n}} \leq 0.46650 \dots

Let

Z (2 n) : = \sum_{\begin{matrix} p, p^{'} \leq x \\ p^{'} - p = 2 n \end{matrix}} (log p) (log p^{'}),

(7)

S (α) : = \sum_{p \leq x} (log p) e (p α)

(8)

U (α) : = \sum_{m = - L}^{L} e (2 m α) (e (α) = e^{2 π i α}, L \in N) .

Then,

T (α) : = {| U (α) |}^{2} = \sum_{j = - 2 L}^{2 L} t (j) e (2 j α),

with

t (j) : = 2 L + 1 - | j | .

One row considers the integral

I (x) = \int_{0}^{1} {| S (α) |}^{2} T (α) d α .

(9)

By orthogonality, one obtains:

I (x) = t (0) Z (0) + 2 \sum_{m = 1}^{2 L} t (m) Z (2 m) .

(10)

One now tries to establish a lower bound for I(x). This bound can be combined with upper bounds for

Z (2 m)

for large values of m to obtain estimates

Z (2 m) > 0

for small values of m. Thus, gaps of size

2 m

exist.

These estimates became possible by application of the Bombieri–Vinogradov theorem, proven one year before [27].

For its formulation, the following definition will be useful:

Definition 3.

Let

θ (x, q, a) = \sum_{\begin{matrix} p \leq x \\ p \equiv a (mod q) \end{matrix}} log p .

We say that the primes have an admissible level of distribution θ if

\sum_{q \leq x^{δ - ϵ}} max_{(a, q) = 1} |θ (x; q, a) - \frac{x}{ϕ (q)}| ≪ \frac{x}{{(log x)}^{A}}

holds for any

A > 0

and any

ϵ > 0

.

The Bombieri–Vinogradov theorem now states that:

For any

A > 0

, there is a

B = B (A)

such that, for

Q = x^{1 / 2} {(log x)}^{- B} :

\sum_{q \leq Q} max_{(a, q) = 1} |θ (x; q, a) - \frac{x}{ϕ (q)}| ≪ \frac{x}{{(log x)}^{A}} .

(11)

This implies that the primes have an admissible level of distribution

1 / 2

.

Definition 4.

We say that the primes have anadmissible level of ditribution ϑ if (11) holds for any

A > 0

and any

ϵ > 0

with

Q = x^{ϑ - ϵ}

.

A great breakthrough was achieved in the paper [16]. They consider admissible k-tuples for which we reproduce the definition:

Definition 5.

U

is called admissible if for each prime p the number

v_{p} (U)

of distinct residue classes modulo p occupied by elements of

U

satisfies

v_{p} (U) < p

.

The two main results in the paper [16] of Goldston, Pintz and Yildirim are

Theorem 3

([16], Theorem 3.3). Suppose the primes have a level of distribution

ϑ > 1 / 2

. Then, there exists an explicitly calculable constant

C (ϑ)

depending only on ϑ such that any admissible k-tuple with

k \geq C (ϑ)

contains at least two primes infinitely often. Specifically, if

ϑ \geq 0.971

, then this is true for

k \geq 6

.

Theorem 4

([16], Theorem 3.4). We have

Δ_{1} : = \underset{n \to \infty}{lim inf} \frac{p_{n + 1} - p_{n}}{log p_{n}} = 0 .

The method of Goldston, Pintz and Yildirim has also become known as the GPY sieve.

There are several overview articles on the history of the GPY method (cf. [18,28]).

The overview article most relevant for this paper is due to Maynard [29], whose improvements of the GPY sieve is of crucial importance for the large gap results described in this paper.

Before we recall Maynard’s description, we should mention another milestone which, however, is not relevant for large gap results. The results were obtained by Yitang Zhang [30] from 2014. He proves the existence of infinitely many bounded gaps. He does not establish an admissible level of distribution

ϑ > 1 / 2

, which would imply the result, but succeeds in replacing the sum

\sum_{q \leq Q} max_{(a, q) = 1} |θ (x, q, a) - \frac{x}{ϕ (q)}|

by a sum over the smooth moduli.

We now come to the short description of the GPY method and its improvement by Maynard, closely following the paper “Small gaps between primes" by Maynard [29]. One of the main results of [29] is:

Theorem 5

(of [29]). Let

m \in N

. We have

\underset{n \to \infty}{lim inf} (p_{n + m} - p_{n}) ≪ m^{3} e^{4 m} .

Tao (in private communication to Maynard) has independently proven Theorem 5 (with a slightly weaker bound at much the same time).

Theorem 5 implies that for every

H > 0

there exist intervals whose lengths depend only on H with arbitrarily large initial point that contain at least H primes.

Now, we follow [29] for a short description of the GPY sieve and its improvement.

Let

U : = {h_{1}, \dots, h_{k}}

be an admissible k-tuple. One considers the sum

S (N, ρ) = \sum_{N \leq n \leq 2 N} (\sum_{i = 1}^{k} χ_{P} (n + h_{i}) - ρ) w_{n} .

Here,

χ_{P}

is the characteristic function of the primes, and

ρ > 0

and

w_{n}

are non-negative weights. If one can show that

S (N, ρ) > 0

, then at least one term in the sum over n must have a positive contribution. By the non-negativity of

w_{n}

, this means that there must be some integer

n \in [N, 2 N]

such that at least

⌊ ρ + 1 ⌋

of the

n + h_{i}

are prime.

The weights

w_{n}

are typically chosen to mimic Selberg sieve weights. The standard Selberg k-dimensional weights are

w_{n} : = (\sum_{\begin{matrix} d ∣ \prod_{i = 1}^{k} (n + h_{i}) \\ d < R \end{matrix}} λ_{d}), λ_{d} : = μ (d) {(log \frac{R}{d})}^{k} .

The key new idea in the paper [16] of Goldston, Pintz and Yildirim was to consider more general sieve weights of the form

λ_{d} : = μ (d) F (log \frac{R}{d})

for a suitable smooth function F.

Goldston, Pintz and Yildirim chose

F (x) : = x^{k + l}

for suitable

l \in N

, which has been shown to be essentially optimal, when k is large.

The new ingredient in Maynard’s method is to consider a more general form of the sieve weights

w_{n} = {(\sum_{d_{i} ∣ n + h_{i}, \forall i} λ_{d_{1} \dots d_{k}})}^{2} .

The results of [29] were modified and extended in the paper [15] “Dense clusters of primes in subsets” of Maynard. Some of his results and their applications will be described later in this paper.

4. Large Gaps with Improved Order of Magnitude and Its K-Version, Part I

Here, we state the theorems from [19,20] and sketch their proofs.

We number definitions and theorems in the following manner:

Definition (resp. Theorem) X of paper

[Y]

(in the list of references) is referred to as (

[Y]

, Definition (resp. Theorem) X).

We start with a list of the theorems from [19] and the definitions relevant for them:

Theorem 6

([19], Theorem 1, large prime gaps). For any sufficiently large X, one has

G (X) ≫ \frac{log X {log}_{2} X {log}_{4} X}{{log}_{3} X}

(12)

for sufficiently large X. The implied constant is effective.

Definition 6

([19], Definition (3.1)).

y : = c x \frac{log x {log}_{3} x}{{log}_{2} x},

where c is a certain (small) fixed positive constant.

Definition 7

([19], Definition (3.2)).

z : = x^{{log}_{3} x / (4 {log}_{2} x)} .

Definition 8

([19], Definitions (3.3)–(3.5)).

\begin{matrix} S & : = {s prime : {log}^{20} x < s \leq z} \\ P & : = {p prime : x / 2 < p \leq x} \\ Q & : = {q prime : x < q \leq y} . \end{matrix}

For congruence classes

\vec{a} : = {(a_{s} mod s)}_{s \in S}

and

\vec{b} : = {(b_{p} mod p)}_{p \in P}

define the sifted sets

S (\vec{a}) : = {n \in Z : n \equiv a_{s} (mod s) for all s \in S}

and likewise

S (\vec{b}) : = {n \in Z : n \equiv b_{p} (mod p) for all p \in P}

Theorem 7

([19], Theorem 2—sieving primes). Let x be sufficiently large and suppose that y obeys (7). Then, there are vectors

\vec{a} = {(a_{s} mod s)}_{s \in S} and \vec{b} = {(b_{p} mod p)}_{p \in P},

such that

# (Q \cap S (\vec{a}) \cap S (\vec{b})) ≪ \frac{x}{log x} .

Theorem 8

([19], Theorem 3, probabilistic covering). There exists a constant

C_{0} \geq 1

such that the following holds. Let

D, r, A

,

0 < κ \leq 1 / 2

, and let

m \geq 0

be an integer. Let

δ > 0

satisfy the smallness bound

δ \leq {(\frac{κ^{A}}{C_{0} exp (A D)})}^{10^{m + 2}}

Let

I_{1}, \dots, I_{m}

be disjoint finite non-empty sets and let V be a finite set. For each

1 \leq j \leq m

and

i \in I_{j}

, let

e_{i}

be a random finite subset of V. Assume the following:

(Edges not too large) With probability 1, we have for all $j = 1, \dots, m$ and $i \in I_{j}$

$# e_{i} \leq r_{i}$
(Each sieving step is sparse) For all $j = 1, \dots, m$ , $i \in I_{j}$ and $v \in V$ ,

$P (v \in e_{i}) \leq \frac{δ}{{(# I_{j})}^{1 / 2}}$
(Very small codegrees) For every $j = 1, \dots, m$ and distinct $v_{1}, v_{2} \in V$ ,

$\sum_{i \in I_{j}} P (v_{1}, v_{2} \in e_{i}) \leq δ$
(Degree bound) If for every $v \in V$ and $j = 1, \dots, m$ , we introduce the normalized degrees

$d_{I_{j}} (v) : = \sum_{i \in I_{j}} P (v \in e_{i})$

and then recursively define the quantities $P_{j} (v)$ for $j = 0, \dots, m$ and $v \in V$ by setting

$P_{0} (v) : = 1$

and

$P_{j + 1} (v) : = P_{j} (v) exp (- d_{I_{j + 1}} (v) / P_{j} (v))$

for $j = 0, \dots, m - 1$ and $v \in V$ , then we have

$d_{I_{j}} (v) \leq D P_{j - 1} (v), (1 \leq j \leq m, v \in V)$

and

$P_{j} (v) \geq κ (0 \leq j \leq m, v \in V) .$

Then, we can find random variables $e_{i}^{'}$ for each $i \in ⋃_{j = 1}^{m} I_{j}$ with the following properties:
(a)
For each $i \in ⋃_{j = 1}^{m} I_{j}$ , the essential support of $e_{i}^{'}$ is contained in the essential support of $e_{i}$ , union the empty set singleton ${\emptyset}$ . In other words, almost surely $e_{i}^{'}$ is either empty or is a set that $e_{i}$ also attains with positive probability.
(b)
For any $0 \leq J \leq m$ and any finite subset e of V with $# e \leq A - 2 v J$ , one has

$P (e \subseteq V ∖ ⋃_{j = 1}^{J} ⋃_{i \in I_{j}} e_{i}^{'}) = (1 + O_{\leq} (δ^{1 / 10^{J + 1}})) P_{j} (e)$

where

$P_{j} (e) : = \prod_{v \in e} P_{j} (v) .$

Corollary 1

([19], Corollary 4). Let

x \to \infty

. Let

P^{'}, Q^{'}

be sets with

# P^{'} \leq x

and

{({log}_{2} x)}^{3} < # Q^{'} \leq x^{100}

. For each

p \in P^{'}

, let

e_{p}

be a random subset of

Q^{'}

satisfying the size bound:

# e_{p} \leq r = O (\frac{log x {log}_{3} x}{{log}_{2}^{2} x}) (p \in P^{'})

Assume the following:

(Sparcity) For all $p \in P^{'}$ and $q \in Q^{'}$

$P (q \in e_{p}) \leq x^{- 1 / 2 - 1 / 10} .$
(Small codegrees) For any distinct $q_{1}, q_{2} \in Q^{'}$

$\sum_{p \in P^{'}} P (q_{1}, q_{2} \in e_{p}) \leq x^{- 1 / 20} .$
(Elements covered more than once in expectation) For all but at most $\frac{1}{{({log}_{2} x)}^{2}} # Q^{'}$ elements $q \in Q^{'}$ , we have:

$\sum_{q \in P^{'}} P (q \in e_{p}) = C + O_{\leq} (\frac{1}{{({log}_{2} x)}^{2}})$

for some quantity C, independent of q, satisfying

$\frac{5}{4} log 5 \leq C \leq 1 .$

Then, for any positive integer m with

$m \leq \frac{{log}_{3} x}{log 5}$

We can find random sets $e_{p}^{'} \subseteq Q^{'}$ for each $p \in P^{'}$ such that $e_{p}^{'}$ is either empty or a subset of $Q^{'}$ which $e_{p}$ attains with positive probability and that

$# {q \in Q^{'} : q \notin e_{p}^{'} f o r a l l p \in P^{'}} \sim 5^{- m} # Q^{'}$

with probability $1 - o (1)$ . More generally, for any $Q^{''} \subset Q^{'}$ with cardinality at least $(# Q^{'}) / \sqrt{{log}_{2} x}$ , one has

$# {q \in Q^{''} : q \notin e_{p}^{'} f o r a l l p \in P^{'}} \sim 5^{- m} # Q^{''}$

with probability $1 - o (1)$ . The decay rates in the $o (1)$ and ∼ notation are uniform in $P^{'}, Q^{'}, Q^{''}$ .

Theorem 9

([19], Theorem 4, random construction). Let x be a sufficiently large real number and define y by (7). Then, there is a quantity

C

with

C ≍ \frac{1}{c}

with the implied constants independent of c, a tuple of positive integers

(h_{1}, \dots, h_{r})

with

r \leq \sqrt{log x}

and some way to choose random vectors

\vec{a} = {(a_{s} mod s)}_{s \in S}

and

\vec{n} = {(n_{p})}_{p \in P}

of congruence classes

a_{s} mod s

and integers

n_{p}

respectively, obeying the following:

For every $\vec{a}$ in the essential range of $\vec{a}$ , one has

$P (q \in e_{p} (\vec{a}) | \vec{a} = \vec{a}) \leq x^{1 / 2 - 1 / 20} (p \in P),$

where

$e_{p} (\vec{a}) : = {n_{p} + h_{i} p : 1 \leq i \leq r} \cap Q \cap S (\vec{a}) .$
With probability $1 - o (1)$ , we have that

$# (Q \cap S (\vec{a})) \sim 80 c \frac{x}{log x} {log}_{2} x .$
Call an element $\vec{a}$ in the essential range of $\vec{a}$ good if, for all but at most $\frac{x}{log x {log}_{2} x}$ elements $q \in Q \cap S (\vec{a})$ , one has

$\sum_{p \in P} P (q \in e_{p} (\vec{a}) | \vec{a} = \vec{a}) = C + O_{\leq} (\frac{1}{{({log}_{2} x)}^{2}})$

Then, $\vec{a}$ is good with probability $1 - o (1)$ .
The theorem and definitions are from [20].

Theorem 10

([20], Theorem 1.1). There is a constant

c > 0

and infinitely many n, such that

p_{n + 1} - p_{n} \geq c \frac{log p_{n} {log}_{2} p_{n} {log}_{4} p_{n}}{{log}_{3} p_{n}}

and the interval

[p_{n}, p_{n + 1}]

contains the K-th power of a prime.

Definition 9

([20], Definitions (3.1)–(3.5)).

y : = c x \frac{log x {log}_{3} x}{{log}_{2} x},

where c is a fixed positive constant. Let

z : = x^{{log}_{3} x / (4 {log}_{2} x)} .

and introduce the three disjoint sets of primes

\begin{matrix} S & : = {s prime : {log}^{20} x < s \leq z} \\ P & : = {p prime : x / 2 < p \leq x} \\ Q & : = {q prime : x < q \leq y} . \end{matrix}

For residue classes

\vec{a} : = {(a_{s} mod s)}_{s \in S}

and

\vec{b} : = {(b_{p} mod p)}_{p \in P}

define the sifted sets

S (\vec{a}) : = {n \in Z : n \equiv a_{s} (mod s) f o r a l l s \in S}

and likewise

S (\vec{b}) : = {n \in Z : n \equiv b_{p} (mod p) f o r a l l p \in P} .

We set

\begin{matrix} A_{(K)} & : = {\vec{a} = {(a_{s} mod s)}_{s \in S} : \exists c_{s} such that \\ a_{s} \equiv 1 - {(c_{s} + 1)}^{K} (mod s), c_{s} \equiv - 1 (mod s)} \end{matrix}

\begin{matrix} B_{(K)} & : = {\vec{b} = {(b_{p} mod p)}_{p \in P} : \exists d_{p} such that \\ b_{p} \equiv 1 - {(d_{p} + 1)}^{K} (mod p), b_{p} \equiv - 1 (mod p)} . \end{matrix}

Theorem 11

([20], Theorem 3.1, sieving primes). Let x be sufficiently large and suppose that y obeys (7). Then, there are vectors

\vec{a} \in A_{(K)}

and

\vec{b} \in B_{(K)}

, such that

# (Q \cap S (\vec{a}) \cap S (\vec{b})) ≪ \frac{x}{log x} .

Theorem 12

([20], Theorem 4.1). (Has wording identical to [19], Theorem 3.)

Corollary 2

([20], Corollary 4.2). (Has wording identical to [19], Corollary 3.)

Theorem 13

([20], Theorem 4.3, random construction). (Has wording identical to [19], Theorem 4.)

Definition 10

([20], Definition 6.1). An admissible r-tuple is a tuple

(h_{1}, \dots, h_{r})

of distinct integers that do not cover all residue classes modulo p for any prime p.

For

(u, K) = 1

, we define

S_{u} : = {s : s prime, s \equiv u (mod K), {(log x)}^{20} < s \leq z}

d (u) = (u - 1, K), r^{*} (u) = \frac{1}{d (u)} \sum_{s \in S_{u}} s^{- 1} .

For

n \in [x, y]

, let

r (n, u) = \sum_{\begin{matrix} s \in S_{u} : \exists c_{s} : n \equiv 1 - {(c_{s} + 1)}^{K} (mod s) \\ c_{s} \equiv - 1 (mod s) \end{matrix}} s^{- 1} .

We set

\begin{matrix} G & = {n : n \in [x, y], | r (n, u) - r^{*} (u) | \leq {(log x)}^{- 1 / 40} \\ f o r a l l u (mod K), (u, K) = 1} . \end{matrix}

For an admissible r-tuple to be specified later and for primes p with

x / 2 < p \leq x

, we set

G (p) = {n \in G : n + (h_{i} - h_{l}) p \in G, \forall i, l \leq r} .

Theorem 14

([20], Theorem 6.2—Existence of good sieve weights). Let x be a sufficiently large real number and let y be any quantity obeying (7). Let

P, Q

be defined by Definitions 7 and 8. Let r be a positive integer with

r_{0} \leq r \leq {log}^{η_{0}} x

for some sufficiently large absolute constant

r_{0}

and some sufficinetly small

η_{0} > 0

.

Let

(h_{1}, \dots, h_{r})

be an admissible r-tuple contained in

[2 r^{2}]

. Then, one can find a positive quantity

τ \geq x^{- o (1)}

and a positive quantity

u = u (r)

depending only on r

u ≍ log r

and a non-negative function

w : P \times Z \to R^{+}

supported on

P \times (Z \cap [- y, y])

with the following properties:

$w (p, n) = 0$ unless

$n \equiv 1 - {(d_{p} + 1)}^{K} (mod p), f o r s o m e d_{p} \in Z$

$d_{p} \equiv - 1 (mod p) and n \in G (p) .$
Uniformly for every $p \in P$ , one has

$\sum_{n \in Z} w (p, n) = (1 + O (\frac{1}{{log}_{2}^{10} x})) τ \frac{y}{log x}$
Uniformly for every $q \in Q$ and $i = 1, \dots r$ , one has

$\sum_{p \in P} w (p, q - h_{i} p) = (1 + O (\frac{1}{{log}_{2}^{10} x})) τ \frac{u}{r} \frac{x}{2 {log}^{r} x}$
Uniformly for every $h = O (y / x)$ that is not equal to any of the $h_{i}$ , one has

$\sum_{q \in Q} \sum_{p \in P} w (p, q - h_{p}) = O (\frac{1}{{log}_{2}^{10} x} τ \frac{x}{{log}^{r} x} \frac{y}{log log x})$

uniformly for all $p \in P$ and $n \in Z$ .

$w (p, q) = O (x^{1 / 3 + o (1)}) .$

In [19], we have the following dependency graph for the proof of ([19], Theorem 1).

([19], Theorem 5) \Rightarrow ([19], Theorem 4) \Rightarrow ([19], Theorem 2) \Rightarrow ([19], Theorem 1) .

(13)

Replacing these theorems by their K-versions we obtain the following dependency graph for the K-version ([19], Theorem 1.1):

([20], Theorem 6.2) \Rightarrow ([20], Theorem 4.4) \Rightarrow ([20], Theorem 3.1) \Rightarrow ([20], Theorem 1.1) .

(14)

The graphs (13) and (14) can be combined in the graph:

\begin{matrix} Thm . 5 & \Rightarrow Thm . 4 & \Rightarrow Thm . 2 & \Rightarrow Thm . 1 \\ ↓ & ↓ & ↓ & ↓ \\ Thm . 6.2 & \Rightarrow Thm . 4.3 & \Rightarrow Thm . 3.1 & \Rightarrow Thm . 1.1 \end{matrix}

(15)

(with Theorems 1, 2, 4, 5 corresponding to [19] and Theorems 1.1, 3.1, 4.3, 6.2 corresponding to [20]).

The horizontal arrows indicate the deduction of Theorem B from Theorem A; the vertical arrows indicate the transition from Theorem A to its K-version Theorem A’.

Part I of “Large gaps with improved order of magnitude and its K-version” (Section 4) deals with the graph (16). The end of the graph, Theorem 5 and its K-version Theorem 6.2 is deduced from results of Maynard’s paper [15] “Dense clusters of primes in subsets”. The K-version, Theorem 6.2 is deduced from its K-version. These deductions make up Part II and are the contents of Section 5.

The graph (15) consists of segments, the last one being

\begin{matrix} Thm . 2 & \Rightarrow Thm . 1 \\ ↓ & ↓ \\ Thm . 3.1 & \Rightarrow Thm . 1.1 \end{matrix}

(16)

(with Theorems 1, 2 corresponding to [19] and Theorems 1.1, 3.1 corresponding to [20]).

We shall proceed segment by segment starting with (16). In this way, the transition from a theorem to its K-version should become more transparent.

We start with the upper string in (16):

Theorem 2 \Rightarrow Theorem 1 .

Let

\vec{a}

and

\vec{b}

be as in ([19], Definitions (3.3)–(3.5)). We extend the tuple

{(a_{p})}_{p \leq x}

of congruence classes

a_{p} mod p

for all primes

p \leq x

by setting

a_{p} : = b_{p}

for

p \in P

and

a_{p} : = 0

for

p \notin S \cup P

and consider the sifted set

T : = {n \in [y] ∖ [x] : n \equiv a_{p} (mod p) for all p \leq x} .

As in previous versions, one shows that the second residual set consists of a negligible set of smooth numbers and the set Q of primes. Thus, we find that

# T ≪ \frac{x}{log x} .

Next let C be a sufficiently large constant such that

# T

is less than the number of primes in

[x, C x]

. By matching each of these surviving elements to a distinct prime in

[x, C x]

and choosing congruence classes appropriately, we thus find congruence classes

a_{p} mod p

for

p \leq C x

which cover all of the integers in

(x, y)

. This finishes the deduction of Theorem 1 from Theorem 2.

K-version deduction of ([20], Theorem 1.1) from ([20], Theorem 3.1)

The first two sieving steps are the same as in the “upper string” of ([19], Theorem 2 ⇒ Theorem 1). Thus, the second residual set is again Q apart from a negligible set of smooth integers. The random choice in the remaining sieving steps now has to be modified.

Let

\begin{matrix} A_{(K)} & : = {\vec{a} = {(a_{s} mod s)}_{s \in S} : \exists c_{s} such that \\ a_{s} \equiv 1 - {(c_{s} + 1)}^{K} (mod s), c_{s} \equiv - 1 (mod s)} \end{matrix}

(17)

\begin{matrix} B_{(K)} & : = {\vec{b} = {(b_{p} mod p)}_{p \in P} : \exists d_{p} such that \\ b_{p} \equiv 1 - {(d_{p} + 1)}^{K} (mod p), b_{p} \equiv - 1 (mod p)} . \end{matrix}

(18)

One then has:

Theorem 15

([20], Theorem 3.1). Let x be sufficiently large and suppose that y obeys Definition 9. Then, there are vectors

\vec{a} \in A_{(K)}

and

\vec{b} \in B_{(K)}

, such that

# (Q \cap S (\vec{a}) \cap S (\vec{b})) ≪ \frac{x}{log x} .

(19)

We now sketch the deduction of ([20], Theorem 1.1) from ([20], Theorem 3.1).

Let

\vec{a}

and

\vec{b}

be as in ([20], Theorem 3.1). We extend the tuple

\vec{a}

to a tuple

{(a_{p})}_{p \leq x}

of congruence classes

a_{p} (mod p)

for all primes

p \leq x

by setting

a_{p} : = b_{p}

for

p \in P

and

a_{p} : = 0

for

p \notin S \cup P

. Again the sifted set

T : = {n \in [y] ∖ [x] : n \equiv a_{p} (mod p) for all p \leq x},

differs from the set

Q \cap S (\vec{a}) \cap S (\vec{b})

only by a negligible set of z-smooth integers. We find ([20], Lemma 3.2)

# T ≪ \frac{x}{log x} .

(20)

As in the “upper string deduction” ([19], Theorem 2) ⇒ ([19], Theorem 1) we now further reduce the sifted set

T

by using the prime numbers from the interval

[x, C_{0} x]

,

C_{0} > 1

being a sufficiently large constant.

One follows—with some modification in the notation—the papers [20,21]. One distinguishes the cases K odd and K even. We recall the following definition:

Definition 11

([20], Definition 3.3). Let

\tilde{P} = \{\begin{matrix} p : & x < p \leq C_{0} x, p \equiv 2 (mod 3), i f K is odd \\ p : & x < p \leq C_{0} x, p \equiv 3 (mod 3 K), if K is even . \end{matrix}

For K even and

δ > 0

, we set

U = {u \in [0, y] : (\frac{- u}{p}) = 1 for at most \frac{δ x}{log x} primes p \in \tilde{P}} .

By [21], we have:

Lemma 1.

# U ≪_{ϵ} x^{1 / 2 + ϵ}

.

Lemma 2.

There are pairs

(u, p_{u})

with

u \in T

,

p_{u} \in \tilde{P}

, such that all

u \in T

satisfy a congruence

u \equiv 1 - {(e_{u} + 1)}^{K} (mod p_{u}) where e_{u} \equiv - 1 (mod p_{u})

with the possible exceptions of u from an exceptional set V with

# V ≪ x^{1 / 2 + 2 ϵ} .

Proof.

If K is odd, the congruence

u \equiv 1 - {(e_{u} + 1)}^{K} (mod p) (with the variable e_{u})

is solvable, whenever

p \equiv 2 (mod 3)

.

If K is even, the congruence is solvable whenever

p \equiv 3 (mod 2 K)

and

(\frac{- u}{p}) = 1

. The claim now follows from Lemma 1. □

We now conclude the deduction of Theorem 1.1 by the application of the matrix method. The following definition is borrowed from [31].

Definition 12.

Let us call an integer

q > 1

a “good” modulus if

L (s, χ) \neq 0

for all characters

χ mod q

and all

s = σ + i t

with

σ > 1 - \frac{c_{2}}{log (q (| t | + 1))} .

This definition depends on the size of

c_{2} > 0

.

Lemma 3.

There is a constant

c_{2} > 0

, such that, in terms of

c_{2}

, there exist arbitrarily large values of x, for which the modulus

P (x) = \prod_{p < x} p

is good.

Remark 1.

This is Lemma 1 of [31].

Lemma 4

Let q be a good modulus. Then,

π (x; q, a) ≫ \frac{x}{ϕ (q) log x},

where

ϕ (\cdot)

denotes Euler’s totient function, uniformly for

(a, q) = 1

and

x \geq q^{D}

. Here, the constant D depends only on the value of

c_{2}

in Lemma 3.

Remark 2.

This result, which is due to Gallagher [32], is Lemma 2 from [31].

We now define the matrix

M

.

Definition 13.

Choose x, such that

P (C_{0} x)

is a good modulus. Let

\vec{a} \in A_{(K)}

and

\vec{b} \in B_{(K)}

be given. From the definition of

A_{(K)}

and

B_{(K)}

, there are

{(c_{s} mod s)}_{s \in S} and {(d_{p} mod p)}_{p \in P}, c_{s} \equiv - 1 (mod s), d_{p} \equiv - 1 (mod p),

such that

\vec{a} = {(1 - {(c_{s} + 1)}^{K} mod s)}_{s \in S} and \vec{b} = {(1 - {(d_{p} + 1)}^{K} mod p)}_{p \in P} .

We now determine

m_{0}

by

1 \leq m_{0} < P (C_{0} x)

and the congruences

\begin{matrix} m_{0} & \equiv c_{s} (mod s) \\ m_{0} & \equiv d_{p} (mod p) \\ m_{0} & \equiv 0 (mod q), q \in (1, x], q \notin S \cup P \\ m_{0} & \equiv e_{u} (mod p_{u}), (e_{u}, p_{u}) given by Lemma 2 \\ m_{0} & \equiv g_{p} (mod p), for all other primes p \leq C_{0} x, g_{p} arbitrary . \end{matrix}

(21)

By the Chinese Remainder Theorem

m_{0}

is uniquely determined. We let

M = {(a_{r, u})}_{\begin{matrix} 1 \leq r \leq P {(x)}^{D - 1} \\ 1 \leq u \leq y \end{matrix}}

with

a_{r, u} = {(m_{0} + 1 + r P (x))}^{K} + u - 1 .

For

1 \leq r \leq P {(x)}^{D - 1}

, we denote by

R (r) = {(a_{r, u})}_{0 \leq u \leq y}

the r-th row of

M

and for

0 \leq u \leq y

, we denote by

C (u) = {(a_{r, u})}_{1 \leq r \leq P {(x)}^{D - 1}}

the u-th column of

M

.

Lemma 5.

We have that

a_{r, u}

,

2 \leq u \leq y

is composite unless

u \in V

.

Proof.

From the congruences

m_{0} \equiv c_{s} (mod s), and . m_{0} \equiv d_{p} (mod p), m_{0} \equiv 0 (mod q), m_{0} \equiv e_{u} (mod p_{u})

in (21), it follows that for

u \equiv 1 - {(d_{p} + 1)}^{K} (mod p), u \equiv 0 (mod q), u \equiv 1 - {(e_{u} + 1)}^{K} (mod p_{u})

we have

a_{r, u} \equiv 0 (mod s), and . a_{r, u} \equiv 0 (mod p), a_{r, u} \equiv 0 (mod q), a_{r, u} \equiv 0 (mod p_{u}) .

□

Definition 14.

Let

R_{0} (M) : = {r : 1 \leq r \leq P {(x)}^{D - 1}, m_{0} + 1 + r P (x) is prime},

R_{1} (M) : = {r : 1 \leq r \leq P {(x)}^{D - 1}, r \in R_{0} (M), R (r) contains a prime number} .

Remark 3.

We observe that each

a_{r - 1}

row

R (r)

with

r \in R_{0} (M)

has as its first element

a_{r - 1} = {(m_{0} + 1 + r P (x))}^{K},

the K-th power of the prime

m_{0} + 1 + r P (x)

.

If

r \in R_{0} (M) ∖ R_{1} (M)

,

a_{r, 1}

is the K-th power of a prime of the desired kind. To deduce Theorem 5 from Theorem 15, it thus remains to show that

R_{0} (M) ∖ R_{1} (M)

is nonempty.

Lemma 6.

We have

# R_{0} (M) ≫ \frac{P {(x)}^{D}}{ϕ (P (x)) log (P {(x)}^{D})} .

Proof.

This follows from Lemma 4. □

We obtain an upper estimate for

R_{1} (M)

by the observation that, if

R (r)

contains a prime number, then

m_{0} + 1 + r P (x) and {(m_{0} + 1 + r P (x))}^{K} + v - 1

are primes for some

v \in V

.

The number

\begin{matrix} t (v) = & # {r : 1 \leq r \leq P {(x)}^{D - 1}, m_{0} + 1 + r P (x) \\ and {(m_{0} + 1 + r P (x))}^{K} + v - 1 are primes} \end{matrix}

is estimated by standard sieves as in Lemma 6.1 of [21].

This concludes the deduction of Theorem 5 from Theorem 15. We now come to the next section in graph (16).

\begin{matrix} Thm . 4 & \Rightarrow Thm . 2 \\ ↓ & ↓ \\ Thm . 4.3 & \Rightarrow Thm . 3.1 \end{matrix}

We first state a hypergraph covering theorem (Theorem 3 of [19]) of a purely combinatorial nature, generalizing a result of Pippenger and Spencer [24] using the Rödl nibble method [25]. We also state a corollary.

Both the deduction of Theorem 2 (Theorem 7) from Theorem 4 (Theorem 17) and its K-version, the deduction of Theorem 15 from Theorem 18, are based on Theorem 3 of [19].

Theorem 16

(Theorem 3 of [19], Probabilistic covering). There exists a constant

C_{0} \geq 1

such that the following holds. Let

D, r, A \geq 1

and let

0 < κ \leq 1 / 2

,

m \geq 0

be an integer. Let

δ > 0

satisfy the smallness bound

δ \leq {(\frac{κ^{A}}{C_{0} exp (A D)})}^{10^{m + 2}}

(22)

Let

I_{1}, \dots, I_{m}

be disjoint finite non-empty sets and let V be a finite set. For each

1 \leq j \leq m

and

i \in I_{j}

, let

e_{i}

be a random finite subset of V. Assume the following:

(Edges not too large) Almost surely for all $j = 1, \dots, m$ and $i \in I_{j}$ , we have

$# e_{i} \leq r_{i}$
(Each sieving step is sparse) For all $j = 1, \dots, m$ , $i \in I_{j}$ and $v \in V$ ,

$P (v \in e_{i}) \leq \frac{δ}{{(# I_{j})}^{1 / 2}}$

(23)
(Very small codegrees) For every $j = 1, \dots, m$ and distinct $v_{1}, v_{2} \in V$ ,

$\sum_{i \in I_{j}} P (v_{1}, v_{2} \in e_{i}) \leq δ$

(24)
(Degree bound) If for every $v \in V$ and $j = 1, \dots, m$ , we introduce the normalized degrees

$d_{I_{j}} (v) : = \sum_{i \in I_{j}} P (v \in e_{i})$

(25)

and then recursively define the quantities $P_{j} (v)$ for $j = 0, \dots, m$ and $v \in V$ by setting

$P_{0} (v) : = 1$

(26)

and

$P_{j + 1} (v) : = P_{j} (v) exp (- d_{I_{j + 1}} (v) / P_{j} (v))$

(27)

for $j = 0, \dots, m - 1$ and $v \in V$ , then we have

$d_{I_{j}} (v) \leq D P_{j - 1} (v), (1 \leq j \leq m, v \in V)$

(28)

and

$P_{j} (v) \geq κ (0 \leq j \leq m, v \in V) .$

(29)

Then, we can find random variables $e_{i}^{'}$ for each $i \in ⋃_{j = 1}^{m} I_{j}$ with the following properties:
(a)
For each $i \in ⋃_{j = 1}^{m} I_{j}$ , the essential support of $e_{i}^{'}$ is contained in the essential support of $e_{i}$ , union the empty set singleton ${\emptyset}$ . In other words, almost surely $e_{i}^{'}$ is either empty or is a set that $e_{i}$ also attains with positive probability.
(b)
For any $0 \leq J \leq m$ and any finite subset e of V with $# e \leq A - 2 r J$ , one has

$P (e \subseteq V ∖ ⋃_{j = 1}^{J} ⋃_{i \in I_{j}} e_{i}^{'}) = (1 + O_{\leq} (δ^{1 / 10^{J + 1}})) P_{j} (e)$

(30)

where

$P_{j} (e) : = \prod_{v \in e} P_{j} (v) .$

(31)

The proof, which we will not give in this paper, is given in Section 5 of [19].

We have the following:

Corollary 3

(Corollary 4 of [19]). Let

x \to \infty

. Let

P^{'}, Q^{'}

be sets with

# P^{'} \leq x

and

# Q^{'} > {({log}_{2} x)}^{3}

. For each

p \in P^{'}

, let

e_{p}

be a random subset of

Q^{'}

satisfying the size bound:

# e_{p} \leq r = O (\frac{log x {log}_{3} x}{{log}_{2}^{2} x}) (p \in P^{'})

(32)

Assume the following:

(Sparsity) For all $p \in P^{'}$ and $q \in Q^{'}$

$P (q \in e_{p}) \leq x^{- 1 / 2 - 1 / 10} .$

(33)
(Uniform covering) For all but at most $\frac{1}{{({log}_{2} x)}^{2}} # Q^{'}$ elements $q \in Q^{'}$ , we have:

$\sum_{p \in P^{'}} P (q \in e_{p}) = C + O_{\leq} (\frac{1}{{({log}_{2} x)}^{2}})$

(34)

for some quantity C, independent of q, satisfying

$\frac{5}{4} log 5 \leq C \leq 1 .$

(35)
(Small codegrees) For any distinct $q_{1}, q_{2} \in Q^{'}$

$\sum_{p \in P^{'}} P (q_{1}, q_{2} \in e_{p}) \leq x^{- 1 / 20} .$

(36)

Then, for any positive integer m with

$m \leq \frac{{log}_{3} x}{log 5}$

(37)

we can find random sets $e_{p}^{'} \subseteq Q^{'}$ for each $p \in P^{'}$ such that

$# {q \in Q^{'} : q \in e_{p}^{'} for all p \in P^{'}} \sim 5^{- m} # Q^{'}$

with probability $1 - o (1)$ . More generally, for any $Q^{''} \subset Q^{'}$ with cardinality at least $(# Q^{'}) / \sqrt{{log}_{2} x}$ , one has

$# {q \in Q^{''} : q \notin e_{p}^{'} f o r a l l p \in P^{'}} \sim 5^{- m} # Q^{''}$

with probability $1 - o (1)$ . The decay rates in the $o (1)$ and ∼ notation are uniform in $P^{'}, Q^{'}, Q^{''}$ .

Proof.

For the proof, we refer to [19]. □

Theorem 17

([19], Theorem 4, Random construction). Let x be a sufficiently large real number and define y by Definition 9. Then, there is a quantity

C

with

C \approx \frac{1}{c}

(38)

with the implied constants independent of c, a tuple of positive integers

(h_{1}, \dots, h_{r})

with

r \leq \sqrt{log x}

and some way to choose random vectors

\vec{a} = {(a_{s} mod s)}_{s \in S}

and

\vec{n} = {(n_{p})}_{p \in P}

of congruence classes

a_{s} mod s

and integers

n_{p}

respectively, obeying the following:

For every $\vec{a}$ in the essential range of $\vec{a}$ , one has

$P (q \in e_{p} (\vec{a}) | \vec{a} = \vec{a}) \leq x^{1 / 2 - 1 / 10} (p \in P),$

(39)

where

$e_{p} (\vec{a}) : = {n_{p} + h_{i} p : 1 \leq i \leq r} \cap Q \cap S (\vec{a}) .$

(40)
With probability $1 - o (1)$ , we have that

$# (Q \cap S (\vec{a})) \sim 80 c \frac{x}{log x} {log}_{2} x .$
Call an element $\vec{a}$ in the essential range of $\vec{a}$ good if, for all but at most $\frac{x}{log x {log}_{2} x}$ elements $q \in Q \cap S (\vec{a})$ , one has

$\sum_{p \in P} P (q \in e_{p} (\vec{a}) | \vec{a} = \vec{a}) = C + O_{\leq} (\frac{1}{{({log}_{2} x)}^{2}})$

(41)

Then, $\vec{a}$ is good with probability $1 - o (1)$ .

We now show that Theorrem 17 implies Theorem 16. By (38), we may choose

0 < c < 1 / 2

small enough so that (35) holds. Take

m : = [\frac{{log}_{3} x}{log 5}] .

Now, let

\vec{a}

and

\vec{n}

be the random vectors guaranteed by Theorem 17. Suppose that we are in the probability

1 - o (1)

event that

\vec{a}

takes a value

\vec{a}

which is good and such that (40) holds. Fix some

\vec{a}

within this event. We may apply Corollary 3 with

P^{'} = P

and

Q^{'} = Q \cap S (\vec{a})

for the random variables

n_{p}

conditioned to

\vec{a} = \vec{a}

. A few hypotheses of the corollary must be verified. First, (34) follows easily. The small codegree condition (36) is also quickly checked. Indeed, for distinct

q_{1}, q_{2} \in Q^{'},

if

q_{1}, q_{2} \in e_{p} (\vec{a})

then

p | q_{1} - q_{2}

. But

q_{1} - q_{2}

is a nonzero integer of size at most

x log x

and is thus divisible by at most one prime

p_{0} \in P^{'}

. Hence

\sum_{p \in P^{'}} P (q_{1}, q_{2} \in e_{p} (\vec{a})) = P (q_{1}, q_{2} \in e_{p_{0}} (\vec{a})) \leq x^{- 1 / 2 - 1 / 10},

the sum on the left side being zero if

p_{0}

does not exist.

By Corollary 3, there exist random variables

e_{p}^{'} (\vec{a})

, whose essential range is contained in the essential range of

e_{p} (\vec{a})

together with ∅ and satisfying

{q \in Q \cap S (\vec{a}) : q \notin e_{p}^{'} (\vec{a}) for all p \in P} \sim 5^{- m} # (Q \cap S (\vec{a})) ≪ \frac{x}{log x}

with probability

1 - o (1)

, where we have used (40). Since

e_{p}^{'} (\vec{a}) = {n_{p}^{'} + h_{i} p : 1 \leq i \leq r} \cap Q \cap S (\vec{a})

for some random integer

n_{p}^{'}

, it follows that

{q \in Q \cap S (\vec{a}) : q \equiv n_{p}^{'} (mod p) for all p \in P} ≪ \frac{x}{log x}

with probability

1 - o (1)

. Taking a specific

\vec{n^{'}} = \vec{n^{'}}

for which this relation holds and setting

b_{p} = n_{p}^{'}

for all p concludes the proof of claim (17) and establishes Theorem 7 (Theorem 2 of [19]).

We now come to the K-version of the deduction Theorem 4 ⇒ Theorem 2, “the lower string” Theorem 4.3 ⇒ Theorem 3.1 of the section

\begin{matrix} Thm . 4 & \Rightarrow Thm . 2 \\ ↓ & ↓ \\ Thm . 4.3 & \Rightarrow Thm . 3.1 \end{matrix}

Theorem 18

([20], Theorem 4.18—Random construction). Let x be a sufficiently large real number and define y by Definition 9. Then, there is a quantity C with

C ≍ \frac{1}{c}

with the implied constants independent of c, a tuple of positive integers

(h_{1}, \dots, h_{r})

with

r \leq \sqrt{log x}

and some way to choose random vectors

\vec{a} = {(a_{s} mod s)}_{s \in S}

and

\vec{n} = {(n_{p})}_{p \in P}

of congruence classes

a_{s} mod s

and integers

n_{p}

, respectively, obeying the following:

For every $\vec{a}$ in the essential range of $\vec{a}$ , one has

$P (q \in e_{p} (\vec{a}) | \vec{a} = \vec{a}) \leq x^{- 1 / 2 - 1 / 10} (p \in P),$

where $e_{p} (\vec{a}) : = {n_{p} + h_{i} p : 1 \leq i \leq r} \cap Q \cap S (\vec{a})$ .
With probability $1 - o (1)$ , we have that

$# (Q \cap S (\vec{a})) \sim 80 c \frac{x}{log x} {log}_{2} x .$
Call an element $\vec{a}$ in the essential range of $\vec{a}$ good if, for all but at most $\frac{x}{log x {log}_{2} x}$ elements $q \in Q \cap S (\vec{a})$ , one has

$\sum_{p \in P} P (q \in e_{p} (\vec{a}) | \vec{a} = \vec{a}) = C + O_{\leq} (\frac{1}{{({log}_{2} x)}^{2}}) .$

Then, $\vec{a}$ is good with probability $1 - o (1)$ .

Remark 4.

The wording of Theorem 18 is the same as the wording of ([19], Theorem 4). However, the contents of these two theorems are different, since the term essential range has different meaning.

In Theorem 17

\vec{a}

and

\vec{b}

, assume values of the form

\vec{a} \in_{s \in S} {(a_{s} mod s)}_{s \in S}

and

p \in P {(b_{p} mod p)}_{p \in P}

, whereas in Theorem 18 they are of the form

\vec{a} = (a_{s} mod s, s \in S, \exists c_{s} such that a_{s} \equiv 1 - {(c_{s} + 1)}^{K} (mod s), c_{s} \equiv - 1 (mod s))

\vec{b} = (b_{p} mod p, p \in P, \exists d_{p} such that b_{p} \equiv 1 - {(d_{p} + 1)}^{K} (mod p), d_{p} \equiv - 1 (mod p)) .

Also, the wording of the deduction of Theorem 15 from Theorem 18 is the same as the deduction of Theorem 7 (Theorem 2 of [19]) from Theorem 17 (Theorem 4 of [19]).

We come to the section:

\begin{matrix} Thm . 5 & \Rightarrow Thm . 4 \\ ↓ & ↓ \\ Thm . 6.2 & \Rightarrow Thm . 4.3 \end{matrix}

(42)

of graph (16).

The proof of this theorem relies on the estimates for multidimensional prime-detecting sieves established by the fourth author in [19].

We show now that Theorem 14 implies Theorem 17.

Let

x, y, z, S, P, Q

be as in Theorem 17. We set r to be the maximum value permitted by Theorem 14, namely

r = [{log}^{1 / 5} x]

(43)

and let

(h_{1}, \dots, h_{r})

be the admissible r-tuple consisting of the first r primes larger than r; thus,

h_{i} = p_{π (r) + i}

for

i = 1, \dots, r

. From the prime number theorem, we have

h_{i} = O (r log r)

for

i = 1, \dots, r

and so we have

h_{i} = [2 r^{2}]

for

i = 1, \dots, r

if x is large enough. We now invoke Theorem 14 to obtain quantities

τ, u

and a weight

w : P \times Z \to R^{+}

with the stated properties.

For each

p \in P

, let

{\tilde{n}}_{p}

denote the random integer with probability density

P ({\tilde{n}}_{p} = n) : = \frac{w (p, n)}{\sum_{n^{'} \in Z} w (p, n^{'})}

for all

n \in Z

(we will not need to impose any independence condition on

{\tilde{n}}_{p}

). We have

\sum_{p \in P} P (q = {\tilde{n}}_{p} + h_{i} p) = (1 + O (\frac{1}{{log}_{2}^{10} x})) \frac{u}{r} \frac{x}{2 y} (q \in Q, 1 \leq i \leq r)

(44)

Also, one has

P ({\tilde{n}}_{p} = n) ≪ x^{- 1 / 2 - 1 / 6 + o (1)}

(45)

for all

p \in P

and

n \in Z

.

We choose the random vector

\vec{a} : = {(a_{s} mod s)}_{s \in S}

by selecting each

a_{s} mod s

uniformly at random from

Z / s Z

, independently in s and independently of the

{\tilde{n}}_{p}

.

The resulting sifted set

S (\vec{a})

is a random periodic subset of

Z

with density

σ : = \prod_{s \in S} (1 - \frac{1}{s}) .

From the prime number theorem (with sufficiently strong error term),

σ = (1 + O (\frac{1}{{log}_{2}^{10} x})) \frac{log ({log}^{20} x)}{log z} = (1 + O (\frac{1}{{log}_{2}^{10} x})) \frac{80 {log}_{2} x}{log x {log}_{2} x / {log}_{2} x},

so in particular we see that

σ y = (1 + O (\frac{1}{{log}_{2}^{10} x})) 80 c x {log}_{2} x .

(46)

We also see from (43) that

σ^{r} = x^{o (1)} .

We have a useful correlation bound:

Lemma 7.

Let

t \leq log x

be a natural number and let

n_{1}, \dots, n_{t}

be distinct integers of magnitude

O (x^{O (1)})

. Then, one has

P (n_{1}, \dots, n_{t} \in S (\vec{a})) = (1 + O (\frac{1}{{log}^{16} x})) σ^{t} .

Proof.

For each

s \in S

, the integers

n_{1}, \dots, n_{t}

occupy t distinct residue classes modulo s, unless s divides one of

n_{i} - n_{j}

for

1 \leq i < t

. Since

s \geq {log}^{20} x

and

n_{i} - n_{j}

are of size

O (x^{O (1)})

, the latter possibility occurs at most

O (t^{2} log x) = O ({log}^{3} x)

times. Thus, the probability that

\vec{a} mod s

avoids all of the

n_{1}, \dots, n_{t}

is equal to

1 - \frac{t}{s}

except for

O ({log}^{3} x)

values of s, where it is instead

(1 + O (\frac{1}{{log}^{19} x})) (1 - \frac{t}{s}) .

Thus,

\begin{matrix} P (n_{1}, \dots, n_{t} \in S (\vec{a})) & = {(1 + O (\frac{1}{{log}^{19} x}))}^{O ({log}^{3} x)} \prod_{s \in S} (1 - (\frac{t}{s})) \\ = (1 + O (\frac{1}{{log}^{16} x})) σ^{t} \prod_{s \in S} (1 + O (\frac{t^{2}}{s^{2}})) \\ = (1 + O (\frac{1}{{log}^{16} x})) σ^{t} . \end{matrix}

□

Among other things, this gives claim (40):

Corollary 4.

With probability

1 - o (1)

, we have

E # (Q \cap S (\vec{a})) = (1 + O (\frac{1}{{log}^{16} x})) σ # Q

and

E # {(Q \cap S (\vec{a}))}^{2} = (1 + O (\frac{1}{{log}^{16} x})) (σ # Q + σ^{2} (# Q) (# Q - 1))

and so by the prime number theorem we see that the random variable

# Q \cap S (\vec{a})

has mean

(1 + o (\frac{1}{{log}_{2} x})) σ \frac{y}{log x}

and variance

O (\frac{1}{{log}^{16} x} {(σ \frac{y}{log x})}^{2}) .

The claim then follows from Chebyshev’s inequality (with plenty of room to spare).

For each

p \in P

, we consider the quantity

X_{p} (\vec{a}) : = P ({\vec{n}}_{p} + h_{i} p \in S (\vec{a}) for all i = 1, \dots, r),

(47)

and let

P (\vec{a})

denote the set of all the primes

p \in P

such that

X_{p} (\vec{a}) : = (1 + O_{\leq} (\frac{1}{{log}^{3} x})) σ^{r} .

(48)

In light of Lemma 7, we expect most primes in P to lie in

P (\vec{a})

and this will be confirmed below in Lemma 9. We now define the random variables

n_{p}

as follows. Suppose we are in the event

\vec{a} = \vec{a}

for some

\vec{a}

in the range of

\vec{a}

. If

p \in P ∖ P (\vec{a})

, we set

n_{p} = 0

. Otherwise, if

p \in P (\vec{a})

, we define

n_{p}

to be the random integer with conditional probability distribution

P ({\vec{n}}_{p} = n | \vec{a} = \vec{a}) : = \frac{Z_{p} (\vec{a}; n)}{X_{p} (\vec{a})}, Z_{p} (\vec{a}; n) = 1_{n + h_{j} p \in S (\vec{a}) for j = 1, \dots r} P ({\tilde{n}}_{p} = n),

(49)

with the

n_{p}

(p \in P (\vec{a}))

jointly independent, conditionally on the event

\vec{a} = \vec{a}

. From (47), we see that these random variables are well defined.

Lemma 8.

With probability

1 - o (1)

, we have

σ^{- r} \sum_{i = 1}^{r} \sum_{p \in P (\vec{a})} Z_{p} (\vec{a}; q - h_{i} p) = (1 + O (\frac{1}{{log}_{2}^{3} x})) \frac{u}{σ} \frac{x}{2 y}

(50)

for all but at most

\frac{x}{2 log x {log}_{2} x}

of the primes

q \in Q \cap S (\vec{a})

.

Let

\vec{a}

be good and

q \in Q \cap S (\vec{a})

. Substituting definition (49) into the left-hand side of (50), using (48), and observing that

q = n_{p} + h_{i} p

is only possible if

p \in P (\vec{a})

, we find that

\begin{matrix} σ^{- r} \sum_{i = 1}^{r} \sum_{p \in P (\vec{a})} Z_{p} (\vec{a}; q - h_{i} p) & = σ^{- r} \sum_{i = 1}^{r} \sum_{p \in P (\vec{a})} X_{p} (\vec{a}) P (n_{p} = q - h_{i} p | \vec{a} = \vec{a}) \\ = (1 + O (\frac{1}{{log}^{3} x})) \sum_{i = 1}^{r} \sum_{p \in P (\vec{a})} P (n_{p} = q - h_{i} p | \vec{a} = \vec{a}) \\ = (1 + O (\frac{1}{{log}^{3} x})) \sum_{p \in P} P (q \in e_{p} (\vec{a}) | \vec{a} = \vec{a}) . \end{matrix}

where

e_{p} (\vec{a}) = {n_{p} + h_{i} p : 1 \leq i \leq r} \cap Q \cap S (\vec{a})

is as defined in Theorem 17 (Theorem 4 of [19]). Relation (41) (that is,

\vec{a}

is good with probability

1 - o (1)

) follows upon noting that by (43) and (46),

C : = \frac{u}{σ} \frac{x}{2 y} \sim \frac{1}{c} .

Before proving Lemma 8, we first confirm that

P ∖ P (\vec{a})

is small with high probability.

Lemma 9.

With probability

1 - O (1 / {log}^{3} x)

,

P (\vec{a})

contains all but

O (\frac{1}{{log}^{3} x} \frac{x}{log x})

of the primes

p \in P

. In particular,

E # P (\vec{a}) = # P (1 + O (\frac{1}{{log}^{3} x})) .

Proof.

By linearity of expectation and Markov’s inequality, it suffices to show that for each

p \in P

, we have

p \in P (\vec{a})

with probability

1 - O (\frac{1}{{log}^{6} x})

. It suffices to show that

E X_{p} (\vec{a}) = P ({\tilde{n}}_{p} + h_{i} p \in S (\vec{a}) for all i = 1, \dots r) = (1 + O (\frac{1}{{log}^{12} x})) σ^{r}

(51)

and

E X_{p} {(\vec{a})}^{2} = P ({\tilde{n}}_{p}^{(1)} + h_{i} p, {\tilde{n}}_{p}^{(2)} + h_{i} p \in S (\vec{a}) for all i = 1, \dots r) = (1 + O (\frac{1}{{log}^{12} x})) σ^{2 r}

(52)

where

{\tilde{n}}_{p}^{(1)}

,

{\tilde{n}}_{p}^{(2)}

are independent copies of

{\tilde{n}}_{p}

that are also independent of

\vec{a}

. □

The claim (50) follows from Lemma 7 (performing the conditional expectation over

{\tilde{n}}_{p}

first). A similar application of Lemma 7 allows one to write the left-hand side of (52) as

(1 + O (\frac{1}{{log}^{16} x})) E σ^{# {{\tilde{n}}_{p}^{(l)} + h_{i} p : i = 1, \dots, r; l = 1, 2}} .

From (44), we see that the quantity

# {{\tilde{n}}_{p}^{(l)} + h_{i} p : i = 1, \dots, r; l = 1, 2}

is equal to

2 r

with probability

1 - O (x^{- 1 / 2 - 1 / 6 + o (1)})

and is less than

2 r

otherwise. The claim now follows from (46).

(Proof of Lemma 8).

We first show that replacing

P (\vec{a})

with P has negligible effect on the sum, with probability

1 - o (1)

. Fix i and substitute

n = q - h_{i} p

. By Markov’s inequality, it suffices to show that

E \sum_{n} σ^{- r} \sum_{p \in P ∖ P (\vec{a})} Z_{p} (\vec{a}; n) = o (\frac{u}{σ} \frac{x}{2 y} \frac{1}{r} \frac{1}{{log}_{2}^{3} x} \frac{x}{log x {log}_{2} x}) .

(53)

by Lemma 7, we have

\begin{matrix} E \sum_{n} σ^{- r} \sum_{p \in P} Z_{p} (\vec{a}; n) & = σ^{- r} \sum_{p \in P} \sum_{n} P ({\tilde{n}}_{p} = n) P (n + h_{j} p \in S (\vec{a}) for j = 1, \dots, r) \\ = (1 + O (\frac{1}{{log}^{16} x})) # P . \end{matrix}

(54)

Next, by (47) and Lemma 9 we have

\begin{matrix} E \sum_{n} σ^{- r} \sum_{p \in P (\vec{a})} Z_{p} (\vec{a}; n) & = σ^{- r} \sum_{\vec{a}} P (\vec{a} = \vec{a}) \sum_{p \in P (\vec{a})} X_{p} (\vec{a}) \\ = (1 + O (\frac{1}{{log}^{3} x})) E # P (\vec{a}) = (1 + O (\frac{1}{{log}^{3} x})) # P; \end{matrix}

subtracting, we conclude that the left-hand side of (53) is

O (# P / {log}^{3} x) = O (x / {log}^{4} x)

. The claim then follows from (42). By (53), it suffices to show that with probability

1 - o (1)

, for all but at most

\frac{x}{2 log x {log}_{2} x}

primes

q \in Q \cap S (\vec{a})

, one has

\sum_{i = 1}^{r} \sum_{p \in P} Z_{p} (\vec{a}; q - h_{i} p) = (1 + O_{\leq} (\frac{1}{{log}_{2}^{3} x})) σ^{r - 1} u \frac{x}{2 y} .

(55)

Call a prime

q \in Q

bad if

q \in Q \cap S (\vec{a})

but (55) fails. Using Lemma 7 and (44), we have

\begin{matrix} E [\sum_{q \in Q \cap S (\vec{a})} \sum_{i = 1}^{r} \sum_{p \in P} Z_{p} (\vec{a}; q - h_{i} p)] \\ = \sum_{q, i, p} P (q + (h_{j} - h_{i}) p \in S (\vec{a}) for all j = 1, \dots, r) P ({\tilde{n}}_{p} = q - h_{i} p) \\ = (1 + O (\frac{1}{{log}_{2}^{10} x})) \frac{σ y}{log x} σ^{r - 1} u \frac{x}{2 y} \end{matrix}

and

\begin{matrix} E [\sum_{q \in Q \cap S (\vec{a})} {(\sum_{i = 1}^{r} \sum_{p \in P} Z_{p} (\vec{a}; q - h_{i} p))}^{2}] \\ = \sum_{\begin{matrix} p_{1}, p_{2}, q \\ i_{1}, i_{2} \end{matrix}} P (q + (h_{j} - h_{i_{l}}) p_{l} \in S (\vec{a}) for all j = 1, \dots, r; l = 1, 2) \\ \times P ({\tilde{n}}_{p_{1}}^{(1)} = q - h_{i_{1}} p_{1}) P ({\tilde{n}}_{p_{2}}^{(2)} = q - h_{i_{2}} p_{2}) \\ = (1 + O (\frac{1}{{log}_{2}^{10} x})) \frac{σ y}{log x} {(σ^{r - 1} u \frac{x}{2 y})}^{2}, \end{matrix}

where

{({\tilde{n}}_{p_{1}}^{(1)})}_{p_{1} \in P}

and

{({\tilde{n}}_{p_{2}}^{(2)})}_{p_{2} \in P}

are independent copies of

{({\tilde{n}}_{p})}_{p \in P}

over

\vec{a}

. In the last step, we used the fact that the terms with

p_{1} = p_{2}

contribute negligibly.

By Chebyshev’s inequality, it follows that the number of bad q is

≪ \frac{σ y}{log x} \frac{1}{{log}_{2}^{3} x} ≪ \frac{x}{log x {log}_{2}^{2} x}

with probability

1 - O (1 / {log}_{2} x)

. □

We now come to the K-version, the “lower string” Theorem 6.2 ⇒ Theorem 4.3 of section (42).

Like in the “upper string” in Theorem 5 of [19], a certain weight function w is of importance. The construction of w will be modelled on the construction of the function w in [19], Theorem 5.

The restrictions

\vec{a} \in A_{(K)}

,

\vec{b} \in B_{(K)}

bring some additional complications. The function

w (p, n)

will be different from zero only if n belongs to a set

G (p)

of p-good integers. The definition of

G (p)

is based on the set

G

of good integers.

Definition 15.

For

(u, K) = 1

, we define

S_{u} : = {s : s prime, s \equiv u (mod K), {(log x)}^{20} < s \leq z}

d (u) = (u - 1, K), r^{*} (u) = \frac{1}{d (u)} \sum_{s \in S_{u}} s^{- 1} .

For

n \in [x, y]

, let

r (n, u) : = \sum_{\begin{matrix} s \in S_{u} : \exists c_{s}, n \equiv 1 - {(c_{s} + 1)}^{K} (mod s) \\ c_{s} \equiv - 1 (mod s) \end{matrix}} s^{- 1} .

We set

\begin{matrix} G : = & {n : n \in [x, y], | r (n, u) - r^{*} (u) | \leq {(log x)}^{- 1 / 40} \\ for all u (mod K), (u, K) = 1} . \end{matrix}

For an admissible r-tuple to be specified later and for primes p with

x / 2 < p < x

, we set

G (p) : = {n \in G : n + (h_{i} - h_{l}) p \in G, \forall i, l \leq r} .

Theorem 19

(Theorem 6.2 of [20], Existence of good sieve weights). Let x be a sufficiently large real number and let y be any quantity obeying Definition 9. Let

P, Q

be defined by Definition 9. Let r be a positive integer with

r_{0} \leq r \leq {log}^{η} x

(56)

for some sufficiently large absolute constant

r_{0}

and some sufficiently small

η > 0

.

Let

(h_{1}, \dots, h_{r})

be an admissible r-tuple contained in

[2 r^{2}]

. Then, one can find a positive quantity

r \geq x^{- o (1)}

(56a)

and a positive quantity

u = u (r)

depending only on r with

u ≍ log r

(57)

and a non-negative function

w_{(K)} : P \times Z \to R^{+}

supported on

P \times (P \cap [- y, y])

with the following properties:

w_{(K)} (p, n) = 0

(58)

unless

n \equiv 1 - {(d_{p} + 1)}^{K} (mod p)

for some

d_{p} \in Z

,

d_{p} \equiv - 1 (mod p)

and

n \in G (p)

.

Uniformly for every

p \in P

, one has

\sum_{n \in Z} w_{(K)} (p, n) = (1 + O (\frac{1}{{log}_{2}^{10} x})) (τ \frac{y}{log x}) .

(59)

Uniformly for every

q \in Q

and

i = 1, \dots, r

, one has

\sum_{p \in P} w_{(K)} (p, q - h_{i} p) = (1 + O (\frac{1}{{log}_{2}^{10} x})) τ \frac{u}{r} \frac{x}{2 {log}^{r} x} .

(60)

Uniformly for every

h = O (y / x)

that is not equal to any of the

h_{i}

, one has

\sum_{q \in Q} \sum_{p \in P} w_{(K)} (p, q - h p) = O (\frac{1}{{log}_{2}^{10} x} τ \frac{x y}{{log}^{r} x log log x}) .

(61)

Uniformly for all

p \in P

and

z \in Z

w_{(K)} (p, n) = O (x^{1 / 3 + o (1)}) .

(62)

We now show how Theorem 19 implies Theorem 18.

Let

x, c, y, z, S, P, Q

be as in Theorem 16. We set

τ : = ⌊ {(log x)}^{η_{0}} ⌋, σ : = \prod_{s \in S} (1 - \frac{1}{s}) .

We now invoke Theorem 19 to obtain quantities

τ, u

and weight

w : P \times Z \to R^{+}

with the stated properties.

For each

p \in P

, let

{\tilde{n}}_{p}

, denote the random integer with probability density

P ({\tilde{n}}_{p} = n) = \frac{w_{(K)} (p, n)}{\sum_{n^{'} \in Z} w_{(K)} (p, n^{'})}

(63)

for all

n \in Z

. From (59), (60), we have

\sum_{p \in P} P (q = {\tilde{n}}_{p} + h_{i} p) = (1 + O (\frac{1}{{log}_{2}^{10} x})) \frac{u}{r} \frac{x}{2 y} (q \in Q, 1 \leq i \leq r) .

(64)

Also, from (57), (59), (63), one has

P ({\tilde{n}}_{p} = n) ≪ x^{- 1 / 2 - 1 / 6 + o (1)}

for all

p \in P

and

n \in Z

.

We choose the random vector

\vec{a} : = {(a_{s} mod s)}_{s \in S}

by selecting each

a_{s} mod s

uniformly at random from

A_{s}

independently in s.

Lemma 10.

Let

t \leq {(log x)}^{3 η_{0}}

be a natural number and let

n_{1}, \dots, n_{t}

be distinct integers from

G

. Then, one has

P (n_{1}, \dots, n_{t} \in S (\vec{a})) = (1 + O (\frac{1}{{log}_{2}^{10} x})) σ^{t} .

Proof.

For

\vec{n} = (n_{1}, \dots, n_{t})

, let

K (\vec{n})

be the set of

s \in S

for which

s ∣ n_{l} - n_{i}

, for

i \neq l

. Then, since

n_{i} - n_{l} = O (x^{O (1)}),

we have

| K (\vec{n}) | = O ({(log x)}^{3}) .

Let

\vec{a} \in A_{s}

,

1 \leq u \leq K - 1

,

(u, K) = 1

. We write

{\vec{a}}_{u} = (a_{s_{1, u}}, \dots, a_{s_{r_{u}, u}}),

where

S_{u} \cap K^{c} = {s_{1, u}, \dots, s_{r_{u}, u}} .

We set

ϵ (h, s) : = \{\begin{matrix} 1, if n_{h} \equiv 1 - {(c_{s, h} + 1)}^{K} (mod s) has a solution c_{s, h} \equiv - 1 (mod s) \\ 0, otherwise . \end{matrix}

□

We have

n_{h} \in S (\vec{a}) if and only if n_{h} \in S (\vec{a}, u), \forall n, (u, K) = 1 .

We now use certain well-known facts from the theory of K-th power residues.

There are

\frac{s_{i, u} - 1}{d (u)} - 1

possible choices for the

a_{s_{i}, u}

. From these, for each h,

1 \leq h \leq t

there are

ϵ (h, s_{i, u})

choices such that

a_{s_{i}, u} \equiv n_{h} (mod s_{i, u}) .

Thus, the total number of choices for

a_{s_{i}, u}

for which not all

u_{h} \in S (\vec{a})

,

(1 \leq h \leq t)

is

\sum_{h = 1}^{t} ϵ (h, s_{i, u}) .

Since the choices for the components

a_{s}

are independent, we have

\begin{matrix} P (n_{1}, \dots, n_{t} \in S (\vec{a})) \\ = \prod_{u : (u, K) = 1} \prod_{s \in S_{u}} {(\frac{s - 1 - d (u)}{d (u)})}^{- 1} (\frac{s - 1 - d (u)}{d (u)} - \sum_{h = 1}^{t} ϵ (h, s)) (1 + O (\frac{{(log x)}^{3}}{z_{0}})) \\ = \prod_{u : (u, K) = 1} \prod_{s \in S_{u}} (1 - d (u) s^{- 1} \sum_{h = 1}^{t} ϵ (h, s)) (1 + O (s^{- 2})) (1 + O ({(log x)}^{- 17})) . \end{matrix}

(65)

We have

\prod_{s \in S_{u}} (1 - d (u) s^{- 1} \sum_{h = 1}^{t} ϵ (h, s)) = exp (- \sum_{s \in S_{u}} d (u) s^{- 1} \sum_{h = 1}^{t} ϵ (h, s) + O (s^{- 2})) .

Since

n_{h} \in G

for

1 \leq h \leq t

, we have by the definition for

G

:

\sum_{s \in S (u)} s^{- 1} ϵ (h, s) = \frac{1}{d (u)} \sum_{s \in S_{u}} s^{- 1} + O ({(log x)}^{- 1 / 40}) .

(66)

From (65) and (66), we thus obtain

P (n_{1}, \dots, n_{t} \in S (\vec{a})) = (1 + O (\frac{1}{{(log x)}^{1 / 40}})) σ^{t} .

Corollary 5

(to Lemma 10). With probability

1 - o (1)

, we have:

# (Q \cap S (\vec{a})) \sim σ \frac{y}{log x} \sim 80 c \frac{x}{log x} {log}_{2} x .

Proof.

From Lemma 10, we have

E # (Q \cap S (\vec{a})) = (1 + O (\frac{1}{{({log}_{2} x)}^{5}})) σ # Q

and

E # {(Q \cap S (\vec{a}))}^{2} = (1 + O (\frac{1}{{({log}_{2} x)}^{5}})) (σ # Q + σ^{2} (# Q) (# Q - 1))

and so by the prime number theorem we see that the random variable

# (Q \cap S (\vec{a}))

has mean

(1 + O (\frac{1}{{({log}_{2} x)}^{5}})) σ \frac{y}{log x}

and variance

O (\frac{1}{{({log}_{2} x)}^{5}} {(σ \frac{y}{log x})}^{2}) .

The claim then follows from Chebyshev’s inequality. □

For each

p \in P

, we consider the quantity

X_{p} (\vec{a}) : = P ({\tilde{n}}_{p} + h_{i} p \in S (\vec{a}), for all i = 1, \dots, r)

(67)

and let

P (\vec{a})

denote the set of primes

p \in P

, such that

X_{p} (\vec{a}) : = (1 + O (\frac{1}{{({log}_{2} x)}^{10}})) σ^{r} .

We now define the random variables

n_{p}

as follows. Suppose we are in the event

\vec{a} = \vec{a}

for some

\vec{a}

in the range of

\vec{a}

. If

p \in P ∖ P (\vec{a})

, we set

n_{p} : = 0

. Otherwise, if

p \in P (\vec{a})

, we define

n_{p}

to be the random integer with conditional probability distribution

P (n_{p} = n | \vec{a} = \vec{a}) : = \frac{Z_{p} (\vec{a}; n)}{X_{p} (\vec{a})},

where

Z_{p} (\vec{a}; n) : = 1_{n + h_{j} p \in S (\vec{a}), for j = 1, \dots, r} P ({\tilde{n}}_{p} = n)

with the

{\tilde{n}}_{p}

jointly conditionally independent on the event

\vec{a} = \vec{a}

.

Lemma 11.

With probability

1 - o (1)

, we have

σ^{- r} \sum_{i = 1}^{r} \sum_{p \in P (\vec{a})} Z_{p} (\vec{a}; q - h_{i} p) = (1 + O (\frac{1}{{({log}_{2} x)}^{5}})) \frac{u}{σ} \frac{x}{2 y}

(68)

for all but at most

x / (2 log x {log}_{2} x)

of the primes

q \in Q \cap S (\vec{a})

.

Before proving Lemma 11, we first confirm that

p \in P ∖ P (\vec{a})

is small with high probability.

Lemma 12.

With probability

1 - O (\frac{1}{{({log}_{2} x)}^{10}}),

P (\vec{a})

contains all but

O (\frac{1}{{log}^{3} x} \frac{x}{log x})

of the primes

p \in P

. In particular

E # P (\vec{a}) = # P (1 + O (\frac{1}{{log}^{3} x})) .

Proof.

By linearity of expectation and Markov’s inequality, it suffices that for each

p \in P

we have

p \in P (\vec{a})

with probability

1 - O (\frac{1}{{({log}_{2} x)}^{20}}) .

By Chebyshev’s inequality it suffices to show that

\begin{matrix} E X_{p} (\vec{a}) & = P ({\tilde{n}}_{p} + h_{i} p \in S (\vec{a}) for all i = 1, \dots, r) \\ = (1 + O (\frac{1}{{log}_{2} x})) σ^{r} \end{matrix}

(69)

and

\begin{matrix} E X_{p} {(\vec{a})}^{2} & = P ({\tilde{n}}_{p}^{(1)} + h_{i} p, {\tilde{n}}_{p}^{(2)} + h_{i} p \in S (\vec{a}) for all i = 1, \dots, r) \\ = (1 + O (\frac{1}{{log}_{2} x})) σ^{2 r}, \end{matrix}

(70)

where

{\tilde{n}}_{p}^{(1)}, {\tilde{n}}_{p}^{(2)}

are independent copies of

{\tilde{n}}_{p}

that are also independent of

\vec{a}

.

To prove claim (69), we first select the value n for

{\tilde{n}}_{p}

according to the distribution (63):

P ({\tilde{n}}_{p} = n) = \frac{w_{(K)} (p, n)}{\sum_{n^{'} \in Z} w_{(K)} (p, n^{'})} .

Because of the property

w (p, n) = 0

, if

n \notin G (p)

we have with probability 1:

n + h_{i} p \in G for 1 \leq i \leq r .

Relation (69) now follows from Lemma 10 with

n_{i} = n + h_{i} p

, applying the formula for total probability

P ({\tilde{n}}_{p} + h_{i} p \in S (\vec{a})) = \sum_{n} P ({\tilde{n}}_{p} + h_{i} p \in S (\vec{a}) | {\tilde{n}}_{p} = n) .

A similar application of Lemma 10 allows one to write the left-hand side of (70) as

(1 + O (\frac{1}{{({log}_{2} x)}^{5}})) E σ^{# {{\tilde{n}}_{p}^{(l)} + h_{i} p : i = 1, 2, \dots, r, l = 1, 2}} .

From (69), we see that the quantity

# {{\tilde{n}}_{p}^{(l)} + h_{i} p : i = 1, 2, \dots, r, l = 1, 2}

is equal to

2 r

with probability

1 - O (x^{- 1 / 2 - 1 / 6 + o (1)})

and is less than

2 r

otherwise.

The claim now follows from

σ^{- r} = x^{o (1)}

. □

(Proof of Lemma 11).

We first show that replacing

P (\vec{a})

with P has negligible effect on the sum with probability

1 - o (1)

. Fix i and substitute

n : = q - h_{i} p

.

By Lemma 11, we have

\begin{matrix} E \sum_{n} σ^{- r} \sum_{p \in P} Z_{p} (\vec{a}; n) & = σ^{- r} \sum_{p \in P} \sum_{n} P ({\tilde{n}}_{p} = n) P (n + h_{i} p \in S (\vec{a}) for j = 1, \dots r) \\ = (1 + O (\frac{1}{{({log}_{2} x)}^{10}})) # P . \end{matrix}

Next by

X_{p} (\vec{a}) = (1 + O (\frac{1}{{log}^{3} x})) σ^{r}

and Lemma 12 we have

\begin{matrix} E \sum_{r} σ^{- r} \sum_{p \in P (\vec{a})} Z_{p} (\vec{a}; n) & = σ^{- r} \sum_{a} P (\vec{a} = \vec{a}) \sum_{p \in P (\vec{a})} X_{p} (\vec{a}) \\ = (1 + O (\frac{1}{{({log}_{2} x)}^{10}})) E # P (\vec{a}) \\ = (1 + O (\frac{1}{{log}^{3} x})) # P . \end{matrix}

Subtracting, we conclude that the difference of the two expectations above is

O (# P / {log}_{2} x)

. The claim then follows from (56).

By this, it suffices to show that

σ^{- r} \sum_{i = 1}^{r} \sum_{p \in P} Z_{p} (\vec{a}; q - h_{i} p) = 1 + O (\frac{1}{{log}_{2} x})

for all but at most

\frac{x}{2 log x {log}_{2} x}

primes

q \in Q \cap S (\vec{a})

, one has

\sum_{i = 1}^{r} \sum_{p \in P} Z_{p} (\vec{a}; q - h_{i} p) = (1 + O_{\leq} (\frac{1}{{({log}_{2} x)}^{3}})) σ^{r - 1} u \frac{x}{2 y} .

(71)

We call a prime

q \in Q

“bad” if

q \in Q \cap S (\vec{a})

, but (71) fails. Using Lemma 12 and (63) we have

\begin{matrix} E (\sum_{q \in Q \cap S (\vec{a})} \sum_{i = 1}^{r} \sum_{p \in P} Z_{p} (\vec{a}; q - h_{i} p)) \\ = \sum_{q, i, p} P (q \in (h_{j} - h_{i}) p \in S (\vec{a}) for all j = 1, \dots r) P ({\tilde{n}}_{p} = q - h_{i} p) . \end{matrix}

(72)

By the definition of

G (p)

, we have

P (q + (h_{j} - h_{i}) p \in S (\vec{a})) = 0,

unless

q \in G (p)

. By Definition 15 this means that

q + (h_{j} - h_{i}) p \in G

.

We may thus apply Lemma 12 with

n_{j} : = (q - h_{i} p) + h_{j} p

and obtain for all i:

P (q + (h_{i} - h_{j}) p \in S (\vec{a}) for all j = 1, \dots, r) = σ^{r} (1 + O (\frac{1}{{({log}_{2} x)}^{10}})) .

With (71), we thus obtain

\begin{matrix} E (\sum_{q \in Q \cap S (\vec{a})} \sum_{i = 1}^{r} \sum_{p \in P} Z_{p} (\vec{a}; q - h_{i} p)) \\ = (1 + O (\frac{1}{{({log}_{2} x)}^{10}})) \frac{σ y}{log x} σ^{r - 1} u \frac{x}{2 y}, \end{matrix}

Next, we obtain

\begin{matrix} E (\sum_{q \in Q \cap S (\vec{a})} (\sum_{i = 1}^{r} \sum_{p \in P} Z_{p} {(\vec{a}; q - h_{i} p)}^{2})) \\ = \sum_{\begin{matrix} p_{1}, p_{2}, q \\ i_{1}, i_{2} \end{matrix}} P (q + (h_{j} - h_{i_{l}}) p_{l} \in S (\vec{a}) for j = 1, \dots, r; l = 1, 2 \\ \times P ({\tilde{n}}_{p_{1}}^{(1)} = q - h_{i_{1}} p_{1}) P ({\tilde{n}}_{p_{2}}^{(2)} = q - h_{i_{2}} p_{2}) \\ = (1 + O (\frac{1}{{({log}_{2} x)}^{10}})) \frac{σ y}{log x} σ^{r - 1} u \frac{x}{2 y}, \end{matrix}

where

{({\tilde{n}}_{p_{1}}^{(1)})}_{p_{1} \in P}

and

{({\tilde{n}}_{p_{2}}^{(2)})}_{p_{2} \in P}

are independent copies of

{({\tilde{n}}_{p})}_{p \in P}

over

\vec{a}

. In the last step, we used the fact that the terms with

p_{1} = p_{2}

contribute negligibly.

By Chebyshev’s inequality, it follows that the number of bad q’s is

≪ \frac{σ y}{log x} \frac{1}{{log}_{2}^{2} x} ≪ \frac{x}{log x {log}_{2}^{2} x}, with probability 1 - O (\frac{1}{{log}_{2} x}) .

(73)

We may now prove Theorem 16.

Relation (40) is actually the corollary to Lemma 10. In order to prove (14), we assume that

\vec{a}

is good and

q \in Q \cap S (\vec{a})

.

Substituting (67) into the left-hand side of (68) using

σ^{- r} = x^{o (1)}

and observing that

q = n_{i} + h_{i} p

is only possible if

p \in P (\vec{a})

, we find that

\begin{matrix} σ^{- r} \sum_{i = 1}^{r} \sum_{p \in P (\vec{a})} Z_{p} (\vec{a}; q - h_{i} p) & = σ^{- r} \sum_{i = 1}^{r} \sum_{p \in P (\vec{a})} X_{p} (\vec{a}) P (n_{p} = q - h_{i} p | \vec{a} = \vec{a}) \\ = (1 + O (\frac{1}{{({log}_{2} x)}^{2}})) \sum_{i = 1}^{r} \sum_{p \in P (\vec{a})} P (n_{p} = q - h_{i} p | \vec{a} = \vec{a}) \\ = (1 + O (\frac{1}{{({log}_{2} x)}^{2}})) \sum_{i = 1}^{r} \sum_{p \in P} P (q \in e_{p} (\vec{a}) | \vec{a} = \vec{a}), \end{matrix}

where

e_{p} (\vec{a}) = {n_{p} + h_{i} p : 1 \leq i \leq r} \cap Q \cap S (\vec{a})

is as defined in Theorem 16. The fact that

\vec{a}

is good with probability

1 - o (1)

follows upon noticing that

C : = \frac{u}{σ} \frac{x}{2 y} \sim \frac{1}{σ} .

This concludes the proof of Theorem 16. □

5. Large Gaps with Improved Order of Magnitude and Its K-Version, Part II

We first state definitions and results from “Dense clusters of primes in subsets” by Maynard [15].

We make use of the notation given in Section 7: “Multidimensional Sieve Estimates” of [15].

Definition 16.

A linear form is a function

L : Z \to Z

of the form

L (n) = l_{1} n + l_{2}

with integer coefficients

l_{1}, l_{2}

and

l_{1} \neq 0

. Let

A

be a set of integers. Given a linear form

L (n) = l_{1} n + l_{2}

. We define the sets

\begin{matrix} A (x) & : = {n \in A : x \leq n \leq 2 x}, \\ A (x; q, a) & : = {n \in A : n \equiv a (mod q)}, \\ P_{L, A} (x) & : = L (A (x)) \cap P \\ P_{L, A} (x; q, a) & : = L (A (x; q, a)) \cap P \end{matrix}

for any

x > 0

and congruence class

a mod q

and define the quantity

ϕ (q) : = ϕ (| l_{1} | q) / ϕ (| l_{1} |),

where ϕ is the Euler totient function.

A finite set

L = {L_{1}, \dots, L_{k}}

of linear forms is said to be admissible if

\prod_{i = 1}^{k} L_{i} (n)

has no fixed prime divisor; that is, for every prime p there exists an integer

n_{p}

such that

\prod_{i = 1}^{k} L_{i} (n_{p})

is not divisible by p.

Definition 17.

Let x be a large quantity, let

A

be a set of integers,

L = {L_{1}, \dots, L_{k}}

a finite set of linear forms and B a natural number. We allow

A, L, k, B

to vary with x. Let

0 < θ < 1

be a quantity independent of

x_{0}

. Let

L^{'}

be a subset of

L

. We say that the tuple

(A, L, P, B, x, θ)

obeys Hypothesis 1 at

L^{'}

if we have the following three estimates:

(1): ( $A (x)$ is well-distributed in arithmetic progressions). We have

$\sum_{q \leq x^{θ}} max_{a} |# A (x; q, a) - \frac{# A (x)}{q}| ≪ \frac{# A (x)}{{log}^{100 k^{2}} x}$
(2): ( $P_{L, A} (x)$ is well-distributed in arithmetic progressions). For any $L \in L^{'}$ , we have

$\sum_{q \leq x^{θ}, (q, P) = 1} max_{(L (a), q) = 1} |# P_{L, A} (x; q, a) - \frac{# P_{L, A} (x)}{ϕ_{K} (q)}| ≪ \frac{# P_{L, A} (x)}{{(log x)}^{100 k^{2}}} .$
(3): ( $A (x)$ not too concentrated). For any $q < x^{θ}$ and $a \in Z$ , we have

$# A (x; q, a) ≪ \frac{# A (x)}{q} .$

In [15], this definition was only given in the case

L^{'} = L

, but we will need the (mild) generalization to the case in which

L^{'}

is a (possibly empty) subset of

L

.

As is common in analytic number theory, we will have to address the possibility of a Siegel zero. As we want to keep all our estimates effective, we will not rely on Siegel’s theorem or its consequences. Instead, we will rely on the Landau–Page theorem, which we now recall. Throughout,

χ

denotes a Dirichlet character.

Lemma 13

(Landau–Page Theorem). Let

Q \geq 100

. Suppose that

L (s, χ) = 0

for some primitive character χ of modulus at most Q and some

s = σ + i t

. Then, either

1 - σ ≫ \frac{1}{log (Q (1 + | t |)},

or else

t = 0

and χ is a quadratic character

χ_{Q}

, which is unique. Furthermore, if

χ_{Q}

exists, then its conductor

q_{Q}

is square-free apart from a factor of at most 4 and obeys the lower bound

q_{Q} ≫ \frac{{log}^{2} Q}{{log}_{2}^{2} Q} .

Proof.

See, e.g., ([27], Chapter 14). The final estimate follows from the bound

1 - β ≫ q^{- 1 / 2} {log}^{- 2} q

for a real zero

β

of

L (s, χ)

with

χ

of modulus q, which can also be found in ([27], Chapter 14).

We can then eliminate the exceptional character by deleting at most one prime factor of

q_{Q}

. □

Corollary 6.

Let

Q \geq 100

. Then, there exists a quantity

B_{Q}

which is either equal to 1 or is a prime of size

B_{Q} ≫ {log}_{2} Q

with the property that

1 - σ ≫ \frac{1}{log (Q (1 + | t |))}

whenever

L (σ + i t, χ) = 0

and χ is a character of modulus at most Q and coprime to

B_{Q}

.

Proof.

If the exceptional character

χ_{Q}

from Lemma 13 does not exist, then take

B_{Q} : = 1

; otherwise, we take

B_{Q}

to be the largest prime factor of

q_{Q}

. As

q_{Q}

is square-free apart from a factor of at most 4, we have

log q_{Q} ≪ B_{Q}

by the prime number theorem and the claim follows. □

Lemma 14.

Let x be a large quantity. Then, there exists a natural number

B \leq x

, which is either 1 or a prime, such that the following holds.

Let

A : = Z

, let

θ : = 1 / 3

and

L : = {L_{1}, \dots, L_{k}}

be a finite set of linear forms

L_{i} (n) = a_{i} + b_{i}

(which may depend on x) with

k \leq {log}^{1 / 5} x

,

1 \leq | a_{i} | \leq log x

and

| b_{i} | \leq x {log}^{2} x

.

Let

x \leq y \leq x {log}^{2} x

and let

L^{'}

be a subset of

L

such that

L_{i}

is non-negative on

[y, 2 y]

and

a_{i}

is coprime to B for all

L_{i} \in L^{'}

. Then,

(A, L, P, B, y, θ)

obeys Hypothesis 1 at

L^{'}

with absolute implied constants (i.e., the bounds in Hypothesis 1 are uniform over all such choices of

L

and y).

Proof.

Parts (1) and (3) of Hypothesis 1 are easy to see; the only difficult verification is (2). We apply Corollary 6 with

Q : = exp (c_{1} \sqrt{log x})

for some small absolute constant

c_{1}

to obtain a quantity

B : = B_{Q}

with the stated properties. By the Landau–Page theorem (see [27], Chapter 20), we have that if

c_{1}

is sufficiently small then we have the effective bound

ϕ {(q)}^{- 1} \sum_{χ}^{*} | ψ (z, χ) | ≪ x exp (- 3 c \sqrt{log x})

(74)

for all

1 < q < exp (2 c \sqrt{log x})

with

(q, B) = 1

and all

z \leq x {log}^{4} x

. Here, the summation is over all primitive

χ mod q

and

ψ (z, χ) = \sum_{n \leq x} χ (n) Λ (n) .

Following a standard proof of the Bombieri–Vinogradov Theorem (cf. [27], Chapter 28), we have (for a suitable constant

c > 0

):

\begin{matrix} \sum_{\begin{matrix} q < x^{1 / 2 - ϵ} \\ (q, B) = 1 \end{matrix}} sup_{\begin{matrix} (a, q) = 1 \\ z \leq x {log}^{4} x \end{matrix}} |π (z; q, a) - \frac{π (z)}{ϕ (q)}| \\ ≪ x exp (- c \sqrt{log x}) + log x \sum_{\begin{matrix} q < exp (2 c \sqrt{log x}) \\ (q, B) = 1 \end{matrix}} \sum_{χ}^{*} sup_{z \leq x {log}^{4} x} \frac{| ψ (z, χ) |}{ϕ (q)} \end{matrix}

(75)

Combining these two statements and using the triangle inequality gives the bound required for (2). □

We now recall the construction of sieve weights from ([15], Section 7).

Let

W : = \prod_{\begin{matrix} p \leq 2 k^{2} \\ p ∤ B \end{matrix}} p .

For each prime p not dividing B, let

r_{p, 1} (L) < \dots < r_{p, ω_{L (p)}} (L)

be the elements n of

[p]

for which

p ∣ \prod_{i = 1}^{k} L_{i} (n) .

If p is also coprime to w, then for each

1 \leq a \leq ω_{L (p)}

, let

j_{p, u} = j_{p, u} (L)

denote the least element of

[k]

such that

p ∣ L_{j_{p, u}} (r_{p, u} (L)) .

Let

D_{k} (L)

denote the set

\begin{matrix} D_{k} (L) & : = {(d_{1}, \dots, d_{k}) \in N^{k} : μ^{2} (d_{1} \dots d_{k}) = 1 : \\ (d_{1} \dots d_{k}, W B) = 1; (d_{j}, p) = 1 whenever p ∤ B W \\ and j \neq j_{p, 1}, \dots, j_{p, ω_{L (p)}}} . \end{matrix}

Define the singular series

S (L) : = \prod_{p ∤ B} (1 - \frac{ω_{L} (p)}{p}) {(1 - \frac{1}{p})}^{- k},

the function

ϕ_{ω_{L}} : = \prod_{p ∣ d} (p - ω_{L} (p)),

and let R be a quantity of size

x^{θ / 10} \leq R \leq x^{θ / 3} .

Let

F : R^{k} \to R

be a smooth function supported on the simplex

R_{k} : = {(t_{1}, \dots, t_{k}) \in R_{+}^{k} : t_{1} + \dots + t_{k} \leq 1} .

For any

(d_{1}, \dots, d_{k}) \in D_{k} (L)

, define

Y_{(d_{1}, \dots, d_{k})} (L) : = \frac{1_{D_{k} (L)} (r_{1}, \dots, r_{k}) W^{k} B^{k}}{ϕ {(W B)}^{k}} S_{W B} (L) F (\frac{log r_{1}}{log R}, \dots, \frac{log r_{k}}{log R}) .

For any

(d_{1}, \dots, d_{k}) \in D_{k} (L)

, define

λ_{(d_{1}, \dots, d_{k})} (L) : = μ (d_{1} \dots d_{k}) d_{1} \dots d_{k} \sum_{d_{i} ∣ r_{i} for i = 1, \dots, k} \frac{Y_{(r_{1}, \dots, r_{k})} (L)}{ϕ_{ω_{L}} (r_{1} \dots r_{k})},

and then define the function

w = w_{k, L, B, R} : Z \to R^{+}

by

w (n) : = {(\sum_{d_{1}, \dots, d_{k} : d_{i} / L_{i} (n) for all i} λ_{(d_{1}, \dots, d_{k})} (L))}^{2} .

(76)

We then have the following slightly modified form of Proposition 6.1 of [15].

Theorem 20.

Fix θ,

α > 0

. Then, there exists a constant C depending only on

θ, α

such that the following holds. Suppose that

(A, L, P, B, x, θ)

obeys Hypothesis 1 at some subset

L^{'}

of

L

. Write

k : = # L

and suppose that

x \geq C

,

B \leq x^{α}

and

C \leq k \leq {log}^{1 / 5} x

. Moreover, assume that the coefficients

a_{i}, b_{i}

of the linear forms

L_{i} (n) = a_{i} n + b_{i}

in

L

obey the size bound

| a_{i} |, | b_{i} | \leq x^{α}

and

C \leq k \leq {log}^{1 / 5} x

. Moreover, assume that the coefficients

a_{i}

,

b_{i}

of the linear forms

L_{i} (n) = a_{i} n + b_{i}

in

L

obey the size bound

| a_{i} |, | b_{i} | \leq x^{α}

for all

i = 1, \dots, k

. Then, there exists a smooth function

F : R^{k} \to R

depending only on k and supported on the simplex

R_{k}

and quantities

I_{k}

,

J_{k}

depending only on k with

I_{k} ≫ {(2 k log k)}^{- k}

and

J_{k} ≍ \frac{log k}{k} I_{k}

(77)

such that, for

w (n)

given in terms of F as above, the following assertions hold uniformly for

x^{θ / 10} \leq R \leq x^{θ / 3}

.

We have

$\sum_{n \in A (x)} w (n) = (1 + O (\frac{1}{{log}^{1 / 10} x})) \frac{B^{k}}{ϕ {(B)}^{k}} S (L) # A (x) {(log R)}^{k} I_{k} .$

(78)
For any linear form $L (n) = a_{L} n + b_{L}$ in $L^{'}$ with $a_{L}$ coprime to B and $L (n) > R$ on $[x, 2 x]$ , we have

$\begin{matrix} \sum_{n \in A (x)} 1_{P} (L (n)) w (n) \\ = (1 + O (\frac{1}{{log}^{1 / 10} x})) \frac{Φ (| a_{L} |)}{| a_{L} |} \frac{B^{k - 1}}{ϕ {(B)}^{k - 1}} \frac{B^{k - 1}}{ϕ {(B)}^{k - 1}} S (L) # P_{L, A} (x) {(log R)}^{k - 1} J_{h} \\ + O (\frac{B^{k}}{ϕ {(B)}^{k}} S (L) # A (x) {(log R)}^{k - 1} I_{h}) . \end{matrix}$

(79)
Let $L (n) = a_{0} n + b_{0}$ be a linear form such that the discriminant

$Δ_{L} : = | a_{0} | \prod_{j = 1}^{k} | a_{0} b_{j} - a_{j} b_{0} |$

is non-zero (in particular L is not in $L$ ). Then,

$\sum_{n \in A (x)} 1_{P \cap [x^{θ / 10}, + \infty)} (L (n)) w (n) ≪ \frac{Δ_{L}}{ϕ (Δ_{L})} \frac{B^{k}}{ϕ {(B)}^{k}} S (L) # A (x) {(log R)}^{n - 1} I_{k} .$

(80)
We have the crude upper bound

$w (n) ≪ x^{2 θ / 3 + o (1)}$

(81)

for all n $\in Z$ .

Proof.

The first estimate (78) is given by [15], Proposition 9.1, (79) follows from [15], Proposition 9.2, in the case of

(a_{L}, B) = 1

, (80) is given by [15], Proposition 9.4, (taking

ξ : = θ / 10

and

D : = 1

) and the final statement (81) is given by part (iii) of [15], Lemma 8.5. The bounds for

J_{k}

and

I_{k}

are given by [15], Lemma 8.6.

We can now prove Theorem 20. Let

x, y, r, h_{1}, \dots, h_{r}

be as in that theorem. We set

\begin{matrix} A & : = Z, \\ θ & : = 1 / 3, \\ k & : = r, \\ R & : = {(x / 4)}^{θ / 3} \end{matrix}

and let

B = x^{o (1)}

be the quantity from Lemma 14.

We define the function

w : P \times Z \to R^{+}

by setting

w (p, n) : = 1_{[- y, y]} (n) w_{k, L_{p}, B, R} (n)

for

p \in P

and

n \in Z

, where

L_{p}

is the (ordered) collection of linear forms

n \mapsto n + h_{i} p

for

i = 1, \dots, r

and

w_{k, L_{p}, B, R}

was defined in (76). Note that the admissibility of the r-tuple

(h_{1}, \dots, h_{r})

implies the admissibility of the linear forms

n \mapsto n + h_{i} p

.

An important point is that many of the key components of

w_{k, L_{p}, B, R}

are essentially uniform in p. Indeed, for any primes, the polynomial

\prod_{i = 1}^{k} (n + h_{i} p)

is divisible by s only at the residue classes -

h_{i} p mod s

. From this, we see that

ω_{L_{p}} (s) : = # {h_{i} (mod s)} whenever s \neq p .

In particular,

ω_{L_{p}} (s)

is independent of p as long as s is distinct from p; therefore,

\begin{matrix} S (L_{p}) & = (1 + O (\frac{k}{x})) S, \\ S_{B W} (L_{p}) & = (1 + O (\frac{k}{x})) S_{B W}, \end{matrix}

(82)

for some

S, S_{B W}

independent of p, with the error terms uniform in p. Moreover, if

s ∤ W P

then

s > 2 k^{2}

, so all the

h_{i}

are distinct

mod s

(since the

h_{i}

are less than

2 k^{2}

). Therefore, if

s ∤ p W B

we have

ω_{L_{p}} (s) = k

and

{j_{s, 1} (L_{p}), \dots, j_{s, ω (s)} (L_{p})} = {1, \dots, k} .

Since all

p \in P

are at least

x / 2 > R

, we have

s \neq p

whenever

s \leq R

. From this, we see that

D_{R} (L_{p}) \cap \{(d_{1}, \dots, d_{k}) : \prod_{i = 1}^{k} d_{i} \leq R\}

is independent of p and where the error term is independent of

d_{1}, \dots, d_{k}

.

It is clear that w is non-negative and supported on

P \times [- y, y]

and from (81) we have (57). We set

τ : = 2 \frac{B^{k}}{ϕ {(B)}^{k}} S {(log R)}^{k} {(log x)}^{k} I_{k}

(83)

and

u : = \frac{ϕ (B)}{B} \frac{log R k J_{k}}{log x 2 I_{k}} .

Since B is either 1 or prime, we have

\frac{ϕ (B)}{B} ≍ 1,

and from the definition of R we also have

\frac{log R}{log x} ≍ 1 .

(84)

From (77), we thus obtain (57). From [15], Lemma 8.1(i), we have

S \geq x^{- o (1)}

and from [15], Lemma 8.6, we have

I_{k} = x^{o (1)}

and so we have the lower bound (56a). (In fact, we also have a matching upper bound

τ \leq x^{o (1)}

, but we will not need this.)

It remains to verify the estimates (59) and (60). We begin with (59). Let p be an element of

P

. We shift the n variable by

3 y

and rewrite

\sum_{n \in Z} w (p, n) = \sum_{n \in A (2 y)} w_{k, L_{p} - 3 y, B, R} (n) + O (x^{1 - c + o (1)}),

where

L_{p} - 3 y

denotes the set of linear forms

n \mapsto n + h_{i} p - 3 y

for

i = 1, \dots, k

. (The

x^{1 - c + o (1)}

error arises from (61) and roundoff effect if y is not an integer.) This set of linear forms remains admissible and

S (L_{p} - 3 y) = S (L_{p}) = (1 + O (\frac{k}{x})) S .

The claim (59) now follows from (75) and the first conclusion (78) of Theorem 20 (with x replaced by

2 y

,

L^{'} = \emptyset

and

L = L_{p} - 3 y

), using Lemma 14 to obtain Hypothesis 1.

Now, we prove (60). Fix

q \in Q

and

i \in {1, \dots, k}

. We introduce the set

{\tilde{L}}_{q, i}

of linear forms

{\tilde{L}}_{q, i, 1}, \dots, {\tilde{L}}_{q, i, k}

, where

{\tilde{L}}_{q, i, i} : = n

and

{\tilde{L}}_{q, i, j} (n) : = q + (h_{j} - h_{i}) n (1 \leq j \leq k, j \neq i) .

We claim that this set of linear forms is admissible. Indeed, for any prime

s \neq q

, the solutions of

n \prod_{j \neq i} (q + (h_{j} - h_{i}) n) \equiv 0 (mod s)

are

n \equiv 0

and

n \equiv - y {(h_{j} - h_{i})}^{- 1} (mod s) for h_{j} \equiv h_{i} (mod s),

the number of which is equal to

# {h_{j} (mod s)}

. Thus

\begin{matrix} S ({\tilde{L}}_{q, i}) & = (1 + O (\frac{k}{x})) S, \\ S_{B W} ({\tilde{L}}_{q, i}) & = (1 + O (\frac{k}{x})) S_{B W}, \end{matrix}

as before. Again, for

s ∤ W B

we have that the

h_{i}

are distinct

(mod s)

and so if

s < R

and

s ∤ W B

we have

ω_{{\tilde{L}}_{q, i} (s)} = k

and

{j_{s, 1} ({\tilde{L}}_{q, i}), \dots, j_{s, ω (s)} ({\tilde{L}}_{q, i})} = {1, \dots, k} .

In particular

D_{k} ({\tilde{L}}_{q, i}) \cap \{(d_{1}, \dots, d_{k}) : \prod_{i = 1}^{k} d_{i} \leq R\}

is independent of

q, i

and so

λ_{(d_{1}, \dots, d_{k})} ({\tilde{L}}_{q, i}) = (1 + O (\frac{k}{x})) λ_{(d_{1}, \dots, d_{k})}

where again the

O (k / x)

error is independent of

d_{1}, \dots, d_{k}

. From this, since

q - h_{i} p

takes values in

[- y, y]

, we have that

w_{k, \tilde{L}, B, R} (p) = (1 + O (\frac{k}{x})) w_{k, L_{p}, B, R} (q - h_{i} p)

whenever

p \in P

(note that the

d_{i}

summation variable implicit on both sides of this equation is necessarily equal to 1). Thus, recalling that

P = P \cap (\frac{x}{2}, x)

we can write the left-hand side of (60) as

(1 + O (\frac{k}{x})) \sum_{n \in A (x / 2)} 1_{P} ({\tilde{L}}_{q, i, i} (n) w_{k, {\tilde{L}}_{q}, B, R} (n) .

Applying the second conclusion on (79) of Theorem 20 (with x replaced by

x / 2

,

L^{'} = {{\tilde{L}}_{q, i, i}}

and

L = {\tilde{L}}_{q, i}

) and using Lemma 14 to obtain Hypothesis 1, this expression becomes

\begin{matrix} (1 + O (\frac{1}{{log}_{2}^{10} x})) \frac{B^{k - 1}}{ϕ {(B)}^{k - 1}} S # P_{\tilde{L}, q, i, i, A} (\frac{x}{2}) {(log R)}^{k + 1} J_{k} \\ + O (\frac{B^{k}}{ϕ {(B)}^{k}} S # A (\frac{x}{2}) {(log R)}^{k - 1} I_{k}) . \end{matrix}

Clearly

# A (x / 2) = O (x)

and from the prime number theorem, one has

# P_{L_{q, i, i, A}} (\frac{x}{2}) = (1 + O (\frac{1}{{log}_{2}^{10} x})) \frac{x}{2 log x}

for any fixed

C > 0

. Using (83), we can thus write the left-hand side of (79) as

(1 + O (\frac{1}{{log}_{2}^{10} x})) \frac{u}{k} τ \frac{x}{2 {log}^{k} x} \frac{x}{2 {log}^{k} x} + O (\frac{1}{log R} τ \frac{x}{{log}^{k} x}) .

From (42) and (56a), the second error term may be absorbed into the first and (59) follows.

Finally, we prove (60). Fix

h = O (y / x)

not equal to any of the

h_{i}

and fix

p \in P

. By the prime number theorem, it suffices to show that

\sum_{q \in Q} w (p, q - h p) ≪ \frac{1}{{log}_{2}^{10} x} τ \frac{y}{{log}^{k} x} .

By construction, the left-hand side is the same as

\sum_{x - h p < n \leq y - h p} 1_{P} (n + h p) w_{k, L_{p}, B, R} (n),

which we can shift as

\sum_{n \in A (y - x)} 1_{P \cap [x^{θ / 10}, + \infty])} (n - y + 2 x) w_{k, L_{p} - y + 2 x - h p, B, R} (n) + O (x^{1 - c + o (1)}),

where again the

O (x^{1 - c + o (1)})

error is a generous upper bound for round-off errors. This error is acceptable and may be discarded. Applying (80), we may then bound the main term by

\begin{matrix} ≪ \frac{δ}{ϕ (Δ)} \frac{B^{k}}{ϕ {(B)}^{k}} G (L_{p} - y + 2 x - h p) y {(log R)}^{k - 1} I_{k} \\ = \frac{δ}{ϕ (Δ)} \frac{B^{k}}{ϕ {(B)}^{k}} G (L_{p}) y {(log R)}^{k - 1} I_{k}, \end{matrix}

where

Δ : = \prod_{j = 1}^{k} | h p - h_{i} p | .

Applying (83), we may simplify the above upper bound as

\frac{Δ}{ϕ (Δ)} \frac{y}{(log R) {(log x)}^{k}} τ .

Now,

h - h_{i} = O (y / x) = O (log x)

for each i; hence,

Δ \leq O (x {(log x)}^{k})

and it follows from (82) and (56), observing

\frac{log R}{log x} ≍ 1 .

\frac{Δ}{ϕ (Δ)} ≪ {log}_{2} Δ ≪ {log}_{2} x ≪ \frac{log R}{{log}_{2}^{10} x} .

This concludes the proof of Theorem 20 and hence Theorem 4. □

The K-version deduction of Theorem 19 (of [20]).

We now modify the weights

w_{n}

to incorporate (for fixed primes p) the conditions

n \equiv 1 - {(d_{p} + 1)}^{K} (mod p), d_{p} \equiv - 1 (mod p)

(85)

and

n \in G (p) .

We carry out the modification in two steps. In a first step, we replace

w_{n} = w_{n} (L)

by

w^{*} (p, n) = w^{*} (p, n, L)

. Here, p is a fixed prime with

x / 2 < p \leq x

.

Here, we have to be more specific about the set

A

. We set

A : = Z

.

Definition 18.

Let

w_{n}

be as in (76),

A = Z

, p a fixed prime with

x / 2 < p \leq x

. Let also

D = (K - 1, p)

. We set

w^{*} (p, n) : = \{\begin{matrix} D w_{n}, if there is d_{p} \in Z with n \equiv 1 - {(d_{p} + 1)}^{K} (mod p), \\ d_{p} \equiv - 1 (mod p) \\ 0, otherwise . \end{matrix}

(86)

We first express the solvability of (86) by the use of Dirichlet characters.

Lemma 15.

Let p be a prime number. Let

D = (p - 1, K)

, and

χ_{0}

be the principal character

mod D

. There are

D - 1

non-principal characters

χ_{1}, \dots, χ_{D - 1} mod D

, such that for all

n \in Z

we have

\frac{1}{D} \sum_{l = 0}^{D - 1} χ_{l} (1 - n) = \{\begin{matrix} 1, if n \equiv 1 - c^{K} (mod p) is solvable with p ∤ c \\ 0, otherwise . \end{matrix}

Proof.

Let

ρ

be a primitive root

mod p

,

1 - n \equiv ρ^{s} (mod p), 0 \leq s \leq p - 2 .

Setting

c \equiv ρ^{y} (mod p)

we see that the congruence

c^{K} \equiv 1 - n (mod p)

(87)

is solvable if and only if

K y \equiv s (mod p - 1)

(88)

has a solution y.

By the theory of linear congruences, this is equivalent to

D ∣ s

. We have

\frac{1}{D} \sum_{l = 0}^{D - 1} e (\frac{l s}{D}) = \{\begin{matrix} 1, if D ∣ s, \\ 0, otherwise . \end{matrix}

We now define the Dirichlet character

χ_{l}

, (

0 \leq l \leq D - 1

),

χ_{l} (- n) = e (\frac{l s}{D})

and obtain the claim of Lemma 15. □

Theorem 21.

Let

p, w^{*} (p, n), D

, as in the Definition of

w^{*} (p, n)

,

A : = Z

. Then, we have

\sum_{n \in A (x)} w^{*} (p, n) = (1 + O (\frac{1}{{(log x)}^{1 / 10}})) \frac{B^{k}}{ϕ {(B)}^{k}} G_{B} (L) A (x) {(log R)}^{k} I_{k} (F) .

Proof.

By Lemma 15, we have

\sum_{n \in A (x)} w^{*} (p, n) = \sum_{l = 0}^{D - 1} \sum_{n \in A (x)} w_{n} χ_{l} (1 - n) .

The sum belonging to the principal character

χ_{0} = \sum_{n \in A (x)} w_{n} χ_{0} (1 - n)

differs from the sum

\sum_{n \in A (x)} w_{n}

only by

O (x^{1 / 2})

, since there are only

\frac{| A (x) |}{p}

terms with

n \equiv 1 (mod p)

, each of them has size at most

x^{1 / 3}

. We therefore have

\sum_{n \in A (x)} w_{n} χ_{0} (1 - n) = \sum_{n \in A (x)} w_{n} + O (x^{1 / 2}) .

(89)

Let now

1 \leq l \leq D - 1

. Here, we closely follow the proof of Proposition 9.1 of [15]. We split the sum into residue classes

n \equiv v_{0} (mod W)

. We recall that

W = \prod_{\begin{matrix} p \leq 2 g^{2} \\ p ∤ B \end{matrix}} p < exp ({(log x)}^{2 / 5}) .

If

(\prod_{i = 1}^{g} L_{i} (v_{0}), W) \neq 1,

then we have

w_{n} = 0

and so we restrict our attention to

v_{0}

with

(\prod_{i = 1}^{g} L_{i} (v_{0}), W) = 1 .

We substitute the definition of

w_{n}

, expand the square and swap the order of summation. This gives

\sum_{n \in A (x)} χ_{l} (1 - n) = \sum_{v_{0} (mod W)} \sum_{d, e \in D_{g}} λ_{d} λ_{e} \sum_{\begin{matrix} n \in A (x) \\ n \equiv v_{0} (mod W) \\ [d_{i}, e_{i}] ∣ L_{i} (n), \forall i \end{matrix}} χ_{l} (1 - n) .

The congruence conditions in the inner sum may be combined via the Chinese Remainder Theorem by a single congruence condition

1 - n \equiv c (mod v), where v = W [d, e],

where

[\cdot, \cdot]

stands for the least common multiple.

There are

w \leq v

Dirichlet characters

ψ_{1}, \dots, ψ_{w} (mod W)

such that

1 - n \equiv c (mod v) if and only if \frac{1}{w} \sum_{l = 1}^{w} \bar{ψ (c)} ψ_{l} (1 - n) = 1 .

We thus may write

|\sum_{\begin{matrix} n \in A (x) \\ n \equiv v_{0} (mod w) \\ [d_{i}, e_{i}] ∣ L_{i} (n), \forall i \end{matrix}} χ_{l} (1 - n)| \leq A \sum_{l = 1}^{z} |\sum_{n \in I} ξ_{l} (1 - n)|,

with a suitable absolute constant A, an interval I of length

| I | \leq x {(log x)}^{2}

and the

D (v)

non-principal Dirichlet characters

ξ_{j, l} = χ_{j} ψ_{l}

of conductor

\geq p

and modulus

\leq x v

.

By the Pólya–Vinogradov bound, we obtain:

\sum_{\begin{matrix} n \in A (x) \\ - n \equiv v_{0} (mod w) \\ [d_{i}, e_{i}] ∣ L_{i} (n), \forall i \end{matrix}} χ (1 - n) ≪ x^{1 / 2} v .

(90)

The claim of Theorem 21 now follows from (89) and (90). □

As a preparation for the proof of Theorem 22 which is a modification of Proposition 9.2 of [15], we state a lemma on character sums over shifted primes.

Lemma 16.

Let χ be a Dirichlet character

(mod q)

. Then, for

N \leq q^{16 / 9}

we have

\sum_{n \leq N} Λ (n) χ (n + a) \leq (N^{7 / 8} q^{1 / 9} + N^{33 / 32} q^{- 1 / 18}) q^{o (1)} .

Proof.

This is Theorem 1 of [33]. □

Theorem 22.

Let

A = Z

,

L (n) = a_{m} n + b_{m} \in L

satisfy

L (n) > R

for

n \in [x, 2 x]

and

\sum_{\begin{matrix} q < x^{θ} \\ (q, B) = 1 \end{matrix}} max_{L (a, q) = 1} |# P_{L, A} (x; q, a) - \frac{# P_{L, A} (x)}{ϕ_{L} (q)}| ≪ \frac{# P_{L, A} (x)}{{(log x)}^{100 g^{2}}} .

Then, we have for sufficiently small θ:

\begin{matrix} \sum_{n \in A (x)} 1_{P} (L (n)) w^{*} (p, n) & = (1 + O (\frac{1}{{(log x)}^{1 / 10}})) \frac{B^{g - 1}}{ϕ {(B)}^{g - 1}} S (L) \\ \times # P_{L, A} (x) {(log R)}^{g + 1} J_{g} (F) \prod_{\begin{matrix} p ∣ a_{m} \\ p ∤ B \end{matrix}} \frac{p - 1}{p} \\ + O (\frac{B^{g}}{ϕ {(B)}^{g}} S_{B} (L) # A (x) {(log R)}^{g - 1} I_{g} (F)) . \end{matrix}

Proof.

By Lemma 15, we have

\sum_{n \in A (x)} 1_{P} (L (n)) w^{*} (p, n) = \frac{1}{D} \sum_{l = 0}^{D - 1} \sum_{n \in A (x)} 1_{P} (L (n)) w_{n} χ_{l} (1 - n) .

The sum belonging to the principal character

χ_{0}

differs from the sum

\sum_{n \in A (x)} 1_{P} (L (n)) w_{n}

only by

O (# A (x)) p^{- 1}

and thus in [15], Proposition 9.2, we have

\begin{matrix} \sum_{n \in A (x)} 1_{P} (L (n)) w^{*} (p, n) χ_{0} (1 - n) & = (1 + O (\frac{1}{{(log x)}^{1 / 10}})) \frac{B^{g - 1}}{ϕ {(B)}^{g - 1}} S_{B} (L) \\ \times # P_{L, A} (x) {(log R)}^{g + 1} I_{g} (F) \prod_{\begin{matrix} p ∣ a_{m} \\ p ∤ B \end{matrix}} \frac{p - 1}{p} \\ + O (\frac{B^{g}}{ϕ (B^{g})} S_{B} (L) # A (x) {(log R)}^{g - 1} I_{g} (F)) . \end{matrix}

(91)

For

1 \leq l \leq D - 1

, we follow closely the proof of Proposition 9.2 in [15]. We again split the sum into residue classes

n \equiv v_{0} (mod W) .

If

(\prod_{i = 1}^{g} L_{i} (v_{0}), W) > 1,

then we have

w_{n} = 0

and so we restrict our attention to

v_{0}

with

(\prod_{i = 1}^{g} L_{i} (v_{0}), W) = 1 .

We substitute the definition of

w_{n}

, expand the square and swap the order of summation. Setting

\tilde{n} = n - 1

, we obtain

\sum_{\begin{matrix} n \in A (x) \end{matrix} n \equiv v_{0} (mod W)} 1_{p} (L (n)) w_{n} χ_{l} (1 - n) = \sum_{d, e} χ_{d} λ_{l} \sum_{\begin{matrix} n \in A (x) \\ n \equiv v_{0} (mod W) \\ [d_{i}, e_{i}] ∣ L_{i} (n), \forall i \end{matrix}} 1_{p} (L (n)) χ_{l} (1 - n)

(92)

If

\tilde{n}

runs through the arithmetic progression

\tilde{n} = W h + v_{0} (h \in I_{0}),

then also

L (\tilde{n} + 1)

runs through an arithmetic progression

L (\tilde{n} + 1) = a_{n} W h + a_{m} (v_{0} + 1) + b .

Thus, we have

\begin{matrix} \sum_{\begin{matrix} n \in A (x) \\ n \equiv v_{0} (mod W) \end{matrix}} 1_{p} (L (n)) χ_{l} (1 - n) \\ = \sum_{\begin{matrix} \tilde{p} \equiv a_{m} (v_{0} + 1) + b (mod a_{m} W) \\ \tilde{p} prime, \tilde{p} \in I \end{matrix}} χ_{l} (\tilde{p} + a_{m} (v_{0} + 1) + b) . \end{matrix}

Also, the condition

\tilde{p} \equiv a_{m} (v_{0} + 1) + b (mod a_{m} W)

may be expressed with the help of Dirichlet characters

ω_{1}, \dots, ω_{ϕ (| a_{m} W |)} (mod | a_{m} W |),

using orthogonality relations.

Theorem 22 thus follows from (91) and Lemma 16. □

For the definition of the weight

w_{(K)} (p, n)

whose existence is claimed in Theorem 19, we now have to be more specific about the set

L

of linear forms.

Definition 19.

Let the tuple

(h_{1}, \dots, h_{r})

be given. For

p \in P

and

n \in Z

, let

L_{p}

be the (ordered) collection of linear forms

n \mapsto n + h_{i} p

for

i = 1, \dots, r

and set

w_{(K)} = \{\begin{matrix} w^{*} (p, n, L_{p}), if n \in G (p), \\ 0, otherwise . \end{matrix}

In the sequel, we now show that in the sums

\sum_{n \in Z} w_{(K)} (p, n) and \sum_{p \in P} w_{(K)} (p, q - h_{i} p)

appearing in (58) and (59) of Theorem 19, the function

w_{(K)} (p, \cdot)

may be replaced by the function

w^{*} (p, \cdot, L_{p})

with a negligible error.

Since these sums have been treated in Theorem 21 and Theorem 22, this will essentially conclude the proof of Theorem 19 and thus of Theorem 5. □

Lemma 17.

We have

\sum_{\begin{matrix} n \in A (x) \\ n \notin G (p) \end{matrix}} w^{*} (p, n, L_{p}) \leq \sum_{\begin{matrix} n \in A (x) \\ n \notin G (p) \end{matrix}} w_{n} (L_{p}) .

Definition 20.

Let

(h_{1}, \dots, h_{r})

be an admissible r-tuple,

p \in (x / 2, x)

. For

n \in Z

,

1 \leq i, l \leq r

, let

\begin{matrix} \tilde{n} = \tilde{n} (n, i, l, p) & = n + (h_{i} - h_{l}) p \\ A (i, l, p) & = \sum_{n : \tilde{n} \notin G} w_{n} (L_{p}) . \end{matrix}

Let

\begin{matrix} \sum (i, l, p) & : = \sum_{n \in A (x)} w_{n} (L_{p}) {(r (\tilde{n}, u) - r^{*} (u))}^{2} \\ \sum (i, l, p, j) & : = \sum_{n \in A (x)} w_{n} (L_{p}) r {(\tilde{n}, u)}^{j} (j \in N_{0}) . \end{matrix}

Lemma 18.

\sum_{\begin{matrix} n \in A (x) \\ n \notin G (p) \end{matrix}} w_{n} (L_{p}) = \sum_{1 \leq i, l \leq r} A (i, l, p) .

Proof.

This follows immediately from Definition 5 and 20. □

Lemma 19.

Let

A

,

w_{n}

be as in (76),

L_{p}

as in Definition 19. Let

j \in {1, 2}

. Then

\sum (i, l, p, j) = (1 + O (\frac{1}{{(log x)}^{1 / 10}})) \frac{B^{g}}{ϕ {(B)}^{g}} G_{B} (L_{p}) # A (x) {(log R)}^{g} I_{g} (F) r^{*} {(u)}^{j} .

Proof.

We only give the proof for the hardest case

j = 2

and briefly indicate the proof for

j = 1

.

\begin{matrix} \sum (i, l, p, 2) = \sum_{n \in A (x)} w_{n} (L_{p}) r {(\tilde{n}, u)}^{2} \\ = \sum_{n \in A (x)} w_{n} (L_{p}) (\frac{1}{d {(u)}^{2}} \sum_{\begin{matrix} s_{1} \in S_{u}, c_{s_{1}} \in {0, 1, \dots, s_{1} - 2} \\ n \equiv 1 - {(c_{s_{1}} + 1)}^{K} (mod s_{1}) \end{matrix}} s_{1}^{- 1}) (\sum_{\begin{matrix} s_{2} \in S_{u}, c_{s_{2}} \in {0, 1, \dots, s_{2} - 2} \\ n \equiv 1 - {(c_{s_{2}} + 1)}^{K} (mod s_{2}) \end{matrix}} s_{2}^{- 1}) \\ = \frac{1}{d {(u)}^{2}} \sum_{s_{1}, s_{2} \in S_{u}} s_{1}^{- 1} s_{2}^{- 1} \sum_{c_{s_{1}} = 1}^{s_{1} - 2} \sum_{c_{s_{2}} = 1}^{s_{2} - 2} \sum_{\begin{matrix} n \equiv 1 - {(c_{s_{1}} + 1)}^{K} + (h_{l} - h_{1}) p (mod s_{1}) \\ n \equiv 1 - {(c_{s_{2}} + 1)}^{K} + (h_{l} - h_{2}) p (mod s_{2}) \end{matrix}} w_{n} (L_{p}) . \end{matrix}

□

In the inner sum, we only deal with the case

s_{1} \neq s_{2}

; the case

s_{1} = s_{2}

has a negligible contribution. The inner sum is non-empty if and only if the system

\{\begin{matrix} \tilde{n} \equiv 1 - {(c_{s_{1}} + 1)}^{K} (mod s_{1}) \\ \tilde{n} \equiv 1 - {(c_{s_{2}} + 1)}^{K} (mod s_{2}) \end{matrix}

(93)

is solvable. In this case, (93) is equivalent to a single congruence

n \equiv c + (h_{l} - h_{i}) p (mod s_{1} s_{2}),

where

e = e (s_{1}, s_{2}, c_{1}, c_{2})

is uniquely determined by the system (93) and

0 \leq e \leq s_{1} s_{2} - 1 .

We apply Theorem 20 with B independent of

s_{1}, s_{2}

and with

A = A^{(s_{1}, s_{2})} = {n : x / 2 < n \leq x, n \equiv e + (h_{l} - h_{i}) p (mod s_{1} s_{2})} .

We have

# A^{(s_{1}, s_{2})} (x) = s_{1}^{- 1} s_{2}^{- 1} A (x) + O (1)

and obtain

\begin{matrix} \sum_{n \in A (x)} w_{n} (L_{p}) r {(\tilde{n}, u)}^{2} \\ = (1 + O (\frac{1}{{(log x)}^{1 / 10}})) \frac{B^{g}}{ϕ {(B)}^{g}} G_{B} (L_{p}) # A (x) (log R) I_{g} (F) r^{*} {(u)}^{2} . \end{matrix}

(94)

This proves the claim for

j = 2

. The proof of the case

j = 1

is analogous but simpler, since there is only the single variable of summation

s_{1}

. □

Lemma 20.

Let the conditions be as in Lemma 19. Then, we have

\sum (i, l, p) ≪ \frac{B^{g}}{ϕ {(B)}^{g}} G_{B} (L_{p}) # A (x) {(log R)}^{g} I_{g} (F) r^{*} {(u)}^{2} {(log x)}^{1 / 8} .

Theorem 23.

Let the conditions be as in the previous lemmas. For sufficiently small

η_{0}

, we have

\sum_{\begin{matrix} n \in A (x) \\ n \notin G (p) (L_{p}) \end{matrix}} w_{n} (L_{p}) ≪ \frac{B^{g}}{ϕ {(B)}^{g}} G_{B} (L_{p}) # A (x) {(log R)}^{g} I_{g} (F) {(log x)}^{- 1 / 10} .

Proof.

Let

1 \leq i, R \leq r

. By Definition 20, we have

\tilde{n} = n + (h_{i} - h_{l}) p \notin G

which yields

| r (\tilde{n}, u) - r^{*} (u) | \geq r^{*} (u) {(log x)}^{- 1 / 40} .

Thus,

\begin{matrix} r^{*} {(u)}^{2} {(log x)}^{- 1 / 20} \sum_{n \in A (x) : n + (h_{i} - h_{l}) p \notin G (p)} w_{n} (L_{p}) \\ \leq \sum_{n \in A (x)} w_{n} (L_{p}) {(r (\tilde{n}, u) - r^{*} (u))}^{2} \\ \leq \frac{B^{g}}{ϕ {(B)}^{g}} G_{B} (L_{p}) # A (x) {(log B)}^{g} I_{g} (F) {(log x)}^{- 1 / 20} r^{*} {(u)}^{2} \end{matrix}

and therefore

\sum_{n \in A (x) : n + (h_{i} - h_{l}) p \notin G (p)} w_{n} (L_{p}) \leq \frac{B^{g}}{ϕ {(B)}^{g}} G_{B} (L_{p}) # A (x) {(log R)}^{g} I_{g} (F) {(log x)}^{- 1 / 20} .

The claim of Theorem 23 follows by summation over all pairs

(i, l)

if

η_{0}

is sufficiently small. □

We now investigate the sum (60) of Theorem 19.

Definition 21.

Let

x / 2 < p \leq x

,

L (n) = n + h_{f} p

. Let

L \in L_{p}

:

1 \leq i, l, \leq r

. Then, we define

\begin{matrix} C (i, l, p) & : = \sum_{n \in A (x) : \tilde{n} \notin G} 1_{p} (L (n)) w_{n} (L_{p}) \\ Ω (i, l, p) & : = \sum_{n \in A (x)} 1_{p} (L (n)) w_{n} (L_{p}) (r (\tilde{n}, u) - r^{*} (u)) \\ Ω (i, l, p, j) & : = \sum_{n \in A (x)} 1_{p} (L (n)) w_{n} (L_{p}) r {(\tilde{n}, u)}^{j} \end{matrix}

Lemma 21.

Let

L, i, l, r, p

be as in Definition 20. Let

j \in {1, 2}

. Then, we have

\begin{matrix} Ω (i, l, p, j) & : = \frac{B^{g - 1}}{ϕ {(B)}^{g - 1}} G_{B} (L_{p}) # P_{L, A} (x) {(log R)}^{g + 1} \\ \times J_{g} (F) r^{*} {(u)}^{j} (1 + ({(log x)}^{- 1 / 10})) \\ + O (\frac{B^{g}}{ϕ {(B)}^{g}} G_{B} (L_{p}) # A (x) {(log R)}^{g - 1} J_{g} (F) r^{*} {(u)}^{j}) . \end{matrix}

Proof.

We only give the proof for the hardest case

j = 2

. The case

j = 1

is analogous but simpler. We have

\begin{matrix} Ω (i, l, p, 2) & = \sum_{n \in A (x)} 1_{p} (L (n)) w_{n} (L_{p}) r {(\tilde{n}, u)}^{2} \\ = \sum_{n \in A (x)} 1_{p} (L (n)) w_{n} (L_{p}) (\frac{1}{d {(u)}^{2}} \sum_{\begin{matrix} s_{1} \in S_{u}, c_{s_{1}} \in {1, \dots, s_{1} - 2} \\ \tilde{n} \equiv 1 - {(c_{s_{1}} + 1)}^{K} (mod s_{1}) \end{matrix}} s_{1}^{- 1}) \\ \times (\sum_{s_{2} \in S_{u}, c_{s_{2}} \in {1, \dots, s_{2} - 2}} s_{2}^{- 1}) \\ = \frac{1}{d {(u)}^{2}} \sum_{s_{1}, s_{2} \in S_{u}} s_{1}^{- 1} s_{2}^{- 1} \sum_{c_{s_{1}} = 1}^{s_{1} - 2} \sum_{c_{s_{2}} = 1}^{s_{2} - 2} \sum_{\begin{matrix} \tilde{n} \equiv 1 - {(c_{s_{1}} + 1)}^{K} + (h_{l} - h_{i}) p (mod s_{1}) \\ \tilde{n} \equiv 1 - {(c_{s_{2}} + 1)}^{K} + (h_{l} - h_{i}) p (mod s_{2}) \end{matrix}} 1 . \end{matrix}

We deal only with the case

s_{1} \neq s_{2}

for the inner sum, the case

s_{1} = s_{2}

having a negligible contribution. The inner sum is non-empty if and only if the system

\{\begin{matrix} n \equiv 1 - {(c_{s_{1}} + 1)}^{K} (mod s_{1}) \\ n \equiv 1 - {(c_{s_{2}} + 1)}^{K} (mod s_{2}) \end{matrix}

(95)

is solvable.

In this case, the system is equivalent to a single congruence

n \equiv e (s_{1}, s_{2}, c_{1}, c_{2})

uniquely determined by the system (95) and

0 \leq e \leq s_{1} s_{2} - 1

. The inner sum then takes the form

\sum_{\begin{matrix} n \equiv c (mod s_{1} s_{2}) \\ n \in A (x) \end{matrix}} 1_{p} (L (n)) w_{n} (L_{p}) .

By the substitution

n = m s + c

, we obtain

L (n) = L^{*} (m, s) = m s + e + h_{f} p .

We set

L_{p} = {L_{h_{i}}}

, where

L_{h_{i}} (n) = n + h_{i} p

is replaced by the set

L_{p, s} = {L_{h_{i}, s}}

, where

L_{h_{i}, s} (m) = m s + e + (h_{i} + h_{f}) p .

We thus have

\begin{matrix} \sum (s) & : = \sum_{\begin{matrix} n \equiv e (mod s) \\ n \in A (x) \end{matrix}} 1_{p} (L (n)) w_{n} (L_{p}) \\ = \sum_{m \in A (\frac{x}{s})} w_{m} (L_{p, s}) 1_{p} (L^{*} (m, s)) + O (1) . \end{matrix}

We apply Theorem 22 with

A = N

,

x / s

instead of x,

L (\cdot) = L^{*} (\cdot, s)

,

L = L_{p, s}

. We have

G_{B} (L_{p}) = G_{B} (L_{p, s}) (1 + O (\frac{1}{log x})) .

From Bombieri’s Theorem, it can easily be seen that conditions (78) are satisfied for all s with the possible exception of

s \in E

,

E

being an exceptional set, satisfying

\sum_{s \in E} s^{- 1} ≪ {(log x)}^{- 4} .

For

s \in E

, we use the trivial bound

1_{p} (L^{*} (m, s)) = O (1)

. Thus, we obtain the claim of Lemma 21 for the case

j = 2

.

The proof for

j = 1

is analogous but simpler, since we have only to sum over the single variable

s_{1}

. □

Lemma 22.

Let

i, l, p

be as in Definition 20. We have

\begin{matrix} Ω (i, l, p) & = O (\frac{B^{g - 1}}{ϕ {(B)}^{g - 1}} | G_{B} (L_{p}) | # P_{L, A} (x) {(log R)}^{g + 1} J_{g} (F) r^{*} {(u)}^{2} {(log x)}^{- 1 / 10}) \\ + O (\frac{B^{g}}{ϕ {(B)}^{g}} | G_{B} (L_{p}) | # A (x) {(log R)}^{g - 1} J_{g} (F) r^{*} {(u)}^{2}) . \end{matrix}

Proof.

By Definition 21, we have

Ω (i, l, p) = Ω (i, l, 2) - 2 r^{*} (u) Ω (i, l, p, 2) + r^{*} {(u)}^{2} Ω (i, l, p, 0) .

□

Theorem 24.

Let

p, L (n)

be as in Definition 21. Then, we have

\sum_{\begin{matrix} n \in A (x) \\ n \notin G (p) \end{matrix}} 1_{p} (L (n)) w_{n} (L_{p}) ≪ \frac{B^{g - 1}}{ϕ {(B)}^{g - 1}} G_{B} (L_{p}) # P_{L, A} (x) {(log R)}^{g + 1} J_{g} (F) {(log x)}^{- 1 / 10} .

Proof.

Let

1 \leq i, l \leq r

. By Definition 20, we have

\tilde{n} = n + (h_{i} - h_{l}) p \in G .

It follows that

| r (\tilde{n}, u) - r^{*} (u) | \geq r^{*} (u) {(log x)}^{- 1 / 40} .

Thus

\begin{matrix} r^{*} {(u)}^{2} {(log x)}^{- 1 / 20} \sum_{n \in A (x) : n \notin G (p)} 1_{p} (L (n)) w_{n} (L_{p}) \\ ≪ \frac{B^{g - 1}}{ϕ {(B)}^{g - 1}} G_{B} (L_{p}) # P_{L, A} (x) {(log R)}^{g + 1} J_{g} (F) r^{*} {(u)}^{2} {(log x)}^{- 1 / 10} \\ + \frac{B^{g}}{ϕ {(B)}^{g}} G_{B} (L_{p}) # D (x) {(log R)}^{g - 1} J_{g} (F) r^{*} {(u)}^{2} . \end{matrix}

The second term is absorbed in the first one, since by the definition

x^{θ / 10} \leq R \leq x^{θ / 3}

and thus

log R ≍ log x .

Therefore

L (i, l, p) ≪ \frac{B^{g - 1}}{ϕ {(B)}^{g - 1}} G_{B} (L_{p}) # P_{L, A} (x) {(log R)}^{g + 1} J_{g} (F) {(log x)}^{- 1 / 20} .

The claim of the Theorem 24 now follows by summing over all pairs

(i, j)

. □

We now can conclude the proof of Theorem 19 and therefore also the proof of Theorem 1.1.

By Theorems 21–24, we have

\sum_{n \in A (x)} w_{(K)} (p, n) = (1 + O (\frac{1}{{(log x)}^{1 / 100}})) \sum_{n \in A (x)} w_{n} (L_{n})

(96)

and

\sum_{n \in A (x)} 1_{p} (L (n)) w_{(K)} (p, n) = (1 + O (\frac{1}{{(log x)}^{1 / 100}})) \sum_{n \in A (x)} 1_{p} (L (n)) w_{n} (L_{n}) .

(97)

The deduction of Equations (58) and (59) of Theorem 19 can thus be deduced from results on the sums on the right-hand side of Equations (96) and (97).

6. The K-Version of Large Gap Results for Primes from Special Sequences

In joint work with Maier [34,35], the author of this paper established the K-version for special sequences of primes: Beatty primes and Piatetski–Shapiro primes.

We recall the following definitions:

Definition 22.

For two fixed real numbers α, β, the corresponding non-homogeneous Beatty sequence is the sequence of integers defined by

B_{α, β} : = {([α n + β])}_{n = 1}^{\infty} .

Definition 23.

For an irrational number γ, we define its type τ by the relation

τ : = sup {ρ \in R : lim inf n^{ρ} ∥ γ n ∥ = 0} .

Definition 24.

Let

c > 1

be a fixed constant. A prime of the form

[l^{c}]

is called Piatetski–Shapiro prime.

In the paper [34], the following Theorem is proved:

Theorem 25.

(Theorem 1.3 of [11]). Let

k \geq 2

be an integer. Let

α, β

be fixed real numbers with α being a positive irrational and of finite type. Then, there is a constant

C > 0

, depending only on α and β, such that for infinitely many n we have:

p_{n + 1} - p_{n} \geq C \frac{log p_{n} {log}_{2} p_{n} {log}_{4} p_{n}}{{log}_{3} p_{n}}

and the interval

[p_{n}, p_{n + 1}]

contains the K-th power of a prime

\tilde{p} \in B_{α, β}

.

In the paper [36], the following theorem is proved:

Theorem 26.

(Theorem 2.1 of [36]). Let

c \in (1, 18 / 17)

be fixed,

K \in N

,

K \geq 2

. Then, there is a constant

C > 0

, depending only on K and C, such that for infinitely many n we have

p_{n + 1} - p_{n} \geq C \frac{log p_{n} {log}_{2} p_{n} {log}_{4} p_{n}}{{log}_{3} p_{n}}

and the interval

[p_{n}, p_{n + 1}]

contains the K-th power of a prime

\tilde{p} = [l^{c}]

.

We now give a short sketch of the proof of these theorems.

These proofs are modifications of the proofs of the K-versions of the large gap result in Section 4 and Section 5. One applies the matrix method.

The matrices

M

are defined in a manner similar to their definition in the deduction of Theorem 2.1 of [36]. Once again choose x, such that

P (C_{0} x)

is a good modulus.

The only major modification is that one does not count primes of the form

a_{r, 1} = (m_{0} + 1 + r) P (x)

in the first column ((1) of the matrix

M

but only such primes from

B_{α, β}

(Beatty primes) and from

P^{(c)} = {[l^{c}] prime}) .

For the count of Beatty and Piatetski–Shapiro primes in the column C(1):

Lemma 23

(Lemma 3.1 of [34]). Let α and β be fixed real numbers with α a positive irrational and of finite type. Then, there is a constant

κ > 0

, such that for all integers

0 \leq a < q \leq N^{κ}

(98)

with

(a, q) = 1

, we have

\sum_{\begin{matrix} n \leq N \\ α n + β] \equiv a (mod q) \end{matrix}} Λ ([α n + β]) = α^{- 1} \sum_{\begin{matrix} m \leq [α N + β] \\ m \equiv a (mod q) \end{matrix}} Λ (m) + O (N^{1 - κ}),

where the implied constant depends only on α and β.

Theorem 27.

(Theorem 8 of [36]). Let a and d be coprime integers,

d \geq 1

. For fixed

c_{0} \in (1, 18 / 17)

, we have (with

γ = 1 / c_{0}

):

\begin{matrix} π_{c_{0}} (w; d, a) = & γ w^{γ - 1} π (w; d, a) \\ + γ (1 - γ) \int_{2}^{w} u^{γ - 2} π (u; d, a) d u + O (w^{17 / 39 + 7 γ / 13 + ϵ}) . \end{matrix}

(π_{c_{0}} (w; d, a) = # {p \in P^{(c_{0})} : p \leq w, p \equiv a mod d) .

7. Conclusions

In this paper, we mainly investigate recent results on large gaps between primes. In the series of important results in this domain, the first were accomplished in the work [10] by Ford, Green, Konyagin and Tao. Subsequently, they were improved in the joint paper [19] of these four authors with Maynard. One of the main ingredients of these results are old methods due to Erdős and Rankin. Other ingredients are important breakthrough results due to Goldston, Pintz and Yildirim [16,17,18] and their extension by Maynard on small gaps between primes. All these previous results are discussed briefly in the present paper. The results on the appearance of k-th powers of primes contained in those large gaps, obtained by the author in joint work with Maier [20,34,35], are based on a combination of the results just described with the matrix method of Maier.

Funding

This research received no external funding.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Acknowledgments

The author wishes to express his gratitude to H. Maier for extensive discussions and close communication during the preparation of this paper. His support has been invaluable. The author wishes to also thank the anonymous referees for reading the manuscript in detail and for providing very constructive comments which helped improve the presentation of this work.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Westzynthius, E. Über die Verteilung der Zahlen, die zu den n ersten Primzahlen teilerfremd sind. In Commentationes Physico-Mathematicae; Societas Scientiarum Fennica: Helsingfors, Finland, 1931; pp. 1–37. [Google Scholar]
Backlund, R.J. Über die Differenzen Zwischen den Zahlen, die zu den ersten n Primzahlen teilerfremd sind. Ann. Acad. Sci. Fenn. 1929, 32, 1–9. [Google Scholar]
Brauer, A.; Zeitz, H. Über eine zahlentheoretische Behauptung von Legendre. Jber. Berliner Math. Ges. 1930, 29, 116–125. [Google Scholar]
Erdős, P. On the difference of consecutive primes. Quart. J. Math. Oxford Ser. Ser. 1935, 6, 124–128. [Google Scholar] [CrossRef]
Rankin, R.A. The difference between consecutive prime numbers. J. Lond. Math. Soc. 1938, 13, 242–247. [Google Scholar] [CrossRef]
Maier, H.; Pomerance, C. Unusually large gaps between consecutive primes. Trans. Amer. Math. Soc. 1990, 322, 201–217. [Google Scholar] [CrossRef]
Rankin, R.A. The difference between consecutive prime numbers V. Proc. Edinb. Math. Soc. 1962/1963, 13, 331–332. [Google Scholar]
Schönhage, A. Eine Bemerkung zur Konstruktion grosser Primzahllücken. Arch. Math. 1963, 14, 29–30. [Google Scholar] [CrossRef]
Pintz, J. Very large gaps between consecutive primes. J. Number Theory 1997, 63, 286–301. [Google Scholar] [CrossRef]
Ford, K.; Green, B.J.; Konyagin, S.; Tao, T. Large gaps between consecutive prime numbers. Ann. Math. 2016, 183, 935–974. [Google Scholar] [CrossRef]
Maynard, J. Large gaps between primes. Ann. Math. 2016, 183, 915–933. [Google Scholar] [CrossRef][Green Version]
Green, B.; Tao, T. Linear equations in primes. Ann. Math. 2010, 171, 1753–1856. [Google Scholar] [CrossRef]
Green, B.; Tao, T. The quantitative behaviour of polynomial orbits on nilmanifolds. Ann. Math. 2012, 175, 465–540. [Google Scholar] [CrossRef]
Green, B.; Tao, T.; Ziegler, T. An inverse theorem for the Gowers U^s+1[N]-norm. Ann. Math. 2012, 112, 1231–1372. [Google Scholar] [CrossRef]
Maynard, J. Dense clusters of primes in subsets. Compos. Math. 2016, 152, 1517–1554. [Google Scholar] [CrossRef]
Goldston, D.A.; Pintz, J.; Yildirim, C.Y. Primes in Tuples I. Ann. Math. 2009, 170, 819–862. [Google Scholar] [CrossRef]
Goldston, D.A.; Pintz, J.; Yildirim, C.Y. Primes in Tuples II. Acta Math. 2010, 204, 1–47. [Google Scholar] [CrossRef]
Goldston, D.A.; Pintz, J.; Yildirim, C.Y. The path to recent progress on small gaps between primes. arXiv 2016. arXiv:math/0512436v2. [Google Scholar]
Ford, K.; Green, B.J.; Konyagin, S.; Maynard, J.; Tao, T. Large gaps between primes. arXiv 2016, arXiv:1412.5029v3. [Google Scholar]
Maier, H.; Rassias, M.T. Large gaps between consecutive prime numbers containing perfect k-th powers of prime numbers. J. Funct. Anal. 2017, 272, 2659–2696. [Google Scholar] [CrossRef]
Ford, K.; Heath-Brown, D.R.; Konyagin, S. Large gaps between consecutive prime numbers containing perfect powers. In Analytic Number Theory; Honor of Helmut Maier’s 60th Birthday; Springer: New York, NY, USA, 2015; pp. 83–92. [Google Scholar]
Erdős, P. The difference of Consecutive Primes. Duke Math. J. 1940, 6, 438–441. [Google Scholar] [CrossRef]
De Bruijn, N.G. On the number of positive integers ≤x and free of prime factors ≥y. Indag. Math. 1951, 13, 50–60. [Google Scholar] [CrossRef]
Pippenger, N.; Spencer, J. Asymptotic behavior of the chromatic index for hypergraphs. J. Combin. Theory Ser. A 1989, 51, 24–42. [Google Scholar] [CrossRef]
Rödl, V. On a packing and covering problem. Eur. J. Comb. 1985, 6, 69–78. [Google Scholar] [CrossRef]
Bombieri, E.; Davenport, H. Small differences between prime numbers. Proc. R. Soc. London. Ser. A Math. Phys. Sci. 1966, 293, 1–18. [Google Scholar]
Davenport, H. Multiplicative Number Theory, 3rd ed.; Graduate Texts in Mathematics; Springer: New York, NY, USA, 2000; Volume 74. [Google Scholar]
Broughan, K. Bounded Gaps Between Primes: The Epic Breakthroughs of the Early Twenty-First Century; Cambridge University Press: Cambridge, UK, 2021. [Google Scholar]
Maynard, J. Small gaps between primes. Ann. Math. 2015, 181, 383–413. [Google Scholar] [CrossRef]
Zhang, Y. Bounded gaps between primes. Ann. Math. 2014, 179, 1121–1174. [Google Scholar] [CrossRef]
Maier, H. Chains of large gaps between consecutive primes. Adv. Math. 1981, 39, 257–269. [Google Scholar] [CrossRef]
Gallagher, P.X. A large sieve density estimate near σ=1. Invent. Math. 1970, 11, 329–339. [Google Scholar] [CrossRef]
Friedlander, J.; Gong, K.; Shparlinski, I.E. Character sums over shifted primes. Math. Not. 2010, 88, 585–598. [Google Scholar] [CrossRef]
Maier, H.; Rassias, M.T. Prime avoidance property of k-th powers of prime numbers with Beatty sequences, In Discrete Mathematics and Applications; Springer: Berlin/Heidelberg, Germany, 2020; pp. 397–404. [Google Scholar]
Maier, H.; Rassias, M.T. Prime Avoidance Property of k-th Powers of Piatetski–Shapiro Primes. arXiv 2023, arXiv:2306.16777. [Google Scholar]
Baker, R.C.; Banks, W.; Brüdern, J.; Shparlinski, I.E.; Weingartner, A. Piatetski–Shapiro sequences. Acta Arith. 2013, 157, 37–68. [Google Scholar] [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Rassias, M.T. Recent Results on Large Gaps Between Primes. Axioms 2025, 14, 198. https://doi.org/10.3390/axioms14030198

AMA Style

Rassias MT. Recent Results on Large Gaps Between Primes. Axioms. 2025; 14(3):198. https://doi.org/10.3390/axioms14030198

Chicago/Turabian Style

Rassias, Michael Th. 2025. "Recent Results on Large Gaps Between Primes" Axioms 14, no. 3: 198. https://doi.org/10.3390/axioms14030198

APA Style

Rassias, M. T. (2025). Recent Results on Large Gaps Between Primes. Axioms, 14(3), 198. https://doi.org/10.3390/axioms14030198

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Recent Results on Large Gaps Between Primes

Abstract

1. Introduction

2. Short History of Large Gap Results

3. Small Gaps, GPY Sieve and Maynard’s Improvement

4. Large Gaps with Improved Order of Magnitude and Its K-Version, Part I

5. Large Gaps with Improved Order of Magnitude and Its K-Version, Part II

6. The K-Version of Large Gap Results for Primes from Special Sequences

7. Conclusions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI