Riemann Hypothesis and Random Walks: the Zeta case

In previous work it was shown that if certain series based on sums over primes of non-principal Dirichlet characters have a conjectured random walk behavior, then the Euler product formula for its $L$-function is valid to the right of the critical line $\Re (s)>\tfrac{1}{2}$, and the Riemann Hypothesis for this class of $L$-functions follows. Building on this work, here we propose how to extend this line of reasoning to the Riemann zeta function and other principal Dirichlet $L$-functions. We apply these results to the study of the argument of the zeta function. In another application, we define and study a 1-point correlation function of the Riemann zeros, which leads to the construction of a probabilistic model for them. Based on these results we describe a new algorithm for computing very high Riemann zeros, and we calculate the googol-th zero, namely $10^{100}$-th zero to over 100 digits, far beyond what is currently known.


I. INTRODUCTION
There are many generalizations of Riemann's zeta function to other Dirichlet series, which are also believed to satisfy a Riemann Hypothesis. A common opinion, based largely on counterexamples, is that the L-functions for which the Riemann Hypothesis is true enjoy both an Euler product formula and a functional equation. However a direct connection between these properties and the Riemann Hypothesis has not been formulated in a precise manner. In [1,2] a concrete proposal making such a connection was presented for Dirichlet L-functions, and those based on cusp forms, due to the validity of the Euler product formula to the right of the critical line. In contrast to the non-principal case, in this approach the case of principal Dirichlet L-functions, of which Riemann zeta is the simplest, turned out to be more delicate, and consequently it was more difficult to state precise results. In the present work we attempt to address further this special case, although as we will explain the results are not as conclusive as for the non-principal case. What is new that is presented here is a different way to understand the extent in which the truncated Euler product is a good approximation. We then use this to approximate the argument of the zeta function on the critical line. We also study 1-point statistics of the Riemann zeros, in contrast to the 2-point correlation functions that are widely studied.
Let χ(n) be a Dirichlet character modulo k and L(s, χ) its L-function with s = σ + it. It satisfies the Euler product formula where p n is the n-th prime. The above formula is valid for (s) > 1 since both sides converge absolutely. The important distinction between principal verses non-principal characters is the following. For non-principal characters the L-function has no pole at s = 1, thus there exists the possibility that the Euler product is valid partway inside the strip, i.e. has abscissa of convergence σ c < 1. It was proposed in [1,2] that σ c = 1 2 for this case. In contrast, now consider L-functions based on principal characters. The latter character is defined as χ(n) = 1 if n is coprime to k and zero otherwise. The Riemann zeta function is the trivial principal character of modulus k = 1 with all χ(n) = 1. L-functions based on principal characters do have a pole at s = 1, and therefore have abscissa of convergence σ c = 1, which implies the Euler product in the form given above strictly cannot be valid inside the critical strip 0 < σ < 1. Nevertheless, in this paper we will show how a truncated version of the Euler product formula can be approximately valid for σ > 1 2 . The primary aim of the work [1,2] was to determine what specific properties of the prime numbers would imply that the Riemann Hypothesis is true. This is the opposite of the more well-studied question of what the validity of the Riemann Hypothesis implies for the fluctuations in the distribution of primes. The answer proposed was simply based on the multiplicative independence of the primes, which to a large extent underlies their pseudo-random behavior. To be more specific, let χ(n) = e iθn for χ(n) = 0. In [1,2] it was proven that if the series is O( √ N ), then the Euler product converges for σ > 1 2 and the formula (1) is valid to the right of the critical line. In fact, we only need B N = O( √ N ) up to logs (see Remark 1); when we write write O( √ N ), it is implicit that this can be relaxed with logarithmic factors. For non-principal characters the allowed angles θ n are equally spaced on the unit circle, and it was conjectured in [2] that the above series with t = 0 behaves like a random walk due to the multiplicative independence of the primes, and this is the origin of the O( √ N ) growth. Furthermore, this result extends to all t since domains of convergence of Dirichlet series are always half-planes. Taking the logarithm of (1), one sees that log L is never infinite to the right of the critical line and thus has no zeros there. This, combined with the functional equation that relates L(s) to L(1 − s), implies there are also no zeros to the left of the critical line, so that all zeros are on the line. The same reasoning applies to cusp forms if one also uses a non-trivial result of Deligne [2].
In this article we reconsider the principal Dirichlet case, specializing to Riemann zeta itself since identical arguments apply to all other principal cases with k > 1 1 . Here all angles θ n = 0, so one needs to consider the series which now strongly depends on t. On the one hand, whereas the case of principal Dirichlet L-functions is complicated by the existence of the pole, and, as we will see, one consequently needs to truncate the Euler product to make sense of it, on the other hand B N can be estimated using the prime number theorem since it does not involve sums over non-trivial characters χ, and this aids the analysis. This is in contrast to the non-principal case, where, however well-motivated, we had to conjecture the random walk behavior alluded to above, so in this respect the principal case is potentially simpler. To this end, a theorem of Kac (Theorem 1 below) nearly does the job: B N (t) = O( √ N ) in the limit t → ∞, which is also a consequence of the multiplicative independence of the primes. This suggests that one can also make sense of the Euler product formula in the limit t → ∞. However this is not enough for our main purpose, which is to have a similar result for finite t which we will develop.
This article is mainly based on our previous work [1,2] but provides a more detailed analysis and extends it in several ways. It was suggested in [1] that one should truncate the series at an N that depends on t. First, in the next section we explain how a simple group structure underlies a finite Euler product which relates it to a generalized Dirichlet series which is a subseries of the Riemann zeta function. Subsequently we estimate the error under truncation, which shows explicitly how this error is related to the pole at s = 1, as expected. The remainder of the paper, sections IV-VI, presents various applications of these ideas. We use them to study the argument of the zeta function. We present an algorithm to calculate very high zeros, far beyond what is currently known. We also study the statistical fluctuations of individual zeros, in other words, a 1-point correlation function.
In many respects, our work is related to the work of Gonek et. al. [5,6], which also considers a truncated Euler product. The important difference is that the starting point in [5] is a hybrid version of the Euler product which involves both primes and zeros of zeta. Only after assuming the Riemann Hypothesis can one explain in that approach why the truncated product over primes is a good approximation to zeta. In contrast, here we do not assume anything about the zeros of zeta, since the goal is to actually understand their location.
We are unable to provide fully rigorous proofs of some of the statements below, however we do provide supporting calculations and numerical work. In order to be clear on this, below "Proposal" signifies the most important claims that we could not rigorously prove, and should not be taken as a "Proposition" in the usual formal mathematical sense.

II. ALGEBRAIC STRUCTURE OF FINITE EULER PRODUCTS
The aim of this section is to define properly the objects we will be dealing with. In particular we will place finite Euler products on the same footing as other generalized Dirichlet series. The results are straightforward and are mainly definitions.
where the group operation is ordinary multiplication. Clearly Q N ⊂ Q + where Q + are the positive rational numbers.
There are an infinite number of integers in Q N which form a subset of the natural numbers N = {1, 2, . . .}. We will denote this set as N N ⊂ N, and elements of this set simply as n.
Definition 2. Fix a positive integer N . For every integer n ∈ N we can define the character c(n): Definition 3. Fix a positive integer N and let s be a complex number. Based on Q N we can define the infinite series which is a generalized Dirichlet series. There are an infinite number of terms in the above series since N N is infinite dimensional. Because of the group structure of Q N , ζ N satisfies a finite Euler product formula: Proposition 1. Let σ c be the abscissa of convergence of the series ζ N (s) where s = σ + it, namely ζ N (s) converges for (s) > σ c . Then in this region of convergence, ζ N satisfies a finite Euler product formula: Proof. Based on the completely multiplicative property of the characters, one has The result follows then from the fact that c(p n ) = 0 if n > N .
Then the above Euler product formula (7) is simply the standard formula for the sum of a geometric series: Here the abscissa of convergence is σ c = 0.
The series ζ N (s) defined in (6) has some interesting properties: (i) For finite N the product is finite for s = 0, thus the infinite series ζ N (s) converges for (s) > 0 for any finite N .
(ii) Since the logarithm of the product is finite, for finite N , ζ N (s) has no zeros nor poles for (s) > 0. Thus the Riemann zeros and the pole at s = 1 arise from the primes at infinity p ∞ , i.e. in the limit N → ∞. In this limit all integers are included in the sum (6) that defines ζ N since N ∞ = N. This is in accordance with the fact that the pole is a consequence of there being an infinite number of primes.
The property (ii) implies that, in some sense, the Riemann zeros condense out of the primes at infinity p ∞ . Formally one has However since N is going to infinity, the above is true only where the series formally converges as a Dirichlet series, which, as discussed in the Introduction, is (s) > 1. Nevertheless, for very large but finite N , the function ζ N can still be a good approximation to ζ(s) inside the critical strip since for N finite there is convergence of ζ N (s) for (s) > 0. This is the subject of the next section, where we show that a finite Euler product formula is valid for (s) > 1 2 in a manner that we will specify. In this section we propose that the Euler product formula can be a very good approximation to ζ(s) for (s) > 1 2 and large t if N is chosen to depend on t in a specific way which was already proposed in [1,2]. The new result presented here is an estimate of the error due to the truncation.
The random walk property we will build upon is based on a central limit theorem of Kac [4], which largely follows from the multiplicative independence of the primes: Theorem 1. (Kac) Let u be a random variable uniformly distributed on the interval u ∈ [−T, T ], and define the series cos(u log p n ). (11) Then in the limit N → ∞ and T → ∞, B N / √ N approaches the normal distribution N (0, 1), namely where P denotes the probability for the set.
We wish to use the above theorem to conclude something about B N (t) for a fixed, non-random t. Based on Theorem 1, we suggest the following for non-random, but large t: For any > 0, We could not rigorously prove this statement, however we can provide a heuristic argument. As T → ∞, even though u is random, the vast majority of them are tending to ∞. One then uses the normal distribution in Theorem 1. In the following we will provide indirect numerical evidence. Remark 1. The proof of convergence of the Euler product in [2] is not spoiled if the bound on B N is relaxed up to logs. For instance, if in the limit t → ∞, B N = O( √ N log log N ), as suggested by the law of iterated logarithms relevant to central limit theorems, this is fine, as is for any positive power a. A consequence of Theorem 1 and the comments following it is that the Euler product formula is valid to the right of the critical line in the limit t → ∞, at least formally. Namely for σ > 1 2 , As shown in [1,2] and discussed in the Introduction, this formally follows from the √ N growth of B N . The problem with the above formula is that due to the double limit on the RHS, it is not rigorously defined. For instance, it could depend on the order of limits. It is thus desirable to have a version of (14) where N and t are taken to infinity simultaneously. Namely, we wish to truncate the product at an N (t) that depends on t with the property that lim t→∞ N (t) = ∞. One can then replace the double limit on the RHS of (14) with one limit t → ∞, or equivalently There is no unique choice for N (t), but there is an optimal upper limit, N (t) < N max (t) ≡ [t 2 ], with [t 2 ] its integer part, which we now describe. We can use the prime number theorem to estimate B N (t): where Ei is the usual exponential-integral function, and we have used The prime number theorem implies p N ≈ N log N . Using this in (15) and imposing B N (t) < √ N leads to N < [t 2 ]. Based on the above, henceforth we will always assume the following properties of N (t): and will not always display the t dependence of N . Equation Extensive and compelling numerical evidence supporting the above formula was already presented in [1]. Based on the above results we are now in a position to study the following important question. If we fix a finite but large t, and truncate the Euler product at N (t), which is finite, what is the error in the approximation to ζ to the right of the critical line? We estimate this error as follows: Proposal 1. Let N = N (t) satisfy (17). Then for (s) > 1 2 and large t, where ζ(s) is the actual ζ function defined by analytic continuation and R N is finite (except at the pole s = 1) and satisfies namely the error goes to zero as t → ∞.
We provide the following supporting argument, although not a rigorous proof, for this Proposal. From (18), one concludes that (19) must hold in the limit of large t with R N satisfying (21). The logarithm of (19) reads First assume (s) > 1. Then in the limit of large t, the error upon truncation is the part that is neglected in (18): Expanding out the logarithm, one has where in the second line we again used the prime number theorem to approximate the sum over primes. Next using p N ≈ N log N , one obtains (20). Finally, the above expression can be continued into the strip σ > 1 2 if N (t) < [t 2 ] since N (t) 1−s /t < N 1/2−s which goes to zero as N → ∞ if (s) > 1 2 . The latter also implies (21). Proposal 1 makes it clear that the need for a cut-off N < N max originates from the pole at s = 1, since as long as s = 1, the error R N (s) in (20) is finite. The error becomes smaller and smaller the further one is from the pole, i.e. as t → ∞. In Figure 1 we numerically illustrate Proposal 1 inside the critical strip. Remark 2. For estimating errors at large t the following formula is useful: Proposal 2. Assuming Proposal 1, all non-trivial zeros of ζ(s) are on the critical line.
Proof. Taking the logarithm of the truncated Euler product, one obtains (22). If there were a zero ρ with (ρ) > 1 2 , then log ζ(ρ) = −∞. However the right hand side of (22) is always finite, thus there are no zeros to the right of the critical line. The functional equation relating ζ(s) to ζ(1 − s) shows there are also no zeros to the left of the critical line.
Remark 3. Interestingly, Proposal 1 and Theorem 2 imply that proving the validity of the Riemann Hypothesis is under better control the higher one moves up the critical line. For instance, it is known that all zeros are on the line up to t ∼ 10 13 , and beyond this, the error R N is too small to spoil the validity of the Riemann Hypothesis. Henceforth, we assume the RH.

IV. 1-POINT CORRELATION FUNCTION OF THE RIEMANN ZEROS
Montgomery conjectured that the pair correlation function of ordinates of the Riemann zeros on the critical line satisfy GUE statistics [9]. Being a 2-point correlation function, it is a reasonably complicated statistic. In this section we propose a simpler 1-point correlation function that captures the statistical fluctuations of individual zeros.
Let t n be the exact ordinate of the n-th zero on the critical line, with t 1 = 14.1347... and so forth. The single equation ζ(ρ) = 0 is known to have an infinite number of non-trivial solutions ρ = 1 2 + it n . In [7], by placing the zeros in one-to-one correspondence with the zeros of a cosine function, the single equation ζ(ρ) = 0 was replaced by an infinite number of equations, one for each t n that depends only on n: where ϑ is the Riemann-Siegel function: The equation (26) involves the important function It is important that the δ → 0 + approaches the critical line from the right, since this is where the Euler product formula is valid in the sense described above. This equation was used to calculate zeros very accurately in [7], up to thousands of digits. There is no need for a cut-off N max in the above equation since the arg ζ term is defined for arbitrarily high t by standard analytic continuation. One aspect of this equation is the following theorem: Remark 4. Details of the proof are in [7]. The main idea is that if there is a unique solution, then the zeros are enumerated by the integer n and can be counted along the critical line, and the resulting counting formula coincides with a well known result due to Backlund for the number of zeros in the entire critical strip. The zeros are simple because the zeros of the cosine are simple. The above theorem is another approach towards proving the Riemann Hypothesis, however it is not entirely independent of the above approach based on the Euler product formula, in particular Theorem 2. In [7], we were unable to prove there is a unique solution because we did not have sufficient control over the relevant properties of the argument of ζ on the critical line.
If the arg ζ term is ignored, then there is indeed a unique solution for all n since ϑ(t) is a monotonically increasing function of t. Using its asymptotic expansion for large t, equation (34) below, and dropping the O(1/t) term, then the solution is where W is the Lambert W -function. The only way there would fail to be a solution is if S(t) is not well defined for all t. We point out that the Lambert function was used in connection with the Riemann zeros in [8], however the meaning does not seem to be the same as in this article. The fluctuations in the zeros come from arg ζ since t n is a smooth function of n. These small fluctuations are shown in Figure 2. Let us define δt n = t n − t n . One needs to properly normalize δt n , taking into account that the spacing between zeros decreases as 2π/ log n. To this end we expand the equation (26) around t n . Using ϑ( t n ) ≈ (n − 3 2 )π, one obtains δt n ≈ −πS(t n )/ϑ ( t n ) where ϑ (t) is the derivative with respect to t. Using ϑ (t) ≈ 1 2 log(t/2πe), this leads us to define The probability distribution of the set for large M is then an interesting property to study. Here "probability" is defined as frequency of occurrence. The origin of the statistical fluctuations of ∆ M is ultimately the fluctuations in the primes. In Figure 3 we plot the distribution of ∆ M for M = 10 5 . It closely resembles a normal distribution. Let us suppose ∆ M does indeed satisfy a normal distribution N (µ, σ 1 ). Using some known properties of S(t n ), together with the equation (30), we can propose then the following. First, one expects that the average of δ n is zero since it is known that the average of S(t) is zero, thus µ = 0. Up to the height t that we have studied, S(t) is nearly always on the principal branch, i.e. −1 < S(t) < 1 up to some reasonably high t on the order of t = 10 6 or more. Then at each jump by 1 at t n , on average S(t n ) passes through zero. This implies that the average |S(t n )| ≈ 1/4. For a normal distribution |S(t n )| = 2 π σ 1 . Thus one expects the standard deviation σ 1 of ∆ M to be σ 1 ≈ π/32 = 0.313... In Figure 3 we present results for the first 10 5 -th known exact zeros. The distribution function fits a normal distribution with σ 1 = π/32 rather well. Performing a fit, one finds σ 1 ≈ 0.27. For higher values of M around 10 6 , a fit gives σ 1 ≈ 0.3, which is closer to the predicted value. We emphasize however that this approximate prediction for σ 1 assumes S(t) is on the principal branch, which is not expected to hold for arbitrarily high t.
If we approximate the distribution of ∆ M as normal, then we can construct a simple probabilistic model of the Riemann zeros:

Definition 4.
A probabilistic model of the Riemann zeros.
Let r be a random variable with normal distribution N (0, σ 1 ). Then a probabilistic model of the zeros t n can be defined as the set { t n }, where and t n is defined in (29). In the above formula r is chosen at random independently for each n.
The statistical model (32) is rather simplistic since it is just based on a normal distribution for r and t n is smooth and completely deterministic. A natural question then arises. Does the pair correlation function of { t n } satisfy GUE statistics as does the actual zeros {t n }? It is certainly interesting to study the 2-point correlation function of { t n }. Montgomery's pair correlation conjecture can be stated as follows. Let N (T ) denote the number of zeros up to height T , where N (T ) ≈ T 2π log T 2πe . Let t, t denote zeros in the range [0, T ]. Then in the limit of large T : where d(t, t ) is a normalized distance between zeros d(t, t ) = 1 2π log T 2πe (t − t ). In Figure 4 we plot the pair correlation function for the first 10 5 -th t n 's. We chose σ 1 = 0.274 since in this range of n this gives a better fit to the normal distribution of the 1-point function. The results are reasonably close to the GUE prediction (33), especially considering that for just the first 10 5 true zeros the fit to the GUE prediction is not perfect; for much higher zeros it is significantly better [10].

V. COMPUTING VERY HIGH ZEROS FROM THE PRIMES
This section can be viewed as providing additional numerical evidence for some of the previous results. We will be calculating S(t) from the primes using the truncated Euler product. Since this requires (s) → 1 2 + , this is pushing the limit of the validity of the truncated Euler product formula, nevertheless we will obtain reasonable results. We emphasize that this method has nothing to do with the random model for the zeros in Definition 4, but rather relies on the Euler product formula to calculate S(t).
Many very high zeros of ζ have been computed numerically, beginning with the work of Odlyzko. All zeros up to the 10 13 -th have been computed and are all on the critical line [11]. Beyond this the computation of zeros remains a challenging open problem. However some zeros around the 10 21 -st and 10 22 -nd are known [12]. In this section we describe a new and simple algorithm for computing very high zeros based on the above reasoning. It will allow us to go much higher than the known zeros since it does not require numerical implementation of the ζ function itself, but rather only requires knowledge of some of the lower primes.
Let us first discuss the numerical challenges involved in computing high zeros from the equation (26) based on the standard Mathematica package. The main difficulty is that one needs to implement the arg ζ term. Mathematica computes Arg ζ, i.e. on the principal branch, however near a zero this is likely to be valid based on the discussion in section IV. The main problem is that Mathematica can only compute ζ for t below some maximum value around t = 10 10 . This was sufficient to calculate up to the n = 10 9 -th zero from (26) in [7]. The log Γ term must also be implemented to very high t, which is also limited in Mathematica. We deal with these difficulties first by computing arg ζ from the Euler product formula involving a finite sum over primes. Then, the log Γ term can be accurately computed using corrections to Stirling's formula: Let t n;N denote the ordinate of the n-th zero computed using the first N primes based on (26). For high zeros, it is approximately the solution to the following equation where it is implicit that N < N max (t) = [t 2 ]. The important property of this equation is that it no longer makes any reference to ζ itself. It is straightforward to solve the above equation with standard root-finder software, such as FindRoot in Mathematica. One can view the computation of t n as a kind of Markov process. If one includes no primes, i.e. N = 0, and drops the next to leading 1/t corrections, then the solution is unique and explicitly given by t n;0 = t n in terms of the Lambert W -function in (29). One then goes from t n;0 to t n;1 by finding the root to the equation for t n;1 in the vicinity of t n;0 , then similarly t n;2 is calculated based on t n;1 and so forth. At each step in the process one includes one additional prime, and this slowly approaches t n , so long as N (t) < N max (t). In practice we did not follow this iterative procedure, but rather fixed N and simply solved (35) in the vicinity of t n .
We can estimate the error in computing the zero t n from the primes using equation (35) as follows. As in Section IV, we expand the equation (26) now around t n;N rather than t n . One obtains t n − t n;N = −π dS N /ϑ (t n;N ) where dS N is the error in computing S(t) from the primes. Using (24), we have Now from the prime number theorem, p N ≈ N log N . Recall N is cut off at N max = [t 2 ], which cancels the 1/t in the previous formula. Finally it is meaningful to normalize the error by the mean spacing 2π/ log n. The result is t n − t n;N 2π/ log n ≈ 1 π √ log N cos (t n log p N )  (35) with N = 5 × 10 6 primes. We fixed δ = 10 −6 . Above, ∼ denotes the integer part of the second column.
where we have used t n;N ≈ t n ≈ 2πn/ log n. The left hand side represents the ratio of the error to the mean spacing between zeros at that height. Again, it is implicit that N < [t 2 n ]. The interesting aspect of the above formula is that the relative error decreases with N , although rather slowly. The cosine factor also implies there are large scale oscillations around the actual t n .
For very high t, N max (t) = [t 2 ] is extremely large and it is not possible in practice to work with such a large number of primes. This is the primary limitation to the accuracy we can obtain. We will limit ourselves to the relatively small N = 5 × 10 6 primes. Let us verify the method by comparing with some known zeros around n = 10 21 and 10 22 . The results are shown in Table I. Equation (36) predicts t n − t n;N ≈ 0.01 for these n and N , and inspection of the table shows this is a good estimate. Odlyzko was of course able to calculate more digits; our accuracy can be improved by increasing N in principle. We also checked some zeros around the n = 10 33 -rd computed by Hiary [13], again with favorable results.
Having made this check, let us now go far beyond this and compute the n = 10 100 -th zero by the same method. Again using only N = 5 × 10 6 primes, we found the following t n : n = 10 100 −th zero : t n = 280690383842894069903195445838256400084548030162846 045192360059224930922349073043060335653109252473.244.... Obtaining this number took only a few minutes on a laptop using Mathematica. We are confident that the last 3 digits ∼ .244 are correct since we checked that they didn't change between N = 10 6 and 5 × 10 6 . Furthermore, 3 digits is consistent with (36), which predicts that for these n and N , t n − t n;N ≈ 0.002. We calculated the next zero to be ∼ .273.
We were able to extend this calculation to the 10 1000 -th zero without much difficulty. As equation (36) shows, the relative error only decreases as one increases t. It is also straightforward to extend this method to all primitive Dirichlet L-functions and those based on cusp forms using the transcendental equations in [7] and the results in [2].