Next Article in Journal
Non Stationary Multi-Armed Bandit: Empirical Evaluation of a New Concept Drift-Aware Algorithm
Next Article in Special Issue
Extreme Value Theory for Hurwitz Complex Continued Fractions
Previous Article in Journal
Design of Optimal Rainfall Monitoring Network Using Radar and Road Networks
Previous Article in Special Issue
A Method for Confidence Intervals of High Quantiles
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Potential Well in Poincaré Recurrence

1
Instituto de Matemática e Estatística, Universidade de São Paulo-USP, Rua do Matão, 1010 Butantã, São Paulo 05508-090, Brazil
2
Instituto Federal de Educação, Ciência e Tecnologia de São Paulo-IFSP, Câmpus Araraquara, Rua Doutor Aldo Benedito Pierri, 250 Jardim Paulo Freire, Araraquara 14804-296, Brazil
3
Departamento de Estatística, Universidade Federal de São Carlos-UFSCar, Rua dos Bem-te-vis, São Carlos 13565-905, Brazil
*
Authors to whom correspondence should be addressed.
Entropy 2021, 23(3), 379; https://doi.org/10.3390/e23030379
Submission received: 6 February 2021 / Revised: 17 March 2021 / Accepted: 18 March 2021 / Published: 23 March 2021
(This article belongs to the Special Issue Extreme Value Theory)

Abstract

:
From a physical/dynamical system perspective, the potential well represents the proportional mass of points that escape the neighbourhood of a given point. In the last 20 years, several works have shown the importance of this quantity to obtain precise approximations for several recurrence time distributions in mixing stochastic processes and dynamical systems. Besides providing a review of the different scaling factors used in the literature in recurrence times, the present work contributes two new results: (1) For ϕ -mixing and ψ -mixing processes, we give a new exponential approximation for hitting and return times using the potential well as the scaling parameter. The error terms are explicit and sharp. (2) We analyse the uniform positivity of the potential well. Our results apply to processes on countable alphabets and do not assume a complete grammar.

1. Introduction

The close relation between the Extreme Value Theory (EVT) and the statistical properties of Poincaré recurrence has been recently quite well explored. The starting point is that the exceedances of a stochastic process to a sequence of barrier values a n > 0 , n N , can be considered as hittings to a sequence of nested sets. More precisely, if one defines the semi-infinite intervals:
A n = ( a n , ) ,
and considers a sequence of random variables X 1 , X 2 , , one has the equivalence:
max { X 1 , , X t } > a n if and only if T A n t ,
where for any measurable set A, T A denotes the smallest k such that X k A . As the sequence of levels a n diverges, the sets A n are nested. This equivalence allows building a bridge between two historically independent theories: Extreme Value Theory (EVT) and Poincaré Recurrence Theory (PRT). While EVT focuses on the existence (and identification) of the limit of the distribution of the partial maxima and k-maxima, among others [1,2,3,4,5], the aim of recent works on PRT is to understand the statistical properties of the different notions of return times.
The present paper stands on the approach of PRT. Our interest is the statistics of visits of a random process X t , t N to a given target measurable set. Asymptotic statistics are obtained by studying sequences of target sets A n , n 1 , usually of a measure shrinking to zero. In this context and for certain classes of processes, hitting and return times with respect to a given sequence of target sets converge to the exponential distribution, modelling the unpredictability of rare events. However, this rough affirmation is full of nuances that need to be established in very precise terms. It turns out that these details bring much information on the system.
For instance, for two observables having the same probability, the Ergodic theorem says that, macroscopically, their numbers of occurrences are about the same. However, these occurrences can appear scattered in a very different way along time. Under some strong mixing assumptions, it is a well-known fact of the literature that for nested sequences of observables with the same probability, the asymptotic observation of one of them can be distributed as a Poisson process while the other one will follow a compound Poisson process. Thus, the dichotomy Poisson/compound Poisson in the same system is determined also by the intrinsic properties of the target sets considered.
In the setting of the present paper, the target sets are finite strings of symbols (patterns). In this case, even if the process is a sequence of independent random variables, the successive occurrences of the string are not independent because the structure of the pattern itself enters the game, allowing or avoiding consecutive observations due to possible overlaps with itself. This leads to a dichotomy between aperiodic/periodic patterns that yields, in the limit of long patterns, the dichotomy Poisson/compound Poisson mentioned before. In passing, let us also mention that this dichotomy also exists in EVT where it is referred to as the phenomenon of clustering/non-clustering of maxima, and it has generated a great deal of research over the last two decades [3].
Let us now be more specific about what we are doing here. First, we operate in the context of discrete time stochastic processes with countable alphabet enjoying ϕ -mixing. Fix any point x, that is any right infinite sequence of symbols taken from the alphabet, and consider the nested sequence of neighbourhoods corresponding to the first n symbols of x, namely A n = ( x 0 , , x n 1 ) , n 1 . The main theorem of the paper, Theorem 1, gives explicit and computable error terms for the approximation of the hitting time distribution μ ( T A n > t ) and return time distribution μ A n ( T A n > t ) , by exponential distributions whose parameter is explicit and depends on A n .
The first main advantage of Theorem 1 is that it uses the potential well as the scaling parameter. In other words, the potential well is the probability, conditioned on starting from A n , that the pattern A n does not reappear at the first possible moment it could reappear. The use of this simple and well-defined quantity as the scaling parameter contrasts with previous works using parameters whose expressions are hardly explicit and even more hardly computable. Another advantage of Theorem 1 is that, unlike a whole body of literature obtaining almost sure results, our results hold for all x. This allows distinguishing different limiting distributions, as for example in the periodic/aperiodic dichotomy described above that almost sure results cannot detect. Finally, the error terms of our approximations are not in total variation distance, but in the stronger point-wise form with respect to the time scale.
The closest result of the literature, but restricted to return times, is the main theorem of [6]. Our Theorem 1, besides considering also the case of hitting times, also extends the class of processes to which the approximations apply. First, we do not assume the finiteness of the alphabet, allowing countably infinite alphabets. Second, our processes are not assumed to have a complete grammar, as was the case of [6]. In ergodic theory terminology, this means that we do not restrict to the full shift. Yet another contribution here is that we specify a tighter and simpler error term under the stronger assumption of ψ -mixing in both hitting and return time approximations.
Last but not least, Theorem 1 corrects the exponential approximation obtained by [6] for return times. Indeed, the error term of their Theorem 4.1 contains a mistake for small t’s.
The other main novelty of the present work is Theorem 2, stating that (1) the potential well almost surely converges to one as the size of the patter diverges under ϕ -mixing and (2) the potential well is uniformly bounded away from zero when we have ψ -mixing or ϕ -mixing with summable function ϕ ( n ) . Naturally, as a conditional probability, we know that the potential well belongs to the interval [ 0 , 1 ] for any n 1 and any pattern A n . However, it was proven that the potential well could be arbitrarily close to zero for β -mixing processes, a slightly weaker mixing assumption than ϕ -mixing. Indeed, it was shown in [7] that for the binary renewal process, with specific choices of transition probabilities and target sets A n , n 1 , the potential well of A n vanishes as n diverges. Note that the border is thin between this β -mixing example and our Theorem 2 holding for ψ -mixing and ϕ -mixing with summable ϕ (see the review of [8] on the distinct mixing assumptions). We conjecture that the assumption of the summability of the ϕ rates can be dropped.
A fundamental quantity is the shortest possible return of the pattern A n , denoted τ ( A n ) , which is in particular used to define the potential well. It is known [9,10] that under the assumptions of specification and positive entropy, τ ( A n ) / n converges almost surely to one. Since we do not assume a complete grammar, τ ( A n ) is not bounded above by n, and it could be strictly larger than n in general. Here, we prove an important technical result (Lemma 2) giving explicit uniform upper bounds, which hold for sufficiently large n and under either ϕ - or ψ -mixing.
To conclude about the importance of the present work as a whole, let us mention that our results are fundamental for the study of further recurrence quantities, such as the return time function [11,12] and the waiting time function [13,14], establishing a link with information theory. These random variables are known to satisfy a counterpart of the famous Shannon–McMillan–Breiman theorem (asymptotic equipartition property). In order to study the fluctuations of these limit theorems, for instance a large deviation principle, we need to control the return/hitting time exponential approximations for any point and any t > 0 . This was particularly clear in [15,16], studying the fluctuations of the waiting time and return time, respectively. It is also interesting to notice that it was [17] who first pointed out the importance of seeking exponential approximations for any point x, and it was precisely to study the small and large fluctuations of the return time function.
The paper is organized as follows. We describe in Section 2 the setting of the paper in the context of PRT, defining carefully the types of exponential approximations we are interested in and explaining, including through an extensive bibliography, the role of the potential well as the scaling parameter. Section 3 contains the main results, and Section 4 is dedicated to their proofs.

2. Poincaré Recurrence Theory for Mixing Processes

2.1. The Framework of Mixing Processes

Consider a countable set A that we call the alphabet. By N , we denote the set of nonnegative integers and, by X : = A N , the set of right infinite sequences x = ( x 0 , x 1 , ) of symbols taken from A . Given a point x X and for any finite set I N , the cylinder sets with the base in I is defined as the set A I ( x ) : = { y X : y i = x i , i I } . In the particular case where I = { 0 , n 1 } , we write A n ( x ) and sometimes abuse notation, writing x 0 n 1 . We endow X with the σ -algebra F generated by the class of cylinder sets { A I : I N , | I | < } . Furthermore, F I denotes the σ -algebra generated by A I ( x ) , x X . For the special case in which I = { i , , j } , 0 i j , we use the notation F i j . We use the shorthand notation a i j : = ( a i , a i + 1 , , a j ) , 0 i j < for finite strings of consecutive symbols of A . When necessary, A n ( x ) will be naturally identified with the sequence x 0 n 1 .
The shift operator σ : X X shifts the point x = ( x 0 , x 1 , x 2 , ) to the left by one coordinate, ( σ x ) i = x i + 1 , i 0 .
We consider a shift invariant (or stationary) probability measure μ on ( X , F ) . For any A F of positive measure, μ A ( · ) : = μ ( { x · A } ) μ ( A ) is the conditional measure μ restricted to A.
Our results are stated under two mixing conditions that we now define. For all n 1 , define:
ϕ ( n ) : = sup i N , A F 0 i , B F i + n μ ( A B ) μ ( A ) μ ( B ) , ψ ( n ) : = sup i N , A F 0 i , B F i + n μ ( A B ) μ ( A ) μ ( B ) 1 .
Note that ψ ( n ) and ϕ ( n ) are nonincreasing sequences, since F 0 i F 0 i + 1 for every i 0 .
Definition 1.
We say that the measure μ on ( X , F ) is ϕ-mixing (resp. ψ-mixing) if ϕ ( n ) (resp. ψ ( n ) ) goes to zero as n diverges. We say that μ is “summable ϕ-mixing” if it is ϕ mixing with n ϕ ( n ) < .
We refer to [8] for an exhaustive review of mixing properties and examples.

2.2. Recurrence Times and Exponential Approximations

The hitting time of a point y to a set A F is defined by:
T A ( y ) = inf { k 1 : σ k ( y ) A } .
For sets A of small measure (rare events) and under mixing conditions such as the ones introduced in the preceding subsection, it is expected that μ ( T A > t ) is approximately exponentially distributed. This is what we call hitting time exponential approximation. Similarly, when we refer to return time, we mean that we study the approximation of μ A ( T A > t ) , that is the measure of the same event, conditioned on the points starting in A.
In this paper, we are interested in the case where we fix any point x and consider A n ( x ) as the target set. When n diverges, the measure of A n ( x ) vanishes, leading to rare events. The scaling parameter of the exponential approximation depends on the point x.
The two main types of approximations that appeared in the literature when approximating the hitting/return time distributions around any point x of the phase space are a total variation distance type and a pointwise type.
  • Type 1: Total variation distance. For any x X ,
    -
    Hitting times:
    sup t > 0 μ ( T A > t ) e μ ( A ) θ ( A ) t ϵ ( A ) ,
    -
    Return times:
    sup t > 0 μ A ( T A > t ) θ ¯ ( A ) e μ ( A ) θ ( A ) t ϵ ( A ) .
  • Type 2: Pointwise. For any x X and any t > 0 ,
    -
    Hitting times:
    μ ( T A > t ) e μ ( A ) θ ( A ) t ϵ ( A , t ) ,
    -
    Return times:
    μ A ( T A > t ) θ ¯ ( A ) e μ ( A ) θ ( A ) t ϵ ( A , t ) .
Note that in the return time approximation, the parameters θ and θ ¯ need not be equal. However, such approximation leads to:
E A ( T A ) θ ¯ ( A ) 1 μ ( A ) θ ( A ) .
In view of Kac lemma, which, we recall, states that E A ( T A ) = 1 μ ( A ) , the last display suggests that θ and θ ¯ must be close.

2.3. Potential Well: Definition and Genealogy in PRT

As already explained, the potential well will be used as the scaling parameter in the exponential approximations of Types 1 and 2 defined above. In order to define it, we need first to define the shortest possible return of a set A F (to itself):
τ ( A ) : = inf y A T A ( y ) : μ A σ T A ( y ) ( A ) > 0 ,
or, equivalently:
τ ( A ) : = inf k 1 : μ A σ k ( A ) > 0 .
In the case where A = A n ( x ) , we define τ n ( x ) = τ ( A n ( x ) ) . The τ n : X N constitutes a sequence of simple functions. Notice that the above alternative definitions of τ ( A ) involve the measure μ A , while the traditional definition is completely topological. This is to account for the case where the measure does not have a complete grammar (We say that μ has a complete grammar if μ ( A n ( x ) ) > 0 for any x X and n 1 ).
The first possible return time τ n ( x ) is an object of independent interest, which was studied under several perspectives in the literature. Let us mention that its asymptotic concentration was proven by [9,10], large deviations in [18,19,20], and fluctuations in [21,22].
Obviously, by definition, μ A ( T A τ ( A ) ) = 1 . If for a point x A , we have T A ( x ) > τ ( A ) , we say that x escapes from A. The potential well of order n at x is precisely the proportional measure of points of A that escape from A:
ρ ( A ) : = μ A ( T A > τ ( A ) ) .
Since we are interested in the case where A = A n ( x ) , we use the alternative notation ρ ( x 0 n 1 ) instead of ρ ( A n ( x ) ) . Besides being explicitly computable in many situations, the potential well is physically meaningful and, as the scaling parameter, provides precise exponential approximations for recurrence times under suitable mixing assumptions.
We give below a small genealogy of scaling parameters that appeared in the literature that consider results holding for all points to get approximations for hitting/return times.
  • As far as we know, the first paper to prove exponential approximations for hitting time statistics for all points is due to Aldous and Brown [23]. They obtained Type 1 approximations in the case of reversible Markov chains. The parameter used there was just the inverse of the expectation, which is mandatory to use when the approximating law is the exponential distribution. However, this does not bring information about the value of the expectation.
  • Galves and Schmitt [24] obtained Type 1 approximations for hitting times in ψ -mixing processes. The major breakthrough there was that the authors provided an explicit formula for the parameter (denoted by λ ( A ) ). This quantity could be viewed as the grandfather of ρ . Nonetheless, its explicit significance was not evident.
  • References [25,26] gave exponential approximations (Type 1 and Type 2, respectively) of the distribution of hitting time around any point using a scaling parameter. In [26], however, only its existence and necessity were proven, the calculation of λ being intractable in general. The main problem is that λ ( A ) depends on the recurrence property of the cylinder A up to large time scales (usually of the order of μ ( A ) 1 ).
  • In order to circumvent this issue, Reference [25] also provided, in the context of approximations of Type 1, another scaling parameter, easier to compute, but with a slightly larger error term as a price to pay. It is defined as follows:
    ζ s ( x 0 n 1 ) : = μ x 0 n 1 ( T x 0 n 1 > n / s ) .
    This quantity depends on, at most, the 2 n first coordinates of the process. ζ s ( x 0 n 1 ) can be seen as the father of the potential well. Both works [25,26] led with processes enjoying ψ -mixing or summable ϕ -mixing.
  • The use of the potential well ρ as the scaling parameter was firstly proposed by Abadi in [27], still in the context of an approximation of Type 1 for hitting and return times. More specifically, it is proven that, for exponentially α -mixing processes, λ and ζ (grandpa and father of ρ ) can be well approximated by ρ .
  • The first paper to really directly use ρ as the scaling parameter was [6], in which a Type 2 approximation for return times was obtained, with θ ¯ = θ = ρ . The process is assumed to be ϕ -mixing.
  • Focusing on proving exponential approximations for hitting and return times under the largest possible class of systems, and still for all points, Abadi and Saussol [28] returned to the approach of Galves and Schmitt. Their results hold under the α -mixing condition, which is the weakest hypothesis used up to date, but the scaling parameter is not explicit.
  • Focussing on the specific class of binary renewal processes, Reference [7] proved a Type 1 approximation for hitting and return times using the potential well ρ . One interesting aspect concerning this work lies in the fact that the renewal process is β -mixing (weaker than the ϕ -mixing assumed by [6]). Moreover, the authors managed to use the renewal property to compute the limit of ρ ( A n ( x ) ) for any point x. In other words, the approximating asymptotic law for hitting and return times was explicitly computed as a function of the parameters of the process. This result shows the usefulness of the potential well, an “easy to compute” scaling parameter.

3. Main Results

Theorem 1 below presents Type 2 approximations for hitting and return time under ϕ - and ψ -mixing conditions with the potential well as the scaling parameter and an explicit error term.
Before we can state this result, we first need to define the second order periodicity of string A n ( x ) , which plays a crucial role for the size of the error term.

3.1. Second Order Periodicity

The short returns that we will define here are precisely those that are difficult to treat as (almost) independent. They not only depend on the correlation decay of the system, but also on the particular properties of the string itself. Technically, for an n-cylinder, short means returning in up to the order n steps.
Consider the cylinder A, and suppose τ ( A ) = k . Write n = q k + r , where q N and 0 r < k , and note that the cylinder overlaps itself in all multiples of k smaller than n. The set P ( A ) : = { m k : 1 m q } s the indexes of possible returns at multiples of τ ( A ) , but returns can also occur at other time indexes after that. Let:
R ( A ) = { j { q k + 1 , , q k + r 1 } : μ A ( σ j ( A ) ) > 0 } .
A point y A could only return to A before n at time indexes in P ( A ) R ( A ) , but there is a crucial difference between them. A point that escapes from A cannot return in P ( A ) , but it could return in R ( A ) . Namely,
μ A ( σ τ ( A ) ( A c ) σ j ( A ) ) = 0 , j P ( A ) ,
while:
μ A ( σ τ ( A ) ( A c ) σ j ( A ) ) > 0 , j R ( A ) .
We set n A as the first possible return to A, among those points x A that escape A at τ ( A ) :
n A = min { j : μ A ( σ τ ( A ) ( A c ) σ j ( A ) ) > 0 } .
We refer to [6] for an example that illustrates these ideas under a complete grammar and a finite alphabet. In the general case, notice that n A could be strictly lager than n, if one is not considering the full shift. For a complete example, consider the house of cards Markov chain. It has transition matrix Q on N with entries Q ( i , 0 ) = q i = 1 Q ( i , i + 1 ) where q i , i 0 is a sequence of real numbers taking values in the interval ( 0 , 1 ) . Therefore, it is defined on an infinite alphabet and does not have a complete grammar, since several transitions are forbidden due to the sparse nature of Q. Consider the strings A = 00010001000 and B = ( n + 1 ) 2 n (for some n 1 ) generated by this Markov chain in the stationary measure. Then, we see that τ ( A ) = 4 , R ( A ) = { 9 , 10 } , and n A = 9 , while, on the other hand, τ ( B ) = 2 n + 1 > n , R ( B ) = , and n B = 3 n + 2 .

3.2. Type 2 Approximations Scaled by the Potential Well

For any finite string A, let us denote by A ( k ) the suffix of A of size k. That is, if A = x 0 n 1 , then A ( k ) = x n k n 1 . When A is an n-cylinder we use the convention μ ( A ( j ) ) = μ ( A ( n ) ) = μ ( A ) for j n .
By definition, ϕ ( g ) is finite for all g 1 . This is not the case for ψ ( g ) . Thus, for ψ -mixing measures, we define:
g 0 = g 0 ( ψ ) : = inf { g 1 : ψ ( g ) < } 1 .
Now, for the error term, define:
( a ) ϵ ψ ( A ) : = n μ A ( n A g 0 ) + ψ ( n ) ,
( b ) ϵ ϕ ( A ) : = inf 1 w n * ( n + τ ( A ) ) μ A ( w ) + ϕ ( min { n A w + 1 , n } ) .
where n * : = min { n , n A } .
Note that cylinders A of size n verify that n A n / 2 , then ϵ ψ is well defined for all n > 2 g 0 .
We use ϵ to denote either ϵ ψ or ϵ ϕ when the argument/statement is general.
Theorem 1.
Consider a stationary measure μ on ( X , F ) enjoying either ϕ-mixing with sup A A n μ ( A ) τ ( A ) n 0 , or simply ψ-mixing. There exist five positive constants C i , i = 1 , , 5 , and n 0 N such that for all n n 0 and all A A n , the following inequalities hold.
  • For all t 0 :
μ ( T A > t ) e ρ ( A ) μ ( A ) t C 1 τ ( A ) μ ( A ) + t μ ( A ) ϵ ( A ) t [ 2 μ ( A ) ] 1 C 2 μ ( A ) t ϵ ( A ) e μ ( A ) t ρ ( A ) C 3 ϵ ( A ) t > [ 2 μ ( A ) ] 1 .
For all t τ ( A ) :
μ A ( T A > t ) ρ ( A ) e ρ ( A ) μ ( A ) ( t τ ( A ) )
C 4 ϵ ( A ) t [ 2 μ ( A ) ] 1 C 5 μ ( A ) t ϵ ( A ) e μ ( A ) t ρ ( A ) C 3 ϵ ( A ) t > [ 2 μ ( A ) ] 1 .
Theorem 1 (and its proof) were definitely inspired by [6] and their Theorem 4.1. However, let us first observe that our result provides the first statement of the literature for Type 2 hitting time approximations with the potential well as the scaling parameter. Moreover, contrary to [6], we do not assume a complete grammar nor a finite alphabet.
Let us make some further important observations concerning this theorem.
Remark 1.
Under ϕ-mixing, the assumption sup μ ( A ) τ ( A ) n 0 can be dropped under certain circumstances. For instance, if the measure has a complete grammar, we have τ ( A ) n , and the assumption is granted using Lemma 1. Another way is to assume that μ is summable ϕ-mixing, as mentioned after Lemma 2 in Section 4.
Remark 2.
According to Lemma 1, if μ is ϕ-mixing (and a fortiori, ψ-mixing), there exist constants C and c such that μ ( A ) C e c n for all n 1 and A A n . On the other hand, since n A n / 2 , we get μ A ( n A g 0 ) C e c ( n / 2 g 0 ) for all n > 2 g 0 . Therefore, ϵ ψ ( A ) n 0 uniformly. Further, if τ ( A ) 2 n , it is enough to take w = n / 4 to obtain ϕ ( min { n A w + 1 , n } ) ϕ ( n / 4 ) and ( n + τ ( A ) ) μ A ( w ) 3 C n e c n / 4 , which ensures ϵ ϕ ( A ) n 0 uniformly. This is the case, for instance, if one has a complete grammar. On the other hand, notice that τ ( A ) < n A . Hence, if τ ( A ) > 2 n , we take w = n and get ϕ ( min { n A w + 1 , n } ) = ϕ ( n ) . Therefore, since τ ( A ) μ ( A ) n 0 , we also have in this case ϵ ϕ n 0 .
Remark 3.
Naturally, the statements under ψ-mixing are less general, but have smaller error terms. The error term is the same for t > [ 2 μ ( A ) ] 1 for both hitting and return times’ approximations. The difference is for small t’s, due to the correlation arising from the conditional measure.
Remark 4.
For application purposes involving data, it is essential to control all the constants involved in the statements. These constants can be accessed from the proof presented in Section 4.2, where we also make explicit the integer n 0 from which Theorem 1 holds (see (23)). If μ is ψ-mixing, we define M : = ψ g 0 + 1 + 1 . In this case, C 1 = 8 M + 9 , C 2 = 194 M + 206 , C 3 = 66 M + 89 , C 4 = 12 M + 15 , and C 5 = 197 M + 220 . On the other hand, for the ϕ-mixing case, we have C 1 = 9 , C 2 = 143 , C 3 = 61 , C 4 = 14 , and C 5 = 170 .
Remark 5.
We show the sharpness of the error term in the return time approximation given by Theorem 1 with a simple example. Consider an i.i.d. process ( X m ) m N with alphabet A . Take b A such that μ ( b ) = p and x = ( b , b , ) X = A N . Thus, A n ( x ) = x 0 n 1 = ( b , b , , b ) . Direct calculations give:
  • μ ( A n ) = p n
  • τ ( A n ) = 1
  • ρ ( A n ) = 1 μ A n ( T A n = τ ( A n ) ) = 1 μ A n ( X n = b ) = 1 p
  • n A n = n
  • μ A n ( T A n > n 1 ) = ρ ( A n ) = 1 p .
An i.i.d. process is trivially ψ -mixing with function ψ identically zero. Thus, Theorem 1 states that the error for small t’s is ϵ ψ ( A n ) = n p n . On the other hand, by direct substitution using the above facts, we have for each n 2 :
μ A n ( T A n > n 1 ) ρ ( A n ) e ρ ( A n ) μ ( A n ) ( ( n 1 ) τ ( A n ) ) = ( 1 p ) ( 1 p ) e ( 1 p ) p n ( ( n 1 ) 1 ) = ( 1 p ) 1 e ( 1 p ) p n ( n 2 )
which implies that the exact error in the approximation for the return time at n 1 is of order p n n , just as stated by Theorem 1.
Remark 6.
The reader may notice a difference between Theorem 1 and Theorem 4.1 of [6] concerning the error term for small t’s for return time approximation. Indeed, their statement is incorrect as shown by the preceding example. We recall that the error term for small t’s plays a fundamental role when studying the return time spectrum, as was done by [15]. Theorem 1, besides correcting [6], is also fundamental to correct [15], which was based on the exponential approximations given by [6].
Remark 7.
As an example where Theorem 1 can be applied while Theorem 4.1 of [6] cannot, let us consider the house of cards Markov chain defined in the previous subsection. We refer to [7] where it was explained that, if q i ϵ , i 1 for some ϵ > 0 , there exists a stationary process ( X m ) m N with matrix Q, and it is ϕ-mixing with exponentially decaying ϕ ( n ) . Therefore, according to Remark 1, it fits into the conditions of Theorem 1. However, since it is defined on an infinite alphabet and does not have a complete grammar, such processes are not covered by [6].

3.3. Uniform Positivity of the Potential Well

Theorem 1 says that the potential well can be used as the scaling parameter to obtain approximations for recurrence times around any point. We now ask about the possible values of this scaling parameter in its range [ 0 , 1 ] .
Abadi and Saussol [29], in the more general case known up to now, proved that for α -mixing processes with at least polynomially decaying α rates, the distribution of the hitting and return time converges, almost surely, to an exponential with parameter of one. We refer to [8] for the precise definition of α -mixing, but the only important point for us it to know that summable ϕ -mixing implies α -mixing with at least polynomially decaying α rates. This fact, combined with Theorem 1, proves, indirectly, that for summable ϕ -mixing processes, the potential well converges almost surely to one, since both theorems must agree on the limiting distribution under these conditions. Theorem 2 Item (a) below states that the same holds for ϕ -mixing without any assumption on the rate.
On the other hand, for the renewal processes, with a certain tail distribution for the inter-arrival times, Abadi, Cardeño, and Gallo [7] proved that for the point x = ( 00000 ) , the sequence of potential wells ρ ( x 0 n 1 ) converges to zero. In this case, the scaling parameter has a predominant role, indicating the drastic change of scale of the occurrence of events. For instance, in this case, the mean hitting time is much larger than the mean return time:
E ( T x 0 n 1 ) 1 ρ ( x 0 n 1 ) μ ( x 0 n 1 ) 1 μ ( x 0 n 1 ) = E x 0 n 1 ( T x 0 n 1 ) .
Such renewal processes are β -mixing (see [8] for the definition). Theorem 2 Item (b) below states that this cannot happen for ψ -mixing processes or summable ϕ -mixing processes.
Theorem 2.
Let μ be a stationary ϕ-mixing measure. Then:
(a) 
ρ ( x 0 n 1 ) n 1 , almost surely.
(b) 
If μ is ψ-mixing or summable ϕ-mixing, there exists n 1 1 such that:
inf n n 1 , x 0 n 1 A n ρ ( x 0 n 1 ) = ρ > 0 .
If the alphabet, A is finite; the set { ρ A : A A n , n < n 1 } is finite and has a strictly positive infimum, which implies that the infimum above can be taken over all n 1 .

4. Proofs of the Results

The statement of Theorem 1 is for ϕ and ψ and for hitting and return times. The case of return times under ϕ -mixing was already done by [6]. Our proof follows their method. In particular for the next subsection that lists a sequence of auxiliary results, some of them are not proven.

4.1. Preliminary Results

The following lemma plays a fundamental role in Theorems 1 and 2. It was originally proven in [25] assuming the summability of the function ϕ , an assumption that can be dropped.
Lemma 1.
Let μ be a ϕ-mixing measure. Then, there exists positive constants C and c such that for all n 1 and all A A n , one has:
μ ( A ) C e c n .
Proof. 
We denote by λ = sup { μ ( a ) : a A } < 1 . Consider a positive integer k 0 , and for all n k 0 , write n = k 0 q + r , with 1 q N and 0 r < k 0 . Suppose A = a 0 n 1 , and apply the ϕ -mixing property to obtain:
μ ( A ) μ j = 0 q 1 σ j k 0 ( a j k 0 ) μ j = 0 q 2 σ j k 0 ( a j k 0 ) ( ϕ ( k 0 ) + μ ( a ( q 1 ) k 0 ) ) μ j = 0 q 2 σ j k 0 ( a j k 0 ) ( ϕ ( k 0 ) + λ ) .
Iterating this argument, one concludes:
μ ( A ) ( ϕ ( k 0 ) + λ ) q .
Since ϕ ( k ) k 0 , there exists k 0 N such that ϕ ( k 0 ) + λ < 1 . Thus, for n k 0 and observing that q = n r k 0 > n k 0 k 0 1 k 0 :
μ ( A ) ( ϕ ( k 0 ) + λ ) ( k 0 1 ) / k 0 ϕ ( k 0 ) + λ 1 / k 0 n .
This covers the case n k 0 . By eventually enlarging the constant C, one covers the case n < k 0 . This ends the proof. □
For a complete grammar, one has τ ( A n ( x ) ) n . Since we do not assume this, we need the following lemma, which provides upper bounds for τ ( A ) when μ is ψ -mixing or summable ϕ -mixing.
Lemma 2.
Consider μ a ψ-mixing or summable ϕ-mixing measure. Then, there exists n 2 N such that for all n n 2 and A A n ,
  • τ ( A ) 2 n , for ψ;
  • τ ( A ) 2 μ A ln μ A + n , for summable ϕ.
Proof. 
We start with the case ψ . For n large enough, we have ψ ( n ) < 1 , which implies:
μ A σ 2 n ( A ) μ ( A ) 2 ( 1 ψ ( n ) ) > 0
Since τ ( A ) is the smallest positive integer such that μ A σ τ ( A ) ( A ) > 0 , one has τ ( A ) 2 n .
Now consider the ϕ -mixing case. The summability of ϕ ensures that for g large enough, we have ϕ ( g ) 1 / ( g ln g ) . Thus:
μ A σ g n A μ ( A ) μ A ϕ ( g ) μ ( A ) μ A 1 g ln g .
Take g = 2 μ A ln μ A . The rightmost parenthesis above becomes:
μ A 1 1 2 ln μ A ( ln ( 2 ) ln μ A ln ( ln μ A ) )
which is positive for n large enough. □
The multiplicative constant of two in both cases is technical and was chosen for the simplicity of the proof. Actually, it can be replaced by any constant strictly larger than one. An irreducible aperiodic finite state Markov chain with some entry equal to zero shows that this constant cannot be taken equal to one in the ψ -mixing case. Whether this bound is optimal for the ϕ -mixing case is an open question. Note that Lemmas 1 and 2 imply that τ ( A ) μ ( A ) n 0 uniformly.
The remaining results of this subsection hold for n n , where n = 1 for the case of ϕ -mixing and:
n : = inf { n > 2 g 0 : ψ ( n ) < 1 }
for the ψ -mixing case (see (1) for the definition of g 0 ).
Let us define:
M : = ψ g 0 + 1 + 1 .
Proposition 1.
Let μ be a ψ-mixing measure. Then, for all n n , A A n and k n A , the following inequality holds:
μ A ( τ ( A ) < T A k ) M ( k n A + 1 ) μ A ( n A g 0 ) .
Proof. 
By definition of n A , we first note that μ A ( τ ( A ) < T A k ) = μ A ( n A T A k ) . Consider the case in which n A n + g 0 . In this case, for j n A , one trivially has:
{ T A = j } σ j ( A ) σ j ( n ( n A g 0 ) ) A ( n A g 0 ) .
Thus:
μ A ( n A T A k ) μ A n A j k σ j ( n ( n A g 0 ) ) A ( n A g 0 ) .
Note that A and the union on the right hand-side in the above inequality are separated by a gap of length g 0 + 1 . By ψ -mixing, one concludes that the left-hand side is bounded by:
( ψ ( g 0 + 1 ) + 1 ) μ n A j k σ j ( n ( n A g 0 ) ) A ( n A g 0 ) M ( k n A + 1 ) μ A ( n A g 0 ) .
For n A > n + g 0 , recall first the convention in Section 3.2, which states that μ A ( n A g 0 ) = μ ( A ) . In a similar way to the first case,
μ A ( n A T A k ) μ A n A j k σ j A M ( k n A + 1 ) μ A .
For the next proposition, recall that ϵ stands either for ϵ ψ (2) or for ϵ ϕ (3), according to the mixing property of the measure under consideration. Further, let us use the notation T A [ i ] : = T A σ i .
Proposition 2.
Let μ be a ϕ or ψ-mixing measure. Then, for all n n , A A n and t τ ( A ) :
| μ A ( T A > t ) ρ ( A ) μ ( T A > t ) | C ϵ ( A ) ,
where C = 4 for ϵ ϕ and C = 4 ( M + 1 ) for ϵ ψ .
Proof. 
The proof for ϵ ϕ can be found in Proposition 4.1 Item (b) of [6], and it remains valid even for a non-complete grammar and infinite alphabet. We observe that the error term defined therein is:
ϵ ( A ) = inf 1 w n A ( 2 n + τ ( A ) ) μ A ( w ) + ϕ ( n A w + 1 ) ) 2 ϵ ϕ ( A ) ,
which justifies C = 4 for this case.
Here, we prove the case ϵ ψ in the same way. We start by assuming that t τ ( A ) + 2 n . By the triangle inequality:
| μ A ( T A > t ) ρ ( A ) μ ( T A > t ) | μ A T A > τ ( A ) ; T A [ τ ( A ) ] > t τ ( A )
μ A T A > τ ( A ) ; T A [ τ ( A ) + 2 n ] > t τ ( A ) 2 n
+ μ A T A > τ ( A ) ; T A [ τ ( A ) + 2 n ] > t τ ( A ) 2 n ρ ( A ) μ T A > t τ ( A ) 2 n
+ ρ ( A ) μ ( T A > t τ ( A ) 2 n ) ρ ( A ) μ ( T A > t ) .
For the first modulus, by inclusion of sets:
( 5 ) μ A T A > τ ( A ) ; T A [ τ ( A ) ] 2 n = μ A ( τ ( A ) < T A τ ( A ) + 2 n ) .
If n A > τ ( A ) + 2 n , the last term is equal to zero. Otherwise, we apply Proposition 1 to obtain:
μ A ( τ ( A ) < T A τ ( A ) + 2 n ) M ( τ ( A ) + 2 n n A + 1 ) μ A ( n A g 0 ) 4 M n μ A ( n A g 0 )
where the last inequality follows from Lemma 2.
By ψ -mixing, the modulus (6) is bounded by:
ρ ( A ) μ ( T A > t τ ( A ) 2 n ) ψ ( n ) ψ ( n ) .
Note that the modulus is not needed for (7), and by inclusion:
( 7 ) ρ ( A ) μ T A [ t τ ( A ) 2 n ] τ ( A ) + 2 n = ρ ( A ) μ T A τ ( A ) + 2 n ( 2 n + τ ( A ) ) μ ( A ) 4 n μ A ( n A g 0 )
where the equality and second inequality follow from the stationarity of μ .
Therefore, for t τ ( A ) + 2 n , the sum of ( 5 ) , ( 6 ) and ( 7 ) is bounded by:
4 M n μ A ( n A g 0 ) + ψ ( n ) + 4 n μ A ( n A g 0 ) 4 ( M + 1 ) ϵ ψ ( A ) .
We now consider the case where τ ( A ) t < τ ( A ) + 2 n . We have:
| μ A ( T A > t ) ρ ( A ) μ ( T A > t ) | | μ A ( T A > t ) ρ ( A ) | + | ρ ( A ) ρ ( A ) μ ( T A > t ) | μ A ( τ ( A ) < T A τ ( A ) + 2 n ) + t μ ( A ) 4 M n μ A ( n A g 0 ) + ( τ ( A ) + 2 n ) μ A ( n A g 0 ) 4 ( M + 1 ) ϵ ψ ( A ) .
The lastinequality follows from (8). The other inequalities are straightforward. This ends the proof. □
The next lemma establishes upper bounds for the tail distribution at the scale given by Kac’s lemma, namely 1 / μ ( A ) . For technical reasons, we actually choose the scale:
f A : = 1 / ( 2 μ ( A ) ) .
Lemma 3.
Let μ be a stationary measure. Then, for all n 1 , A A n , positive integer k, and B F k f A , the following inequalities hold:
(a) 
μ ( T A > k f A ; B ) ( ψ ( n ) + 1 ) k μ ( T A > f A 2 n ) k μ ( B ) ,
(b) 
μ ( T A > k f A ; B ) μ T A > f A 2 n + ϕ ( n ) k ( μ ( B ) + ϕ ( n ) ) ,
(c) 
μ A ( T A > k f A ; B ) ( ψ ( n ) + 1 ) k μ ( T A > f A 2 n ) k 1 μ ( B ) .
Proof. 
We start by observing that { T A > k f A } { T A > k f A 2 n } F 0 k f A n . Thus, applying the ψ -mixing property, we get:
μ ( T A > k f A ; B ) μ ( T A > k f A 2 n ; B ) ( ψ ( n ) + 1 ) μ ( T A > k f A 2 n ) μ ( B )
Furthermore:
{ T A > k f A 2 n } = T A > ( k 1 ) f A ; T A [ ( k 1 ) f A ] > f A 2 n .
Now, one can take in particular B = T A [ ( k 1 ) f A ] > f A 2 n F ( k 1 ) f A and then apply (9) with k 1 instead of k to get:
μ ( T A > k f A 2 n ) ( ψ ( n ) + 1 ) μ T A > ( k 1 ) f A 2 n μ T A [ ( k 1 ) f A ] > f A 2 n = ( ψ ( n ) + 1 ) μ T A > ( k 1 ) f A 2 n μ T A > f A 2 n .
The equality follows by stationarity. Iterating this argument, one concludes that:
μ ( T A > k f A 2 n ) ( ψ ( n ) + 1 ) k 1 μ ( T A > f A 2 n ) k .
Applying the resulting inequality in (9), we get Statement (a). In a similar way, ϕ -mixing gives:
μ ( T A > k f A ; B ) μ ( T A > k f A 2 n ) ( μ ( B ) + ϕ ( n ) ) .
Thus:
μ ( T A > k f A 2 n ) μ ( T A > ( k 1 ) f A 2 n ) μ T A [ ( k 1 ) f A ] > f A 2 n + ϕ ( n ) μ T A > f A 2 n + ϕ ( n ) k .
which ends the proof of (b).
The proof for (c) follows the same lines as Item (a), by observing that for A , B F 0 i and C F i + n , the ψ -mixing property implies μ A ( B ; C ) μ A ( B ) μ ( C ) ( ψ ( n ) + 1 ) . □
The next proposition is the key to the proof of Theorem 1, and the idea is the following. We work under the time scale f A . When t = k f A , k N , then we simply cut t into k pieces of equal size f A . Then, the case of general t = k f A + r , r < f A is approximated by its integer part k f A . Technically, this is done in (b) and (a), respectively.
Proposition 3.
Let μ be a ϕ or ψ-mixing measure. Then, for all n n , A A n and positive integer k, the following inequalities hold:
(a) 
For 0 r f A :
1. 
| μ ( T A > k f A + r ) μ ( T A > k f A ) μ ( T A > r ) | C ( ψ ( n ) + 1 ) k 1 μ ( T A > f A 2 n ) k ϵ ψ ( A )
2. 
| μ ( T A > k f A + r ) μ ( T A > k f A ) μ ( T A > r ) | C ( μ ( T A > f A 2 n ) + ϕ ( n ) ) k ϵ ϕ ( A )
3. 
| μ A ( T A > k f A + r ) μ A ( T A > k f A ) μ ( T A > r ) | C ( ( ψ ( n ) + 1 ) μ ( T A > f A 2 n ) ) k 1 ϵ ψ ( A ) .
(b) 
For k 1 :
1. 
μ ( T A > k f A ) μ ( T A > f A ) k C ϵ ψ ( A ) ( k 1 ) ( ψ ( n ) + 1 ) k 2 μ ( T A > f A 2 n ) k 1
2. 
μ ( T A > k f A ) μ ( T A > f A ) k C ϵ ϕ ( A ) ( k 1 ) ( μ ( T A > f A 2 n ) + ϕ ( n ) ) k 1
3. 
μ A ( T A > k f A ) μ A ( T A > f A ) μ ( T A > f A ) k 1 C ϵ ψ ( A ) ( k 1 ) ( ( ψ ( n ) + 1 ) μ ( T A > f A 2 n ) ) k 2
where C = 2 ( M + 1 ) for the cases involving ψ and C = 4 for ϕ.
Proof. 
We will prove Items (a)-1 and (a)-2 together. Initially, consider the case in which r < 2 n . In this case, for all n n , we have:
| μ ( T A > k f A + r ) μ ( T A > k f A ) μ ( T A > r ) | μ T A > k f A , T A [ k f A ] > r μ ( T A > k f A ) + μ ( T A > k f A ) | 1 μ ( T A > r ) | μ T A > k f A , T A [ k f A ] r + μ ( T A > k f A ) μ ( T A r ) .
By Lemma 3-(a) and (10), the last sum is bounded by:
( ( ψ ( n ) + 1 ) μ ( T A > f A 2 n ) ) k μ T A [ k f A ] r + μ ( T A > k f A 2 n ) r μ ( A ) ( ( ψ ( n ) + 1 ) μ ( T A > f A 2 n ) ) k μ T A r + ( ψ ( n ) + 1 ) k 1 μ ( T A > f A 2 n ) k r μ ( A ) ( ψ ( n ) + 1 ) k 1 μ ( T A > f A 2 n ) k r μ ( A ) ( M + 1 ) 2 ( M + 1 ) ( ψ ( n ) + 1 ) k 1 μ ( T A > f A 2 n ) k ϵ ψ ( A )
which gives us (a)-1. To get (a)-2 for r < 2 n , we apply Lemma 3-(b) and (11) in a similar way. Thus, (12) is bounded by:
( μ ( T A > f A 2 n ) + ϕ ( n ) ) k ( μ ( T A r ) + ϕ ( n ) ) + ( μ ( T A > f A 2 n ) + ϕ ( n ) ) k μ ( T A r ) 4 ( μ ( T A > f A 2 n ) + ϕ ( n ) ) k ϵ ϕ ( A ) .
We now consider the case r 2 n . The triangle inequality gives us:
| μ ( T A > k f A + r ) μ ( T A > k f A ) μ ( T A > r ) |
μ T A > k f A ; T A [ k f A ] > r μ T A > k f A ; T A [ k f A + 2 n ] > r 2 n
+ μ T A > k f A ; T A [ k f A + 2 n ] > r 2 n μ T A > k f A μ T A [ k f A + 2 n ] > r 2 n
+ μ T A > k f A μ T A [ k f A + 2 n ] > r 2 n μ T A > k f A μ T A > r .
We proceed as in (5) and use Lemma 3-(a) to get:
( 14 ) μ T A > k f A ; T A [ k f A ] 2 n ( ( ψ ( n ) + 1 ) ( μ ( T A > f A 2 n ) ) k μ ( T A 2 n ) 2 n μ ( A ) ( ( ψ ( n ) + 1 ) ( μ ( T A > f A 2 n ) ) k .
For the case ϕ , we apply Lemma 3-(b) and get:
( 14 ) ( μ ( T A > f A 2 n ) + ϕ ( n ) ) k ( 2 n μ ( A ) + ϕ ( n ) ) .
By ψ -mixing and (10):
( 15 ) μ ( T A > k f A 2 n ) ψ ( n ) ( ψ ( n ) + 1 ) k 1 μ ( T A > f A 2 n ) k ψ ( n )
Applying ϕ -mixing and (11):
( 15 ) ( μ ( T A > f A 2 n ) + ϕ ( n ) ) k ϕ ( n )
Finally, using the shift-invariance and the same arguments as above:
( 16 ) = μ ( T A > k f A ) μ r 2 n < T A r 2 n μ ( A ) μ ( T A > k f A 2 n ) .
Therefore, (10), (17), (19), and (21) give us:
( 14 ) + ( 15 ) + ( 16 ) 2 ( M + 1 ) ( ψ ( n ) + 1 ) k 1 μ ( T A > f A 2 n ) k ϵ ψ ( A )
and from (11), (18), (20), and (21), we get:
( 14 ) + ( 15 ) + ( 16 ) 4 ( μ ( T A > f A 2 n ) + ϕ ( n ) ) k ϵ ϕ ( A )
which ends the proof of (a)-1 and (a)-2.
For the proof of (a)-3, we write a similar triangle inequality as above:
| μ A ( T A > k f A + r ) μ A ( T A > k f A ) μ ( T A > r ) | μ A T A > k f A ; T A [ k f A ] > r μ A T A > k f A ; T A [ k f A + 2 n ] > r 2 n + μ A T A > k f A ; T A [ k f A + 2 n ] > r 2 n μ A T A > k f A μ T A [ k f A + 2 n ] > r 2 n + μ A T A > k f A μ T A [ k f A + 2 n ] > r 2 n μ T A > r .
Then, we follow the same as we did for (a)-1, but applying Item (c) of Lemma 3 and using the ψ -mixing property:
| μ A ( B ; C ) μ A ( B ) μ ( C ) | μ A ( B ) μ ( C ) ψ ( n )
where A , B F 0 i and C F i + n . For the case r < 2 n , we use:
| μ A ( T A > k f A + r ) μ A ( T A > k f A ) μ ( T A > r ) | μ A T A > k f A , T A [ k f A ] > r μ A ( T A > k f A ) + μ A ( T A > k f A ) | 1 μ ( T A > r ) |
and proceed as we did in (13), applying again Lemma 3-(c). This ends Item (a).
We now come to the proof of Items (b)-1 and (b)-2. For k = 1 , we have an equality. For k 2 , we get:
μ ( T A > k f A ) μ ( T A > f A ) k = j = 2 k μ ( T A > j f A ) μ ( T A > ( j 1 ) f A ) μ ( T A > f A ) μ ( T A > f A ) k j j = 2 k μ ( T A > j f A ) μ ( T A > ( j 1 ) f A ) μ ( T A > f A ) μ ( T A > f A ) k j .
We put r = f A in Item (a)-1 to obtain (b)-1:
( 22 ) 2 ( M + 1 ) ϵ ψ ( A ) j = 2 k ( ψ ( n ) + 1 ) j 2 μ ( T A > f A 2 n ) j 1 μ ( T A > f A ) k j 2 ( M + 1 ) ϵ ψ ( A ) ( k 1 ) ( ψ ( n ) + 1 ) k 2 μ ( T A > f A 2 n ) k 1 .
Furthermore, we get the inequality (b)-2, under ϕ -mixing, proceeding similarly as above:
( 22 ) 4 ϵ ϕ ( A ) j = 2 k ( μ ( T A > f A 2 n ) + ϕ ( n ) ) j 1 ( μ ( T A > f A 2 n ) + ϕ ( n ) ) k j = 4 ϵ ϕ ( A ) ( k 1 ) ( μ ( T A > f A 2 n ) + ϕ ( n ) ) k 1
Finally, we prove (b)-3 applying (a)-3 as follows:
μ A ( T A > k f A ) μ A ( T A > f A ) μ ( T A > f A ) k 1 j = 2 k μ A ( T A > j f A ) μ A ( T A > ( j 1 ) f A ) μ ( T A > f A ) μ ( T A > f A ) k j 2 ( M + 1 ) ϵ ψ ( A ) j = 2 k ( ( ψ ( n ) + 1 ) μ ( T A > f A 2 n ) ) j 2 μ ( T A > f A ) k j 2 ( M + 1 ) ϵ ψ ( A ) ( k 1 ) ( ( ψ ( n ) + 1 ) μ ( T A > f A 2 n ) ) k 2 .
The next two lemmas are classical results and are stated without proof. The first one establishes the reversibility of certain sets for stationary measures, and the second one is a discrete version of the mean value theorem, which follows with a straightforward computation.
Lemma 4.
Let μ be a stationary measure. For all positive i N , n 1 and A A n :
μ ( T A = i ) = μ ( T A > i 1 ; A )
Lemma 5.
Given a 1 , , a n , b 1 , , b n real numbers such that 0 a i , b i 1 , the following inequality holds:
i = 1 n a i i = 1 n b i i = 1 n a i b i max 1 i n { a i , b i } n 1 i = 1 n a i b i .

4.2. Proof of Theorem 1

Theorem 1 contains eight statements, each statement corresponding to the choice of:
  • recurrence time: hitting or return,
  • mixing property: ψ or ϕ ,
  • amplitude of t: smaller or larger than f A .
Recall the definition of n in (4). The proof of Theorem 1 holds for all n n 0 , where n 0 is explicitly given by:
n 0 : = inf m n ; sup A A n μ ( A ) τ ( A ) < 1 / 2 , n m
which is finite since sup A A n μ ( A ) τ ( A ) n 0 . Then, in particular, we have τ ( A ) < f A for all n n 0 and A A n .

4.2.1. Proofs of the Statements for Small t’s

Here, we assume that 1 t f A : = [ 2 μ ( A ) ] 1 .
Proof of hitting time, ϕ and ψ together. 
Recall that ϵ ( A ) denotes ϵ ϕ ( A ) or ϵ ψ ( A ) , depending on whether the measure is ϕ or ψ -mixing.
Applying the inequality 1 e x x for x 0 , we obtain the statement for 1 t τ ( A ) as follows:
μ ( T A > t ) e ρ ( A ) μ ( A ) t μ ( T A > t ) 1 + 1 e ρ ( A ) μ ( A ) t 2 τ ( A ) μ ( A ) .
We consider now the case τ ( A ) < t f A . For positive i N , define:
p i = μ A ( T A > i 1 ) μ ( T A > i 1 ) .
Then:
μ ( T A > t ) μ ( T A > τ ( A ) ) = i = τ ( A ) + 1 t μ ( T A > i ) μ ( T A > i 1 ) = i = τ ( A ) + 1 t 1 μ ( T A = i | T A > i 1 ) = i = τ ( A ) + 1 t 1 μ ( A ) p i
where we used Lemma 4 in the last equality.
Thus, for τ ( A ) < t f A , we apply (25) to obtain:
μ ( T A > t ) e ρ ( A ) μ ( A ) t = μ ( T A > τ ( A ) ) i = τ ( A ) + 1 t 1 μ ( A ) p i e ρ ( A ) μ ( A ) τ ( A ) i = τ ( A ) + 1 t e ρ ( A ) μ ( A ) μ ( T A > τ ( A ) ) e ρ ( A ) μ ( A ) τ ( A ) + i = τ ( A ) + 1 t 1 μ ( A ) p i i = τ ( A ) + 1 t e ρ ( A ) μ ( A ) 2 τ ( A ) μ ( A ) + i = τ ( A ) + 1 t 1 μ ( A ) p i e ρ ( A ) μ ( A )
where the two inequalities follow from Lemma 5 and (24).
On the other hand, by the triangle inequality:
1 p i μ ( A ) e ρ ( A ) μ ( A ) p i ρ ( A ) μ ( A ) + 1 ρ ( A ) μ ( A ) e ρ ( A ) μ ( A ) .
Since | 1 x e x | x 2 2 for all 0 x 1 , by doing x = ρ ( A ) μ ( A ) , we get:
1 ρ ( A ) μ ( A ) e ρ ( A ) μ ( A ) ρ ( A ) 2 μ ( A ) 2 2 ϵ ( A ) μ ( A ) 2 .
Furthermore, still for τ ( A ) + 1 i f A + 1 , Proposition 2 gives us:
| p i ρ ( A ) | = μ A ( T A > i 1 ) μ ( T A > i 1 ) ρ ( A ) C ϵ ( A ) μ ( T A > i 1 ) 2 C ϵ ( A ) ,
where, for the last inequality, we used:
μ ( T A > i 1 ) = 1 μ ( T A i 1 ) 1 ( i 1 ) μ ( A ) 1 f A μ ( A ) = 1 2 .
Thus, applying (27), we obtain for τ ( A ) + 1 i f A + 1 :
1 p i μ ( A ) e ρ ( A ) μ ( A ) 2 C + 1 / 2 ϵ ( A ) μ ( A ) .
Therefore, (26) and (28) give us:
μ ( T A > t ) e ρ ( A ) μ ( A ) t 2 τ ( A ) μ ( A ) + 2 C + 1 / 2 ( t τ ( A ) ) ϵ ( A ) μ ( A ) 2 C + 1 / 2 [ τ ( A ) μ ( A ) + t μ ( A ) ϵ ( A ) ]
which concludes the statement of Theorem 1 for hitting time at small t’s (with either ϕ or ψ ). □
Proof for return time, ϕ and ψ together. 
We first note that the statement is trivial for t = τ ( A ) , then we consider t > τ ( A ) . By definition, we have μ A ( T A > t ) = p t + 1 μ ( T A > t ) . Then, we use again the triangle inequality to write:
μ A ( T A > t ) ρ ( A ) e ρ ( A ) μ ( A ) ( t τ ( A ) ) μ ( T A > t ) p t + 1 ρ ( A ) + ρ ( A ) μ ( T A > t ) e ρ ( A ) μ ( A ) ( t τ ( A ) ) .
As we saw before, the first modulus above is bounded by 2 C ϵ ( A ) . On the other hand, we apply (25) to obtain for τ ( A ) < t f A :
μ ( T A > t ) e ρ ( A ) μ ( A ) ( t τ ( A ) ) = μ ( T A > τ ( A ) ) i = τ ( A ) + 1 t ( 1 μ ( A ) p i ) i = τ ( A ) + 1 t e ρ ( A ) μ ( A ) .
This is bounded, applying Lemma 5, by:
| μ ( T A > τ ( A ) ) 1 | + i = τ ( A ) + 1 t ( 1 μ ( A ) p i ) i = τ ( A ) + 1 t e ρ ( A ) μ ( A ) τ ( A ) μ ( A ) + 2 C + 1 / 2 t μ ( A ) ϵ ( A )
where the last inequality follows from (26) and (28). Finally, notice that t μ ( A ) f A μ ( A ) = 1 / 2 and τ ( A ) μ ( A ) 2 ϵ ( A ) (use Lemma 2 for ψ ). Therefore, we obtain from (30):
μ A ( T A > t ) ρ ( A ) e ρ ( A ) μ ( A ) ( t τ ( A ) ) 3 C + 9 / 4 ϵ ( A )
This concludes the statement of Theorem 1 for the return time at small t’s (with either ϕ or ψ ). □

4.2.2. Proof of the Statements for Large t’s

The proof for the return time for t > f A was given in [6] under ϕ -mixing, a finite alphabet, and a complete grammar. The proof still holds if one just assumes a countable alphabet and an incomplete grammar (recall Remark 2 for the uniform convergence to zero of the error term ϵ ϕ ). Thus, we focus on the hitting time under each mixing assumption and the return time only under ψ -mixing.
Proof of Theorem 1 for hitting times, for t > f A . 
Write t = k f A + r with integer k 1 and 0 r < f A . Thus, we have:
μ ( T A > t ) e ρ ( A ) μ ( A ) t μ ( T A > k f A + r ) μ ( T A > k f A ) μ ( T A > r )
+ μ ( T A > k f A ) μ ( T A > f A ) k μ ( T A > r )
+ μ ( T A > f A ) k e ρ ( A ) k 2 μ ( T A > r )
+ e ρ ( A ) k 2 μ ( T A > r ) e ρ ( A ) μ ( A ) t .
In order to get an upper bound for the sum of (32) and (33), we analyse the ψ and ϕ cases separately and start with the ψ -mixing. Applying Items (a)-1 and (b)-1 of Proposition 3, that sum is bounded by:
C ϵ ψ ( A ) ( ψ ( n ) + 1 ) k 1 μ ( T A > f A 2 n ) k 1 + ( k 1 ) ( ( ψ ( n ) + 1 ) μ ( T A > f A 2 n ) ) 1 2 ( M + 1 ) ϵ ψ ( A ) ( ψ ( n ) + 1 ) μ ( T A > f A 2 n ) k 2 k 8 ( M + 1 ) ϵ ψ ( A ) μ ( A ) t ( ψ ( n ) + 1 ) μ ( T A > f A 2 n ) k .
where the last two inequalities are justified by μ ( T A > f A 2 n ) 1 μ ( T A > f A ) 1 2 and k 2 μ ( A ) t .
On the other hand, applying (29) with t = f A 2 n , we get:
μ ( T A > f A 2 n ) e ρ ( A ) μ ( A ) ( f A 2 n ) = μ ( T A > f A 2 n ) e ρ ( A ) 2 + 2 n ρ ( A ) μ ( A ) 2 C + 1 / 2 ( τ ( A ) μ ( A ) + ( f A 2 n ) μ ( A ) ϵ ψ ( A ) ) 5 C + 5 / 4 ϵ ψ ( A )
where we use τ ( A ) μ ( A ) 2 ϵ ψ ( A ) .
Furthermore, by the Mean Value Theorem (MVT):
e ρ ( A ) 2 + 2 n ρ ( A ) μ ( A ) e ρ ( A ) 2 2 n ρ ( A ) μ ( A ) e ρ ( A ) 2 + 2 n ρ ( A ) μ ( A ) 2 n μ ( A ) e 2 n μ ( A ) 11 2 n μ ( A )
since for n n 0 , we have 2 n μ ( A ) 2 sup μ ( A ) τ ( A ) 1 .
Thus, it follows that:
( ψ ( n ) + 1 ) μ ( T A > f A 2 n ) e ρ ( A ) 2 ψ ( n ) + μ ( T A > f A 2 n ) e ρ ( A ) 2 + 2 n ρ ( A ) μ ( A ) + e ρ ( A ) 2 + 2 n ρ ( A ) μ ( A ) e ρ ( A ) 2 5 C + 27 / 4 ϵ ψ ( A ) .
Therefore:
( ψ ( n ) + 1 ) μ ( T A > f A 2 n ) k e ρ ( A ) 2 + 5 C + 27 / 4 ϵ ψ ( A ) k .
Since e x 1 x x R , by doing K = 5 C + 27 / 4 e 1 / 2 , we get:
e K ϵ ψ ( A ) 1 K ϵ ψ ( A ) 5 C + 27 / 4 ϵ ψ ( A ) e ρ ( A ) 2 e ρ ( A ) 2 e K ϵ ψ ( A ) 1 5 C + 27 / 4 ϵ ψ ( A ) e ρ ( A ) 2 + K ϵ ψ ( A ) 5 C + 27 / 4 ϵ ψ ( A ) + e ρ ( A ) 2 .
Now, using that k = 2 μ ( A ) ( t r ) , we have:
( ψ ( n ) + 1 ) μ ( T A > f A 2 n ) k e ρ ( A ) 2 + K ϵ ψ ( A ) k = e ρ ( A ) μ ( A ) t + ρ ( A ) μ ( A ) r + 2 K ϵ ψ ( A ) μ ( A ) t 2 K ϵ ψ ( A ) μ ( A ) r e μ ( A ) t ρ ( A ) 2 K ϵ ψ ( A ) e μ ( A ) r e 1 / 2 e μ ( A ) t ρ ( A ) C 3 ϵ ψ ( A )
where the last inequality follows from e μ ( A ) r e μ ( A ) f A .
Therefore, it follows from (36) that the sum of (32) and (33) is bounded by:
14 ( M + 1 ) ϵ ψ ( A ) μ ( A ) t e μ ( A ) t ρ ( A ) C 3 ϵ ψ ( A ) .
We now turn to the case of ϕ -mixing. We apply Items (a)-2 and (b)-2 of Proposition 3 to get an upper bound for the sum of (32) and (33):
μ ( T A > k f A + r ) μ ( T A > k f A ) μ ( T A > r ) + μ ( T A > k f A ) μ ( T A > f A ) k μ ( T A > r ) 4 ϵ ϕ ( A ) μ ( T A > f A 2 n ) + ϕ ( n ) k 1 + ( k 1 ) μ ( T A > f A 2 n ) + ϕ ( n ) 1 4 ϵ ϕ ( A ) μ ( T A > f A 2 n ) + ϕ ( n ) k 2 k 16 ϵ ϕ ( A ) μ ( A ) t μ ( T A > f A 2 n ) + ϕ ( n ) k .
Similarly to the ψ -mixing case, one obtains:
μ ( T A > f A 2 n ) + ϕ ( n ) k e 1 / 2 e μ ( A ) t ρ ( A ) C 3 ϵ ϕ ( A )
which implies in the ϕ -mixing case that the sum of (32) and (33) is bounded by:
27 ϵ ϕ ( A ) μ ( A ) t e μ ( A ) t ρ ( A ) C 3 ϵ ϕ ( A ) .
Now, we will treat the cases ψ and ϕ together to obtain upper bounds for (34) and (35). In order to get an upper bound for (34), we apply (29) with t = f A :
μ ( T A > f A ) e ρ ( A ) μ ( A ) f A = μ ( T A > f A ) e ρ ( A ) 2 2 C + 1 / 2 τ ( A ) μ ( A ) + f A μ ( A ) ϵ ( A ) 5 C + 5 / 4 ϵ ( A ) .
Thus, applying Lemma 5, we have:
μ ( T A > f A ) k e ρ ( A ) k 2 i = 1 k μ ( T A > f A ) e ρ ( A ) 2 max μ ( T A > f A ) , e ρ ( A ) 2 k 1 .
The max is bounded using (39) by:
e ρ ( A ) 2 + 5 C + 5 / 4 ϵ ( A ) .
Naturally, the absolute value is also bounded by using (39), and we get that the above sum is bounded above by:
k 5 C + 5 / 4 ϵ ( A ) e ρ ( A ) 2 + 5 C + 5 / 4 ϵ ( A ) k 1 .
Recalling that k = 2 μ ( A ) ( t r ) and proceeding as we did for (37) and (38), we get the following upper bound for (34):
2 5 C + 5 / 4 ϵ ( A ) μ ( A ) t e μ ( A ) t ρ ( A ) C 3 ϵ ( A ) e 1 7 4 C + 1 ϵ ( A ) μ ( A ) t e μ ( A ) t ρ ( A ) C 3 ϵ ( A ) .
To conclude the proof for the hitting time, we apply (29) with t = r to bound (35) as follows:
e ρ ( A ) k 2 μ ( T A > r ) e ρ ( A ) μ ( A ) t = e ρ ( A ) μ ( A ) t + ρ ( A ) μ ( A ) r μ ( T A > r ) e ρ ( A ) μ ( A ) r 2 C + 1 / 2 e ρ ( A ) μ ( A ) t + μ ( A ) f A τ ( A ) μ ( A ) + r μ ( A ) ϵ ( A ) 2 C + 1 / 2 τ ( A ) μ ( A ) + f A μ ( A ) ϵ ( A ) e ρ ( A ) μ ( A ) t e 1 / 2 ( 17 C + 5 ) ϵ ( A ) μ ( A ) t e μ ( A ) t ρ ( A ) C 3 ϵ ( A )
where the term μ ( A ) t follows from 1 = 2 μ ( A ) f A 2 μ ( A ) t . □
Proof of Theorem 1 for the return time, for t > f A and under ψ -mixing. 
We use again the triangle inequality to write:
μ A ( T A > t ) ρ ( A ) e ρ ( A ) μ ( A ) ( t τ ( A ) )
μ A ( T A > k f A + r ) μ A ( T A > k f A ) μ ( T A > r )
+ μ A ( T A > k f A ) μ A ( T A > f A ) μ ( T A > f A ) k 1 μ ( T A > r )
+ μ A ( T A > f A ) μ ( T A > f A ) k 1 ρ ( A ) e ρ ( A ) k 2 μ ( T A > r )
+ ρ ( A ) e ρ ( A ) k 2 μ ( T A > r ) e ρ ( A ) μ ( A ) ( r τ ( A ) ) .
Applying Items (a)-3 and (b)-3 of Proposition 3, the sum of (41) and (42) is bounded by:
2 ( M + 1 ) ϵ ψ ( A ) ( ( ψ ( n ) + 1 ) μ ( T A > f A 2 n ) ) k 1 ( 1 + ( k 1 ) ( ( ψ ( n ) + 1 ) μ ( T A > f A 2 n ) ) 1 ) 2 ( M + 1 ) ϵ ψ ( A ) ( ( ψ ( n ) + 1 ) μ ( T A > f A 2 n ) ) k 1 2 k 8 ( M + 1 ) ϵ ψ ( A ) μ ( A ) t ( ( ψ ( n ) + 1 ) μ ( T A > f A 2 n ) ) k 1 .
Replacing k by k 1 in (38), the last term is bounded above by:
8 ( M + 1 ) ϵ ψ ( A ) μ ( A ) t e 1 e μ ( A ) t ρ ( A ) C 3 ϵ ψ ( A ) 22 ( M + 1 ) ϵ ψ ( A ) μ ( A ) t e μ ( A ) t ρ ( A ) C 3 ϵ ψ ( A ) .
On the other hand, Lemma 5 gives us:
( 43 ) max μ A ( T A > f A ) , μ ( T A > f A ) , e ρ ( A ) / 2 k 1 μ A ( T A > f A ) ρ ( A ) e ρ ( A ) / 2 + i = 1 k 1 μ ( T A > f A ) e ρ ( A ) / 2
The last sum is bounded by ( 5 C + 5 / 4 ) ( k 1 ) ϵ ψ ( A ) using (39). On the other hand, applying (31) with t = f A and the MVT, we obtain:
μ A ( T A > f A ) ρ ( A ) e ρ ( A ) / 2 μ A ( T A > f A ) ρ ( A ) e ρ ( A ) / 2 + ρ ( A ) μ ( A ) τ ( A ) + ρ ( A ) e ρ ( A ) / 2 + ρ ( A ) μ ( A ) τ ( A ) e ρ ( A ) / 2 ( 3 C + 9 / 4 ) ϵ ψ ( A ) + ρ ( A ) μ ( A ) τ ( A ) e ρ ( A ) ( 1 / 2 μ ( A ) τ ( A ) ) ( 3 C + 17 / 4 ) ϵ ψ ( A )
since e ρ ( A ) ( 1 / 2 μ ( A ) τ ( A ) ) 1 and μ ( A ) τ ( A ) 2 ϵ ψ ( A ) for n n 0 .
Furthermore, the last inequality implies:
μ A ( T A > f A ) ρ ( A ) e ρ ( A ) / 2 + ( 3 C + 17 / 4 ) ϵ ψ ( A ) e ρ ( A ) / 2 + ( 5 C + 5 / 4 ) ϵ ψ ( A )
and by (39), we get:
max μ A ( T A > f A ) , μ ( T A > f A ) , e ρ ( A ) / 2 e ρ ( A ) / 2 + ( 5 C + 5 / 4 ) ϵ ψ ( A ) .
Therefore, as we saw in (40), we have:
( 43 ) ( 5 C + 5 / 4 ) ϵ ψ ( A ) k e ρ ( A ) / 2 + ( 5 C + 5 / 4 ) ϵ ψ ( A ) k 1 2 ( 5 C + 5 / 4 ) ϵ ψ ( A ) μ ( A ) t e 1 e μ ( A ) t ρ ( A ) C 3 ϵ ψ ( A ) ( 109 M + 116 ) ϵ ψ ( A ) μ ( A ) t e μ ( A ) t ρ ( A ) C 3 ϵ ψ ( A ) .
Finally, by doing t = r in (29) and applying the MVT once again, we get:
μ ( T A > r ) e ρ ( A ) μ ( A ) ( r τ ( A ) ) μ ( τ A > r ) e ρ ( A ) μ ( A ) r + e ρ ( A ) μ ( A ) r e ρ ( A ) μ ( A ) ( r τ ( A ) ) ( 2 C + 1 / 2 ) ( τ ( A ) μ ( A ) + r μ ( A ) ϵ ψ ( A ) ) + ρ ( A ) μ ( A ) τ ( A ) e ρ ( A ) μ ( A ) ( r τ ( A ) ) ( 2 C + 1 / 2 ) ( 2 ϵ ψ ( A ) + f A μ ( A ) ϵ ψ ( A ) ) + ( 7 / 2 ) ϵ ψ ( A ) ( 5 C + 19 / 4 ) ϵ ψ ( A ) .
The third inequality follows from e ρ ( A ) μ ( A ) ( r τ ( A ) ) e ρ ( A ) μ ( A ) τ ( A ) e 1 / 2 , since n n 0 . Now, just note that ρ ( A ) μ ( A ) τ ( A ) 2 ϵ ψ ( A ) .
Therefore, we finish the proof by obtaining the following upper bound:
( 44 ) ( 5 C + 19 / 4 ) ϵ ψ ( A ) e ρ ( A ) μ ( A ) r e ρ ( A ) μ ( A ) t ( 5 C + 19 / 4 ) ϵ ψ ( A ) 2 μ ( A ) t e f A μ ( A ) e μ ( A ) t ( ρ ( A ) C 3 ϵ ψ ( A ) ) ( 66 M + 82 ) ϵ ψ ( A ) μ ( A ) t e μ ( A ) t ( ρ ( A ) C 3 ϵ ψ ( A ) ) .

4.3. Proof of Theorem 2

Proof of Statement (a).
For each x X , we define:
τ ( x ) : = sup { τ ( x 0 n 1 ) , n 1 } .
Let B = { x X ; τ ( x ) = } be the set of aperiodic points of X . For x B , denote A n = A n ( x ) , and consider the case τ ( A n ) < n . Then, we have:
1 ρ ( A n ) = μ A n T A n = τ ( A n ) = μ A n σ n A n ( τ ( A n ) μ A n σ n τ ( A n ) / 2 A n ( τ ( A n ) / 2 μ A n ( τ ( A n ) / 2 + ϕ τ ( A n ) / 2 + 1 .
Since x B , we have τ ( A n ) n , which implies that the last expression converges to zero. For the case τ ( A n ) n , we use the same argument:
1 ρ ( A n ) = μ A n σ τ ( A n ) ( A n ) μ A n σ τ ( A n ) n / 2 A n ( n / 2 μ A n ( n / 2 ) + ϕ n / 2 + 1
which also converges to zero. Therefore, ρ ( A n ) n 1 . We conclude the proof by noting that X B is a countable set, and thus, μ ( B ) = 1 . □
Proof of Statement (b).
By Lemma 2, for ψ -mixing or summable ϕ -mixing measures, there exists n 0 1 such that:
n n 0 , A A n , μ ( A ) 1 > τ ( A ) .
Now, since μ A ( T A > j ) , j 1 is a nonincreasing sequence, the potential well is larger than or equal to the arithmetic mean of the subsequent μ ( A ) 1 τ ( A ) elements:
ρ ( A ) = μ A ( T A > τ ( A ) ) 1 μ ( A ) 1 τ ( A ) j = τ ( A ) μ ( A ) 1 1 μ A ( T A > j ) 1 μ ( A ) 1 j = τ ( A ) μ ( A ) 1 1 μ A ( T A > j ) = j = τ ( A ) μ ( A ) 1 1 μ ( A ; T A > j ) = j = τ ( A ) μ ( A ) 1 1 μ ( T A = j + 1 ) .
In the last equality, we used Lemma 4. By (45), one obtains:
ρ ( A ) μ ( T A μ ( A ) 1 ) μ ( T A τ ( A ) ) = μ ( T A μ ( A ) 1 ) τ ( A ) μ ( A )
where the equality follows by stationarity and the definition of τ ( A ) .
By Lemmas 1 and 2, we know that τ ( A ) μ ( A ) n 0 uniformly. Thus, it is enough to find a strictly positive lower bound for μ ( T A μ ( A ) 1 ) . Let:
N = j = 1 μ ( A ) 1 𝟙 A σ j
which counts the number of occurrences of A up to μ ( A ) 1 . By the so-called second moment method,
μ ( T A μ ( A ) 1 ) = μ ( N 1 ) E ( N ) 2 E ( N 2 ) .
Stationarity gives E ( N ) = 1 . It remains to prove that E ( N 2 ) is bounded above by a constant. Expanding N 2 , using stationarity and E ( N ) = 1 , we obtain:
E ( N 2 ) = 1 + 2 j = 1 μ ( A ) 1 ( μ ( A ) 1 j ) μ ( A σ j ( A ) ) .
Let us first consider the ϕ -mixing case. For j n , mixing gives μ ( A σ j ( A ) ) μ ( A ) 2 + μ ( A ) ϕ ( j n + 1 ) . Thus,
j = n μ ( A ) 1 ( μ ( A ) 1 j ) μ ( A σ j ( A ) ) 1 2 + = 0 μ ( A ) 1 n ϕ ( + 1 )
where we used μ ( A ) 1 j μ ( A ) 1 to get the last term.
For 1 j n 1 , as before A ( j ) A ( j / 2 ) ; thus:
μ A σ j A = μ A σ n A ( j ) μ A σ n j / 2 A ( j / 2 ) μ ( A ) μ A j / 2 + ϕ ( j / 2 + 1 ) μ ( A ) C e c j / 2 + ϕ ( j / 2 + 1 ) .
Therefore,
j = 1 n 1 ( μ ( A ) 1 j ) μ ( A σ j ( A ) ) j = 1 n 1 C e c j / 2 + ϕ ( j / 2 + 1 ) .
Therefore, by (49) and (50), the summability of ϕ concludes the proof for the ϕ -mixing case.
If μ is ψ -mixing, we separate the sum in (48) into three parts. First, recall the definition of g 0 in Section 3.2. For 1 j g 0 , we bound the sum as follows:
j = 1 g 0 ( μ ( A ) 1 j ) μ ( A σ j ( A ) ) j = 1 g 0 μ ( A ) 1 μ ( A ) = g 0 .
For g 0 + 1 j g 0 + n 1 , we have by ψ -mixing:
( μ ( A ) 1 j ) μ ( A σ j ( A ) ) μ ( A ) 1 μ A σ n g 0 A ( j g 0 ) M μ ( A ) 1 μ ( A ) μ A ( )
where we denoted = j g 0 . Thus:
j = g 0 + 1 g 0 + n 1 ( μ ( A ) 1 j ) μ ( A σ j ( A ) ) M = 1 n 1 C e c .
Finally, applying ψ -mixing again,
j = g 0 + n μ ( A ) 1 ( μ ( A ) 1 j ) μ ( A σ j ( A ) ) M j = n + g 0 μ ( A ) 1 μ ( A ) 1 μ ( A ) 2 M ,
concluding the proof of the ψ -mixing case. □

Author Contributions

Conceptualization, M.A., V.A., and S.G.; formal analysis, M.A., V.A., and S.G.; investigation, M.A., V.A., and S.G.; methodology, M.A., V.A., and S.G.; writing, original draft, M.A., V.A., and S.G.; writing, review and editing, M.A., V.A., and S.G. All authors read and agreed to the published version of the manuscript.

Funding

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq, Chamada Universal 439422/2018-3 and Bolsa de Produtividade 305096/2019-2), Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP, Auxílio Regular 2019/23439-4).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

This work is part of the PhD Thesis of V.A. under the supervision of M.A. and S.G. at the PIPGEs (Programa Interinstitucional de Pós-Graduação em Estatística, UFSCar-USP). V.A. and S.G. thank IME-USP for hospitality during several stays, and M.A. thanks DEs-UFSCar for hospitality during several stays.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Freitas, A.C.M.; Freitas, J.M.; Todd, M. Hitting time statistics and extreme value theory. Probab. Theory Relat. Fields 2010, 147, 675–710. [Google Scholar] [CrossRef] [Green Version]
  2. Freitas, J.M. Extremal behaviour of chaotic dynamics. Dyn. Syst. 2013, 28, 302–332. [Google Scholar] [CrossRef]
  3. Lucarini, V.; Faranda, D.; de Freitas, J.M.M.; Holland, M.; Kuna, T.; Nicol, M.; Todd, M.; Vaienti, S. Extremes and Recurrence in Dynamical Systems; John Wiley & Sons: Hoboken, NJ, USA, 2016. [Google Scholar]
  4. Leadbetter, M.R.; Lindgren, G.; Rootzén, H. Extremes and Related Properties of Random Sequences and Processes; Springer: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
  5. Resnick, S.I. Extreme Values, Regular Variation and Point Processes; Springer: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
  6. Abadi, M.; Vergne, N. Sharp error terms for return time statistics under mixing conditions. J. Theoret. Probab. 2009, 22, 18–37. [Google Scholar] [CrossRef] [Green Version]
  7. Abadi, M.; Cardeno, L.; Gallo, S. Potential well spectrum and hitting time in renewal processes. J. Stat. Phys. 2015, 159, 1087–1106. [Google Scholar] [CrossRef]
  8. Bradley, R.C. Basic properties of strong mixing conditions. A survey and some open questions. Probab. Surv. 2005, 2, 107–144, update of, and a supplement to, the 1986 original. [Google Scholar] [CrossRef] [Green Version]
  9. Afraimovich, V.; Chazottes, J.R.; Saussol, B. Pointwise dimensions for Poincaré recurrences associated with maps and special flows. Discrete Contin. Dyn. Syst. 2003, 9, 263–280. [Google Scholar]
  10. Saussol, B.; Troubetzkoy, S.; Vaienti, S. Recurrence, dimensions, and Lyapunov exponents. J. Statist. Phys. 2002, 106, 623–634. [Google Scholar] [CrossRef]
  11. Ornstein, D.S.; Weiss, B. Entropy and data compression schemes. IEEE Trans. Inform. Theory 1993, 39, 78–83. [Google Scholar] [CrossRef]
  12. Wyner, A.D.; Ziv, J. Some asymptotic properties of the entropy of a stationary ergodic data source with applications to data compression. IEEE Trans. Inf. Theory 1989, 35, 1250–1258. [Google Scholar] [CrossRef]
  13. Marton, K.; Shields, P. Almost sure waiting time results for weak and very weak bernoulli processes. In Proceedings of the 1994 IEEE International Symposium on Information Theory, Trondheim, Norway, 27 June–1 July 1994. [Google Scholar]
  14. Shields, P.C. Waiting times: Positive and negative results on the wyner-ziv problem. J. Theor. Probab. 1993, 6, 499–519. [Google Scholar] [CrossRef]
  15. Abadi, M.; Chazottes, J.; Gallo, S. The complete lq-spectrum and large deviations for return times for equilibrium states with summable potentials. arXiv 2019, arXiv:1902.03441. [Google Scholar]
  16. Chazottes, J.; Ugalde, E. Entropy estimation and fluctuations of hitting and recurrence times for Gibbsian sources. Discrete Cont. Dyn. Syst. Ser. B 2005, 5, 565–586. [Google Scholar] [CrossRef]
  17. Collet, P.; Galves, A.; Schmitt, B. Repetition times for Gibbsian sources. Nonlinearity 1999, 12, 1225–1237. [Google Scholar] [CrossRef]
  18. Abadi, M.; Cardeno, L. Rényi entropies and large deviations for the first match function. IEEE Trans. Inform. Theory 2015, 61, 1629–1639. [Google Scholar] [CrossRef]
  19. Abadi, M.; Vaienti, S. Large deviations for short recurrence. Discret. Cont. Dyn. Syst. Ser. A 2008, 21, 729–747. [Google Scholar] [CrossRef]
  20. Haydn, N.; Vaienti, S. The Rényi entropy function and the large deviation of short return times. Ergod. Theory Dynam. Syst. 2010, 30, 159–179. [Google Scholar] [CrossRef] [Green Version]
  21. Abadi, M.; Gallo, S.; Rada-Mora, E.A. The shortest possible return time of β-mixing processes. IEEE Trans. Inf. Theory 2017, 64, 4895–4906. [Google Scholar] [CrossRef]
  22. Abadi, M.; Lambert, R. The distribution of the short-return function. Nonlinearity 2013, 26, 1143–1162. [Google Scholar] [CrossRef]
  23. Aldous, D.J.; Brown, M. Inequalities for rare events in time-reversible markov chains ii. Stoch. Process. Their Appl. 1993, 44, 15–25. [Google Scholar] [CrossRef] [Green Version]
  24. Galves, A.; Schmitt, B. Inequalities for hitting times in mixing dynamical systems. Random Comput. Dynam. 1997, 5, 337–347. [Google Scholar]
  25. Abadi, M. Exponential approximation for hitting times in mixing processes. Math. Phys. Electron. J 2001, 7, 1–19. [Google Scholar]
  26. Abadi, M. Sharp error terms and necessary conditions for exponential hitting times in mixing processes. Ann. Probab. 2004, 32, 243–264. [Google Scholar] [CrossRef]
  27. Abadi, M. Hitting, returning and the short correlation function. Bull. Braz. Math. Soc. 2006, 37, 593–609. [Google Scholar] [CrossRef]
  28. Abadi, M.; Saussol, B. Hitting and returning to rare events for all alpha-mixing processes. Stoch. Process. Their Appl. 2011, 121, 314–323. [Google Scholar] [CrossRef]
  29. Abadi, M.; Saussol, B. Almost sure convergence of the clustering factor in α-mixing processes. Stochastics Dyn. 2016, 16, 1660016. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Abadi, M.; Amorim, V.; Gallo, S. Potential Well in Poincaré Recurrence. Entropy 2021, 23, 379. https://doi.org/10.3390/e23030379

AMA Style

Abadi M, Amorim V, Gallo S. Potential Well in Poincaré Recurrence. Entropy. 2021; 23(3):379. https://doi.org/10.3390/e23030379

Chicago/Turabian Style

Abadi, Miguel, Vitor Amorim, and Sandro Gallo. 2021. "Potential Well in Poincaré Recurrence" Entropy 23, no. 3: 379. https://doi.org/10.3390/e23030379

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop