Abstract
Consider a coin-tossing sequence, i.e., a sequence of independent variables, taking values 0 and 1 with probability . The famous Erdős-Rényi law of large numbers implies that the longest run of ones in the first n observations has a length that behaves like , as n tends to infinity (throughout this article, log denotes logarithm with base 2). Erdős and Révész refined this result by providing a description of the Lévy upper and lower classes of the process . In another direction, Arratia and Waterman extended the Erdős-Rényi result to the longest matching subsequence (with shifts) of two coin-tossing sequences, finding that it behaves asymptotically like . The present paper provides some Erdős-Révész type results in this situation, obtaining a complete description of the upper classes and a partial result on the lower ones.
1. Introduction
Consider a coin-tossing sequence , i.e., a sequence of independent random variables satisfying . Let be the length of the longest head-run, i.e., the largest integer r for which there is an i, , for which for . A result of Erdős and Rényi [1] implies that
(throughout this paper, log will denote base 2 logarithms. The notation will be used for its iterates: , . Also, C and c, with or without an index, are used to denote generic constants that may have different values at each occurrence). The simple result (1) has seen a number of improvements. Erdős and Révész [2] provided a detailed description of the asymptotic behavior of . In order to formulate their result, let us recall
Definition 1
(Lévy classes). Let be a sequence of random variables. We say that a sequence of real numbers belongs to
- The upper-upper class of (), if, with probability 1 as , eventually.
- The upper-lower class of (), if, with probability 1 as , for infinitely many n.
- The lower-upper class of (), if, with probability 1 as , for infinitely many n.
- The lower-lower class of (), if, with probability 1 as , eventually.
Of course, these definitions work best if the sequence obeys some zero-one law.
Their result is as follows:
Let be a nondecreasing integer sequence. Then
- if ,
- if ,
- for any , ,
- for any , .
Arratia and Waterman [3] extend Erdős and Rényi’s result in another direction: they consider two independent coin-tossing sequences, and , and look for the longest matching subsequences when shifting is allowed. Formally, let be the the largest integer m for which there are with and for all . They prove that, with probability 1
In the present paper, we will make this more precise by providing a description for the upper classes of and also some results on its lower classes:
Theorem 1.
Let be a nondecreasing integer sequence. We have
- if .
- if .
- for some c, .
- for some c, .
2. Discussion
We leave the proof of Theorem 1 for later and rather discuss some of the concepts that are connected to this problem. One of them is the so-called independence principle: in many, although not all, situations, one may pretend that the waiting times until a given pattern of length l is observed have an exponential distribution with parameter , and that the waiting times for different patterns are independent. Móri [4] and Móri and Székely [5] provide an account of this principle and its limitations. In our case, all results but the lower-lower class one are more or less in tune with this principle.
Another question that is closely related is that of the number of different length l subsequences of . This question does not seem to have been considered by literature very much; there is one remarkable result by Móri [6]: in the remark following the statement of Theorem 3 in that paper, he mentions that with probability one for large n, the largest l for which all possible patterns occur as subsequences of is either or for any . The independence principle would suggest that is bounded away from 0 with probability one, and this or even the less stringent for large n would be an important step towards removing the double log term from the result, as we conjecture that, for some , . Unfortunately, we are only able to obtain , which is also implied by Móri’s result.
3. Proofs
Proof of the upper-upper class result.
Both upper class statements are fairly easy to prove. First, observe that under our assumptions, the convergence of
is equivalent to that of
with .
Now, define events
occurs if in one of the pairs of sequences
both sequences agree. That provides the trivial upper bound
so, by our assumptions, , and the Borel-Cantelli Lemma implies that, with probability 1, only finitely many events occur. Thus, for sufficiently large k, , and for , we have
This shows that , as claimed. □
Proof of the upper-lower class result.
We may assume without loss of generality that .
Again, let . We want to use the second Borel-Cantelli Lemma, so we are defining independent events
This is the union of the events
with . We endow the set of pairs with the lexicographic order. For a subset I of the integers, Bonferroni’s inequality provides
Let . If , then , otherwise .
Let . Using this in (11) yields the value for the first sum. In the second sum, for given and , there are no more than pairs with . The number of pairs of pairs with is trivially bounded by . Putting these together, we arrive at the upper estimate
In total,
keeping in mind that .
and . Borel-Cantelli implies that, with probability 1, infinitely many events occur. Thus, for infinitely many k, , so . □
For the lower class results, we first prove some lemmas:
Lemma 1.
With probability 1 for n sufficiently large
Proof of Lemma 1.
The lower part is a direct consequence of Móri’s result: as for sufficiently large n and, obviously , as extending two different sequences from length to keeps them different; it can only happen that some of them are extended beyond index n, but this can affect at most of them. □
Lemma 2.
Let S be a set of sequences of length , and let A be the event that none of the sequences in S occurs as a subsequence of . For , let B be the event that some sequence from S is a subsequence of . Then
Proof of Lemma 2, upper part.
Consider the sequences for . Each of these has probability that it does not contain a subsequence that lies in S, and by independence
□
Proof of Lemma 2, lower part.
Assume for simplicity that n is a multiple of , say . Again, we split into N blocks of length , and the probability that none of those contains a sequence from S is
It can still happen that there is a sequence from S that crosses one of the boundaries between the blocks. There are boundaries, and for each of those, there are possible subsequences of length l crossing it. The probability that this one is from S but none of the N blocks contains one can be estimated above by the probability that it is from S and none of the blocks not adjacent to it contains one from S. This provides the upper bound
for the probability that there is a subsequence from S in but none in any of the N blocks. Subtracting this from (17), we obtain the lower bound
and the general case is obtained by observing that the probability for a given n is bounded below by the one that we get for . □
In this lemma, the lower and upper bounds are rather close. Its applicability, however, depends on the availability of good estimates for the probability . There is the trivial upper bound and an almost as simple lower bound (given ). Bridging the gap between these requires deeper insight into the structure of S.
Proof of the lower-lower class result.
In both the lower-lower and lower-upper parts, we consider the asymptotics of the longest match found between and under the condition that the sequence is given. Doing so, we may assume that is a “typical” coin-tossing sequence, in the sense that it possesses some property that holds with probability 1. In the sequel, all probabilities are understood as conditional with respect to such a typical sequence . For the lower-lower class result, we let , , and in Lemma 2. Clearly, as only has one length l subsequence, equals , where is the number of different sequences of length l in . For sufficiently large l, by Lemma 1, and we obtain an upper estimate
for the probability (conditional on ) that there is no match of length l between and . For , the series converges, so with probability 1, we have for all but finitely many l. Thus, for , . Inverting the relationship between and l yields , so, for some constant c and l large enough, we obtain , which proves the lower-lower class result. □
Proof of the lower-upper class result.
This time, we need to make our choice of the parameters in Lemma 2 with a little more sophistication. We start with for . Then, . As the set S, we choose the set of all sequences of length contained in . is chosen in such a way that and , is a possible choice.
We define the events
(this last is just the event B from Lemma 2).
Lemma 2 gives us
and
The trivial estimate yields
which diverges if we choose .
We are going to use the Borel-Cantelli Lemma in the usual form for dependent events:
Lemma 3
(Borel-Cantelli II). If the sequence satisfies
and
then
To this end, we need an upper bound for for . We have
This means that for any , there is a number such that, for and , the inequality
holds. Plugging this into
yields the estimate
As , we get
As is arbitrary, the sequence satisfies the assumptions of Lemma 3, so, with probability 1, infinitely many of the events occur. Thus, with probability 1 for infinitely many k, . Observing that , we obtain our lower-upper class result. □
Funding
This research was funded by TU Wien Bibliothek for financial support through its Open Access Funding Programme.
Institutional Review Board Statement
Not applicable.
Data Availability Statement
No new data were created or analyzed in this study. Data sharing is not applicable.
Acknowledgments
The author acknowledges TU Wien Bibliothek for financial support through its Open Access Funding Programme.
Conflicts of Interest
The author declares no conflicts of interest.
References
- Erdős, P.; Rényi, A. On a new law of large numbers. J. Anal. Math. 1970, 23, 103–111. [Google Scholar] [CrossRef]
- Erdős, P.; Révész, P. On the length of the longest head-run. In Topics in Information Theory; Colloquia Mathematica Societatis Janos Bolyai Volume, 16; Ciszár, I., Elias, P., Eds.; North-Holland: Amsterdam, The Netherlands, 1977; pp. 219–228. [Google Scholar]
- Arratia, R.; Waterman, S. An Erdős-Rényi law with shifts. Adv. Math. 1985, 55, 13–23. [Google Scholar] [CrossRef]
- Móri, T. Large deviation results for waiting times in repeated experiments. Acta Math. Hung. 1985, 45, 213–221. [Google Scholar] [CrossRef]
- Móri, T.; Székely, G. Asymptotic independence of pure head stopping times. Stat. Probab. Lett. 1984, 2, 5–8. [Google Scholar] [CrossRef]
- Móri, T. On the waiting time till each of some given patterns occurs as a run. Probab. Theory Relat. Fields 1991, 87, 313–323. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).