Abstract
In this paper, we consider a two-dimensional Moran model of random walks consisting of two queues evolving in parallel which can (at each unit of time) either increase by one or reset to 0. We analyze the joint law of their final altitude and prove that the asymptotic distribution of each component is a shifted geometric distribution and we analyze the maximum of these two components, also giving closed forms for the mean and variance.
1. Introduction
The concept of a random walk is very important and is used in many fields, such as in biology (random walk is applied to problems related to genetic diseases such as diabetes, cancer, or the movement of an animal in search of food), physics (the movement of a particle in a gas or liquid), chemistry, (a chemical reagent in a solution), economics (the stock market price), the maximum age of a machine in mechanics, the exploration of a territory by a robot in artificial intelligence, etc.
In this article, we consider discrete random walks. Many scientific studies used probability generating functions to obtain the mean, the variance, and the limiting distribution of some statistics such as the final altitude and the height of discrete random walks, e.g., via the kernel method and singularity analysis (see [1,2]), via some links with continued fractions [3], or via more probabilistic approaches (see [4], which also illustrates links with random trees).
For example in dimension one, if one focuses on articles which play a role in our analysis, Banderier and Flajolet have shown in [1] that the final altitude of a random meander of length n converges asymptotically to a Rayleigh distribution (drift ) and normal distribution (). Furthermore, in [5], Banderier and Nicodème have analyzed the height of discrete bridges/meanders/excursions for bounded discrete walks. Similar extremal parameters were also studied for trees in [4,6], and by Gafni [7] for the asymptotic distribution of the length of the longest run of consecutive equal parts. Finally, some models of walks with reset were considered by Banderier and Wallner, who treated some statistics for lattice paths with catastrophes such as the number of catastrophes of a random excursion of size n, which converges to Gaussian, Rayleigh, or discrete distribution depending on the drift (see Theorem 4.12 in [8]).
Then for higher dimensions, still in link with our model, we can mention the stochastic model for solitons, investigated by Itoh, Mahmoud, and Takahashi in [9], where they proved that the wavelength converges to a convolution of geometric random variables. Itoh and Mahmoud [10] also proved that the distribution for the lifetime of an individual converges to a (shifted) geometric distribution in law. Other articles related to the Moran process (in biology and population genetics) are involving similar higher-dimensional walk models; see e.g., [11,12,13].
In this paper, we study the two-dimensional Moran walk TDMW defined as follows: At time 0, the process starts from the origin. After one unit of time, we have four possibilities: (a) the process stops at (0,0) with probability ; (b) the first component shifts by one positive unit, but the second component returns to zero with probability ; (c) the second component shifts by one positive unit, but the first component returns to zero with probability ; or (d) the two components shift by one positive unit with probability p. We now provide four examples of the evolution of the TDMW process until time :
In these four examples, the final altitude of components one and two are 0, 5, 8, 3, and 3, 1, 0, 7, respectively. The final altitude of the maximum is 3, 5, 8, and 7 and the height of the four walks is 4, 5, 8, and 7, respectively.
The main result of this work is the discovery of the probability generating function of the maximum of the two final altitudes, and the computation of the corresponding mean and variance. Furthermore, we prove that the asymptotic distribution of the two random walks converges to a shifted geometric distribution, and we give a closed form for the corresponding mean and variance.
This paper is organized as follows. In Section 2, we define some statistics, introduce our model in detail, and use some examples for different lengths of Moran walk according to the probability distribution. In Section 3, we state our main results on the probability generating function of the final altitude and their mean and variance. In Section 4, we use R software to simulate Moran walks of lengths: 10, 100, 500, and 1000, for different values of the probabilities , , , and p. In Section 5, we establish recursive equations for the sequence of multivariate polynomials and the probability generating function associated with the model. In Section 6, we prove that the limiting distributions of two Moran random walks and converge to shifted geometric distributions and find their means and variances. In Section 7, we give the proofs of our theorems. In Section 8, we present another method for obtaining the distribution of the final altitude . Finally, we present some conclusions and perspectives.
2. Definitions and Presentation of the Model
In this section, we present some definitions concerning the statistical properties of discrete processes, such as the mean, the variance, and the probability generating function. Finally, we present our model and define some statistics, such as the final altitude and height.
Let us consider a stochastic process with a discrete time and finite state space without assuming that they are Markov chains. For brevity, we call such a process a discrete stochastic process. We start with the following classical definitions (see [14]).
2.1. Definitions
- 1.
- Let be a discrete stochastic process starting from time , with being the initial state.
- 2.
- The state space of the process is assumed to be finite and is denoted by S. Each takes its values in S.
- 3.
- Let be a discrete stochastic process with two dimensions starting from time , with being the initial state.
- 4.
- The state space of the process is assumed to be finite and is denoted by . Each takes their values in .
- 5.
- The probability distribution of is denoted by , which is identified with a vector in . We call these distributions the one-dimensional distributions of the process.
- 6.
- We define the probability that is associated with the discrete process as follows: for all ,
- 7.
- Let U be a discrete random variable with a distribution , . Then, the expected value and variance of U, denoted by and , respectively, is defined as
- 8.
- Let U be a discrete random variable with distribution , ; then, the probability generating function, denoted by G, of the variable U is defined by:for all such that .
Probability generating functions constitute an elegant tool due to their numerous uses. Mainly, the probability density functions associated with discrete stochastic processes and their moments can be obtained from the derivatives of the probability generating function. In fact, the mean and the variance of the process (the first and second centered moments of the distribution of U) are related to the derivatives of the probability generating function at . More precisely, the next folklore lemma explains this link.
Lemma 1
([2]). Let be the probability generating function for the discrete random process random U; then, for , the kth factorial moment of U is given by
In addition, if the limit of and exists at , then we have the following two important equations, which are related to the mean and variance of U and :
Consider a two-dimensional Moran model with two components ; it starts at time 0 with two components having an age 0 (i.e., ), and it is parameterized by four probabilities, denoted by , , , and p. The Moran model with two dimensions is given by the following system:
where and p are non-negative given values such that .
Additionally, we define the final altitude and height statistics by:
Applying the four first definitions, we can consider and as a random walk (discrete stochastic process with one dimension) in the state space , which is started from the initial state and , respectively. However, are considered a stochastic process with two dimensions in the state space , which is started from the initial state . Moreover, we can consider , , , and as a random walk in , where the x-axis represents the time n (length of Moran walk) and the y-axis represents the age of each component and or the maximum age or the height. In the next section, we denote by:
, , and for the means and variances of and for and , respectively.
2.2. Example 1: Two-Dimensional Moran Walk of Length 5 for Initial Probabilities and
Using R software, Figure 1 shows that the final altitudes of the two components are equal to two and one, respectively; thus, the final maximum age of the two components is equal to two. In addition, the age height is fixed at two when and for the Moran walk of length 5.
Figure 1.
The evolution of the age of each component, the final altitude, and the height of two-dimensional Moran walk of length 5 starting from .
2.3. Example 2: Two-Dimensional Moran Walk of Length 10 with Initial Probabilities and
Figure 2 shows that the final altitudes of the two components are equal to one and two, respectively; thus, the final maximum age of the two components is equal to two. In addition, the age height is 4 at time 7, when and for the Moran walk of length 10.
Figure 2.
The evolution of the age of each component, the final altitude, and the height of the two-dimensional Moran walk of length 10 starting from .
3. Main Result
In this section, we study the final altitude of the maximum age, , between the two components at time n. Precisely, we want to find the probability generating function of , denoted by , and determine the mean and variance of the Moran walk with length n. For this reason, we define the probability generating function as
The first result concerns the probability generating function , which is presented in the next theorem.
Theorem 1.
The probability generating function, , of the final altitude between the two components is given by the following: for all , ,
such that , where , , , and .
In the next theorem, using the probability generating function and their derivatives of order one and two, respectively, we can calculate the mean and variance of . The second main result is presented in the following theorem.
Theorem 2.
The mean and variance of the final altitude are given by
and
where and .
Remark 1.
Because , then , for . So, we deduce from Theorem 2 and Corollary 1 that .
4. Simulation
In this section, using R software, we simulate the final altitude of each component , , the maximum Y, and the height H for two cases. In the first case, we take and in the system (6); in the second case, we take and for four different lengths of the TDMW, which are equal to 10, 100, 500, and 1000, respectively.
Figure 3 shows that the final altitudes of , , Y, and H are equal to 0, 2, 2, and 3 for the Moran walk of length 10. When the length is increased from 10 to 100, we observe that the final altitudes , , Y, and H are equal to 2, 0, 2, and 8. It is clear that the height is bounded by eight. Then the maximum age of the two components over the full process goes from 6 (for length 10) to 8 (for length 100). This increase is very small, this can be explained by a logarithmic increase of the height over the time, similarly to what is proven in [15].
Figure 3.
Final altitude of (in red), (in green), their maximum (, in pick) and height H (in orange) of a TDMW of lengths 10 and 100 for and .
Figure 4 shows that if the length is equal to 500, the final altitudes of , , and Y are equal to at most 4, and the final altitude of the height H is equal to 10. If we increase the length of the walk from 500 to 1000, then we remark that the final altitudes of , , and Y are not changed and stay at a maximum of 4; however, the final altitude of the height H changes a little from 10 to 11. Then, the maximum height between the two lengths is equal to at most 11.
Figure 4.
Final altitude , , and Y and height H of a TDMW with lengths 500 and 1000 for and .
Figure 5 shows that if the length of the random walk is equal to 10, the final altitudes of and are equal to 3 and 0; then, the maximum altitude Y is 3, and the height H is 5. When the length of the walk is increased from 10 to 100, we remark that the final altitudes of , , and Y change a little and are equal to at most 2, and the final altitude of the height H is equal to 3. Then, the difference in the final altitude of the height is very small, ranging from three to five between the two lengths of the walk and being bounded by 5.
Figure 5.
Final altitude , , and Y and height H of the Moran walk with lengths 10 and 100 for and .
Figure 6 shows that the final altitude of , , and Y is equal to at most 4, and the final altitude of the height H is fixed at 10 for the Moran walk of length 500. When the length is increased from 500 to 1000, we observe that the final altitudes of , , and Y are moved a little and at most are equal to 6, and the final altitude of the height H is fixed at 11. It is clear that the height is bounded by eight by eleven. The maximum age of the two components is at most equal to 11.
Figure 6.
Final altitude , , and Y and height H of a TDMW with lengths 500 and 1000 for and .
5. Recursive Equations
In this section, we first calculate the probability of the process defined in (6), denoted by at time given that we know it at time n. Secondly, we study the sequence of multivariate polynomials, denoted by , and find the recursive equation related to this sequence between two consecutive times n and . Next, we investigate the probability generating function, denoted by , for our model. Finally, we prove that the limiting distribution of each component converges to the shifted geometric distribution and calculate their means and variances. Applying the fifth definition in Section 2, we then have the following definition: for all , ,
where . We start this section with a technical lemma. It leads to the involvement of a recursive equation between the probability of our model for two consecutive times n and , and it will be used in Theorem 3. It is based on the following conditional probability: consider two events A and B, where the probability of event A occurs given that event B occurred in the past, given by
where the probability of B is strictly positive. It is presented by the following lemma:
Lemma 2.
We have the following recursion of probabilities:
Proof.
This proof is based on the utility of the conditional probability that the Moran walks and are aged and at time given that they are aged l and q at time n; thus, For and , we have
For and , we have:
For and , we have by symmetry:
For , we have:
□
Remark 2.
We offer some comments on the different cases of the age of the two components and :
- 1.
- In the first case, if the two final ages of the two components are equal to at least one, then the probability that the two components are aged and at time is equal to p multiplied by the probability of two components that are aged and at time n.
- 2.
- In the second case, if the ages of the two components one and two are equal to zero and at least one at time , then the probability that the two components are aged and at time is equal to multiplied by the sum of probabilities such that the two components are aged l, which is started from 0 to n and at time n, respectively.
- 3.
- In the third case, if the ages of two components two and one are equal to zero and at least one at time , then the probability that the two components are aged and at time is equal to multiplied by the sum of probabilities such that the two components are aged q, which is started from 0 to n and at time n, respectively.
- 4.
- In the fourth case, if the ages of two components one and two are equal to zero, then the probability that the two components are aged and at time is equal to multiplied by the double sum of the probabilities such that the two components are aged l and q starting from zero to n at time n, respectively.
In the next section, we define the sequence of multivariate polynomials (for ) by the fact that the coefficient of in is the probability that, at time n, the i-th individual has age (for ); that is,
Starting from Equation (10) and using the preceding lemma, we can find a recursive equation that is related between , , , and ; it is introduced in the next proposition.
Proposition 1.
For all , the sequence of multivariate polynomials satisfies the recursive equation:
with initial condition .
6. Distributions of and
In this section, we study some statistical characteristics such as the probability generating function, the asymptotic distribution, the mean, and the variance of the final altitude of each component and at time n. Precisely, we start to find the probability generating function of each component as denoted by and . In the next theorem, we prove that the final altitude of each component converges to the shifted geometric distribution in the law. This partially generalizes the results of Itoh and Mahmoud [10] (which was restricted to the uniform case, i.e., , but in any dimension). Finally, we finish this section by calculating the mean and variance of each component. The following theorem that is presented is the probability generating function and asymptotic limit of and .
Theorem 3.
The distributions of and follow the same discrete distribution, with probability generating functions being given by:
Asymptotically, and converge, in law, to some shifted geometric (Geo) random variables:
where , , and .
Proof.
Using the recursive equation defined in (11) for , we get
By symmetry, we have
Passing to the limit of , we then have
Suppose that Z follows the geometric distribution with parameter . Then, its probability is given by
and the probability generating function of Z is equal to
From Theorem 3, we directly get the following corollary concerning the explicit expressions of the means and variances of and , which depend on probability generating functions.
Corollary 1.
The mean and variance of the final altitude of each component of Moran walk are given by:
where , , and .
Proof.
By symmetry, the mean of is given by
Computing the second derivative of with respect to , one has
and evaluating at , one has
By symmetry, we have
□
7. Proofs
In this section, we prove Theorems 1 and 2 of our main result. For this purpose, consider the three sequences of polynomials, , , and , which are defined by
The following lemmas are technical and will be used for the proof of Theorem 1. The first lemma is used to calculate when the final age of the first component can be equal to zero but the final age of the second component two is increased by one unit more than the first. It is presented by:
Lemma 3.
The sequence satisfies the recursive equation
for all and for all , such that .
Proof.
We can write by
and using Lemma 2, we obtain
Iterating n times , we get
□
We use the previous lemma to find a recursive equation for the sequence . It depends on the probability generating function of the walk and sequence . The second lemma in this section is used to calculate when the final ages of the two components can be equal and can be started from but the age of the first component does not pass the age of the first; it is presented by:
Lemma 4.
The sequence satisfies the recursive equation
for all and for all , such that .
Proof.
Using Lemma 2, we get
□
We will use Lemma 4 to find a recursive equation for the sequence ; it depends on the probability generating function of the walk and sequence . The following lemma is used to calculate when the final ages of the two components can be equal and can be started from but the age of the second component does not pass the age of the first; it is presented in the following lemma.
Lemma 5.
The sequence satisfies the recursive equation:
and is given exactly by
for all such that and , where is defined in Theorem 3.
Proof.
Remark 3.
Consider the sequence given by the following: for all ,
calculate the sum of under the condition that the final ages of the two components can be equal and can be started from but the age of the first component does not pass the age of the second.
We observe that the two sequences and are symmetric; thus, the expression of is directly given by the following: for all , for all such that and ,
where is defined in Theorem 3.
Consider the sequence given by: for all , for all ,
The following lemma is used to calculate when the final ages of two components are equal and are started from ; it is presented by the following:
Lemma 6.
The sequence satisfies the recursive equation: for all such that ,
Proof.
Iterating n times and getting , such that , one has
□
The last lemma gives the probability that the two components have age .
Corollary 2.
For all , we have
Here, we present the proof of Theorem 1, which is based on the four previous lemmas.
7.1. Proof of Theorem 1
We have
Using the results in Lemma 5, Remark 3, and Lemma 6, we get
Simplifying , we then get
In the next section, we prove Theorem 2; for this purpose, we will need to perform two derivatives of the probability generating function defined in the previous theorem and take to obtain the mean and variance of while using Equation (5).
7.2. Proof of Theorem 2
Calculating the first derivative of defined in Theorem 1,
Developing , we have
and evaluating at ,
where is a sequence defined by
Taking the common factor, one has
Simplifying , we then obtain
We define the following sequences of functions:
We calculate the first derivative of the function defined in (36), and we have
evaluating at , we then get
The first derivative of is given by
evaluating at ,
and we get
By symmetry, we have
Computing the first derivative of , we have
taking , we obtain
Defining as
and developing and simplifying thanks to the following equations
we have
Replacing it in (41), we then get
We define the following sequences and by
and
developing, for , we have
Simplifying ,
Using Corollary 1,
and developing , we have
By simplifying , we then get
8. Another Method for Obtaining the Distribution of the Final Altitude
In this section, we present another method from [15] to study the final altitude . This method is based on the generating function of . More precisely, we define the former as follows: for all , the function F is
where the time is encoded by the exponent of t.
Our goal in the first setup is to obtain an explicit form of F; in a second setup, we intend to expand F in series with respect to time t and extract the coefficient of from F. This coefficient represents the explicit form of . Finally, we obtain from all information about the disjoint distribution of and thus, the distribution of .
Starting from the functional equation defined in (11) and using the kernel method (for more details, see [1,16]), we obtain the probability generating function. We obtain the explicit form of F in the following theorem:
Theorem 4.
An explicit form of the moment generating function of the final altitude of the Moran walk is given by the following: for all such that , and ,
where
Proof.
For all , using Equation (11), we have
By applying the sum for the previous equation and through a simple calculation, we have
Then, the probability generating function satisfies the recursive equation
Remark 4.
The term in Equation (46) is called the kernel factor.
Remark 5.
From Theorem 4, we can obtain the generating function of and .
Corollary 3.
The probability generating functions, denoted by and , of and are given by:
and
for all , for all , such that , , where .
Proof.
The proof is a direct consequence of the previous theorem if we take in (46); we obtain the expression of the probability generating function of . This occurs similarly for . □
9. Conclusions and Perspectives
This current paper proposes the finding of the limiting distribution of the age of two components, denoted by and ; they converge to a geometric distribution with parameters and based on the probability generating function method for a TDMW with length n. In addition, the means and variances of and explicitly depend on the initial probability distribution , , , and p. Furthermore, we determine the probability generating function of the maximum age of two components, which is denoted by ; it depends on the probability generating of each component and is controlled by the following term given by . Furthermore, the function depends on the four probabilities , , , and p. Finally, the mean and variance of can be expressed from the means and variances of and and are controlled by the two terms given by and
respectively.
In the next work, we will study the statistical properties of the height statistics, denoted by , starting from to for a TDMW with length n. We will use the generating function of that is defined by
where encodes the possible values of (bounded by h) and is given by
and
where the probability generating function of a bounded Moran walk with dimension two ends at altitude k (for more details, see [15]).
In Section 3, for two cases of four probabilities, we observe that the height statistics is bounded by 3, 5, 10, and 11 for walks with lengths of 10, 100, 500, and 1000 when and , respectively. Moreover, it is bounded by 3, 5, 10, and 11 for the same lengths when and , respectively.
In the future, we plan to work on the following challenging questions:
- 1.
- Can we find the probability generating function of the height?
- 2.
- Can we explicitly calculate the mean and variance of the height?
- 3.
- What is the limiting distribution of the height?
Author Contributions
Conceptualization, A.A. and M.A.; writing—–original draft preparation, A.A. and M.A.; writing—–review and editing, A.A. and M.A. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Data Availability Statement
Generation of random samples using R program.
Acknowledgments
The research of Mohamed Abdelkader has been supported in part by King Saud University, the Deanship of Scientific Research, and the College of Science Research Center.
Conflicts of Interest
The authors declare no conflict of interest.
Abbreviations
The following abbreviation is used in this manuscript:
| TDMW | two-dimensional Moran walk |
References
- Banderier, C.; Flajolet, P. Basic analytic combinatorics of directed lattice paths. Theor. Sci. 2002, 281, 37–80. [Google Scholar] [CrossRef]
- Flajolet, P.; Sedgewick, R. Analytic Combinatorics; Cambridge University Press: Cambridge, UK, 2009. [Google Scholar]
- Flajolet, P.; Guillemin, F. The formal theory of birth-and-death processes, lattice path combinatorics and continued fractions. Adv. Appl. Probab. 2000, 32, 750–778. [Google Scholar] [CrossRef]
- Drmota, M. Random Trees. In An Interplay between Combinatorics and Probability; Springer: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
- Banderier, C.; Nicodème, P. Bounded discrete walks. Discret. Math. Theor. Comput. Sci. 2010, 35–48. [Google Scholar] [CrossRef]
- de Bruijn, N.G.; Knuth, D.E.; Rice, S.O. The average height of planted plane trees. Graph. Theory Comput. 1972, 15–22. [Google Scholar] [CrossRef]
- Gafni, A. Longest run of equal parts in a random integer composition. Discret. Math. 2015, 338, 236–247. [Google Scholar] [CrossRef]
- Banderier, C.; Wallner, M. Lattice paths with catastrophes. Discret. Math. Theor. Comput. Sci. 2017, 19, 32. [Google Scholar] [CrossRef]
- Itoh, Y.; Mahmoud, H.M.; Takahashi, D. A stochastic model for solitons. Random Struct. Algorithms 2004, 24, 51–64. [Google Scholar] [CrossRef]
- Itoh, Y.; Mahmoud, H.M. Age statistics in the Moran population model. Stat. Probab. Lett. 2005, 74, 21–30. [Google Scholar] [CrossRef]
- Huillet, T.; Möhle, M. Duality and asymptotics for a class of nonneutral discrete Moran models. J. Appl. Probab. 2009, 46, 866–893. [Google Scholar] [CrossRef]
- Cordero, F.; Véchambre, G. Moran models and Wright-Fisher diffusions with selection and mutation in a one-sided random environment. Adv. Appl. Probab. 2023, 1–67. [Google Scholar] [CrossRef]
- Huillet, T.E. Random walk Green kernels in the neutral Moran model conditioned on survivors at a random time to origin. Math. Popul. Stud. 2016, 23, 164–200. [Google Scholar] [CrossRef]
- Faragó, A. Approximating general discrete stochastic processes by Markov chains. J. Stat. Comput. Sci. 2022, 1, 135–145. [Google Scholar]
- Althagafi, A.; Aguech, R.; Banderier, C. Height of Walks with Resets and the Moran Model. 2023. submitted. [Google Scholar]
- Banderier, C. Combinatoire Analytique des Chemins et des Cartes. Ph.D. Thesis, Université Paris VI, Paris, France, 2001. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).