Abstract
In this paper, we introduce and analyze a novel two-dimensional Moran random walk model, where each component evolves by either increasing by one unit or resetting to zero at each time step, with transition probabilities dependent on time. The primary contribution of this work is the derivation of an explicit closed-form expression for the probability distribution of the final maximum altitude, . Additionally, we provide a detailed analysis of the height statistics and cumulative distribution associated with the model. Using probabilistic techniques, we establish the exact distribution of the final altitude and its moments. Furthermore, we conduct numerical simulations to illustrate the behavior of the model for various parameter values. These results offer new insights into the statistical properties of two-dimensional Moran processes and have potential applications in population genetics, nuclear physics, and related fields.
1. Introduction
Moran processes model population evolution (or mutation transmission) in a fixed-size population, where any individual that dies is immediately replaced by a new one. Various adaptations of the original process, introduced by Moran [1,2], have been explored in the literature, including more recent extensions such as those incorporating spatially structured populations [3]. Motivated by the literature, we define in this paper a new model of the Moran random walk in two dimensions. The two-dimensional Moran model is a mathematical framework used to study the evolution of allele frequencies in a population of two distinct types (alleles). The “symmetric” aspect typically implies that the population size remains constant, and each individual has an equal probability of reproducing. The Moran model is often used in population genetics. At each time step n, each individual can jump to the next level with a probability that depends on n and reset to 0 with a complementary probability.
In the literature, the statistical properties, including the limiting distribution, the mean, the variance and the cumulative function, of discrete random walks are studied in many scientific papers in one or more dimensions by different methods. We can mention the kernel method and singularity analysis (see [4,5]) via some links with continued fractions [6], or via more probabilistic approaches (see [7], which also illustrates links with random trees).
For example in dimension one, we can mention the important papers in our analysis. Abdelkader [8] analyzed the height of a one-dimensional random walk and proved that the random walk converges to a shifted geometric distribution with parameter asymptotically. Banderier and Flajolet [4] showed that the limiting distribution of the final altitude of a random meander of length n converges to a Rayleigh distribution (drift ) and a normal distribution (). In addition, in [9], Banderier and Nicodème studied the height of discrete bridges/meanders/excursions for discrete bounded walks. Similar extremal parameters were also studied for trees in [7,10].
For higher dimensions, still linked to our model, we can mention the two-dimensional Moran model, which was investigated by Aguech and Abdelkader in [11], where they showed that each walk converges to a shifted-geometric distribution and found their explicit expressions for the mean and the variance. In addition, they determined the exact distribution of the number of resets and the explicit expressions of the mean and variance for the final altitude walk. Althagafi and Abdelkader [12] also studied the two-dimensional Moran model and proved that the limiting distribution of the age of each component converges to a shifted geometric distribution. Furthermore, they determined the closed expressions of the mean and the variance for the final altitude (maximum age between two walks) via the probability generating function. Other articles related to the Moran process (in biology and population genetics) involve similar higher-dimensional walk models; see [13,14].
The novelty of this work compared to the existing literature is that the transition probability depends on n. To the best of our knowledge, despite its significance, this model has never been studied in the literature.
The structure of this paper is organized as follows. In Section 2, we mathematically present the two-dimensional Moran model, define the final altitude and the height statistics, denoted by and , and discuss some applications. In Section 3, we state our main results concerning the probability distribution of the final altitude and the cumulative function of the height statistics associated with our model. In Section 4, we use the program R to compute the probability density function of the final altitude at time with three different values of the probability : 0.1, 0.5, and 0.75. Additionally, we simulate the distribution of the final altitude at three different times, 10, 500, and 100, using values of 0.1, 0.5, and 0.75. In Section 5, we derive a closed-form expression for the distribution of the two components of our model. In Section 6, we present the joint probability distribution of the two components at two positive times, r and s. In Section 7, we show our main results. In Section 8, we present conclusions regarding our results and discuss future perspectives.
2. Presentation of the Model and Its Applications
In this section, we begin by introducing our two-dimensional Moran model, which is generated by a sequence of probabilities. We then present some basic definitions related to discrete random walks in one dimension, including the final statistics and height statistics. Additionally, we define the mean and variance of a discrete process.
The Moran model with two components and starts from 0 (i.e., ) at time 0 and is defined mathematically by
where is a probability sequence defined by the following: for all
Remark 1.
We remark We can write , where f is defined from to by . It is clear that f is an increasing function from to converges to , which is a fixed point of f. If we assume , the law of will almost converge to the law of a Moran process with .
We define the final altitude , the height of the component and the height of all the walk by
, , and are considered as discrete random walks with one dimension in the state space , and started from the initial state , , and , respectively. But the are considered a stochastic process with two dimensions in the state space , and started from the initial state .
In this work, we study the probability distribution of the final altitude, denoted by , representing the maximum age between the two components and , based on the joint probability distribution of and . Additionally, in this paper, we analyze the height statistics, denoted by of the first component .
Our contribution is the derivation of a closed-form expression for the probability distribution of the final altitude , given by the following: for all
Additionally, we use the R programming language to simulate the probability distribution described earlier for specific values of n, 10, 50, and 100, and for different values of (0, 1, 0.5, and 0.75) of the probability , where represents the first term of the sequence which generates our two-dimensional Moran model. Furthermore, we calculate the cumulative distribution function of the height statistics of the first component , given by the following: for all
where , and
The Moran random walk is a stochastic process that models the evolution of a population across discrete generations, where individuals interact and reproduce. It is a Markov chain used in population genetics to study the changes in allele frequencies in finite populations. Some applications of the Moran random walk include the following. Population Genetics:
- (1)
- Allele Frequency Dynamics: The Moran model is frequently used to study the stochastic fluctuations in the frequency of genetic alleles in finite populations. It helps in understanding how genetic diversity changes over time due to random events such as genetic drift.
- (2)
- Evolutionary Biology—Neutral Evolution: The Moran model is a tool for studying neutral evolution, where genetic changes are not influenced by natural selection but rather by random processes such as genetic drift.
- (3)
- Ecology— Species Coexistence: In the context of ecological systems, the Moran model can be adapted to study the dynamics of competing species in a finite habitat. It provides insights into the factors influencing species coexistence or extinction.
- (4)
- Game Theory—Evolutionary Game Dynamics: The Moran process has been used to model evolutionary game dynamics, where strategies or traits in a population compete for reproduction success. It helps analyze the emergence and persistence of strategies in a population.
- (5)
- Application in Population Genetics: The two-dimensional Moran walk can be used to model the stochastic evolution of allele frequencies in a population where mutations and genetic drift play a role. The time-dependent transition probabilities allow for a more realistic representation of populations with variable mutation rates or selection pressures over time. The maximum altitude statistics derived in this work could be used to estimate the likelihood of a particular allele reaching fixation or extinction within a given number of generations. This approach is particularly relevant for studying populations undergoing rapid environmental changes, where mutation rates or selection pressures may change dynamically rather than remain constant.
3. Main Result
In this section, we analyze the final altitude of the maximum age, , at time n. Also, we study the height statistics of each component , for . Precisely, we find a closed form for the probability density function of , denoted by , for all positive integer based on the joint probability, denoted by , for all , of two random walks and . Also, we find the cumulative function of the height statistics, denoted by , for all positive integer h, of the first component. In the next, we define the probability density function of by
In the next, Equation (3) can be written in the following equation:
The first principal theorem in this section concerns the distribution of the final altitude , which is presented in the next theorem. This result is based on Theorem 3, about the joint distribution of . We are able now to give the exact distribution of for all .
Theorem 1.
The distribution of is given by the following: for all
As a consequence of Theorem 1, we deduce the expressions of the mean and the variance of the final altitude . The two expressions are very long and they are not easy to simplify. We want to simulate different values of n in Section 4.
Corollary 1.
The mean and the variance of are given by
The second principal result in this section concerns the distribution of the statistics height of the first component , which is presented in the next theorem. This result is based on Lemma 4. Denote, for all and for all ,
The probability represents the distribution of the height of the first component . It is clear that
Theorem 2.
The cumulative distribution of is given by, for all
where , and
4. Simulation
In this section, we compute and simulate the probability of the final altitude for different values of the time n and for different values of the probability using the program R. Firstly, we use the program R to compute the probability of , denoted by , for all integers , for three special cases of the probability : small value 0.1, medium value 0.5, and high value 0.75. Secondly, we simulate the probability density function of for three special cases of the time n, 10, 50, and 100, with the different values 0.1, 0.5, and 0.75 of for each case. Finally, we compute the mean and the variance at different time n for different values of : 0.1, 0.5 and 0.75.
Figure 1 shows that the probability of the final altitude is increasing from 0.1985929, 0.191448, and 0.1914062 for to attain its maximum 0.2758942 at for the three different values (0.1, 0.5. and 0.75) of , and starts decreasing from at time , respectively. Also, we observe that the decay rate of the density function of for is quicker than the others (0.5 and 0.75). Finally, we remark some differences between the three tails. When and , the density function of approaches 0 at and 9, respectively. But when , the probability of approaches 0 at and increases at .
Figure 1.
Simulation of the distribution of the final altitude with different values of , and .
Figure 2 shows that the probability of the final altitude is increasing from 0.1914062 for to attain its maximum 0.2758942 at for the three different values (0.1, 0.5, and 0.75) of , starts decreasing and approach 0 at time , respectively. Also, we observe that the decay rate of the density function of for is quicker than the others (0.5, and 0.75). Finally, we remark that the density function of equals 0 approximately from to for the three values 0.1, 0.5, and 0.75) of . Plus, precisely, the probability equals and is decreasing to , , and from time to 50 for the values (0.1, 0.5, and 0.75) of , respectively.
Figure 2.
Simulation of the distribution of the final altitude with different values of , and .
From Figure 3, we observe that the probability of the final altitude is increasing from 0.1914062 for to attain its maximum 0.2758942 at for the three different values (0.1, 0.5, and 0.75) of , starts decreasing and approach 0 at time , respectively. Also, we observe that the decay rate of the density function of for is quicker than the others (0.5, and 0.75). Finally, we remark that the density function of equals 0 approximately from to for the three values 0.1, 0.5, and 0.75 of . Plus, precisely, the probability equals and is decreasing to , , and from time to 100, for the values (0.1, 0.5, and 0.75) of , respectively.
Figure 3.
Simulation of the distribution of the final altitude with different values of , and .
All the curves show that the distribution of the final altitude is primarily concentrated on small values, while the probability of taking large values tends to zero.
Table 1 lists the probability of the final altitude , for with different values of (0.1, 0.5, and 0.75). In the first case, when the sequence starts from a small value equal to 0.1 at time 1 and increasing to 0.5625, which is the fixed point of the sequence defined in Equation (2), we remark that the probability of the random walk is decreasing from 0.2758942 at time 1 to 0.2731112 at time 25. In the second case, when the sequence starts from a medium value equal to 0.5 at time 1 and increasing to 0.5625, we observe that the probability of the random walk is decreasing from 0.2758942 at time 1 to at time 25. In the last case, when the sequence starts from a high value equal to 0.75 at time 1 and increasing to 0.5625, we observe that the probability of the random walk is decreasing from 0.2758942 at time 1 to at time 25. Also, we remark that the fixed point of the sequence in the first case is attained at the time 25 but in the second and third cases, it is attained at time 19.
Table 1.
Computation of the probability distribution of the final altitude for different values of (=0.1, 0.5, 0.75).
From Table 2, we remark that the mean of the final altitude is increasing from time 2 to time 34 for different values of , 0.5, and 0.75 and is stable at any time n more than 34. Also, the maximum mean is attained at time 35 of the final altitude and equals 2.108571 for the three values of . But the maximum variance is attained at time 40 of the final altitude and equals 3.846269 for the three values of . In addition, it is stable at any time n greater than or equal to 40.
Table 2.
Computation of the mean and the variance of the final altitude at different times n for different values of (=0.1, 0.5, and 0.75).
5. Marginal Distribution of
In this section, we analyze the distribution of two random walks and . Precisely, we find just a closed form of the distribution of and by a similar argument, we determine the expression of the distribution of . We introduce, for , the random walk : for all , given , the values of are given by
From System (4), we deduce the following lemma about the distribution of and by symmetries, we obtain also the distribution of .
Lemma 1.
For all , the distribution of is given by
Proof.
For we use recurrence reasoning. We take and , and it is clear that
and supposing that
we prove that the preceding hypothesis is true for
using (6), then we obtain
and we finish the proof. □
Remark 2.
By symmetry, we have that for all , the distribution of is given by
We observe that for and n big enough,
where C a strictly positive constant.
6. Recursive Equations
In this section, we find a closed form of the probability of the process defined in (1), denoted by at the time n. Precisely, we find a relationship between and , for all , using the conditional probability. We study three cases for , , or , where and . We start by the following notation: for all r,
We start this section with a very important theorem. It leads to finding a closed form of the probability for all . This theorem is very important and will be used to prove the first result: Theorem 1 in Section 3.
Theorem 3.
The joint probability distribution of is given by
Remark 3.
By similarity, we have that for all and
When n is very large, we have the following estimation:
We will need the following lemmas to prove Theorem 3.
Lemma 2.
We have the following recursion: for all ,
Proof.
For and , we have
Applying Lemma 1, we obtain
For the special case , we have
and applying Lemma 1, we obtain
and the proof is finished. □
Remark 4.
By symmetry, we have the following recursion:
Lemma 3.
We have the following recursion: for all , and
Proof.
It is clear that, for , we have
For all integers r, c and n such that , we iterate , r times
For all , we have
Applying Lemma 1, we have
We use the preceding Lemmas 2 and 3 to prove Theorem 3.
Proof.
It is clear that, for we have
For the rest of the proof, we can deduce directly from Lemmas 2 and 3. □
Corollary 2.
We have the following recursion: for all ,
Proof.
The proof is deduced directly from Lemma 3. □
7. Proofs
In this section, we prove the results in Section 3. We use Lemmas 2 and 3 and Corollary 2 to prove Theorem 1. For the proof of Theorem 2, we use Lemma 4. We start to prove Theorem 1.
Proof.
We have the following: for all and
applying Lemmas 2 and 3 and Corollary 2, for all
we obtain
In the case , using the following equalities in Lemmas 2, 3 and Corollary 2,
then, we obtain
□
In the following, we prove Theorem 2. We introduce the following technical lemma concerning the relationship between , and , for all , for all , where the probability represents the distribution of the height of the first component .
Lemma 4.
We have the following recursion, for all :
Proof.
We have, for all ,
The last equality is due to the fact that, if
then
□
Lemma 4 leads us to prove Theorem 2.
Proof.
Denote, for all n by
We have from Lemma 4,
then we obtain recursively
□
8. Conclusions and Perspective
In this work, we have introduced and analyzed a novel two-dimensional Moran walk model, characterized by a time-dependent probability distribution. Our main contributions include deriving explicit closed-form expressions for the probability distribution of the final maximum altitude, as well as providing a thorough statistical analysis of the model’s behavior through simulations and probabilistic techniques. These results shed new light on the statistical properties of the two-dimensional Moran process and contribute to a broader understanding of discrete stochastic processes in both mathematical and applied contexts. From a theoretical perspective, our findings highlight the significance of the transition probability dependence on time, which introduces additional complexity compared to traditional Moran processes. The obtained probability distribution functions, moment expressions, and cumulative distributions offer new insights into the extremal statistics of such stochastic processes. Furthermore, our simulations confirm the theoretical predictions and provide a deeper understanding of how different parameter choices affect the evolution of the Moran walk. As a perspective, several promising directions for future research emerge from our study:
- Extension to Higher Dimensions: While this work focuses on a two-dimensional setting, a natural extension would be to analyze the Moran walk in three or more dimensions. This generalization could provide a more comprehensive understanding of multidimensional population dynamics and other related stochastic processes.
- Derivation of Recurrence Relations: A key challenge is to establish explicit recurrence relations between the functions and that describe the evolution of height statistics. Extracting the coefficient would provide further insights into the distribution of and its limiting behavior.
- Asymptotic Analysis of Maximum Height: Determining the limiting behavior of as would be an important theoretical advancement. This could be achieved through generating function techniques, kernel methods, and singularity analysis.
- Moment Generating Function for Height Statistics: Using moment generating functions, we could obtain refined information about the mean and variance of . This approach would allow for a more comprehensive characterization of the distribution of the maximum altitude.
- Applications in Population Genetics and Evolutionary Biology: Given its origins in modeling genetic drift and allele frequency evolution, our Moran walk framework could be further adapted to incorporate selection effects, mutations, and varying population sizes. Such extensions would enhance the model’s applicability to real-world biological systems.
- Numerical Approximations and Machine Learning Approaches: The complexity of the recurrence relations and probability distributions suggests the potential benefit of employing numerical methods and machine learning techniques to approximate the behavior of the Moran walk under different parameter regimes.
Overall, this study provides a strong foundation for future explorations of time-dependent stochastic processes and their applications. We anticipate that the techniques developed here will prove valuable for both theoretical research and practical applications in various scientific fields.
Funding
The author extends him appreciation to Ongoing Research Funding Program, (ORF-2025-1068), King Saud University, Riyadh, Saudi Arabia.
Data Availability Statement
The random walks were generated using the RStudio-2023.09.0 program.
Acknowledgments
The author extends him appreciation to Ongoing Research Funding Program, (ORF-2025-1068), King Saud University, Riyadh, Saudi Arabia.
Conflicts of Interest
The author declares no conflict of interest.
References
- Moran, P.A.P. Random processes in genetics. Proc. Camb. Philos. Soc. 1958, 54, 60–71. [Google Scholar] [CrossRef]
- Moran, P.A.P. The Statistical Processes of Evolutionary Theory; Oxford University Press: Oxford, UK, 1962. [Google Scholar]
- Lieberman, E.; Hauert, C.; Nowak, M.A. Evolutionary dynamics on graphs. Nature 2005, 433, 312–316. [Google Scholar] [CrossRef] [PubMed]
- Banderier, C.; Flajolet, P. Basic analytic combinatorics of directed lattice paths. Theor. Comput. Sci. 2002, 281, 37–80. [Google Scholar] [CrossRef]
- Flajolet, P.; Sedgewick, R. Analytic Combinatorics; Cambridge University Press: Cambridge, UK, 2009. [Google Scholar]
- Flajolet, P.; Guillemin, F. The formal theory of birth-and-death processes, lattice path combinatorics and continued fractions. Adv. Appl. Probab. 2000, 32, 750–778. [Google Scholar] [CrossRef][Green Version]
- Drmota, M. Random Trees: An Interplay Between Combinatorics and Probability; Springer: Vienna, Austria, 2009. [Google Scholar]
- Abdelkader, M. On the Height of One-Dimensional Random Walk. Mathematics 2023, 11, 4513. [Google Scholar] [CrossRef]
- Banderier, C.; Nicodème, P. Bounded discrete walks. Discret. Math. Theor. Comput. Sci. 2010, AM, 35–48. [Google Scholar] [CrossRef]
- de Bruijn, N.G.; Knuth, D.E.; Rice, S.O. The average height of planted plane trees. In Graph Theory and Computing; Academic Press: Cambridge, MA, USA, 1972; pp. 15–22. [Google Scholar]
- Aguech, R.; Abdelkader, M. Two-Dimensional Moran Model: Final Altitude and Number of Resets. Mathematics 2023, 11, 3774. [Google Scholar] [CrossRef]
- Althagafi, A.; Abdelkader, M. Two-Dimensional Moran Model. Symmetry 2023, 15, 1046. [Google Scholar] [CrossRef]
- Huillet, T.E. Random walk Green kernels in the neutral Moran modelconditioned on survivors at a random time to origin. Math. Popul. Stud. 2016, 23, 164–200. [Google Scholar] [CrossRef]
- Huillet, T.; Möhle, M. Duality and asymptotics for a class of nonneutral discrete Moran models. J. Appl. Probab. 2009, 46, 866–893. [Google Scholar] [CrossRef][Green Version]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).