Abstract
Consider the one-dimensional random walk : as it evolves (at each unit of time), it either increases by one with probability p or resets to 0 with probability . In the present paper, we analyze the law of the height statistics , corresponding to our model . Also, we prove that the limiting distribution of the walk is a shifted geometric distribution with parameter and find the closed forms of the mean and the variance of using the probability-generating function.
MSC:
60C05; 60F99; 60E05; 60G40
1. Introduction
Let be a discrete random walk with one dimension defined as follows: The walk starts from the origin at Time 0. After one unit of time, the process shifts by one positive unit with probability or resets to 0 with probability . We provide three examples of the evolution of our random walk until time n = 10:
In the above examples, the height of the preceding walks are equal to 7, 4, and 2, respectively.
In this work, we are interested in analyzing the height statistics, denoted by , of the random walk . Our analysis of the height is based on the combinatorial analysis of the coefficient , representing the number of ways to choose r distinct integers satisfying the following conditions:
such that and and based on the probability distribution of the return time, denoted by , of the random walk , given by
Our contribution in this current paper is finding a closed form of the distribution of the height statistics using a combinatorial analysis of the coefficient and the distribution of the return time of the random walk . The closed form of the probability distribution of is given by:
Furthermore, we study the statistical properties of the random walk , like the mean, the variance, and the limiting distribution of . Precisely, we prove that the limiting distribution of the random walk is a shifted geometric distribution with parameter , and we give the closed forms of the mean and the variance of .
This analysis of the height statistics is very important, and it is applicable to many aspects of renewable energy. For example, electricity today plays a very important role in daily activities and is very essential for transport, education, healthcare, and many other sectors. For this reason, controlling electricity consumption is necessary and is performed by estimating the maximum amount of electricity consumption. Electricity consumption is estimated via statistical methods such as time series models [1,2], regression models [3], and ARIMA models [4]. Furthermore, this maximum amount is similar to the height of the electricity consumption in a given period of time.
In the literature, the statistical properties of the height statistics are studied in one dimension via the kernel method and singularity analysis (see [5,6]). For example, we can mention the distribution of the ranked heights of the excursions of a Brownian bridge, investigated by Pitman and Yor in [7]. Similarly, Csaki and Hu analyzed the asymptotic properties of ranked heights in Brownian excursions in [8]. Also, Csaki and Hu analyzed the lengths and heights of random walk excursions in [9]. Furthermore, Katzenbeisser and Panny studied the maximal height of simple random walks, which were revisited in [10]. In addition, Banderier and Nicodème [11] studied the height of discrete bridges/meanders/excursions for bounded discrete walks. Also, Aguech, Althagafi, and Banderier in [12] analyzed the height of walks with resets and the Moran model.
This paper is organized as follows. In Section 2, we introduce our model in detail and define the return time and the height statistics, denoted by , of our random walk . In Section 3, we present our main result concerning the distribution of the height statistics of the random walk . In Section 4, we use the R program to find all possibilities of the integers satisfying the conditions defined in Equation (4) and compute the combinatorial coefficient for different values of n, r, and k. In Section 5, we prove that the limiting distribution of the random walk is a shifted geometric distribution with parameter . Also, we use the probability-generating function of the random walk to obtain their mean and variance. In Section 6, we present some conclusions concerning our results and some perspectives.
2. Definitions and Presentation of the Model
In this section, we define an elegant tool called the probability-generating function, which plays an important role in finding the mean and variance of the random walk. Next, we present our model: a one-dimensional random walk. Finally, we finish this section by providing definitions of some statistics like the return time and the height.
Let U be a discrete random variable with distribution , . The probability-generating function, denoted by G, of the variable U is defined by:
for all such that .
The probability-generating functions constitute an elegant tool to study the statistical characteristics of a random walk. Precisely, the probability density functions associated with discrete stochastic processes and their moments can be obtained from the derivatives of the probability-generating function. In fact, we can obtain the closed forms of the mean and the variance of the process if we derive the probability-generating function, at . For more details, see [6,13,14].
Furthermore, we introduce the following important equations, which are related to the mean and variance of U and :
Consider the one-dimensional random walk . It starts from 0 at Time 0 (i.e., ), parameterized by a probability . It is given by the following system:
where . We denote by the statistics the number of return times of the random walk to 0 up to time n and the height of the random walk :
3. Main Result
The goal of this section is to obtain the distribution of . To reach this goal, we apply at first a very important result concerning the distribution of the return time of the random walk (see Theorem 3 in [15]). For the second setup, we analyze the joint distribution of using the conditional probability and the marginal distribution of the return time . Finally, we deduce the marginal distribution of .
Now, we present a very important result concerning the distribution of the return time, , of the random walk .
Lemma 1
([15]). The exact distribution of is given by
Consider the following event representing the height statistics , bounded by k, given that the return time equals r of the random walk :
where are i.i.d. geometric random variables with parameter and
such that and . We define the combinatorial coefficient :
representing the number of ways to choose r distinct integers satisfying the conditions in Equation (4).
Remark 1.
The combinatorial coefficient depends on the parameters n, r, and k, where n represents the length of the random walk and a k integer less than n.
We present a closed form of the combinatorial coefficient in the next lemma.
Lemma 2.
The coefficient is given by
where stands for the coefficient of in the power series .
Proof.
For all , let and . It is obvious that identifying is equivalent to identifying , and then:
□
Remark 2.
From Equation (6), the combinatorial coefficient is the coefficient of in the power series .
Next, we give some results about the height of the random walk . It represents the maximal height attained by the walk , in all of the past from 1 to n. This means that, for all n and for all , the values of are between 0 and k. For this purpose, firstly, we compute the joint distribution of the discrete return time and the height of the random walk . Secondly, for all and for all , we find the conditional probability of the height bounded by the integer k given that the return time equals r. Furthermore, we determine the probability of the intersection between the events and . Finally, we deduce the marginal distribution of .
The next theorem leads to the conditional probability that the height of the random walk is bounded by k given that the return time equals r.
Theorem 1.
The conditional distribution of , given , is given by
From Theorem 1, we deduce the joint distribution of the following events and .
Corollary 1.
The joint distribution of satisfies the following relation:
where is defined in Lemma 1.
Proof.
One has
Applying Lemma 1 and Theorem 1, we obtain
where is defined in Lemma 1. □
We deduce here some information about the distribution of . By summing over r in Equation (7), we obtain the marginal distribution of , as follows:
Corollary 2.
The probability distribution of the height statistics of the random walk is given by the following equation:
where is defined in Equation (4).
4. Simulation of the Combinatorial Coefficient
In this section, we use the R program to compute the combinatorial coefficient for different values of n, r, and k. In the first case, we find the value of the coefficient and count all the possibilities of the integers for , and . In the second case, we determine the possibilities of the integers for , and . Also, we list the values of the combinatorial coefficient for different values of n (4, 5, 6, and 7), r (2, 3, 4, and 5), and k (2, 3, 4, 5, and 6).
In Table 1, we find all the possibilities of the integers under the conditions defined in Equation (4) for different values of r and k when n equals 7 and compute the corresponding combinatorial coefficient . Precisely, in the first case, when n and k are fixed at 7 and 2, respectively, and the number r takes values of 4, 5, and 6, then the combinatorial coefficient takes the values 17, 12, and 7. This means that, when r increases, then the coefficient decreases. In the second case, if n and k equal 7 and 2 and r increases from 2 to 3, then the coefficient increases from 12 to 18, respectively. This means that the coefficient increases when k increases. Also, from Table 1, we observe that is fixed at 7 when r is near n and k byat least 2 ( and ).
Table 1.
All possibilities of the integers for .
Table 2 lists all the possibilities of the integers under the conditions defined in Equation (4) for different values of r and k when n equals 5. Also, we deduce the value of the combinatorial coefficient for each list. Furthermore, Table 2 shows two cases of the increasing of the combinatorial coefficient , which depends on the parameters r (the number of integers ) and h (the bound of the height of the random walk ). Precisely, in the first case, when n and k are fixed at 5 and 2 and r increases from 3 to 4, then the combinatorial coefficient decreases from 8 to 5. This means that the coefficient decreases when r increases. In the second case, if n and r are equal to 5 and 3 and k increases from 2 to 3, then the coefficient increases from 8 to 10, respectively. This means the coefficient increases when k increases for fixed n and r.
Table 2.
All possibilities of the integers for .
Table 3 shows that the combinatorial coefficient depends on the three parameters n, r, and K. Precisely, this coefficient is increasing or decreasing if the parameters n, r, and p change. From Table 3, we distinguish three cases concerning the computation of :
Table 3.
The combinatorial coefficient for different values of n.
In the first case, when n is increasing, r and k are fixed, then we observe that the coefficient increases. For example, if n takes values of 4, 5, 6, and 7 and r and k equal 2 and 4, then takes values of 6, 10, 14, and 16, respectively. Sometimes, this increasing of is very quick, and it takes values of 4, 5, 15, and 32 when n takes values of 4, 5, 6, and 7 and r and k equal 4 and 3, respectively. But, sometimes, decreases under the same conditions. For example, takes values of 5, 5, 3, and 1 when n equals 4, 5, 6, and 7 and r and k equal 2.
In the second case, the combinatorial coefficient decreases or increases when n and k are fixed but r increases. This means that there exists a maximal coefficient for special values of n, r, and k. Firstly, if n equals 5, k equals 3, and r takes values of 2, 3, and 4, then the coefficient increases from 9 to 10 and decreases to 5, respectively. Secondly, the coefficient increases from 3 to 11 and decreases to 6 when n equals 6, k equals 2, and r takes values of 2, 3, 4, and 5. Finally, the coefficient increases from 16 to 35 and decreases to 21 when n equals 7, k equals 4, and r takes values of 2, 3, 4, and 5.
In the third case, the combinatorial coefficient increases when n and r are fixed, but k increases. This means that and k are proportionally related. Firstly, if n equals 5, r equals 3, but k takes values of 1, 2, 3, and 4, then the coefficient equals 1, 8, 10, and 10, respectively. Secondly, if n equals 6, r equals 4, but k takes values of 2, 3, and 4, then the coefficient equals 11, 15, and 10, respectively. Finally, if n equals 7, r equals 3, but k takes values of 2, 3, 4, 5, and 6, then the coefficient equals 6, 25, 33, 35, and 35, respectively.
Furthermore, Table 3 shows that there exists a maximal combinatorial coefficient for special values of n, r, and k. Firstly, equals 6 when , , and k increases from 3 to n. Secondly, equals 10 when , , and k increases from 4 to n or and k increases from 3 to n. Next, equals 20 when , , and k increases from 4 to n. Finally, equals 35 when , , and k increases from 5 to n or and k increases from 4 to n.
Finally, we observe a very nice property of the combinatorial coefficient . This property depends on the parity of n and the length of the random walk . Precisely, we mention that, if n is an even number, r equals , and k takes any value from to n, then is maximal. For example, when , , and or , , and , the combinatorial coefficient is maximal and equals 10 and 20, respectively. But, if n is an odd number, r equals and k takes any value from to n or r equals and k takes any value from r to n, then is maximal. For the first example, when , , and or and , the combinatorial coefficient is maximal and equals 10. For the second example, when , , and or and , the combinatorial coefficient is maximal and equals 35.
We perform the computation of the combinatorial coefficient for different values of the parameters n, r, and k by the following setups:
First setup:
- 1.
- We fix the three parameters n, r, and k;
- 2.
- we initialize the combinatorial coefficient to 0;
- 3.
- We fix the integers to (1, …, r), then we guarantee that the difference between two consecutive integers is less than k;
- 4.
- We change by a value from to n, and we stop if ;
- 2.
- When , then .
Second setup:
- 1.
- We start with the integers , which equal (1, …, , r, ) such that the difference between two consecutive integers is less than or equal to k;
- 2.
- We change by a value from to n, and we stop if ;
- 3.
- When , then ;
- 4.
- .
Third setup:
- 1.
- We repeat the same procedure from the first and second setups;
- 2.
- The last choice of the integers is , then we guarantee that the difference between two consecutive integers is less than k;
- 3.
- . If , we stop the procedure in the third setup.
Final setup:
- 1.
- We repeat the preceding setups for from 2 to an integer c such that ;
- 2.
- .
5. Distribution of the Random Walk
In this section, we analyze some statistical properties like the limiting distribution, the mean, and the variance of the random walk using a very nice tool called the probability-generating function. Firstly, we find the relation between the probabilities of the random walk at two consecutive times n and using the conditional probability. Secondly, we determine a recursive equation between and , where represents the probability-generating function of . Next, we use to prove that the random walk converges to a shifted geometric distribution with parameter asymptotically. Also, we derive to obtain the mean and the variance of the random walk . Start by the definition of the probability mass function of . Denote, for all ,
The following lemma presents the recursion of the probabilities.
Lemma 3.
For all , we have
Proof.
This proof is based on the utility of the conditional probability that the Moran walk equals r at time given that it equals l at time n, then:
- 1.
- For , we have
- 2.
- For , we have
□
Next, we define the sequence of polynomials (for ) by the fact that the coefficient of in is the probability that, at time n, the position of the process is at level r, that is
From Equation (9) and Lemma 3, we deduce a recursive equation relating , , and . It is presented in the next proposition.
Proposition 1.
For all , the explicit expression of the sequence of polynomials satisfies the following recurrence:
with the initial condition .
Proof.
Using Equation (9) and for all , the function can be developed as:
Due to Lemma 3, we have:
□
Now, we use Equation (10) to show that the random walk converges to a shifted geometric distribution with parameter asymptotically. It is introduced in the next theorem.
Theorem 2.
The limiting distribution of the process converges to a shifted geometric distribution with parameter , with a probability-generating function given by the following: for all ,
for all , such that .
Proof.
Iterating the recursive equation defined in (10) n times, we obtain
and passing to the limit of , then we have
this is exactly the generating function of a shifted geometric distribution with parameter . □
To derive the probability-generating function given in Theorem 2, we deduce the closed expressions of the mean and the variance of the random walk .
Corollary 3.
The mean and the variance of the random walk are given by
Proof.
The first derivative of defined in Equation (11) with respect to x:
evaluating at ,
Using Equation (1), we obtain
To derive the variance of , we need to define the following sequences of functions:
Observe that the first and second derivatives of are given by
The first derivative of and with respect to x at is given by
Combining Equations (14)–(16), we obtain
Applying Equations (1), (12), and (17), we obtain
□
6. Conclusions and Perspectives
In this current paper, we stated our main result concerning the height of the random walk . Precisely, we found the joint distribution between the height and the return time statistics. This is given by the following formula:
where is a combinatorial coefficient. Also, we analyzed this coefficient numerically using the R program and took some properties:
- 1.
- If n increases and r and k are fixed, then the combinatorial coefficient increases;
- 2.
- The combinatorial coefficient decreases or increases when n and k are fixed, but r increases. This means that there exists a maximal coefficient for special values of n, r, and k;
- 3.
- The combinatorial coefficient increases when n and r are fixed, but k increases. This means that and k are proportionally related.
Also, we observe from Table 3 a very nice property of the combinatorial coefficient . This property depends on the parity of n and special values of r and k. Precisely, we mention that, if n is an even number, r equals , and k takes any value from to n, then is maximal. But, if n is an odd number, r equals , and k takes any value from to n or r equals and k takes any value from r to n, then is maximal.
Furthermore, we studied the statistical properties of the random walk like the limit distribution, the mean, and the variance. Firstly, we found the closed form of the probability-generating function of the random walk from the recursive equation defined in Equation (10). Next, we proved that the limiting distribution of is a shifted geometric distribution with parameter . Finally, we derived the probability-generating function of to obtain the closed forms of the mean and the variance of .
In the next work, we plan to work on the following questions:
- 1.
- Can we find a closed form of the probability-generating function of the height?
- 2.
- Can we explicitly calculate the mean and variance of the height statistics using the probability-generating function of ?
Funding
We thank the Deputyship for Research and Innovation, the “Ministry of Education”, in Saudi Arabia for funding this research (IFKSUOR3-331-1).
Data Availability Statement
The random samples were generated using the RStudio-2023.09.0 program.
Acknowledgments
The author extends appreciation to the Deputyship for Research and Innovation, the “Ministry of Education”, in Saudi Arabia for funding this research (IFKSUOR3-331-1).
Conflicts of Interest
The author declares no conflict of interest.
References
- Sarkodie, S.A. Estimating Ghana’s electricity consumption by 2030: An ARIMA forecast. Energy Sources Part B Econ. Plan. Policy 2017, 12, 936–944. [Google Scholar] [CrossRef]
- Chavez, S.G.; Bernat, J.X.; Coalla, H.L. Forecasting of energy production and consumption in Asturias (northern Spain). Energy 1999, 24, 183–198. [Google Scholar] [CrossRef]
- Kankal, M.; Akpınar, A.; Kömürcü, M.; Özşahin, I. Modeling and forecasting of Turkey’s energy. Consumption using socio-economic and demographic variables. Appl. Energy 2011, 88, 1927–1939. [Google Scholar] [CrossRef]
- Koutroumanidis, T.; Ioannou, K.; Arabatzis, G. Predicting fuelwood prices in Greece with the use of ARIMA models, artificial neural networks and a hybrid ARIMA–ANN model. Energy Policy 2009, 37, 3627–3634. [Google Scholar] [CrossRef]
- Banderier, C.; Flajolet, P. Basic analytic combinatorics of directed lattice paths. Theor. Sci. 2002, 281, 37–80. [Google Scholar] [CrossRef]
- Flajolet, P.; Sedgewick, R. Analytic Combinatorics; Cambridge University Press: Cambridge, UK, 2009. [Google Scholar]
- Jim Pitman, J.; Yor, M. On the distribution of ranked heights of excursions of a Brownian bridge. Ann. Probab. 2001, 29, 361–384. [Google Scholar] [CrossRef]
- Csaki, E.; Hu, Y. Asymptotic properties of ranked heights in Brownian excursions. J. Theor. Probab. 2001, 14, 77–96. [Google Scholar] [CrossRef]
- Csaki, E.; Hu, Y. Lengths and heights of random walk excursions. In Discrete Mathematics and Theoretical Computer Science; Discrete Random Walks: Paris, France, 2003; pp. 45–52. [Google Scholar]
- Katzenbeisser, W.; Panny, W. The maximal height of simple random walks revisited. J. Satistical Plan. Inference 2002, 101, 149–161. [Google Scholar] [CrossRef]
- Banderier, C.; Nicodème, P. Bounded discrete walks. In Discrete Mathematics and Theoretical Computer Science; Discrete Random Walks: Paris, France, 2010; pp. 35–48. [Google Scholar] [CrossRef]
- Althagafi, A.; Aguech, R.; Banderier, C. Height of walks with resets and the Moran model. Sémin. Lothar. Comb. 2023; submitted. [Google Scholar]
- Gao, K.; Yan, X.; Peng, R.; Xing, L. Economic design of a linear consecutively connected system considering cost and signal loss. IEEE Trans. Syst. Man Cybern. Syst. 2021, 51, 5116–5128. [Google Scholar] [CrossRef]
- Gao, K.; Peng, R.; Qu, C.L.; Xing, L.; Wang, S.; Wu, F. Linear system design with application in wireless sensor networks. J. Ind. Inf. Integr. 2022, 27, 100279. [Google Scholar] [CrossRef]
- Aguech, R.; Abdelkader, M. Two-Dimensional Moran Model: Final Altitude and Number of Resets. Mathematics 2023, 11, 3774. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).