Next Article in Journal
Correction: Ricceri, B. A Class of Equations with Three Solutions. Mathematics 2020, 8, 478
Previous Article in Journal
Characterizations of Pareto-Nash Equilibria for Multiobjective Potential Population Games
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Stochastic Lomax Diffusion Process: Statistical Inference and Application

by
Ahmed Nafidi
1,†,
Ilyasse Makroz
1,*,† and
Ramón Gutiérrez Sánchez
2,†
1
Department of Mathematics and Informatics, National School of Applied Sciences, Hassan First University of Settat, LAMSAD, Avenue de l’université, Berrechid BP 280, Morocco
2
Department of Statistics and Operational Research, Facultad de Ciencias, Campus de Fuentenueva, University of Granada, 18071 Granada, Spain
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Mathematics 2021, 9(1), 100; https://doi.org/10.3390/math9010100
Submission received: 9 November 2020 / Revised: 18 December 2020 / Accepted: 21 December 2020 / Published: 5 January 2021
(This article belongs to the Section Probability and Statistics)

Abstract

:
In this paper, we discuss a new stochastic diffusion process in which the trend function is proportional to the Lomax density function. This distribution arises naturally in the studies of the frequency of extremely rare events. We first consider the probabilistic characteristics of the proposed model, including its analytic expression as the unique solution to a stochastic differential equation, the transition probability density function together with the conditional and unconditional trend functions. Then, we present a method to address the problem of parameter estimation using maximum likelihood with discrete sampling. This estimation requires the solution of a non-linear equation, which is achieved via the simulated annealing method. Finally, we apply the proposed model to a real-world example concerning adolescent fertility rate in Morocco.

1. Introduction

Stochastic diffusion models are used to analyze the evolution of phenomena in multiple fields of science, including biology, finance, energy consumption and physics. In addition to traditional applications, stochastic diffusion processes (SDPs) have attracted considerable attention as analytical tools in areas such as cell growth, population growth and environmental studies. In this respect, see for example: Lognormal [1]; Gompertz [2]; Logistic [3]; Hyperbolic [4]; Rayleigh [5]; Pearson [6]; Weibull [7] and Brennan–Schwartz [8].
However in most of these studies, the processes considered are time homogeneous, in other words, the present state of the process depend only on the previous states and not on time. In contrast observations from many fields such a as neuroscience, finance and biology, suggest otherwise. Various non-homogeneous SDPs have been proposed to reflect this time dependent behavior, see for example: Lognormal [9], Gompertz [10], Vasicek [11], Brennan–Schwartz [12], and Gamma [13] processes.
In most of the aforementioned studies, the statistical inference is based on the maximum likelihood function, which is the product of transition densities. However, in some cases the closed form of the transition density is unknown, or has complicated expression, so the maximum likelihood method remains difficult to implement. Therefore many methods based on an approximation of the maximum likelihood were developed, such as: Prakasa-Rao [14], Kloeden et al. [15], Bibby et al. [16] and among others.
The Pareto type (II) distribution or Pearon type (IV) distribution, also called Lomax distribution, was introduced and studied by Lomax [17]. This distribution is commonly used in reliability and many lifetime testing studies. It is also used to analyze business data.
The density function of a Lomax distribution on [ 0 , + [ with β > 0 (scale parameter), and α > 0 (shape parameter) is given by:
f ( t ) = α β 1 + t β ( α + 1 ) .
This distribution is a special case of a more general one called the Generalized Pareto distribution, the density function of which has the following form:
g ( t ) = 1 σ 1 + k ( t μ ) σ ( 1 k 1 ) ,
where μ and k are real parameters and σ > 0 . This distribution encompasses the Pareto distribution as a special case since if we set μ = 0 and k = 1 α we obtain the Equation (1).
In the present paper, we introduce a new Stochastic Lomax Diffusion Process (SLDP) as a non-homogeneous extension of the lognormal process, and which presents a trend function that is proportional to the Lomax density function. Moreover, the term adopted for the model we study will be improved by stochastic calculus. In this work, we will present a detailed and complete study of the Lomax Model. To this end, we will proceed as follows: In Section 2, we define the model in terms of stochastic differential equation (SDE), we then give the analytical expression of the solution of the proposed model. After which, we determine the Transition Probability Density Function (TPDF) and the trend functions. In Section 3, we deal with the problem of parameter estimation using Maximum Likelihood (ML) in the basis of discrete sampling. In this case, the system of likelihood equations does not have an explicit solution, so as a result the ML estimators cannot be given in the closed form. Then, one possible way to solve this basic problem is the use of numerical methods. In Section 4, we propose the simulated annealing method approximating the ML estimator then we show the results of the simulation of the process in Section 5. Moreover, in Section 6, we illustrate the results obtained by this method by reference to real data, namely the adolescent fertility rate in Morocco. Finally, we summarize the main conclusions drawn from this work.

2. The Model and Its Characteristics

2.1. The Model

The proposed model is the one-dimensional non-homogeneous SDP x ( t ) , t [ t 1 , T ] , t 1 0 taking values on [ 0 , ] and with drift and diffusion coefficients:
A 1 ( x ) = α t + β x , A 2 ( x ) = σ 2 x 2 ,
where σ > 0 , β > t 1 and α are real parameters.
Alternatively, the process defined above can be considered as the unique solution to the following SDE:
d x ( t ) = α t + β x ( t ) d t + σ x ( t ) d w ( t ) , x ( t 1 ) = x t 1 ,
where w ( t ) is the one-dimensional standard Wiener process and x t 1 is fixed in R + .

2.2. Distribution of the Process

The SDE in Equation (3) has a unique solution (see Kloeden et al [15]). In order to obtain this solution, we consider the appropriate transformation y ( t ) = l o g ( x ( t ) ) , then, by means of Itô formula, the Equation (3) becomes:
d y ( t ) = α t + β σ 2 2 d t + σ d w ( t ) , y ( t 1 ) = l n ( x t 1 ) .
The solution to which is:
y ( t ) = y ( s ) + s t α θ + β σ 2 2 d θ + σ ( w ( t ) w ( s ) ) .
For s [ t 1 , t ] . Hence, we deduce the expression of the solution to SDE in Equation (3):
x ( t ) = x s s + β t + β α exp σ 2 2 t s + σ w ( t ) w ( s ) ,
then x ( t ) | x ( s ) = x s follows a Lognormal distribution:
x ( t ) | x ( s ) = x s Λ l o g ( x s ) + α l o g s + β t + β σ 2 2 t s , σ 2 ( t s ) .
where Λ is the Lognarmal distribution. As a result, the TPDF of this process is found to be:
f ( x , t | x s , s ) = 1 x 2 π ( t s ) σ 2 exp l o g x x s + α l o g t + β s + β + σ 2 2 t s 2 2 σ 2 t s .

2.3. Trend Functions

From the properties of the Lognormal distribution, the main characteristics of the process can be determined, in particular the r-th conditional moment of the process is given by:
E ( x r | x ( s ) = x s ) = e x p r l o g ( x s ) + α l o g s + β t + β σ 2 2 t s + r 2 2 σ 2 ( t s ) .
Then, by considering the case where r = 1 in the previous expression, the conditional trend function of the process is:
E ( x ( t ) | x ( s ) = x s ) = x s s + β t + β α .
In addition, taking into account the initial condition P ( x ( t 1 ) = x t 1 ) = 1 , the trend function of the process is:
E ( x ( t ) ) = x t 1 t 1 + β t + β α .
We note here that:
  • The trend function as defined in Equation (6) is proportional to the Lomax density function Equation (1).
  • Otherwise, in the absence of white noise (i.e., σ = 0 ) the solution to equation Equation (3) is x ( t ) = x s ( s + β t + β ) α which is proportional in this case to the Lomax density function [18], with shape parameter α and scale parameter β , which can be denoted P ( I I ) ( β , α , μ = 0 ) .

3. Maximum Likelihood Estimation

We consider a discrete sample of n observations of the process x ( t ) which we denote here ( x i ) i = 1 , , n , let ( t 0 < t 1 < < t n ) denotes the moments when the process was observed with x i = x ( t i ) moreover we set t i t i 1 = h , and finally θ = ( α , σ ) is the parameters vector.
We know that the likelihood function l ( x , θ ) is the product of the densities functions:
l ( x , θ ) = i = 2 n f ( x i , t i | x i 1 , t i 1 ) = i = 2 n 1 x 2 π h σ 2 exp l o g x i x i 1 + α l o g t i + β t i 1 + β + σ 2 2 h 2 2 σ 2 h .
The Log-likelihood is given by:
L ( x , θ ) = l o g ( l ( x , θ ) ) = i = 2 n l o g ( x i ) 1 2 l o g ( 2 π h ) 1 2 l o g ( σ 2 ) 1 2 σ 2 h l o g x i x i 1 + α l o g t i + β t i 1 + β + σ 2 2 h 2 = n 1 2 l o g ( σ 2 ) n 1 2 l o g ( 2 π h ) i = 2 n l o g ( x i ) + 1 2 σ 2 h C i , α , β + σ 2 2 h 2 ,
where C i , α , β = l o g x i x i 1 + α l o g t i + β t i 1 + β .
We differentiate this function with respect to the elements of vector θ to obtain the following equations:
i = 2 n C i , α , β + σ 2 h 2 l o g t i + β t i 1 + β = 0 ,
i = 2 n C i , α , β 2 ( n 1 ) σ 2 h n 1 4 σ 4 h 2 = 0 ,
i = 2 n C i , α , β + σ 2 h 2 ( t i + β ) ( t i 1 + β ) = 0 .
Equation (8) is a second-degree equation in σ 2 , which admits two solutions (since the discriminant is δ = ( n 1 ) 2 h 2 i = 2 n C i , α , β 2 ( n 1 ) h 2 > 0 ). Therefore, from the non-negative solution corresponding to σ 2 , the estimator σ ^ 2 is given by:
σ ^ 2 = 2 h 1 + 1 n 1 i = 2 n C i , α , β 2 1 / 2 1 .
By replacing σ 2 by σ ^ 2 in Equation (7), the estimator of α is satisfying the following non-linear equation:
i = 2 n C i , α , β + 1 + 1 n 1 i = 2 n C i , α , β 2 1 / 2 1 l o g t i + β t i 1 + β = 0 ,
On the other hand, substituting σ 2 by σ ^ 2 in Equation (10),
i = 2 n C i , α , β + 1 + 1 n 1 i = 2 n C i , α , β 2 1 / 2 1 ( t i + β ) ( t i 1 + β ) = 0 .
Obviously, this is a set of non-linear equations whose solutions may be difficult to find. To address this problem we use numerical resolution methods.

4. Computational Aspects

In this paper we suggest the simulated annealing (SA) method for solving the equations Equations (11) and (12). Hereafter is the description of the method.
Simulated annealing is a stochastic optimisation algorithm, developed in 1983 by [19], which approaches the global optimum of a given cost function by means of a random search. The fundamental idea of the algorithm is inspired by the process of annealing of metals in metallurgy. At each step of the simulated annealing algorithm a new point is randomly generated, if the new point improves the cost function it is accepted, otherwise, it is accepted with a probability e x p ( Δ f / T ) , where f is the cost function and T is the temperature. Accepting points tat don’t improve the cost function allows the algorithm to escape local optima. The main disadvantage of this method is that the adjustment of the parameters (initial temperature, minimum temperature, cooling process and stopping conditions ...) considerably affects the time required to reach the extremum.

5. Simulation

To illustrate the process described by Equation (3), let us consider an equidistant discretisation of the interval [ s , T ] with t i = t i 1 + ( i 1 ) h for i = 2 , , N . Let ( t 1 = s ) and assume a discretisation step h = T s N where N denotes the size of the sample. A total of 25 trajectories of the process were simulated, with s = 0 , T = 1000 and N = 2000 and x s = 100 .
The results of the simulation, together with the Estimated Trend Function (ETF) of the process, are illustrated in Figure 1.
Using the simulated annealing method to solve Equations (11) and (12), we obtained the estimators α ^ i , β ^ i and σ i ^ 2 of each trajectory, and then considered the mean values of the estimators given by α ¯ = 1 25 i = 1 25 α ^ i , β ¯ = 1 25 i = 1 25 β ^ i and σ ¯ 2 = 1 25 i = 1 25 σ i ^ 2 .
The values obtained used this method are α ¯ = 1.511872058 , β ¯ = 90.002739556 and σ ¯ 2 = 0.000099480 . We then calculated the mean value of the simulated paths at each time step, namely, x ¯ ( t i ) = 1 m j = 1 m x j ( t i ) , where x j is the sample path j, t i is the time step i and m is the number of simulated trajectories. We also obtained the estimated trend functions of the process using each method, the result are plotted in Figure 2.

6. Application

6.1. Data Description

In this application, we examined the variable x ( t ) defined by the adolescent fertility rate, which is the number of births per 1000 women aged 15 to 19, these data are annual and are available on the site: https://data.worldbank.org/. The average value for Morocco during the period from 1979 to 2018 is 42.395655 with a minimum value of 30.6810 in 2018 and a maximum value of 92.9376 in 1979. Table 1 illustrates the observed values, as well as the ETF and Estimated Conditional Trend Function (ECTF) during this period.
In Figure 3, real data are plotted against trend functions (conditional and unconditional). The unconditional trend function provides a good estimates for the real values, the accuracy of those estimates can be more accurate if we consider the conditional trend function.
Table 2 shows the estimated data for the years from 2016 to 2018 that were not used in the modeling and the actual data.

6.2. Goodness of Fit of the Model

The absolute mean error in percentage (MAPE) is the average of the deviations in absolute value compared to the observed values. It is a practical indicator of comparison, it makes it possible to evaluate the forecasts obtained from the models. We denote by y i , y ^ i and n respectively the real values, the values predicted by the model and the number of predictions, so we have:
M A P E = 1 n i = 1 n | y ^ i y i | y i × 100 .
The symmetric mean percentage absolute error (SMAPE) is a measure of precision based on relative errors and is defined as follows:
S M A P E = 100 n i = 1 n | y ^ i y i | | y ^ i | + | y i | / 2 .
The values obtained for the MAPE and SMAPE are: 1.756265 and 1.759680 , respectively. The MAPE value is less than 10, so according to Lewis [20] the values obtained by this model are “very precise”.

7. Conclusions

In this study of the stochastic Lomax diffusion process, from a theoretical point of view, we conclude that we can determine the basic probabilistic characteristics of the model and we obtain its parameter estimators. Using the maximum likelihood method in the basis of discrete sampling, we obtained a series of non linear equations which were solved by computational methods. We used the simulated annealing method to estimate the parameters of the model. Hence, a set of statistical results are obtained and show that the proposed process is enable to be applied to real data.
The Lomax model is applied to fit data for adolescent fertility rate in Morocco. The ETF presented a good description of the changing levels of the fertility rate. Furthermore, the period from 2016 to 2018 improved good forecasts. Then, the resulting values obtained by the MAPE and SMAPE were calculated and showed good results. Taking into account these points, we deduced that the methodology applied in the study of this new model was efficient and present a high degree of accuracy.

Author Contributions

Conceptualization, A.N., I.M. and R.G.S.; Data curation, A.N., I.M. and R.G.S.; Formal analysis, A.N., I.M. and R.G.S.; Funding acquisition, A.N. and R.G.S.; Investigation, A.N., I.M. and R.G.S.; Methodology, A.N., I.M. and R.G.S.; Project administration, A.N., I.M. and R.G.S.; Resources, A.N., I.M. and R.G.S.; Software, A.N., I.M. and R.G.S.; Supervision, A.N., I.M. and R.G.S.; Validation, A.N., I.M. and R.G.S.; Visualization, A.N., I.M. and R.G.S.; Writing–original draft, A.N., I.M. and R.G.S.; Writing–review–editing, A.N., I.M. and R.G.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by “FEDER/Junta de Andalucía-Consejería de Economía y Conocimiento/ Proyecto A-FQM-228-UGR18”.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The authors are very grateful to Editor and referees for consecutive comments and suggestions. This research has been funded by “FEDER/Junta de Andalucía-Consejería de Economía y Conocimiento/ Proyecto A-FQM-228-UGR18”.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Ramos-Ábalos, E.M.; Gutiérrez-Sánchez, R.; Nafidi, A. Powers of the Stochastic Gompertz and Lognormal Diffusion Processes, Statistical Inference and Simulation. Mathematics 2020, 8, 588. [Google Scholar]
  2. Gutiérrez, R.; Gutiérrez-Sánchez, R.; Nafidi, A. Modelling and forecasting vehicle stocks using the trends of stochastic Gompertz diffusion models: The case of Spain. Appl. Stoch. Model. Bus. Ind. 2009, 25, 385–405. [Google Scholar] [CrossRef]
  3. Giovanis, A.N.; Skiadas, C.H. A stochastic logistic innovation diffusion model studying the electricity consumption in Greece and the United States. Technol. Forecast. Soc. Chang. 1999, 61, 235–246. [Google Scholar] [CrossRef]
  4. Bibby, B.M.; Sørensen, M. A hyperbolic diffusion model for stock prices. Financ. Stoch. 1996, 1, 25–41. [Google Scholar] [CrossRef]
  5. Gutiérrez, R.; Gutiérrez-Sánchez, R.; Nafidi, A. The Stochastic Rayleigh diffusion model: Statistical inference and computational aspects. Applications to modelling of real cases. Appl. Math. Comput. 2006, 175, 628–644. [Google Scholar] [CrossRef]
  6. Forman, J.L.; Sørensen, M. The Pearson diffusions: A class of statistically tractable diffusion processes. Scand. J. Stat. 2008, 35, 438–465. [Google Scholar] [CrossRef]
  7. Nafidi, A.; Bahij, M.; Gutiérrez-Sánchez, R.; Achchab, B. Two-Parameter Stochastic Weibull Diffusion Model: Statistical Inference and Application to Real Modeling Example. Mathematics 2020, 8, 160. [Google Scholar] [CrossRef] [Green Version]
  8. Nafidi, A.; Moutabir, G.; Gutiérrez-Sánchez, R. Stochastic Brennan–Schwartz Diffusion Process: Statistical Computation and Application. Mathematics 2019, 7, 1062. [Google Scholar] [CrossRef] [Green Version]
  9. Gutiérrez, R.; Angulo, J.M.; González, A.; Pérez, R. Inference in lognormal multidimensional diffusion processes with exogenous factors: Application to modelling in economics. Appl. Stoch. Model. Data Anal. 1991, 7, 295–316. [Google Scholar] [CrossRef]
  10. Gutiérrez, R.; Gutiérrez-Sánchez, R.; Nafidi, A. Electricity consumption in Morocco: Stochastic Gompertz exogenous factors diffusion analysis. Appl. Energy 2006, 83, 1139–1151. [Google Scholar] [CrossRef]
  11. Gutiérrez, R.; Gutiérrez-Sánchez, R.; Nafidi, A.; Pascual, A. Detection, modelling and estimation of non-linear trends by using a non-homogeneous Vasicek stochastic diffusion. Application to CO2 emissions in Morocco. Stoch. Environ. Res. Risk Assess. 2012, 26, 533–543. [Google Scholar] [CrossRef]
  12. Picchini, U.; Ditlevsen, S.; De Gaetano, A. Maximum likelihood estimation of a time-inhomogeneous stochastic differential model of glucose dynamics. Math. Med. Biol. 2008, 25, 141–155. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  13. Nafidi, A.; Gutiérrez, R.; Gutiérrez-Sánchez, R.; Ramos-Ábalos, E.; El Hachimi, S. Modelling and predicting electricity consumption in Spain using the stochastic Gamma diffusion process with exogenous factors. Energy 2016, 113, 309–318. [Google Scholar] [CrossRef]
  14. Prakasa Rao, B.L.S. Statistical Inference for Diffusion Type Processes; Arnold: London, UK, 1999. [Google Scholar]
  15. Kloeden, P.E.; Platen, E. Numerical Solution of Stochastic Differential Equations; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2013; Volume 23. [Google Scholar]
  16. Bibby, B.M.; Sørensen, M. Martingale estimation functions for discretely observed diffusion processes. Bernoulli 1995, 17–39. [Google Scholar] [CrossRef]
  17. Lomax, K.S. Business failures: Another example of the analysis of failure data. J. Am. Stat. Assoc. 1954, 49, 847–852. [Google Scholar] [CrossRef]
  18. Meynial, E. Recueil Publié par la Faculté de Droit, à L’occasion de L’exposition Nationale Suisse de Genève; Editions Dalloz: Paris, France, 1898. [Google Scholar]
  19. Kirkpatrick, S.; Gelatt, C.D.; Vecchi, M.P. Optimization by simulated annealing. Science 1983, 220, 671–680. [Google Scholar] [CrossRef] [PubMed]
  20. Lewis, C.D. A Radical Guide to Exponential Smoothing and Curve Fitting; Butterworth-Heinemann: London, UK; Boston, MA, USA, 1982. [Google Scholar]
Figure 1. Simulated sample paths vs. the Estimated Trend Function (ETF) for α = 1.5 , β = 90 σ = 0.01 .
Figure 1. Simulated sample paths vs. the Estimated Trend Function (ETF) for α = 1.5 , β = 90 σ = 0.01 .
Mathematics 09 00100 g001
Figure 2. ETFs vs. the mean values of simulated data.
Figure 2. ETFs vs. the mean values of simulated data.
Mathematics 09 00100 g002
Figure 3. Real data vs. trend function and conditional trend function.
Figure 3. Real data vs. trend function and conditional trend function.
Mathematics 09 00100 g003
Table 1. Table of adolescent fertility rate data by year.
Table 1. Table of adolescent fertility rate data by year.
YearDataTFCTFYearDataTFCTF
197992.937692.937692.9376199835.055237.323134.2368
198085.952477.13977.4169199935.289436.725734.495
198178.967268.470376.4049200035.523636.164634.751
198271.98262.712572.3839200135.757835.63635.0052
198366.667258.496967.1766200235.99235.136835.2577
198461.352455.220962.9552200335.336234.664335.5087
198556.037652.570558.4222200434.680434.216134.8799
198650.722850.362753.6944200534.024633.790134.2491
198745.40848.482648.8365200633.368833.384433.6166
198843.745646.853743.8876200732.71332.997332.9824
198942.083245.422542.4135200832.89432.627532.3468
199040.420844.150540.9082200933.07532.273732.5376
199138.758443.009339.3787201033.25631.934632.7279
199237.09641.976937.8303201133.43731.609232.9175
199336.64141.036436.2668201233.61831.296633.1067
199436.18640.174435.873201333.101230.99633.2954
199535.73139.380135.472201432.584430.706532.7923
199635.27638.644835.0651201532.067630.427432.2886
199734.82137.961134.6531
Table 2. Forecasted values by year.
Table 2. Forecasted values by year.
YearData TF CTF
201631.550830.158231.7841
201731.03429.898231.279
201830.68129.646830.7733
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Nafidi, A.; Makroz, I.; Gutiérrez Sánchez, R. A Stochastic Lomax Diffusion Process: Statistical Inference and Application. Mathematics 2021, 9, 100. https://doi.org/10.3390/math9010100

AMA Style

Nafidi A, Makroz I, Gutiérrez Sánchez R. A Stochastic Lomax Diffusion Process: Statistical Inference and Application. Mathematics. 2021; 9(1):100. https://doi.org/10.3390/math9010100

Chicago/Turabian Style

Nafidi, Ahmed, Ilyasse Makroz, and Ramón Gutiérrez Sánchez. 2021. "A Stochastic Lomax Diffusion Process: Statistical Inference and Application" Mathematics 9, no. 1: 100. https://doi.org/10.3390/math9010100

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop