Next Article in Journal
Enhancing Strong Neighbor-Based Optimization for Distributed Model Predictive Control Systems
Next Article in Special Issue
The Randomized First-Hitting Problem of Continuously Time-Changed Brownian Motion
Previous Article in Journal
Comparison of Differential Operators with Lie Derivative of Three-Dimensional Real Hypersurfaces in Non-Flat Complex Space Forms
Previous Article in Special Issue
A Time-Non-Homogeneous Double-Ended Queue with Failures and Repairs and Its Continuous Approximation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Some Notes about Inference for the Lognormal Diffusion Process with Exogenous Factors

by
Patricia Román-Román
,
Juan José Serrano-Pérez
and
Francisco Torres-Ruiz
*,†
Departamento de Estadística e Investigación Operativa, Facultad de Ciencias, Universidad de Granada, Avenida Fuente Nueva, 18071 Granada, Spain
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Mathematics 2018, 6(5), 85; https://doi.org/10.3390/math6050085
Submission received: 16 April 2018 / Revised: 14 May 2018 / Accepted: 15 May 2018 / Published: 21 May 2018
(This article belongs to the Special Issue Stochastic Processes with Applications)

Abstract

:
Different versions of the lognormal diffusion process with exogenous factors have been used in recent years to model and study the behavior of phenomena following a given growth curve. In each case considered, the estimation of the model has been addressed, generally by maximum likelihood (ML), as has been the study of several characteristics associated with the type of curve considered. For this process, a unified version of the ML estimation problem is presented, including how to obtain estimation errors and asymptotic confidence intervals for parametric functions when no explicit expression is available for the estimators of the parameters of the model. The Gompertz-type diffusion process is used here to illustrate the application of the methodology.

1. Introduction

The lognormal diffusion process has been widely used as a probabilistic model in several scientific fields in which the variable under consideration exhibits an exponential trend. Originally, the lognormal diffusion process was mainly applied to modeling dynamic variables in the field of economy and finance. Important contributions have been made in this direction by Cox and Ross [1], Markus and Shaked [2], and Merton [3], showing the theoretical and practical importance of the process in that environment. For example, this process is associated with the Black and Scholes model [4] and appears in later extensions as terminal swap-rate models (Hunt and Kennedy [5], Lamberton and Lapeyre [6]).
In 1972, Tintner and Sengupta [7] introduced a modification of the process by including a linear combination of time functions in the infinitesimal mean of the process. The motivation for this was the introduction of external influences on the interest variable (endogenous variable), influences that could contribute to a better explanation of the phenomenon under study. For this reason, these time functions are known as exogenous factors, whose time behavior is assumed to be known or partially known. By using these time functions we can model situations wherein the observed trend shows deviations from the theoretical shape of the trend during certain time intervals, and can therefore use them to help describe the evolution of the process. Furthermore, a suitable choice of the exogenous factors can contribute to the external control of the process for forecasting purposes. Note that the methodology derived from the inclusion of exogenous factors has been applied to several contexts other than the lognormal process (see, for example, Buonocore et al. [8]).
The lognormal diffusion process with exogenous factors has been widely studied in relation to some aspects of inference and first-passage times. It has been applied to the modeling of time variables in several fields (see, for example [9,10]). On occasion, the endogenous variable itself helps identify the exogenous factors. However, there are situations in which external variables to the process that have an influence on the system are not available, or situations in which their functional expressions are unknown. In such cases, Gutiérrez et al. [11] suggested approaching the exogenous factors by means of polynomial functions.
The ability to control the endogenous variable using exogenous factors makes this process particularly useful for forecasting purposes. Some of its main features, such as the mean, mode and quantile functions (that can be expressed as parametric functions of the parameters of the process), can be used for prediction purposes. Therefore, the inference of these functions has been the subject of considerable study, both from the perspective of point estimation and of estimation by confidence intervals. With respect to the former, in [10] a more general study was carried out to obtain maximum likelihood (ML) estimators. In that case, the exact distribution of the estimators was found, and then used to obtain the uniformly minimum variance unbiased (UMVU) estimators. In addition, expressions for the relative efficiency of ML estimators, with respect to UMVU estimators, were obtained. This last study was extended for a class of parametric functions which include the mean and mode functions (together with their conditional versions) as special cases. Concerning estimation by confidence bands, in this paper the authors extended the results obtained by Land [12] on exact confidence intervals for the mean of a lognormal distribution, thus obtaining confidence bands for the mean and mode functions of the lognormal process with exogenous factors and expressing these functions in a more general form.
In most of the works cited, inference has been approached from the ML point of view, considering discrete sampling of the trajectories. To this end, it is essential to have the exact form of the transition density functions from which the likelihood function associated with the sample is constructed. However, alternatives are available for a range of situations. For example, approximating the transition density function using Euler-type schemes derived from the discretization of the stochastic differential equation that models the behavior of the phenomenon under study (sometimes this approach is known as naive ML approach). Other possible alternatives to ML are those derived, for example, from the use of the concept of estimating functions (Bibby et al. [13]) and the generalized method of moments (Hansen [14]). Fuchs in [15] presents a good review of these and other procedures. The Bayesian approach is also present in the study of diffusion processes, as suggested by Tang and Heron in [16].
On the other hand, considering particular choices of the time functions that define the exogenous factors has enabled researchers to define diffusion processes associated to alternative expressions of already-known growth curves. Along these lines, we may cite a Gompertz-type process [17] (applied to the study of rabbit growth), a generalized Von Bertalanffy diffusion process [18] (with an application to the growth of fish species), a logistic-type process [19] (applied to the growth of a microorganism culture), and a Richards-type diffusion process [20]. In [21], a joint analysis of the procedure for obtaining these processes is shown. More recently, Da Luz-Sant’Ana et al. [22] have established, following a similar methodology, a Hubbert diffusion process for studying oil production, while Barrera et al. [23] introduced a process linked to the hyperbolastic type-I curve and applied it in the context of the quantitative polymerase chain reaction (qPCR) technique.
In these last cases, obtaining the ML estimators was a rather laborious task. In fact, the resulting system of equations is exceedingly complex and does not have an explicit solution, and numerical procedures must be employed instead, with the subsequent problem of finding initial solutions (see, for instance [18,19,22]). However, it is impossible to carry out a general study of the system of equations in order to check the conditions of convergence of the chosen numerical method, since it is dependent on sample data. One alternative is then to use stochastic optimization procedures like simulated annealing, variable neighborhood search, and the firefly algorithm [20,23,24]. In any case, the exact distribution of the estimators cannot be obtained. Recently, the asymptotic distribution of the MLestimators and delta method have been used in order to obtain estimation errors, as well as confidence intervals, for the parameters and parametric functions in the context of the Hubbert diffusion model [25].
The main objective of this paper is to provide a unified view of the estimation problem by means of discrete sampling of trajectories, and to cover all the diffusion processes mentioned above. To this end, we will consider the generic expression of the lognormal diffusion process with exogenous factors. In Section 2, a brief summary of the main characteristics of the process is presented. Section 3 and Section 4 address the problem of estimation by ML by using discrete sampling. In Section 3, the distribution of the sample is obtained, while in Section 4 the generic form adopted by the system of likelihood equations is derived in terms of the exogenous factor included in the model. Section 5 deals with obtaining the asymptotic distribution of the estimators, after calculating the Fisher information matrix, for which the results of Section 3 are fundamental. Finally, and as an application of the previous developments, Section 6 deals with the particular case of the Gompertz-type process introduced in [17].

2. The Lognormal Diffusion Process With Exogenous Factors

Let I = [ t 0 , + ) be a real interval ( t 0 0 ), Θ R k an open set, and h θ ( t ) a continuous, bounded and differentiable function on I depending on θ Θ .
The univariate lognormal diffusion process with exogenous factors is a diffusion process { X ( t ) ; t I } , taking values on R + , with infinitesimal moments
A 1 ( x , t ) = h θ ( t ) x A 2 ( x ) = σ 2 x 2 , σ > 0
and with a lognormal or degenerate initial distribution. This process is the solution to the stochastic differential equation
d X ( t ) = h θ ( t ) X ( t ) d t + σ X ( t ) d W ( t ) , X ( t 0 ) = X 0 ,
where W ( t ) is a standard Wiener process independent on X 0 = X ( t 0 ) , t t 0 , being this solution
X ( t ) = X 0 exp H ξ ( t 0 , t ) + σ ( W ( t ) W ( t 0 ) ) , t t 0
with
H ξ ( t 0 , t ) = t 0 t h θ ( u ) d u σ 2 2 ( t t 0 ) , ξ = ( θ T , σ 2 ) T .
An explanation of the main features of the process can be found in [21], where the authors carried out a detailed theoretical analysis. As regards the distribution of the process, if X 0 is distributed according to a lognormal distribution Λ 1 μ 0 ; σ 0 2 , or X 0 is a degenerate variable ( P [ X 0 = x 0 ] = 1 ), all the finite-dimensional distributions of the process are lognormal. Concretely, n N and t 1 < < t n , vector ( X ( t 1 ) , , X ( t n ) ) T has a n-dimensional lognormal distribution Λ n [ ε , Σ ] , where the components of vector ε and matrix Σ are
ε i = μ 0 + H ξ ( t 0 , t i ) , i = 1 , , n
and
σ i j = σ 0 2 + σ 2 ( min ( t i , t j ) t 0 ) , i , j = 1 , , n ,
respectively. The transition probability density function can be obtained from the distribution of ( X ( s ) , X ( t ) ) T , s < t , being
f ( x , t | y , s ) = 1 x 2 π σ 2 ( t s ) exp ln ( x / y ) H ξ ( s , t ) 2 2 σ 2 ( t s ) ,
that is, X ( t ) | X ( s ) = y follows a lognormal distribution
X ( t ) X ( s ) = y Λ 1 ln y + H ξ ( s , t ) , σ 2 ( t s ) , s < t .
From the previous distributions, one can obtain the characteristics most commonly employed for practical fitting and forecasting purposes. These characteristics can be expressed jointly as
G ξ λ ( t | y , τ ) = M ξ ( t | y , τ ) λ 1 exp λ 2 λ 3 σ 0 2 + σ 2 ( t τ ) λ 4 ,
with λ = ( λ 1 , λ 2 , λ 3 , λ 4 ) T and where M ξ ( t | y , τ ) = exp y + H ξ ( τ , t ) . Table 1 includes some of these characteristics (the n - th moment, and the mode and quantile functions as well as their conditional versions) according to the values of λ , τ and y.

3. Joint Distribution of d Sample-Paths of the Process

Let us consider a discrete sampling of the process, based on d sample paths, at times t i j , ( i = 1 , , d , j = 1 , , n i ) with t i 1 = t 0 , i = 1 , , d . Denote by X = X 1 T | | X d T T the vector containing the random variables of the sample, where X i T includes the variables of the i-th sample-path, that is X i = ( X ( t i 1 ) , , X ( t i , n i ) ) T , i = 1 , , d .
From Equation (2), and if the distribution of X ( t 1 ) is assumed lognormal Λ 1 ( μ 1 , σ 1 2 ) , the probability density function of X is
f X ( x ) = i = 1 d exp [ ln x i 1 μ 1 ] 2 2 σ 1 2 x i 1 σ 1 2 π j = 1 n i 1 exp ln x i , j + 1 / x i j m ξ i , j , j + 1 2 2 σ 2 Δ i j + 1 , j x i j σ 2 π Δ i j + 1 , j
where m ξ i , j + 1 , j = H ξ ( t i j , t i , j + 1 ) and Δ i j + 1 , j = t i , j + 1 t i j .
Now, we consider vector V = V 0 T | V 1 T | | V d T T = V 0 T | V ( 1 ) T T , built from X by means of the following change of variables:
V 0 i = X i 1 , i = 1 , , d V i j = ( Δ i j + 1 , j ) 1 / 2 ln X i , j + 1 X i j , i = 1 , , d ; j = 1 , , n i 1 .
Taking into account this change of variables, the density of V becomes
f V ( v ) = exp 1 2 σ 1 2 ( ln v 0 μ 1 1 d ) T ( ln v 0 μ 1 1 d ) i = 1 d v 0 i 2 π σ 1 2 d 2 exp 1 2 σ 2 v ( 1 ) γ ξ T v ( 1 ) γ ξ 2 π σ 2 n 2
with ln v 0 = ( ln v 01 , , ln v 0 d ) T , n = i = 1 d ( n i 1 ) . Here, 1 d represents the d-dimensional vector whose components are all equal to one, while γ ξ is a vector of dimension n with components γ i j ξ = ( Δ i j + 1 , j ) 1 / 2 m ξ i , j , j + 1 , i = 1 , , d ; j = 1 , , n i 1 .
From Equation (5) it is deduced that:
  • V 0 and V ( 1 ) are independents,
  • the distribution of V 0 is lognormal Λ d μ 1 1 d ; σ 1 2 I d ,
  • V ( 1 ) is distributed as an n-variate normal distribution N n γ ξ ; σ 2 I n ,
being I d and I n the identity matrices of order d and n, respectively.

4. Maximum Likelihood Estimation of the Parameters of the Process

Consider a discrete sample of the process in the sense described in the previous section, including the transformation of it given by Equation (4). Denote by η = ( μ 1 , σ 1 2 ) T and suppose that η and ξ are functionally independent. Then, for a fixed value v of the sample, the log-likelihood function is
L v ( η , ξ ) = ( n + d ) ln ( 2 π ) 2 d ln σ 1 2 2 i = 1 d ln v 0 i i = 1 d ln v 0 i μ 1 2 2 σ 1 2 n   ln σ 2 2 Z 1 + Φ ξ 2 Γ ξ 2 σ 2
where
Z 1 = i = 1 d j = 1 n i 1 v i j 2 , Φ ξ = i = 1 d j = 1 n i 1 m ξ i , j + 1 , j 2 Δ i j + 1 , j , Γ ξ = i = 1 d j = 1 n i 1 v i j m ξ i , j + 1 , j ( Δ i j + 1 , j ) 1 / 2 .
Taking into account Equation (6), and since η and ξ are functionally independent, the ML estimation of η is obtained from the system of equations (Given a function f : R k R , f x T = f x 1 , , f x k . Notation f x T indicates that the result is a row vector).
L v ( η , ξ ) η T = L v ( η , ξ ) μ 1 , L v ( η , ξ ) σ 1 2 = 0
resulting in
μ ^ 1 = 1 d i = 1 d ln v 0 i and σ ^ 1 2 = 1 d i = 1 d ( ln v 0 i μ ^ 1 ) 2 .
On the other hand, by denoting
Ω ξ = 1 2 Φ ξ θ T = i = 1 d j = 1 n i 1 m ξ i , j + 1 , j Δ i j + 1 , j m ξ i , j + 1 , j θ T , Ψ θ = 1 2 Γ ξ θ T = i = 1 d j = 1 n i 1 v i j ( Δ i j + 1 , j ) 1 / 2 m ξ i , j + 1 , j θ T Υ ξ = Φ ξ σ 2 = i = 1 d m ξ i , n i , 1 , Z 2 = 2 Γ ξ σ 2 = i = 1 d j = 1 n i 1 v i j ( Δ i j + 1 , j ) 1 / 2
we have
L v ( η , ξ ) θ T = 1 σ 2 Ψ θ Ω ξ L v ( η , ξ ) σ 2 = n 2 σ 2 + Z 1 + Φ ξ 2 Γ ξ 2 σ 4 Z 2 Υ ξ 2 σ 2 .
Thus, the ML estimation of ξ is obtained as the solution of the following system of k + 1 equations:
Ψ θ Ω ξ = 0
Z 1 + Φ ξ 2 Γ ξ σ 2 Z 2 + σ 2 Υ ξ = n σ 2
In the case where h θ is a linear function in θ , it is possible to determine an explicit solution for this system of equations (see [10,26]). In other cases, the existence of a closed-form solution can not be guaranteed, and it is therefore necessary to use numerical procedures for its resolution. The fact that these methods require initial solutions has motivated the construction of ad hoc procedures which depend on the process derived according to the function h θ considered (see [18,19,22]). However, it is impossible to carry out a general study of the system of equations in order to check the conditions of convergence of the chosen numerical method, since the system is dependent on sample data and this may lead to unforeseeable behavior. One alternative would be using stochastic optimization procedures like simulated annealing, variable neighborhood search and the firefly algorithm. These algorithms are often more appropriate than classical numerical methods since they impose fewer restrictions on the space of solutions and on the analytical properties of the function to be optimized. Some examples of the application of these procedures in the context of diffusion processes can be seen in [19,21,23,25].

5. Distribution of the ML Estimators of the Parameters and Related Parametric Functions

In this section we will discuss some aspects related to the distribution of the estimators of the parameters of the model, and their repercussions in the corresponding distributions of parametric functions, which can be of interest for several applications.
With regard to the distribution of the estimators of η , it is immediate to verify that
μ 1 ^ N 1 [ μ 1 ; σ 1 2 / d ] and d σ 1 2 ^ σ 1 2 χ d 1 2 .
If h θ is linear, it is then possible to calculate exact distributions associated with the estimators of ξ , which allows us to establish confidence regions for the parameters as well as UMVU estimators and confidence intervals for linear combinations of θ and σ 2 (see [10,26]). However, in the non-linear case, the fact that an explicit expression for the estimators of ξ is not always readily available precludes obtaining, in general, exact distributions for them. In that case, asymptotic distributions can be used instead. In fact, on the basis of the properties of the ML estimators, it is known that ξ ^ is asymptotically distributed as a normal distribution with mean ξ and covariance matrix I ( ξ ) 1 , where I ( ξ ) is the Fisher’s information matrix associated with the full sample (in this case, ignoring the data of the initial distribution).
First we calculate the associated Hessian matrix: (we have adopted the usual expression for the Hessian matrix of f : R k R using vectorial notation, that is 2 f x x T ).
H ( ξ ) = 2 L v ( η , ξ ) ξ ξ T = 2 L v ( η , ξ ) θ θ T 2 L v ( η , ξ ) σ 2 θ T T 2 L v ( η , ξ ) σ 2 θ T 2 L v ( η , ξ ) ( σ 2 ) 2 = 1 σ 2 Π ξ Ξ ξ 1 σ 2 Ψ θ T Ω ξ T + 1 2 Υ ξ θ T T 1 σ 2 Ψ θ Ω ξ + 1 2 Υ ξ θ T n 2 σ 2 Z 1 + Φ ξ 2 Γ ξ σ 4 + Z 2 Υ ξ σ 2 Z 3 4
where
Π ξ = i = 1 d j = 1 n i 1 2 m ξ i , j + 1 , j θ θ T ( Δ i j + 1 , j ) 1 / 2 v i j ( Δ i j + 1 , j ) 1 / 2 m ξ i , j + 1 , j
and
Ξ ξ = i = 1 d j = 1 n i 1 ( Δ i j + 1 , j ) 1 m ξ i , j + 1 , j θ T T m ξ i , j + 1 , j θ T , Z 3 = i = 1 d Δ i n i , 1 .
Taking into account the distribution of the sample (see Section 3), we have
E [ Π ξ ] = 0 , E [ Z 1 ] = n σ 2 + Φ ξ , E [ Z 2 ] = Υ ξ , E [ Ψ θ ] = Ω ξ , E [ Γ ξ ] = Φ ξ
so, the Fisher’s information matrix is given by
I ( ξ ) = E [ H ( ξ ) ] = 1 σ 2 Ξ ξ 1 2 Υ ξ θ T T 1 2 Υ ξ θ T n 2 σ 2 + Z 3 4 ,
from where it is concluded that ξ ^ D N k + 1 ξ ; I ( ξ ) 1 . In addition, and by applying the delta method, for a q - parametric function g ( ξ ) ( q k + 1 ) it is verified that
g ( ξ ^ ) D N q g ( ξ ) ; g ( ξ ) T I ( ξ ) 1 g ( ξ )
where g ( ξ ) represents the vector of partial derivatives of g ( ξ ) with respect to ξ .
The elements in the diagonal of matrix I ( ξ ) 1 provide asymptotic variances for the estimations of the parameters, while the delta method provides the asymptotic covariance matrix for g ( ξ ^ ) (and consequently the elements of the diagonal are the asymptotic variances for the estimation of each parametric function of g ( ξ ) ). For example, if we consider g ( ξ ) = G ξ λ ( t | y , τ ) , that is the general expression for the main characteristics of the process given by Equation (3), then
g ( ξ ) = g ( ξ ) λ 1 H ξ ( τ , t ) θ T , ( t τ ) λ 1 2 + λ 2 λ 4 λ 3 σ 0 2 + σ 2 ( t τ ) λ 4 1 .

6. Application: The Gompertz-Type Diffusion Process

In this section we focus on the Gompertz-type diffusion process introduced in [17] with the aim of obtaining a continuous stochastic model associated with the Gompertz curve whose limit value depends on the initial value. Concretely
f ( t ) = x 0 exp m β e β t e β t 0 , t t 0 0 , m , β > 0 and x 0 > 0 .
To this end, the non-homogeneous lognormal diffusion process with infinitesimal moments
A 1 ( x , t ) = m e β t x A 2 ( x ) = σ 2 x 2
was considered.
In order to apply the general scheme developed in the preceding sections, we consider the following reparameterization θ = ( δ , α ) T = ( m / β , e β ) T , which leads to expressing the Gompertz curve as
f θ ( t ) = x 0 exp δ α t α t 0
whereas the infinitesimal moments (10) are written in the form of Equation (1), with h θ ( t ) = δ α t ln α .
Denoting φ i , j + 1 , j α = α t i , j + 1 α t i , j and ω i , j + 1 , j α = t i , j + 1 α t i , j + 1 t i j α t i j , one has m ξ i , j + 1 , j = δ φ i , j + 1 , j α σ 2 2 Δ i j + 1 , j and
m ξ i , j + 1 , j θ T = φ i , j + 1 , j α , δ ω i , j + 1 , j α ,
so, from Equation (8), and by taking into account of Equation (7), the following system of equations appears
X 1 α + δ X 2 α + σ 2 2 X 3 α = 0 X 4 α + δ X 5 α + σ 2 2 X 6 α = 0
where
X 1 α = i = 1 d j = 1 n i 1 v i j φ i , j + 1 , j α ( Δ i j + 1 , j ) 1 / 2 , X 2 α = i = 1 d j = 1 n i 1 φ i , j + 1 , j α 2 Δ i j + 1 , j , X 3 α = i = 1 d φ i , n i , 1 α
X 4 α = i = 1 d j = 1 n i 1 v i j ω i , j + 1 , j α ( Δ i j + 1 , j ) 1 / 2 , X 5 α = i = 1 d j = 1 n i 1 φ i , j + 1 , j α ω i , j + 1 , j α Δ i j + 1 , j , X 6 α = i = 1 d ω i , n i , 1 α .
After some algebra, one obtains
δ α = X 3 α X 4 α X 1 α X 6 α X 2 α X 6 α X 3 α X 5 α and σ α 2 = 2 S α , where S α = X 1 α X 5 α X 2 α X 4 α X 2 α X 6 α X 3 α X 5 α .
On the other hand, and since
Φ ξ = δ 2 X 2 α + σ 4 4 Z 3 + δ σ 2 X 3 α , Γ ξ = δ X 1 α σ 2 2 Z 2 , Υ ξ = δ X 3 α σ 2 2 Z 3 ,
Equation (9) results in
S α 2 n + S α δ α 2 X 1 α + δ α X 2 α Z 1 = 0
The solution of this equation provides the estimation of α , whereas those of the other parameters are given by δ α ^ and σ α ^ 2 .
As regards the asymptotic distribution of ξ ^ , it is a trivariate normal distribution with mean ξ and covariance matrix given by I ( ξ ) 1 , being
I ( ξ ) = 1 σ 2 X 2 α δ X 5 α X 3 α δ X 5 α δ 2 X 7 α δ X 6 α X 3 α δ X 6 α n 2 σ 2 + Z 3 4
with
X 7 α = i = 1 d j = 1 n i 1 ω i , j + 1 , j α 2 Δ i j + 1 , j .
This distribution can be used to obtain the asymptotic standard errors for the estimation of the parameters as well as for some parametric functions of interest (see the last comment of the previous section). In particular, we focus on the inflection time and the corresponding expected value of the process at this instant, conditioned on X ( t 0 ) = x 0 . Another important parametric function in this context is the upper bound that determines the carrying capacity of the system modeled by the process. Concretely:
  • Upper bound, conditioned on X ( t 0 ) = x 0 , g 1 ( θ ) = x 0 exp δ α t 0 .
  • Inflection time, g 2 ( θ ) = ln δ / ln α .
  • Value of the process at the time of inflection, conditioned on X ( t 0 ) = x 0 , g 3 ( θ ) = g 1 ( θ ) / e .
On the other hand, when using the model for predictive purposes some of the parametric functions of Table 1 can be used. In particular, the conditioned mean function adopts the expression
E [ X ( t ) | X ( τ ) = y ] = g 4 ( θ ) = y exp δ α t α τ .
Note that this curve is of the type of Equation (11). For this reason, this function is useful for forecasting purposes. In this case, it is of interest to provide not only the value of the function at each time instant, but also the standard error of the prediction and a confidence interval determining a range of values that includes, with a given confidence level, the true real value of the forecast.

Application to Real Data

The following example is based on a study developed in [27] on some aspects related to the growth of a population of rabbits. Figure 1 shows the weight (in grams) of 29 rabbits over 30 weeks. The sample paths begin at different initial values, thus showing a sigmoidal behavior, and their bounds are dependent on the initial values. These two aspects suggest that using the Gompertz-type model proposed above would be appropriate.
This data set has been used in various papers to illustrate some aspects of the Gompertz-type process, such as the estimation of the parameters and the study of some time variables that may be of interest in the analysis of growth phenomena of this nature. As regards the estimation of the parameters, in [17] the authors designed an iterative method for solving the likelihood system of equations, while in [24] the maximization of the likelihood function was directly addressed by simulated annealing. In addition, in [28] two time variables of interest for this type of data were analyzed: concretely the inflection time and the time instant in which the process reaches a certain percentage of total growth. Both cases were modeled as first-passage time problems.
In this paper the estimation of the parameters has been carried out from the resolution of Equation (12) by means of the bisection method (see Figure 2) and then by using expressions δ α ^ and σ α ^ 2 .
Table 2 contains the estimated values for the parameters and the inflection time, as well as the asymptotic estimation error and 95% confidence intervals by applying the delta method.
As regards the weight value at the inflection time and the upper bound, remember that these values depend on the one observed at the initial instant. Taking into account the range of observed weight values at the initial instant of observation, several values have been considered within this range. For these values, the expected weight of a rabbit at the moment of inflection has been studied, as well as the possible value of the maximum weight (upper bound). Table 3 contains the estimated values, the asymptotic standard errors, and the 95% confidence intervals.
Function E [ X ( t ) | X ( t 0 ) = x 0 ] can be used to provide forecasts of the weight of a rabbit that presents an initial weight x 0 . Figure 3 shows, for a selection of four of the rabbits used in the study, the estimated mean function together with the 95% asymptotic confidence intervals obtained for each value of this function. Additionally, the observed values are included to check the quality of the adjustment made by the model under consideration. Obviously, this type of representation can also be obtained by considering any value of x 0 in the range of the initial distribution of the weight. Note that the estimated mean functions for each rabbit depend on the initial value, and so do the corresponding confidence intervals for the mean at each time instant. Therefore, the graphs in the figure are different for each rabbit although the estimation of the parameters is unique.

7. Conclusions

The present paper deals with some topics about inference for the non-homogeneous lognormal process (or with exogenous factors). Starting from the general form of the process, we studied the ML estimation of the parameters by using discrete sampling. This general overview enabled us to provide a unified method for several diffusion processes which can be built from particular cases of the non-homogeneous lognormal process for several choices of exogenous factors. In addition, we also looked into the asymptotic distribution of estimators, through which we can calculate the estimation errors and confidence intervals for the estimators of a wide range of parametric functions of interest in many fields. Finally, the process here described is applied to the Gompertz-type diffusion process introduced in [17].

Author Contributions

The three authors have participated equally in the development of this work, either in the theoretical developments or in the applied aspects. The paper was also written and reviewed cooperatively.

Acknowledgments

This work was supported in part by the Ministerio de Economía, Industria y Competitividad, Spain, under Grants MTM2014-58061-P and MTM2017-85568-P.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Cox, J.C.; Ross, S.A. The evaluation of options for alternative stochastic processes. J. Financ. Econ. 1976, 3, 145–166. [Google Scholar] [CrossRef]
  2. Marcus, A.; Shaked, I. The relationship between accounting measures and prospective probabilities of insolvency: An application to the banking industry. Financ. Rev. 1984, 19, 67–83. [Google Scholar] [CrossRef]
  3. Merton, R.C. Option pricing when underlying stock returns are discontinuous. J. Financ. Econ. 1976, 3, 125–144. [Google Scholar] [CrossRef]
  4. Black, F.; Scholes, M. The pricing of options and corporate liabilities. J. Political Econ. 1973, 81, 637–654. [Google Scholar] [CrossRef]
  5. Hunt, P.J.; Kennedy, J.G. Financial Derivatives in Theory and Practice, Revised Edition; John Wiley and Sons: Chichester, UK, 2004; ISBN 978-0-470-86359-6. [Google Scholar]
  6. Lamberton, D.; Lapeyre, B. Introduction to Stochastic Calculus Applied to Finance, 2nd ed.; Chapman and Hall: New York, NY, USA, 2007; ISBN 9781584886266. [Google Scholar]
  7. Tintner, G.; Sengupta, J.K. Stochastic Economics; Academic Press: New York, NY, USA, 1972; ISBN 9781483274027. [Google Scholar]
  8. Buonocore, A.; Caputo, L.; Pirozzi, E.; Nobile, A.G. A Non-Autonomous Stochastic Predator-Prey Model. Math. Biosci. Eng. 2014, 11, 167–188. [Google Scholar] [CrossRef] [PubMed]
  9. D’Onofrio, G.; Lansky, P.; Pirozzi, E. On two diffusion neuronal models with multiplicative noise: The mean first-passage time properties. Chaos 2018, 28. [Google Scholar] [CrossRef]
  10. Gutiérrez, R.; Román, P.; Romero, D.; Torres, F. Forecasting for the univariate lognormal diffusion process with exogenous factors. Cybern. Syst. 2003, 34, 709–724. [Google Scholar] [CrossRef]
  11. Gutiérrez, R.; Rico, N.; Román, P.; Romero, D.; Serrano, J.J.; Torres, F. Lognormal diffusion process with polynomial exogenous factors. Cybern. Syst. 2006, 37, 293–309. [Google Scholar] [CrossRef]
  12. Land, C.E. Hypothesis tests and interval estimates. In Lognormal Distributions, Theory and Applications; Crow, E.L., Shimizu, K., Eds.; Marcel Dekker: New York, NY, USA, 1988; pp. 87–112. ISBN 0-8247-7803-0. [Google Scholar]
  13. Bibby, B.; Jacobsen, M.; Sørensen, M. Estimating functions for discretely sampled diffusion type models. In Handbook of Financial Econometrics; Aït-Sahalia, Y., Hansen, L., Eds.; North-Holland: Amsterdam, The Netherlands, 2009; pp. 203–268. ISBN 978-0-444-50897-3. [Google Scholar]
  14. Hansen, L. Large sample properties of generalized method of moments estimators. Econometrica 1982, 50, 1029–1054. [Google Scholar] [CrossRef]
  15. Fuchs, C. Inference for Diffusion Processes; Springer: Heidelberg, Germany, 2013; ISBN 978-3-642-25968-5. [Google Scholar]
  16. Tang, S.; Heron, E. Bayesian inference for a stochastic logistic model with switching points. Ecol. Model. 2008, 219, 153–169. [Google Scholar] [CrossRef]
  17. Gutiérrez, R.; Román, P.; Romero, D.; Serrano, J.J.; Torres, F. A new gompertz-type diffusion process with application to random growth. Math. Biosci. 2007, 208, 147–165. [Google Scholar] [CrossRef] [PubMed]
  18. Román-Román, P.; Romero, D.; Torres-Ruiz, F. A diffusion process to model generalized von Bertalanffy growth patterns: Fitting to real data. J. Theor. Biol. 2010, 263, 59–69. [Google Scholar] [CrossRef] [PubMed]
  19. Román-Román, P.; Torres-Ruiz, F. Modelling logistic growth by a new diffusion process: Application to biological system. BioSystems 2012, 110, 9–21. [Google Scholar] [CrossRef] [PubMed]
  20. Román-Román, P.; Torres-Ruiz, F. A stochastic model related to the Richards-type growth curve. Estimation by means of Simulated Annealing and Variable Neighborhood Search. App. Math. Comput. 2015, 266, 579–598. [Google Scholar] [CrossRef]
  21. Román-Román, P.; Torres-Ruiz, F. The nonhomogeneous lognormal diffusion process as a general process to model particular types of growth patterns. In Lecture Notes of Seminario Interdisciplinare di Matematica; Università degli Studi della Basilicata: Potenza, Italy, 2015; Volume XII, pp. 201–219. [Google Scholar]
  22. Da Luz Sant’Ana, I.; Román-Román, P.; Torres-Ruiz, F. Modeling oil production and its peak by means of a stochastic diffusion process based on the Hubbert curve. Energy 2017, 133, 455–470. [Google Scholar] [CrossRef]
  23. Barrera, A.; Román-Román, P.; Torres-Ruiz, F. A hyperbolastic type-I diffusion process: Parameter estimation by means of the firefly algorithm. Biosystems 2018, 163, 11–22. [Google Scholar] [CrossRef] [PubMed]
  24. Román-Román, P.; Romero, D.; Rubio, M.A.; Torres-Ruiz, F. Estimating the parameters of a Gompertz-type diffusion process by means of simulated annealing. Appl. Math. Comput. 2012, 218, 5121–5131. [Google Scholar] [CrossRef]
  25. Da Luz Sant’Ana, I.; Román-Román, P.; Torres-Ruiz, F. The Hubbert diffusion process: Estimation via simulated annealing and variable neighborhood search procedures. Application to forecasting peak oil production. Appl. Stoch. Models Bus. 2018. [Google Scholar] [CrossRef]
  26. Gutiérrez, R.; Román, P.; Torres, F. Inference on some parametric functions in the univariate lognormal diffusion process with exogenous factors. Test 2001, 10, 357–373. [Google Scholar] [CrossRef]
  27. Blasco, A.; Piles, M.; Varona, L. A Bayesian analysis of the effect of selection for growth rate on growth curves in rabbits. Genet. Sel. Evol. 2003, 35, 21–41. [Google Scholar] [CrossRef] [PubMed]
  28. Gutiérrez-Jáimez, R.; Román, P.; Romero, D.; Serrano, J.J.; Torres, F. Some time random variables related to a Gompertz-type diffusion process. Cybern. Syst. 2008, 39, 467–479. [Google Scholar] [CrossRef]
Figure 1. Weight of 29 rabbits over 30 weeks.
Figure 1. Weight of 29 rabbits over 30 weeks.
Mathematics 06 00085 g001
Figure 2. Graph of equation for α .
Figure 2. Graph of equation for α .
Mathematics 06 00085 g002
Figure 3. Observed values, estimated mean function, and confidence intervals for a choice of rabbits.
Figure 3. Observed values, estimated mean function, and confidence intervals for a choice of rabbits.
Mathematics 06 00085 g003aMathematics 06 00085 g003b
Table 1. Values used to obtain the n-th moment and the mode and quantile functions from G ξ λ ( t | z , τ ) . z α is the α -quantile of a standard normal distribution.
Table 1. Values used to obtain the n-th moment and the mode and quantile functions from G ξ λ ( t | z , τ ) . z α is the α -quantile of a standard normal distribution.
FunctionExpressionz τ λ
n-th moment E [ X ( t ) n ] μ 0 t 0 ( n , n 2 / 2 , 1 , 1 ) T
n-th conditional moment E [ X ( t ) n | X ( s ) = y ] ln y s ( n , n 2 / 2 , 0 , 1 ) T
mode M o d e [ X ( t ) ] μ 0 t 0 ( 1 , 1 , 1 , 1 ) T
conditional mode M o d e [ X ( t ) | X ( s ) = y ] ln y s ( 1 , 1 , 0 , 1 ) T
α -quantile C α [ X ( t ) ] μ 0 t 0 ( 1 , z α , 1 , 1 / 2 ) T
α -conditional quantile C α [ X ( t ) | X ( s ) = y ] ln y s ( 1 , z α , 0 , 1 / 2 ) T
Table 2. Estimated values, standard errors and 95% confidence intervals of the parameters and the inflection time.
Table 2. Estimated values, standard errors and 95% confidence intervals of the parameters and the inflection time.
Parametric Function δ α σ g2(θ)
Estimated value4.10200.83010.07087.5803
Standard error0.05560.00210.00020.1053
Confidence interval(3.9929, 4.1063)(0.8258, 0.8343)(0.0704, 0.0713)(7.3738, 7.7869)
Table 3. Estimated values, standard errors, and 95% confidence intervals of the upper bound and value at the inflection time for several values of the initial weight.
Table 3. Estimated values, standard errors, and 95% confidence intervals of the upper bound and value at the inflection time for several values of the initial weight.
Initial WeightUpper BoundValue at Inflection Time
g 3 ( θ ^ ) St. Error95% Interval g 1 ( θ ^ ) St. Error95% Interval
1451772.83670.546(1634.568, 1911.104)4819.068191.764(4443.215, 5194.920)
1551772.83675.411(1625.032, 1920.640)4819.068204.990(4417.295, 5220.841)
1651883.63880.276(1726.298, 2040.978)5120.260218.215(4692.566, 5547.954)
1752105.24385.142(1938.367, 2272.118)5722.643231.440(5269.028, 6176.258)
1852216.04590.007(2039.634, 2392.456)6023.835244.665(5544.299, 6503.371)
1952216.04594.872(2030.098, 2401.992)6023.835257.890(5518.378, 6529.291)
2052105.24399.737(1909.760, 2300.726)5722.643271.115(5191.266, 6254.020)
2151883.638104.603(1678.620, 2088.657)5120.260284.341(4562.961, 5677.558)

Share and Cite

MDPI and ACS Style

Román-Román, P.; Serrano-Pérez, J.J.; Torres-Ruiz, F. Some Notes about Inference for the Lognormal Diffusion Process with Exogenous Factors. Mathematics 2018, 6, 85. https://doi.org/10.3390/math6050085

AMA Style

Román-Román P, Serrano-Pérez JJ, Torres-Ruiz F. Some Notes about Inference for the Lognormal Diffusion Process with Exogenous Factors. Mathematics. 2018; 6(5):85. https://doi.org/10.3390/math6050085

Chicago/Turabian Style

Román-Román, Patricia, Juan José Serrano-Pérez, and Francisco Torres-Ruiz. 2018. "Some Notes about Inference for the Lognormal Diffusion Process with Exogenous Factors" Mathematics 6, no. 5: 85. https://doi.org/10.3390/math6050085

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop