Berry–Esseen Bounds of the Quasi Maximum Likelihood Estimators for the Discretely Observed Diffusions

: For stationary ergodic diffusions satisfying nonlinear homogeneous Itô stochastic differential equations, this paper obtains the Berry–Esseen bounds on the rates of convergence to normality of the distributions of the quasi maximum likelihood estimators based on stochastic Taylor approximation, under some regularity conditions, when the diffusion is observed at equally spaced dense time points over a long time interval, the high-frequency regime. It shows that the higher-order stochastic Taylor approximation-based estimators perform better than the basic Euler approximation in the sense of having smaller asymptotic


Introduction and Preliminaries
Parameter estimation in diffusion processes based on discrete observations is the recent trend of investigation in financial econometrics and mathematical biology since the data available in finance and biology are high-frequency discrete, though the model is continuous. For a treatise on this subject, see Bishwal (2008Bishwal ( , 2021 [1,2].
Consider the Itô stochastic differential equation where {W t , t ≥ 0} is a one-dimensional standard Wiener process, θ ∈ Θ, Θ is a compact subset of R, f is a known real valued function defined on Θ × R, the unknown parameter θ is to be estimated on the basis of observation of the process {X t , t ≥ 0}. Let θ 0 be the true value of the parameter that is in the interior of Θ. We assume that the process {X t , t ≥ 0} is observed at 0 = t 0 < t 1 < . . . < t n = T with ∆t i := t i − t i−1 = T n = h and T = dn 1/2 for some fixed real number d > 0. We estimate θ from the observations {X t 0 , X t 1 , . . . , X t n }.
The conditional least squares estimator (CLSE) of θ is defined as θ n,T := arg min θ∈Θ Q n,T (θ) This estimator was first studied by Dorogovcev (1976) [3], who obtained its weak consistency under some regularity conditions as T → ∞ and T n → 0. Kasonga (1988) [4] obtained the strong consistency of the CLSE under some regularity conditions as n → ∞ assuming that T = dn 1/2 for some fixed real number d > 0. Prakasa Rao (1983) [5] obtained asymptotic normality of the CLSE as T → ∞ and T n 1/2 → 0. Florens-Zmirou (1989) [6] studied the minimum contrast estimator, based on a Euler-Maruyama-type first-order approximate discrete time scheme of the SDE (1), which is given by The log-likelihood function of {Z t i , 0 ≤ i ≤ n} is given by where C is a constant independent of θ. A contrast for the estimation of θ is derived from the above log-likelihood by substituting {Z t i , 0 ≤ i ≤ n} with {X t i , 0 ≤ i ≤ n}. The resulting contrast is and the resulting minimum contrast estimator, called the Euler-Maruyama estimator, is given byθ n,T := arg min θ∈Θ H n,T (θ) Florens-Zmirou (1989) [6] showed the L 2 -consistency of the estimator as T → ∞ and T n → 0 and asymptotic normality as T → ∞ and T n 2/3 → 0. Notice that the contrast H n,T would be the log-likelihood of (X t i , 0 ≤ i ≤ n) if the transition probability was N ( f (θ, x)h, h)). This led Kessler (1997) [7] to consider Gaussian approximation of the transition density. The most natural one is achieved through choosing its mean and variance to be the mean and variance of the transition density. Thus, the transition density is approximated by N (E(X t i |X t i−1 ), h)), which produces the contrast Since the transition density is unknown, in general, there is no closed-form expression for E(X t i |X t i−1 ). Using the stochastic Taylor formula obtained in Florens-Zmirou (1989) [6], he obtained a closed-form expression of E(X t i |X t i−1 ). The contrast H n,T is an example of such an approximation when E( The resulting minimum contrast estimator, which is also the quasi-maximum likelihood estimator (QMLE), is given by θ n,T := arg min θ∈Θ K n,T (θ) Kessler (1997) [7] showed the L 2 -consistency of the estimator as T → ∞ and T n → 0 and asymptotic normality as T → ∞ and T n (p−1)/p → 0 for an arbitrary integer p.
which is the mean function of the transition probability distribution. Hence, the contrast is given by If continuous observation of {X t } on the interval [0, T] were available, then the likelihood function of θ would be (9) (see Liptser and Shiryayev (1977) [8]). Since we have discrete data, we have to approximate the likelihood to obtain the MLE. Taking Itô-type approximation of the stochastic integral and rectangle rule approximation of the ordinary integral in (9), we obtain the approximate likelihood function The Itô approximate maximum likelihood estimate (IAMLE) based onL n,T is defined asθ n,T := arg max θ∈ΘL n,T (θ).
Weak consistency and asymptotic normality of this estimator were obtained by Yoshida (1992) [9] as T → ∞ and T n → 0. Note that the CLSE, the Euler-Maruyama estimator and the IAMLE are the same estimator (see Shoji (1997) [10]). For the Ornstein-Uhlenbeck process, Bishwal and Bose (2001) [11] studied the rates of weak convergence of approximate maximum likelihood estimators, which are of conditional least squares type. For the Ornstein-Uhlenbeck process,   [12] studied the uniform rate of weak convergence for the minimum contrast estimator, which has a close connection to the Stratonovich-Milstein scheme.   [13] studied Berry-Esseen inequalities for conditional least squares estimator in discretely observed nonlinear diffusions.   [14] studied the Stratonovich-based approximate M-estimator of discretely sampled nonlinear diffusions.   [15] studied Milstein approximation of the posterior density of diffusions.   [16] studied conditional least squares estimation in nonlinear diffusion processes based on Poisson sampling.   [17] obtained some new estimators of integrated volatility using the stochastic Taylor-type schemes, which could be useful for option pricing in stochastic volatility models; see also Bishwal (2021) [2].
Prime denotes the derivative with respect to θ, dot denotes the derivative with respect to x and denotes the max symbol throughout the paper. In order to obtain a better estimator in terms of lowering variance in Monte Carlo simulation, which may have a faster rate of convergence, first, we use the algorithm proposed in Bishwal (2008) [1]. Note that the Itô integral and the Fisk-Stratonovich (FS, henceforth; Fisk, while introducing the concept of quasimartingale, had the trapezoidal approximation and Stratonovich had the midpoint approximation, converging to the same mean square limit) integral are connected by where o is the Itô's circle for the FS integral. We transform the Itô integral (the limit of the rectangular approximation to preserve the martingale property) in (9) to the FS integral and apply FS-type trapezoidal approximation of the stochastic integral and rectangular rule-type approximation of the Lebesgue integrals and obtain the approximate likelihood The Fisk-Stratonovich approximate maximum likelihood estimator (FSAMLE) based onL n,T is defined asθ n,T := arg max θ∈ΘL n,T (θ).
Weak consistency as T → ∞ and T n → 0 and asymptotic normality as T → ∞ and T n 2/3 → 0 of the FSAMLE were shown in Bishwal (2008) [1]. Berry-Esseen bounds for the IAMLE and the FSAMLE for the Ornstein-Uhlenbeck processes were obtained in Bishwal and Bose (2001) [11]. We shall use the following notations: C is a generic constant independent of h, n and other variables (it may depend on θ). Throughout the paper,ḟ denotes the derivative with respect to x and f denotes the derivative with respect to θ of the function f (θ, x). Suppose that θ 0 denotes the true value of the parameter and θ 0 ∈ Θ. We assume the following conditions: (A3) The diffusion process X is stationary and ergodic with invariant measure ν, i.e., for any g with E[g(·)] < ∞, (A6) f is continuously differentiable function in x up to order p for all θ.
(A7) f (·, x) and all its derivatives are three times continuously differentiable with respect to θ for all x ∈ R. Moreover, these derivatives up to third order with respect to θ are of polynomial growth in x uniformly in θ.
The Fisher information is given by and for any δ > 0, or any compactΘ ⊂ Θ, (A8) The Malliavin covariance of the process is nondegenerate. The Malliavin covariance matrix of a smooth random variable S is defined as is almost surely positive and, for any m ≥ 1, one has 1/det(γ T ) L m < ∞. This, associated with the functional ω → X(t, ω), is given by In the case of independent observations, in order to prove the validity of asymptotic expansion, one usually needs a certain regularity condition for the underlying distribution, such as the Cramér condition; see Bhattacharya and Ranga Rao (1976) [18]. This type of condition then ensures the regularity of the distribution and hence the smoothness assumption of the functional under the expectation whose martingale expansion is desired can be removed. This type of condition for dependent observations leads to the regularity of the distribution of a functional with nondegenerate Malliavin covariance, which is known in Malliavin calculus; see Ikeda and Watanabe (1989) [19] and Nualart (1995) [20]. Malliavin covariance is connected to the Hörmander condition, which is a sufficient condition for a second-order differential operator to be hypoelliptic; see Bally (1991) [21]. For operators with analytic coefficients, this condition turns out to be also necessary, but this is not true for general smooth coefficients.
More precisely, let X be a differentiable R-valued Wiener functional defined on a Wiener space. Assume that there exists a functional ψ such that Thus, it is a regularity condition of the characteristic function, which is a consequence of the nondegeneracy of the Malliavin covariance in the case of Wiener functionals. The functional ψ, which is a random variable satisfying 0 ≤ ψ ≤ 1, is a truncation functional extracting from the Wiener space, the portion on which the distribution is regular. If X is almost regular, one may take ψ nearly equal to one. Uniform degeneracy of the Malliavin covariance of the functional T −1/2 T 0 f (θ 0 , X t )dW t can be shown under (A8); see Yoshida (1997) [22].   [13] obtained the rate of convergence to normality of the Itô AMLE and the Fisk-Stratonovich AMLE of the order O T −1/2 T 2 n and O T −1/2 T 3 n 2 , respectively, under the regularity conditions given above with q > 16 for (A4). We obtain the rate of convergence to normality, i.e., Berry-Esseen bound of the order O T −1/2 T p+1 n p for the QMLE θ n,T for arbitrary integer p.
We need the following lemma from Michel and Pfanzagl (1971) [23] to prove our main results. Lemma 1. Let ξ, ζ and η be any three random variables on a probability space (Ω, F , P) with P(η > 0) = 1. Then, for any > 0, we have

Main Results
We start with some preliminary lemmas. Let L denote the generator of the diffusion process, g ∈ C 2 (R) The k-th iterate of L is denoted as L k . Its domain is C 2k (R). We set L 0 = I. Stochastic Taylor formula (Kloeden and Platen (1992) [24]): For a p + 1 times continuously differentiable function g : R → R, we have for t ∈ [0, T] and p = 1, 2, 3, . . .
Lemma 2. With f (x) = x, the stochastic Taylor expansion of µ(θ, x) is given by where R denotes a function for which there exists a constant C such that Proof. Applying the stochastic Taylor formula of Florens-Zmirou (1989, Lemma 1) [6], one obtains the result. See also Kloeden and Platen (1992) [24]. Consider the following special cases: Euler Scheme: . This produces the CLSE. This estimator has been very well studied in the literature (see Shoji (1997) [10]).

Remark 2.
Note that the Milstein scheme is equivalent to Stratonovich approximation of the stochastic integral after converting the Itô integral to the Stratonovich integral.
Proof. First, we show that, for p = 2, We emphasize that the Itô formula is a stochastic Taylor formula of order 2. By the Itô formula, we have We employ Taylor expansion in the local neighborhood of θ 0 . Let θ = θ 0 + T −1/2 u, u ∈ R. Then, we have By Lemma 2, we have Observe that, with B i,t := (the last term being zero due to the orthogonality of the integrals) On the other hand, with A i,t := However, E(A 2 i,t ) ≤ C(t − t i−1 ) 2 using (A4) and (A3). On substitution, the last term is dominated by Thus J 1 + J 2 ≤ C T 4 n 2 . By the same method, we have Thus, the proof for p = 2 is complete. Next, we consider the general case p ≥ 3. Denote We have Observe that, by Lemma 2, we have Thus, by combining the bounds for J 1 , J 2 , J 3 and G 1 , we have The following lemma is from Bishwal (2008) [1].

Lemma 5. Let
Then, under the conditions (A1)-(A8), Our main result is the following theorem.

Remark 3.
With p = 1, for the Euler scheme, which produces the conditional least squares estimator, one obtains the rate O T −1/2 T 2 n . With p = 2, for the Milstein scheme, one obtains the rate O T −1/2 T 3 n 2 . With p = 4, for the Simpson scheme, one obtains the rate O T −1/2 T 5 n 4 . With p = 6, for the Boole scheme, one obtains the rate O T −1/2 T 7 n 6 . Thus, the higher the p, the sharper the bound. Thus, the Itô/Euler scheme gives the first-order QMLE, the Milstein/Stratonovich scheme produces the second-order QMLE, the Simpson scheme produces the fourth-order QMLE and the Boole scheme produces the sixth-order QMLE. See Bishwal (2011) [28] for a connection of this area to the stochastic moment problem and hedging of generalized Black-Scholes options.

Example
Consider the stochastic differential equation The solution to the above SDE is called the hyperbolic diffusion process because it has a hyperbolic stationary distribution when θ < 0. The process has nonlinear drift and the process is stationary and ergodic, which distinguishes this from a linear drift case, such as the Ornstein-Uhlenbeck process and the Cox-Ingersoll-Ross process, which have linear drift. This model verifies assumption (A3). In fact, the stationary density is proportional to exp(θ 1 + X 2 t ). It is not possible to calculate the conditional expectation for the hyperbolic diffusion process and hence one needs a higher-order Taylor expansion approach.
Remark 4 (Concluding Remark). It would be interesting to extend the results of the paper to diffusions with jumps using the strong stochastic Taylor expansion with jumps results in Chapter 6 of Kloeden and Bruti-Liberati (2010) [29].