1. Introduction and Preliminaries
Parameter estimation in diffusion processes based on discrete observations is the recent trend of investigation in financial econometrics and mathematical biology since the data available in finance and biology are high-frequency discrete, though the model is continuous. For a treatise on this subject, see Bishwal (2008, 2021) [
1,
2].
Consider the Itô stochastic differential equation
      
      where 
 is a one-dimensional standard Wiener process, 
, 
 is a compact subset of ℝ, 
f is a known real valued function defined on 
, the unknown parameter 
 is to be estimated on the basis of observation of the process 
. Let 
 be the true value of the parameter that is in the interior of 
. We assume that the process 
 is observed at 
 with 
 and 
 for some fixed real number 
. We estimate 
 from the observations 
.
The conditional least squares estimator (CLSE) of 
 is defined as
      
This estimator was first studied by Dorogovcev (1976) [
3], who obtained its weak consistency under some regularity conditions as 
 and 
. Kasonga (1988) [
4] obtained the strong consistency of the CLSE under some regularity conditions as 
 assuming that 
 for some fixed real number 
. Prakasa Rao (1983) [
5] obtained asymptotic normality of the CLSE as 
 and 
.
Florens-Zmirou (1989) [
6] studied the minimum contrast estimator, based on a Euler–Maruyama-type first-order approximate discrete time scheme of the SDE (
1), which is given by
      
The log-likelihood function of 
 is given by
      
      where 
C is a constant independent of 
. A contrast for the estimation of 
 is derived from the above log-likelihood by substituting 
 with 
. The resulting contrast is
      
      and the resulting minimum contrast estimator, called the Euler–Maruyama estimator, is given by
      
Florens-Zmirou (1989) [
6] showed the 
-consistency of the estimator as 
 and 
 and asymptotic normality as 
 and 
.
Notice that the contrast 
 would be the log-likelihood of 
 if the transition probability was 
. This led Kessler (1997) [
7] to consider Gaussian approximation of the transition density. The most natural one is achieved through choosing its mean and variance to be the mean and variance of the transition density. Thus, the transition density is approximated by 
, which produces the contrast
      
Since the transition density is unknown, in general, there is no closed-form expression for 
. Using the stochastic Taylor formula obtained in Florens-Zmirou (1989) [
6], he obtained a closed-form expression of 
 The contrast 
 is an example of such an approximation when 
.
The resulting minimum contrast estimator, which is also the quasi-maximum likelihood estimator (QMLE), is given by
      
Kessler (1997) [
7] showed the 
-consistency of the estimator as 
 and 
 and asymptotic normality as 
 and 
 for an arbitrary integer 
p.
Denote
      
      which is the mean function of the transition probability distribution. Hence, the contrast is given by
      
If continuous observation of 
 on the interval 
 were available, then the likelihood function of 
 would be
      
	  (see Liptser and Shiryayev (1977) [
8]). Since we have discrete data, we have to approximate the likelihood to obtain the MLE. Taking Itô-type approximation of the stochastic integral and rectangle rule approximation of the ordinary integral in (
9), we obtain the approximate likelihood function
      
The Itô approximate maximum likelihood estimate (IAMLE) based on 
 is defined as
      
Weak consistency and asymptotic normality of this estimator were obtained by Yoshida (1992) [
9] as 
 and 
.
Note that the CLSE, the Euler–Maruyama estimator and the IAMLE are the same estimator (see Shoji (1997) [
10]). For the Ornstein–Uhlenbeck process, Bishwal and Bose (2001) [
11] studied the rates of weak convergence of approximate maximum likelihood estimators, which are of conditional least squares type. For the Ornstein–Uhlenbeck process, Bishwal (2010) [
12] studied the uniform rate of weak convergence for the minimum contrast estimator, which has a close connection to the Stratonovich–Milstein scheme. Bishwal (2009) [
13] studied Berry–Esseen inequalities for conditional least squares estimator in discretely observed nonlinear diffusions. Bishwal (2009) [
14] studied the Stratonovich-based approximate M-estimator of discretely sampled nonlinear diffusions. Bishwal (2011) [
15] studied Milstein approximation of the posterior density of diffusions. Bishwal (2010) [
16] studied conditional least squares estimation in nonlinear diffusion processes based on Poisson sampling. Bishwal (2011) [
17] obtained some new estimators of integrated volatility using the stochastic Taylor-type schemes, which could be useful for option pricing in stochastic volatility models; see also Bishwal (2021) [
2].
Prime denotes the derivative with respect to 
, dot denotes the derivative with respect to 
x and ⋁ denotes the max symbol throughout the paper. In order to obtain a better estimator in terms of lowering variance in Monte Carlo simulation, which may have a faster rate of convergence, first, we use the algorithm proposed in Bishwal (2008) [
1]. Note that the Itô integral and the Fisk–Stratonovich (FS, henceforth; Fisk, while introducing the concept of quasimartingale, had the trapezoidal approximation and Stratonovich had the midpoint approximation, converging to the same mean square limit) integral are connected by
      
      where o is the Itô’s circle for the FS integral. We transform the Itô integral (the limit of the rectangular approximation to preserve the martingale property) in (
9) to the FS integral and apply FS-type trapezoidal approximation of the stochastic integral and rectangular rule-type approximation of the Lebesgue integrals and obtain the approximate likelihood
      
The Fisk–Stratonovich approximate maximum likelihood estimator (FSAMLE) based on 
 is defined as
      
Weak consistency as 
 and 
 and asymptotic normality as 
 and 
 of the FSAMLE were shown in Bishwal (2008) [
1]. Berry–Esseen bounds for the IAMLE and the FSAMLE for the Ornstein–Uhlenbeck processes were obtained in Bishwal and Bose (2001) [
11].
We shall use the following notations: , , C is a generic constant independent of  and other variables (it may depend on ). Throughout the paper,  denotes the derivative with respect to x and  denotes the derivative with respect to  of the function . Suppose that  denotes the true value of the parameter and . We assume the following conditions:
Assumption 1.
(A1) ,
         .
(A2)  for all 
where  for any integer r.
(A3) The diffusion process X is stationary and ergodic with invariant measure ν, i.e., for any g with  (A4)  for all .
(A5) 
(A6) f is continuously differentiable function in x up to order p for all θ.
(A7)  and all its derivatives are three times continuously differentiable with respect to θ for all . Moreover, these derivatives up to third order with respect to θ are of polynomial growth in x uniformly in θ.
The Fisher information is given by and for any , or any compact , (A8) The Malliavin covariance of the process is nondegenerate.
The Malliavin covariance matrix of a smooth random variable S is defined as , where  is the Malliavin derivative. The Malliavin covariance is nondegenerate if  is almost surely positive and, for any , one has  This, associated with the functional , is given by  where  and , respectively, satisfy In the case of independent observations, in order to prove the validity of asymptotic expansion, one usually needs a certain regularity condition for the underlying distribution, such as the Cramér condition; see Bhattacharya and Ranga Rao (1976) [
18]. This type of condition then ensures the regularity of the distribution and hence the smoothness assumption of the functional under the expectation whose martingale expansion is desired can be removed. This type of condition for dependent observations leads to the regularity of the distribution of a functional with nondegenerate Malliavin covariance, which is known in Malliavin calculus; see Ikeda and Watanabe (1989) [
19] and Nualart (1995) [
20]. Malliavin covariance is connected to the Hörmander condition, which is a sufficient condition for a second-order differential operator to be hypoelliptic; see Bally (1991) [
21]. For operators with analytic coefficients, this condition turns out to be also necessary, but this is not true for general smooth coefficients.
More precisely, let 
X be a differentiable ℝ-valued Wiener functional defined on a Wiener space. Assume that there exists a functional 
 such that
      
Thus, it is a regularity condition of the characteristic function, which is a consequence of the nondegeneracy of the Malliavin covariance in the case of Wiener functionals. The functional 
, which is a random variable satisfying 
, is a truncation functional extracting from the Wiener space, the portion on which the distribution is regular. If 
X is almost regular, one may take 
 nearly equal to one. Uniform degeneracy of the Malliavin covariance of the functional 
 can be shown under (A8); see Yoshida (1997) [
22].
Bishwal (2009) [
13] obtained the rate of convergence to normality of the Itô AMLE and the Fisk–Stratonovich AMLE of the order 
 and 
, respectively, under the regularity conditions given above with 
 for (A4). We obtain the rate of convergence to normality, i.e., Berry–Esseen bound of the order 
 for the QMLE 
 for arbitrary integer 
p.
We need the following lemma from Michel and Pfanzagl (1971) [
23] to prove our main results.
Lemma 1. Let  and η be any three random variables on a probability space  with . Then, for any , we have    2. Main Results
We start with some preliminary lemmas. Let 
L denote the generator of the diffusion process, 
The k-th iterate of L is denoted as . Its domain is . We set .
Stochastic Taylor formula (Kloeden and Platen (1992) [
24]): For a 
 times continuously differentiable function 
, we have for 
 and 
 Lemma 2. With , the stochastic Taylor expansion of  is given bywhere R denotes a function for which there exists a constant C such that  Proof.  Applying the stochastic Taylor formula of Florens-Zmirou (1989, Lemma 1) [
6], one obtains the result. See also Kloeden and Platen (1992) [
24].
Consider the following special cases:
Euler Scheme: For , 
Milstein Scheme: For , 
Simpson Scheme: For , 
Boole Scheme: For ,
 □
 Remark 1. For ,  This produces the CLSE. This estimator has been very well studied in the literature (see Shoji (1997) [10]).  Remark 2. Note that the Milstein scheme is equivalent to Stratonovich approximation of the stochastic integral after converting the Itô integral to the Stratonovich integral.
 Proof.  First, we show that, for 
,
        
We emphasize that the Itô formula is a stochastic Taylor formula of order 2. By the Itô formula, we have
        
        where
        
We employ Taylor expansion in the local neighborhood of 
. Let 
. Then, we have
        
        where
        
        and 
. Further
        
By Lemma 2, we have
        
		Further
        
		Hence
        
Observe that, with 
 we have
        
		On the other hand, with 
 we have
        
However, 
 using (A4) and (A3). On substitution, the last term is dominated by
        
By the same method, we have
        
Thus, the proof for 
 is complete. Next, we consider the general case 
. Denote
        
Observe that, by Lemma 2, we have
        
Thus, by combining the bounds for 
, 
 and 
, we have
        
□
 The following lemma is from Bishwal (2008) [
1].
      
Lemma 4. Then, under the conditions (A1)–(A8),  The following lemma follows from Theorem 7 in Yoshida (1997) [
22].
      
Lemma 5. Then, under the conditions (A1)–(A8),  Our main result is the following theorem.
Theorem 1. Under the conditions (A1)-(A8), for any , we have  Proof.  We start with 
 and 
 Let
        
By Taylor expansion, we have
        
        where 
. Since 
, hence we have
        
However, 
 from Lemma 4 (see also Pardoux and Veretennikov (2001) [
25] and Yoshida (2011) [
26]). It can be shown that 
 (see Altmeyer and Chorowski (2018) [
27]). Hence
        
Further, by Lemma 1 (b), we have
        
        since, by Lemmas 1 (a) and 5, we have
        
Choosing , we have the result.
On the other hand, by Taylor expansion, we have
        
        where 
 Since 
, hence we have
        
Let 
 in 
 as 
 and 
. Similar to Lemma 4, it can be shown that 
 (see also Pardoux and Veretennikov (2001) [
25] and Yoshida (2011) [
26]). It can be shown that 
 (see Altmeyer and Chorowski (2018) [
27]). Hence
        
Thus, by Lemma 1 (b), we have
        
        since, by Lemmas 1 (a) and 5, we have
        
Choosing , we have the result.
Now, we study the general case for arbitrary 
p. By Taylor expansion, we have
        
        where 
. Since 
, hence we have
        
Let 
 in 
 as 
 and 
. Similar to Lemma 4, it can be shown that 
 (see also Pardoux and Veretennikov (2001) [
25] and Yoshida (2011) [
26]). It can be shown that 
 (see Altmeyer and Chorowski (2018) [
27]). Hence
        
Further, by Lemma 1 (b), we have
        
        since, by Lemmas 1 (a) and  5, we have
        
Choosing , we have the result.    □
 Remark 3. With , for the Euler scheme, which produces the conditional least squares estimator, one obtains the rate  With , for the Milstein scheme, one obtains the rate  With , for the Simpson scheme, one obtains the rate  With , for the Boole scheme, one obtains the rate  Thus, the higher the p, the sharper the bound. Thus, the Itô/Euler scheme gives the first-order QMLE, the Milstein/Stratonovich scheme produces the second-order QMLE, the Simpson scheme produces the fourth-order QMLE and the Boole scheme produces the sixth-order QMLE. See Bishwal (2011) [28] for a connection of this area to the stochastic moment problem and hedging of generalized Black–Scholes options.