Parameter Estimation of Linear Stochastic Differential Equations with Sparse Observations

Yuecai Han; Zhe Yin; Dingwen Zhang

doi:10.3390/sym14122500

,

and

¹

School of Mathematics, Jilin University, Changchun 130012, China

²

Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Symmetry2022, 14(12), 2500;https://doi.org/10.3390/sym14122500

Version Notes

Order Reprints

Abstract

We consider parameter estimation for linear stochastic differential equations with independent experiments observed at infrequent and irregularly spaced follow-up times. The maximum likelihood method is used to obtain an asymptotically consistent estimator. A kernel-weighted score function is proposed for the parameter in drift terms. The strong consistency and the rate of convergence of the estimator are obtained. The numerical results show that the proposed estimator performs well with moderate sample sizes.

Keywords:

kernel-weighted estimation; linear stochastic differential equations; geometric Brownian motion; likelihood function

1. Introduction

To simulate the dynamic behavior of a complex system, linear stochastic differential equations (LSDEs) are frequently used. In many real-world applications, it is customary that the parameters that define the system must be estimated from the data. As an example, geometric Brownian motion (GBM) is one of the most popular stochastic processes and undoubtedly an effective instrument in modeling and predicting random changes in stock prices [1,2]. Both deterministic and stochastic components contribute to the pharmacokinetic and pharmacodynamic models: Although there are predictable trends in drug concentrations, it is not always possible to establish the precise concentration at any particular time [3].

In biometrics, a GBM model and an estimation procedure were developed for predicting the height growth of even-aged forest stands as part of a methodology for modeling growth in forest plantations [4].

Due to its growing use in a variety of domains, parameter estimation problems involving stochastic differential equations have received a lot of attention lately. Using the data, one should estimate the parameters that characterize the system. Several methods are proposed to evaluate the parameters, such as the least squares method [5,6,7,8,9], the maximum likelihood method [10,11,12,13,14], and the numerical approximation approach [15]. Several other methods, such as the generalized method of moments procedures [16,17], local linearization method [11,18], and MCMC methods [19], are also proposed.

Assume that n identical and independently distributed paths are observed. When a number of patients can be watched, for example, this situation can arise in pharmacokinetics. For each patient, a bolus of the medication is given, and the “path” of its diffusion through the body can be observed [3]. Such observations are typically sparse and only observed at infrequent and irregularly spaced follow-up times; the above methods are no longer applicable. In this case, we develop a computationally efficient method to deal with the observations with infrequent and irregularly spaced follow-up times. In this paper, we apply kernel methods to the parameter estimation of LSDEs. At the heart of the proposed approach is to “smooth” an individual’s contributions to the likelihood based on the distance of their observed time to the time of interest. The smoothing methods employed are where smoothing happens on an individual basis as compared to the population level, where all individuals are given the same weights. With a suitable choice of bandwidth, the consistency and asymptotic normality for the proposed estimator can be obtained. One can refer to [20,21] for statistical inference of diffusion processes.

In future research, we may consider parameter estimation for a more generalized drift term such as

f (X_{t}, θ)

and the nonparametric estimation for the drift function

f (X_{t})

where

f (\cdot)

is a measurable function based on identical and independently distributed observations.

Our paper is organized as follows. In Section 2, we propose an estimator for the drift parameter of the LSDE; we also obtain the consistency of

μ

and

σ

and show their asymptotic normality where the convergence rate is

{(n h_{n})}^{1 / 2}

, which is slower than

n^{1 / 2}

. In Section 3, numerical simulations are performed. Simulation findings show that the large sample approximations are suitable for usage in practice. Section 4 is the conclusion.

2. Models and Methods

2.1. Description of Models

Consider an LSDE model as follows

\{\begin{matrix} d S (t) = μ S (t) d t + σ S (t) d W (t), \\ S (0) = s_{0}, \end{matrix}

(1)

where

μ

is an unknown parameter,

σ

is a constant, and

W (t)

are independent standard Brownian motions.

W (t)

is characterized by the following properties: (1)

W (0) = 0

; (2)

W (t)

has independent increments, which is, for every

t > 0

, the future increments

W (t + u) - W (t)

,

u ⩾ 0

, are independent of the past values

W (s), s ⩽ t

, and

W (t + u) - W (t) \sim N (0, u)

; (3)

W (t)

is continuous in t. Under the condition of Lipschitz and linear growth, the LSDE (1) has a unique strong solution

S (t)

,

S (t) = s_{0} exp \{(μ - \frac{1}{2} σ^{2}) t + σ W_{j} (t)\} .

(2)

Let

\{S_{i} (t), s_{0 i}, μ_{i}, σ_{i}; i = 1, \dots, n\}

be n dependent copies of

{S (t), s_{0}, μ, σ}

. As is known to all (see, e.g., [22]), for given t,

S_{i} (t), i = 1, \dots, n

follow a lognormal distribution

log S_{i} (t) \sim N (log s_{0} + μ t - \frac{1}{2} σ^{2} t, σ^{2} t) .

(3)

We are aiming to use the observations to estimate

μ

, where the observations consist of

y = {y_{i} (t_{i k}); i = 1, \dots, n; k = 1, \dots, d_{i}}

, and

d_{i} < \infty

. The probability of the ith subject at time point

t_{i k}

is

p (y_{i} (t_{i k}); μ) = \frac{1}{\sqrt{2 π σ^{2} t_{i k}}} exp \{- \frac{(log y_{i} (t_{i k}) - ϕ {(t_{i k})}^{2}}{2 σ^{2} t_{i k}}\},

where

ϕ (t) = log s_{0} + μ t - \frac{1}{2} σ^{2} t

.

If the observations are continuous, consider a time point

t^{*}

. We know that

y_{i}^{*} = y (t^{*})

are independent and identically distributed variables on the same probability space. Then we get the likelihood function

\begin{matrix} L_{n} (y^{*}; μ) & = \prod_{i = 1}^{n} p (y_{i}^{*}; μ) \\ = \prod_{i = 1}^{n} \frac{1}{\sqrt{2 π σ_{i}^{2} t^{*}}} exp \{- \frac{(log y_{i}^{*} - ϕ {(t^{*})}^{2}}{2 σ_{i}^{2} t^{*}}\}, \end{matrix}

where

y^{*} = (y_{1}^{*}, y_{2}^{*}, \dots, y_{n}^{*})

. The log-likelihood function is

l_{n} (y^{*}; μ) = \frac{1}{n} \sum_{i = 1}^{n} [- \frac{{(log y_{i}^{*} - ϕ (t^{*}))}^{2}}{2 {(Σ_{i} (t^{*}))}^{2}} - log \sqrt{2 π} - log Σ_{i} (t^{*})],

where

Σ_{i} (t^{*}) = \sqrt{σ_{i}^{2} t^{*}}

and the score function is

\begin{matrix} B_{n} (y^{*}, μ) = & \frac{\partial l_{n} (y^{*}; μ)}{\partial μ} . \end{matrix}

(4)

2.2. Kernel Estimation with Forward and Lagged Observation

The data

y_{i} (t), i = 1, \dots, n

are usually not observed continuously, and it is almost impossible for each individual to be observed at

t^{*}

. Hence

l_{n} (y^{*}, θ)

is not computable from the observations. We propose a method that formalizes the forwarding and lagging strategy, with kernel weighting enabling the use of all available forward and lagged observations. We “smooth” the observations’ contribution to the likelihood based on the distance of their observation time to the time of interest. If data continues to be collected on subjects for which observation has occurred, as in the case of the recurrent event, we use the kernel to impute missing values using both forward and backward-lagged observations. We construct a smoothed log-likelihood function by using kernel estimation

\begin{matrix} l_{n} (y^{*}; μ) = - \sum_{i = 1}^{n} \int [\frac{{(K_{h_{n}} (s - t_{k}) log y_{i} (s) - ϕ_{i} (t^{*}))}^{2}}{2 Σ_{i}^{2} (t^{*})} + log \sqrt{2 π} + log Σ_{i}^{2} (t^{*})] d N_{i} (s), \end{matrix}

(5)

where

Σ_{i}^{2} (t^{*}) < \infty

is the variance of

y_{i}^{*}

,

K_{h_{n}} (t) = K ((t - t^{*}) / h_{n}) / h_{n}

,

h_{n}

is the bandwidth, and the kernel function

K (t)

is a symmetric probability density with support

[- 1, 1]

and mean 0 that bound the first derivative. In addition,

E [d N (t)] = λ (t) d t

, where

λ (t)

is twice continuously differentiable and strictly positive for

t \in [0, T]

. The scoring function is given by

U_{n} (μ) = \frac{1}{n} \sum_{i = 1}^{n} \int \frac{K_{h_{n}} (t^{*} - t) log y_{i} (t) - (log s_{0} + (μ - \frac{1}{2} Σ_{i}^{2} (t^{*})) t^{*})}{Σ_{i}^{2} (t^{*})} d N_{i} (t) .

(6)

Assume that the following conditions hold:

(A.1): $Θ_{μ_{0}}$ is an open sets of $R$ , and $Θ_{μ_{0}} = {μ : | μ - μ_{0} | < ρ}$ for some $ρ > 0$ and $μ_{0}$ is the true parameter.
(A.2): $λ^{*} (t)$ is twice continuously differentiable.
(A.3): $K (z)$ is a symmetric density function satisfying $\int_{- \infty}^{\infty} K {(z)}^{2} d z < \infty .$ In addition, $h_{n} \to 0$ , $n h_{n} \to \infty$ , $n h_{n}^{5} \to 0$ .

Condition (A.1) is a usual assumption for the proof of consistency, and condition (A.2) ensures the Taylor expansion of the score function to the second order. Our methods depend on a proper choice of bandwidth, which is shown in condition (A.3). The estimator

{\hat{μ}}_{n}

is obtained based on solving Equation (5) with a kernel bandwidth selected to obtain the consistency.

Lemma 1.

Under conditions (A.1)–(A.3), we have

E [{(n h_{n})}^{\frac{1}{2}} U_{n} (μ_{0})] = 0,

as

n \to \infty

.

Proof of Lemma 1.

From the smoothed likelihood function (5), we have the smoothed scoring function

\begin{matrix} {(n h_{n})}^{\frac{1}{2}} U_{n} (μ_{0}) \\ = & {(n h_{n})}^{\frac{1}{2}} \frac{1}{n} \sum_{i = 1}^{n} [\int \frac{K_{h_{n}} (t^{*} - t) log y_{i} (t) - (log s_{0} + (μ - \frac{1}{2} Σ_{i}^{2} (t^{*})) t^{*})}{Σ_{i}^{2} (t^{*})} d N_{i} (t)] \\ = & {(n h_{n})}^{\frac{1}{2}} \frac{1}{n} \sum_{i = 1}^{n} [\int \frac{\frac{1}{h_{n}} K (\frac{t^{*} - t}{h_{n}}) log y_{i} (t) - (log s_{0} + (μ - \frac{1}{2} Σ_{i}^{2} (t^{*})) t^{*})}{Σ_{i}^{2} (t^{*})} d N_{i} (t)] \\ = & I . \end{matrix}

Let

z = \frac{t^{*} - t}{h_{n}}

. We have

\begin{matrix} I = h_{n}^{\frac{1}{2}} n^{- \frac{1}{2}} \sum_{i = 1}^{n} [\int \frac{K (z) log y_{i} (t^{*} - h_{n} z) - (log s_{0} + (μ - \frac{1}{2} Σ_{i}^{2} (t^{*})) t^{*})}{Σ_{i}^{2} (t^{*})} d N_{i} (z)] . \end{matrix}

Define

F (r, s) = E [y (r - s) - (log s_{0} + (μ - Σ_{i}^{2} (t^{*}) / 2) r)] / Σ_{i}^{2} (t^{*})

. Obviously,

F (t^{*}, 0) = 0

. By taking expectations together with Taylor expansion,

\int_{- \infty}^{\infty} K (z) d z = 1

and

\int_{- \infty}^{\infty} z K (z) d z = 0

, we have

\begin{matrix} E [I] \\ = & h_{n}^{\frac{1}{2}} n^{- \frac{1}{2}} \sum_{i = 1}^{n} [\int \frac{K (z) log y_{i} (t^{*} - h_{n} z) - (log s_{0} + (μ - \frac{Σ_{i}^{2} (t^{*})}{2}) t^{*})}{Σ_{i}^{2} (t^{*})} \\ λ (t^{*} - h_{n} z) d (z)] \\ = & h_{n}^{\frac{1}{2}} n^{\frac{1}{2}} \frac{1}{σ} (- \frac{\partial F}{\partial s} |_{s = 0} h_{n} \frac{\partial λ}{\partial t} {|_{t = t^{*} h_{n}} + \frac{1}{2} \frac{\partial^{2} F}{\partial s^{2}} |}_{s = 0} h_{n}^{2}) \\ = & o (n^{\frac{1}{2}} h_{n}^{\frac{5}{2}}) . \end{matrix}

From condition (A.3), we have

E [I] = o (1)

. □

The following theorem shows the consistency of the proposed estimator

{\hat{μ}}_{n}

obtained based on solving Equation (5).

Theorem 1.

Under conditions (A.1)–(A.3),

{\hat{μ}}_{n}

admits the consistency as

n \to \infty

.

Proof of Theorem 1.

Solving Equation (5), we have

{\hat{μ}}_{n} t^{*} = \frac{1}{n h_{n}} \sum_{i = 1}^{n h_{n}} \frac{\int K_{h_{n}} (t - t^{*}) log y (t) d N_{i} (t)}{\int K_{h_{n}} (t - t^{*}) d N_{i} (t)} - log s_{0} + \frac{1}{2} σ^{2} t^{*} .

By properties of the kernel function

K (\cdot)

, we have

\begin{matrix} | {\hat{μ}}_{n} t^{*} - μ t^{*} | = & |\frac{1}{n h_{n}} \sum_{i = 1}^{n h_{n}} \frac{\int K_{h_{n}} (t - t^{*}) (log y (t) - log s_{0} - (μ - \frac{σ^{2}}{2}) t^{*}) d N_{i} (t)}{\int K_{h_{n}} (t - t^{*}) d N_{i} (t)}| \\ \leq & |\frac{1}{n h_{n}} \sum_{i = 1}^{n h_{n}} (log y (b_{i}) - log s_{0} - (μ - \frac{σ^{2}}{2}) t^{*})|, \end{matrix}

where

b_{i} \in [t^{*} - h_{n}, t^{*} + h_{n}]

is some constant. By Equation (3), we have

log y (t) \sim N (log s_{0} + (μ - \frac{σ^{2}}{2}) t, σ^{2} t) .

Hence, we have

log y (b_{i}) - log s_{0} - (μ - \frac{σ^{2}}{2}) t^{*} \sim N (a, σ^{2} t),

where

a = O (h_{n})

is some constant. By the Wiener–Khinchin law of large numbers, we have

| {\hat{μ}}_{n} t^{*} - μ t^{*} | = O (h_{n}),

which goes to zero, as

n \to \infty

. □

The following theorem shows the asymptotic normality of

{\hat{μ}}_{n}

.

Theorem 2.

Assume conditions (A.1)–(A.3) hold,

{\hat{μ}}_{n}

is consistent, and the asymptotic distribution of

{\hat{μ}}_{n}

satisfies

{(n h_{n})}^{\frac{1}{2}} ({\hat{μ}}_{n} - μ) \sim N (0, {(C_{n} (μ_{0}))}^{2} Σ (μ_{0})),

as

n \to \infty

, where

C_{n} (μ_{0}) = - {(\int_{0}^{1} \frac{\partial U_{n}}{\partial μ} (μ_{0} + λ (μ - μ_{0})) d λ)}^{- 1},

and

Γ (μ_{0}) = \int {(K (z))}^{2} \frac{1}{σ^{2}} t^{*} λ (t^{*} - h_{n} z) d z .

Proof of Theorem 2.

Let

{μ_{n}}

be a strongly consistent sequence of

μ

, i.e.,

μ_{n} \overset{a . s .}{\to} μ_{0}

, and

{μ_{n}} \in Θ_{μ_{0}}

. We can seek a solution

{\hat{μ}}_{n}

of the log-likelihood function

l_{n} (y^{*}, {\hat{μ}}_{n})

, and

{\hat{μ}}_{n}

is a strongly consistent sequence. Note that

U_{n} (μ) = \frac{1}{n} \sum_{i = 1}^{n} \int \frac{K_{h_{n}} (t^{*} - t) y_{i} (t) - (log s_{0} + (μ - \frac{1}{2} Σ_{i} {(t^{*})}^{2}) t^{*})}{Σ_{i}^{2} (t^{*})} d N_{i} (t),

we denote

U_{n} (μ) = \sum_{i = 1}^{n} ψ (y_{i}^{*}, μ)

. Expand

U_{n}

as

\begin{matrix} U_{n} (μ) = & U_{n} (μ_{0}) + \int_{μ_{0}}^{μ} \frac{\partial U_{n} (u)}{\partial μ} d u \\ = & U_{n} (μ_{0}) + \int_{0}^{1} \frac{\partial U_{n}}{\partial μ} (μ_{0} + λ (μ - μ_{0})) d λ (μ - μ_{0}) . \end{matrix}

Let

μ = {\hat{μ}}_{n}

, we have

U_{n} ({\hat{μ}}_{n}) = 0

. Then

{\hat{μ}}_{n} - μ = - {(\int_{0}^{1} \frac{\partial U_{n}}{\partial μ} (μ_{0} + λ (μ - μ_{0})) d λ)}^{- 1} U_{n} (μ_{0}) .

(7)

Multiply both sides of Equation (7) by

{(n h_{n})}^{\frac{1}{2}}

, we denote

C_{n} (μ_{0}) = - {(\int_{0}^{1} \frac{\partial U_{n}}{\partial μ} (μ_{0} + λ (μ - μ_{0})) d λ)}^{- 1} .

Then

{(n h_{n})}^{\frac{1}{2}} ({\hat{μ}}_{n} - μ) = C_{n} (μ_{0}) {(n h_{n})}^{\frac{1}{2}} U_{n} (μ_{0}) .

Hence we give the variance of

{(n h_{n})}^{\frac{1}{2}} U_{n} (μ_{0})

,

{(n h_{n})}^{\frac{1}{2}} U_{n} (μ_{0}) = \frac{{(n h_{n})}^{\frac{1}{2}}}{n} \sum_{i = 1}^{n} ψ (y_{i}^{*}, μ_{0}) = n^{\frac{1}{2}} (\frac{1}{n} \sum_{i = 1}^{n} h_{n}^{\frac{1}{2}} ψ (y_{i}^{*}, μ_{0})) .

From Lemma 1, we have that

E [h_{n}^{\frac{1}{2}} ψ (y_{i}^{*}, μ_{0})] = 0, i = 1, \dots, n

. By central limit theorem,

var [{(n h_{n})}^{\frac{1}{2}} U_{n} (μ_{0})] = var [h_{n}^{\frac{1}{2}} ψ (y_{i}^{*}, μ_{0})]

, and we denote

ϕ_{i} (t) = y_{i} (t) - (log s_{0} + (μ_{0} - \frac{1}{2} σ^{2}) t^{*})

. Then

\begin{matrix} var [h_{n}^{\frac{1}{2}} ψ (y_{i}^{*}, μ_{0})] \\ = & E [{(h_{n}^{\frac{1}{2}} ψ (y_{i}^{*}, μ_{0}))}^{2}] \\ = & h_{n} E [{(\sum_{k = 1}^{n} \frac{K_{h_{n}} (t^{*} - t_{i k}) (y_{i} (t_{i k}) - (log s_{0} + (μ_{0} - \frac{1}{2} Σ_{i}^{2} (t^{*})) t^{*}))}{Σ_{i}^{2} (t^{*})})}^{2}] \\ = & h_{n} \frac{1}{σ^{4}} (\int_{t_{1} \neq t_{2}} K_{h_{n}} (t^{*} - t_{1}) K_{h_{n}} (t^{*} - t_{2}) E [ϕ_{i} (t_{1}) ϕ_{i} (t_{2})] E [d N (t_{1}) d N (t_{2})] \\ + \int_{t_{1} = t_{2}} {(K_{h_{n}} (t^{*} - t_{1}))}^{2} λ (t_{1}) E [ϕ_{i} {(t_{1})}^{2}] d t_{1}) . \end{matrix}

Assume that for

t_{1} \neq t_{2}

,

p r (d N (t_{1}) = 1 | N (t_{2}) - N (t_{2}^{-}) = 1) = g (t_{1}, t_{2}) d t_{1}

, where

p r

means the probability,

g (t_{1}, t_{2})

is continuous for

t_{1} \neq t_{2}

, and

g (t_{1} \pm, t_{2} \pm)

exists. Then

\begin{matrix} E [{(h_{n}^{\frac{1}{2}} ψ (y_{i}^{*}, μ_{0}))}^{2}] \\ = & h_{n} (\frac{1}{σ^{4}} \int_{t_{1} \neq t_{2}} K_{h_{n}} (t^{*} - t_{1}) K_{h_{n}} (t^{*} - t_{2}) E [ϕ_{i} (t_{1}) ϕ_{i} (t_{2})] g (t_{1}, t_{2}) E [d N (t_{2})] d t_{1} \\ + & \frac{1}{σ^{4}} \int_{t_{1} = t_{2}} {(K_{h_{n}} (t^{*} - t_{1}))}^{2} λ (t_{1}) E [ϕ_{i} {(t_{1})}^{2}] d t_{1}) \\ = & h_{n} (I_{1} + I_{2}) . \end{matrix}

Using a change of variables, we have

h_{n} I_{1} = O (h_{n})

.

With notation

E [{(ϕ_{i} (t^{*} - h_{n} z))}^{2}] = σ^{2} t^{*}

, we have

\begin{matrix} h_{n} I_{2} = & h_{n} \frac{1}{σ^{4}} \int {(h_{n})}^{- 2} {(K (z))}^{2} E [{(ϕ_{i} (t^{*} - h_{n} z))}^{2}] λ (t^{*} - h_{n} z) h_{n} d z \\ = & \int {(K (z))}^{2} \frac{1}{σ^{2}} t^{*} λ (t^{*} - h_{n} z) d z . \end{matrix}

From the Lyapunov central limit theorem, we have that

{(n h_{n})}^{\frac{1}{2}} U_{n} (μ_{0})

converges to a continuous Gaussian process

Z \sim N (0, Γ (μ_{0}))

. Hence, we have

{(n h_{n})}^{\frac{1}{2}} ({\hat{μ}}_{n} - μ) = C_{n} (μ_{0}) {(n h_{n})}^{\frac{1}{2}} U_{n} (μ_{0}) \sim N (0, {(C_{n} (μ_{0}))}^{2} Γ (μ_{0})) .

□

Remark 1.

When there are several observations

d_{i} > 1

, one can estimate the drift parameter μ by a standard maximum likelihood method. Let

z_{i k} = log (\frac{y_{i} (t_{i k + 1})}{y_{i} (t_{i k})}),

for

k = 1, \dots, d_{i}

and

i = 1, 2, \dots, n

. Thus, conditional on the observation times,

z_{i k} \sim N ((μ - σ^{2} / 2) (t_{i k + 1} - t_{i k}), σ^{2} (t_{i k + 1} - t_{i k}))

and they are independent. For example, if we reparameterize μ as

ν = μ - σ^{2} / 2

, the MLE for ν would be given by

\hat{ν} = \frac{\sum_{i = 1}^{n} \sum_{j = 1}^{d_{i} - 1} z_{j}^{(i)} / Δ_{j}^{(i)}}{\sum_{i = 1}^{n} (d_{i} - 1)},

(8)

where

Δ_{k}^{(i)} = t_{i k + 1} - t_{i k}

. Then μ is estimated by

\hat{μ} = \hat{ν} + σ^{2} / 2

.

Remark 2.

When there is only one observation

d_{i} = 1

, the estimator proposed in Equation (8) is not effective. Our estimator performs reasonably well in this extreme case, and we could give an explicit asymptotic variance for our estimator, which is

σ^{2} t^{*}

.

3. Simulation

In this section, utilizing both forward and backward-lagged observation, we examine the kernel estimator. We generate 1000 datasets, and each dataset consists of

n = 100, 400, 900

subjects with different bandwidths (BD). The process is generated through model (1); we set the initial condition

s_{0} = 1

,

μ = 2

and

σ = 0.02

. Then the solution is

S (t) = s_{0} exp {1.9998 t + 0.02 w (t)},

(9)

where

w (t)

is a standard Brownian motion. The number of observation times for each subject is a Poisson distributed with an intensity rate of 5. The time points of each individual’s observation are generated from a uniform distribution, Unif

(0, 1)

. The outcomes from other models’ parameter selections are not mentioned because they are essentially identical. All simulations were performed on a laptop running R 4.2.9 with 8G of RAM.

Based on Theorems 1 and 2, we obtain a kernel estimator with asymptotically negligible bias and employ bandwidths in the range

(n^{- 1}, n^{- \frac{1}{5}})

when calculating

{\hat{μ}}_{n}

using the smoothed likelihood score function (6). The kernel function we choose is the Epanechnikov kernel, which is

K (x) = \frac{3}{4} {(1 - x^{2})}_{+}

. The usage of additional kernel functions has little effect on the estimator’s empirical performance, according to additional simulations (not published).

The simulation results show that the estimates for the parameter in the model are accurate. Table 1 summarizes the main findings from over 1000 simulations. We note that the bias diminishes and is minor as the sample size grows. The performance improves the larger sample sizes and smaller bandwidths. The overall parameter estimates are evaluated by the bias and relative bias (RB), which are defined as

B i a s (\hat{μ}) = μ_{0} - \hat{μ}, R B (\hat{μ}) = \frac{| B i a s (\hat{μ}) |}{| μ_{0} |},

where

θ_{0}

denotes the true parameter.

Table 1. Simulation results with different n and bandwidths.

4. Conclusions

In this paper, we have presented kernel-weighting methods for the estimation of the LSDE model (1) in repeatable experiments when the observation time is a random variable, and the number of observations of each individual is uncertain or even sparse. This is a real improvement because the past literature usually supposed that observation intervals are equally spaced and could not deal with the sparse observations. We consider the maximum likelihood estimation of the drift parameter. This method has some assumptions, and we give the asymptotic normality of the proposed estimator. In numerical studies, we set the true parameter

μ_{0} = 2

,

σ_{0} = 0.02

, and the initial condition

s_{0} = 1

for each individual with sparse observations (the frequency of observation follows a Poisson distribution with mean 5). Using the smoothed scoring function, we obtain the estimation of the drift parameter.

Author Contributions

All authors contributed equally to this work. All authors have read and agreed to the published version of the manuscript.

Funding

This work was partially supported by NSFC (grants 11871244), and the Fundamental Research Funds for the Central Universities, JLU.

Data Availability Statement

There is no data used in this paper.

Conflicts of Interest

There are no competing interest to declare that arose during the preparation or publication process of this article.

References

Black, F.; Scholes, M. The pricing of options and corporate liabilities. J. Political Econ. 1973, 81, 637–654. [Google Scholar] [CrossRef]
Merton, R.C. Theory of rational option pricing. Bell J. Econ. Manag. Sci. 1973, 4, 141–183. [Google Scholar] [CrossRef]
Donnet, S.; Samson, A. A review on estimation of stochastic differential equations for pharmacokinetic/pharmacodynamic models. Adv. Drug Deliv. Rev. 2013, 65, 929–939. [Google Scholar] [CrossRef] [PubMed]
Garcia, O. A stochastic differential equation model for the height growth of forest stands. Biometrics 1983, 39, 1059–1072. [Google Scholar] [CrossRef]
Hu, Y.; Long, H. Least squares estimator for Ornstein–Uhlenbeck processes driven by α-stable motions. Stoch. Process. Their Appl. 2009, 119, 2465–2480. [Google Scholar] [CrossRef]
Hu, Y.; Nualart, D.; Zhou, H. Drift parameter estimation for nonlinear stochastic differential equations driven by fractional Brownian motion. Stochastics 2019, 91, 1067–1091. [Google Scholar] [CrossRef]
Long, H.; Ma, C.; Shimizu, Y. Least squares estimators for stochastic differential equations driven by small lévy noises. Stoch. Process. Their Appl. 2017, 127, 1475–1495. [Google Scholar] [CrossRef]
Neuenkirch, A.; Tindel, S. A least square-type procedure for parameter estimation in stochastic differential equations with additive fractional noise. Stat. Inference Stoch. Process. 2014, 17, 99–120. [Google Scholar] [CrossRef]
Gallant, A.R.; Long, J.R. Estimating stochastic differential equations efficiently by minimum chi-squared. Biometrika 1997, 84, 125–141. [Google Scholar] [CrossRef]
Elerian, O.; Chib, S.; Shephard, N. Likelihood inference for discretely observed nonlinear diffusions. Econometrica 2001, 69, 959–993. [Google Scholar] [CrossRef]
Shoji, I.; Ozaki, T. A statistical method of estimation and simulation for systems of stochastic differential equations. Biometrika 1998, 85, 240–243. [Google Scholar] [CrossRef]
Shimizu, Y. M-estimation for discretely observed ergodic diffusion processes with infinitely many jumps. Stat. Inference Stoch. Process. 2006, 9, 179–225. [Google Scholar] [CrossRef]
Shimizu, Y.; Yoshida, N. Estimation of parameters for diffusion processes with jumps from discrete observations. Stat. Inference Stoch. Process. 2006, 9, 227–277. [Google Scholar] [CrossRef]
Yacine, A.S. Maximum likelihood estimation of discretely sampled diffusions: A closed-form approximation approach. Econometrica 2002, 70, 223–262. [Google Scholar]
Milshtein, G.N. A method of second-order accuracy integration of stochastic differential equations. Theory Probab. Its Appl. 1979, 23, 396–401. [Google Scholar] [CrossRef]
Andersen, T.G.; Sørensen, B.E. Gmm estimation of a stochastic volatility model: A Monte Carlo study. J. Bus. Econ. Stat. 1996, 14, 328–352. [Google Scholar]
Hu, Y.; Xi, Y. Estimation of all parameters in the reflected Ornstein–Uhlenbeck process from discrete observations. Stat. Probab. Lett. 2021, 174, 109099. [Google Scholar] [CrossRef]
Shoji, I. Approximation of continuous time stochastic processes by a local linearization method. Math. Comput. 1998, 67, 287–298. [Google Scholar] [CrossRef]
Martin, J.; Wilcox, L.C.; Burstedde, C.; Ghattas, O. A stochastic newton MCMC method for large-scale statistical inverse problems with application to seismic inversion. Siam J. Sci. Comput. 2012, 34, 1460–1487. [Google Scholar] [CrossRef]
Brown, B.M.; Hewitt, J.I. Asymptotic likelihood theory for diffusion processes. J. Appl. Probab. 1975, 12, 228–238. [Google Scholar] [CrossRef]
Bladt, M.; Sørensen, M. Statistical inference for discretely observed markov jump processes. J. R. Stat. Soc. Ser. 2005, 67, 395–410. [Google Scholar] [CrossRef]
Øksendal, B. Stochastic Differential Equations: An Introduction with Applications; Springer: Berlin/Heidelberg, Germany, 2006. [Google Scholar]

Table 1. Simulation results with different n and bandwidths.

n	BD	$\hat{μ}$	Bias ( $\hat{μ}$ )	RB ( $\hat{μ}$ )
100	$n^{- 0.25}$	2.095	0.095	0.048
	$n^{- 0.3}$	2.065	0.065	0.033
400	$n^{- 0.25}$	2.052	0.052	0.026
	$n^{- 0.3}$	2.030	0.030	0.015
900	$n^{- 0.25}$	2.020	0.020	0.010
	$n^{- 0.3}$	2.020	0.020	0.010

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Parameter Estimation of Linear Stochastic Differential Equations with Sparse Observations

Abstract

1. Introduction

2. Models and Methods

2.1. Description of Models

2.2. Kernel Estimation with Forward and Lagged Observation

3. Simulation

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics