Controlled Parameter Estimation for The AR(1) Model with Stationary Gaussian Noise

Lin Sun; Chunhao Cai; Min Zhang

doi:10.3390/fractalfract6110643

,

and

¹

School of Mathematics and Statistics, Guangdong University of Technology, Guangzhou 510520, China

²

School of Mathematics (Zhuhai), Sun Yat-sen University, Zhuhai 519082, China

³

School of Mathematics, Shanghai University of Finance and Economics, Shanghai 200433, China

^*

Author to whom correspondence should be addressed.

Fractal Fract.2022, 6(11), 643;https://doi.org/10.3390/fractalfract6110643

This article belongs to the Special Issue Fractional Processes and Multidisciplinary Applications

Version Notes

Order Reprints

Abstract

This paper deals with the maximum likelihood estimator for the parameter of first-order autoregressive models driven by the stationary Gaussian noises (Colored noise) together with an input. First, we will find the optimal input that maximizes the Fisher information, and then, with the method of the Laplace transform, both the asymptotic properties and the asymptotic design problem of the maximum likelihood estimator will be investigated. The results of the numerical simulation confirm the theoretical analysis and show that the proposed maximum likelihood estimator performs well in finite samples.

Keywords:

Laplace transform; fractional Gaussian noise; optimal input

1. Introduction

The experiment design has been given a great deal of interest over the last decades from the early statistical literature [1,2,3,4] as well as in the engineering literature [5,6,7]. The classical approach for experiment design consists of two-step procedures: Maximize Fisher information under the energy constraint of the input and find an adaptive estimator that is asymptotically normal with an optimal convergence rate, and the variance achieves the inverse of this Fisher information, as presented in [8].

In the research of [9], the authors showed that real inputs exhibit long-range dependence: the behavior of a real process after a given time t depends on the entire history of the process up to time t, the classical examples are presented in finance [10,11,12], that is why the researchers considered controlled problems not only with white noise or standard Brownian motion, but also with fractional type noise such as the fractional Brownian motion [13,14]. The applications of fractional case can be seen in [15,16,17,18]. Let us take [13] for the example: The authors considered the controlled drift estimation problem in fractional Ornstein–Ulenbeck (FOU) process:

d X_{t} = - ϑ X_{t} d t + u (t) d t + d B_{t}^{H}, ϑ > 0, t \in [0, T]

(1)

where

B_{t}^{H}

is a fractional Brownian motion with a known Hurst parameter

H \in (0, 1)

, whose covariance function is

E (B_{t}^{H} B_{s}^{H}) = \frac{1}{2} (t^{2 H} + s^{2 H} {- | t - s |}^{2 H}), s, t \in [0, T] .

u (t)

is a deterministic function in a control space and

ϑ

is the unknown drift parameter. They have achieved the goal of the experiment design.

However, in the real world, we always try to deal with high-frequency or low-frequency data, but not with continuous ones, as presented in the previous example. So, can we find a discrete model which will be pretty close to (1)? To achieve this goal, we apply the Euler approximation with discrete time

t = Δ, 2 Δ, \dots, N Δ (= T)

to

X_{t}

, then we have a time series:

X_{i Δ} = β X_{(i - 1) Δ} + Δ u ((i - 1) Δ) + η_{i Δ}^{H}

(2)

where

β = 1 - ϑ Δ

,

η_{i Δ}^{H} = B_{i Δ}^{H} - B_{(i - 1) Δ}^{H}

is a fractional Gaussian noise with distance

Δ

. When

ϑ

is a positive constant, we can take some special

Δ

such that

| β | < 1

then the Equation (2) is a controlled AR(1) (Autoregressive model of order 1) process with fractional Gaussian noise. For simplicity, we rewrite this model with

X_{n} = ϑ X_{n - 1} + u (n) + ξ_{n}, 0 < | ϑ | < 1, X_{0} = 0,

(3)

where

ξ = (ξ_{n}, n \in Z)

is the fractional Gaussian noise (fGn) with the covariance function defined in (20) (when the variance will not affect the final results, here we always suppose that it will be 1). In fact, according to [19] we can extend the noise

ξ = (ξ_{n}, n \in Z)

to the centered regular stationary Gaussian noise with

\int_{- π}^{π} | log f_{ξ} (λ) | d λ < \infty,

(4)

where

f_{ξ} (λ)

is the spectral density of

ξ

. A similar change-point and Kalman–Bucy problem can be found in [20,21].

Now, let us return to the model (3), the same as the function

u (t)

in (1),

u (n)

denotes a deterministic function of

n \in Z

. Obviously, when considering the problem of estimating the unknown parameter,

ϑ

with the observation data

X^{(N)} = (X_{n}, n = 1, \dots, N)

, if

u (n) = 0

, it has been solved in [19]. When

u (n) \in U_{N}

– the control space defined in (17), Let us denote

L (ϑ, X^{(N)})

the likelihood function for

ϑ

, then the Fisher information can be written as

I_{N} (ϑ, u) = - E_{ϑ} \frac{\partial^{2}}{\partial ϑ^{2}} ln L (ϑ, X^{(N)}) .

Therefore, we denote

J_{N} (ϑ) = sup_{u \in U_{N}} I_{N} (ϑ, u) .

our main goal is to find a function, say,

\hat{u} \in U_{N}

such that

I_{N} (ϑ, \hat{u}) = J_{N} (ϑ)

and then with this

\hat{u}

we will find an adapted estimator

{\bar{ϑ}}_{N}

of the parameter

ϑ

, which is asymptotically efficient in the sense that, for any compact set

K = {ϑ | - 1 < ϑ < 1}

,

sup_{ϑ \in K} J_{N} (ϑ) E_{ϑ} {({\bar{ϑ}}_{N} - ϑ)}^{2} = 1 + o (1),

(5)

as

N \to \infty

.

In this paper, with the computation of the Laplace transform, we will find

\hat{u}

for the model (3) and we check that the Maximum Likelihood Estimator (MLE) satisfy (5).

Remark 1.

Here, we assume that the covariance structure of the noise ξ is known. In fact, if this covariance depends only on one parameter, for example, the Hurst parameter H of the fractional Gaussian noise presented in Section 4 we can estimate this parameter with the log-periodogram method [22] or with generalized quadratic variation [23] and study the plug-in estimator. All details can be found in [24].

Remark 2.

In this paper, we will not estimate the function

u (n)

, but we will simply find one function that maximizes Fisher information.

The organization of this paper is as follows. In Section 2, we give some basic results of regular stationary noise

ξ_{n}

, find the likelihood function, and the formula for Fisher information. Section 3 provides the main results of this paper, and Section 4 shows some simulation examples to show the performance of the proposed MLE. All proofs are collected in Appendix A.

2. Preliminaries and Notations

2.1. Stationary Gaussian Sequences

First of all, let us construct the connection between stationary Gaussian noise and the independent case. Suppose that the covariance of the stationary Gaussian process

ξ = {(ξ_{n})}_{n \geq 1}

is defined by

E ξ_{m} ξ_{n} = c (m, n) = ρ (| n - m |), ρ (0) = 1 .

(6)

when it is positive, then there exists an associated innovation sequence

{(σ_{n} ε_{n})}_{n \geq 1}

where

ε_{n} \sim N (0, 1)

,

n \geq 1

are independent, defined by the following relations:

σ_{1} ε_{1} = ξ_{1}, σ_{n} ε_{n} = ξ_{n} - E (ξ_{n} | ξ_{1}, ξ_{2}, \dots, ξ_{n - 1}), n \geq 2 .

From the theorem of Normal Correlation (Theorem 13.1, [25]) that there exists a deterministic kernel

k = (k (n, m), n \geq 1, m \leq n)

such that

k (n, n) = 1

and

σ_{n} ε_{n} = \sum_{m = 1}^{n} k (n, m) ξ_{m} .

(7)

For

n \geq 1

, we will denote by

β_{n - 1}

the partial correlation coefficient

β_{n - 1} = - k (n, 1) .

(8)

As with the Levinson–Durbin algorithm (see [26]), we have the following relationship between

k (\cdot, \cdot)

, the covariance function

ρ (\cdot)

defined in (6), the sequence of partial correlation coefficients

{(β)}_{n \geq 1}

and the variances of innovation

{(σ_{n}^{2})}_{n \geq 1}

:

σ_{n}^{2} = \prod_{m = 1}^{n - 1} (1 - β_{m}^{2}), n \geq 2, σ_{1} = 1,

(9)

\sum_{m = 1}^{n} k (n, m) ρ (m) = β_{n} σ_{n}^{2},

(10)

k (n + 1, n + 1 - m) = k (n, n - m) - β_{n} k (n, m)

(11)

Since the covariance matrix of

ξ_{n}

is positively defined, there also exists an inverse deterministic kernel

K = (K (n, m), n \geq 1, m \leq n)

such that

ξ_{n} = \sum_{m = 1}^{n} K (n, m) σ_{m} ε_{m} .

(12)

The relationship of the kernel k and K can be found in [19].

Remark 3.

It is worth mentioning that the condition (4) implies that

\sum_{n \geq 1} β_{n}^{2} < \infty .

This condition is theoretically verified for classical autoregressive-moving-average (ARMA) noises. To our knowledge, no explicit form of the partial autocorrelation coefficients for fractional Gaussian noise (fGn) is known, but because the explicit formula of the spectral density of fGn sequences has been exhibited in [27], condition (4) is fulfilled for any Hurst index

H \in (0, 1)

. For very similar fractional autoregressive integrated moving average (fractional ARIMA) processes, it has been proven that

β_{n} = O (1)

in [28].

2.2. Model Transformation

Let us define the process

Z = (Z_{n}, n \geq 1)

so that

Z_{n} = \sum_{m = 1}^{n} k (n, m) X_{m}, n \geq 1,

(13)

where

k (n, m)

is the kernel defined in (7). Similar to (12), we have

X_{n} = \sum_{m = 1}^{n} K (n, m) Z_{m} .

(14)

From the Equalities (13) and (14) the process

Z = (Z_{n}, n \geq 1)

has the same filtration of

X = (X_{n}, n \geq 1)

. In the following parts, let the observation be

(Z_{1}, Z_{2}, \dots, Z_{N})

. Actually, it was shown in [19] that the process Z can be considered as the first component of a 2-dimensional AR(1) process

ζ = (ζ_{n}, n \geq 1)

, which is defined by:

ζ_{n} = (\begin{matrix} Z_{n} \\ \sum_{r = 1}^{n - 1} β_{r} Z_{r} \end{matrix}) .

It is not hard to obtain that

ζ_{n}

is a 2-dimensional Markov process, which satisfies the following equation:

ζ_{n} = A_{n - 1} ζ_{n - 1} + b v (n) + b σ_{n} ϵ_{n}, n \geq 1, ζ_{0} = 0_{2 \times 1},

(15)

with

A_{n} = (\begin{matrix} ϑ & ϑ β_{n} \\ β_{n} & 1 \end{matrix}), b = (\begin{matrix} 1 \\ 0 \end{matrix})

(16)

and

ϵ_{n} \sim N (0, 1)

are independent. Following from the idea of [13] we will define the control space

V_{N}

of the function

v (n)

:

V_{N} = \{v (n) | \frac{1}{N} \sum_{n = 1}^{N} {|\frac{v (n)}{σ_{n + 1}}|}^{2} \leq 1\} .

From the control space of

V_{N}

we can define that for the function

u (n)

:

U_{N} = \{u (n) | \frac{1}{N} \sum_{n = 1}^{N} {|\frac{\sum_{m = 1}^{n} k (n, m) u (m)}{σ_{n + 1}}|}^{2} \leq 1\},

(17)

where

| \cdot |

is just the absolute value.

2.3. Fisher Information

As we have interpreted, the observation will be the first component of the process

ζ = (ζ_{n}, n \geq 1)

. Now, from Equation (15), it is easy to write the likelihood function

L (ϑ, X^{(N)})

, which depends on the function

v (n)

:

L (ϑ, X^{(N)}) = \prod_{n = 1}^{N} \frac{1}{\sqrt{2 π σ_{n}^{2}}} exp (- \frac{1}{2} {(\frac{b^{^{*}} (ζ_{n} - A_{n - 1} ζ_{n - 1} - b v (n))}{σ_{n}})}^{2}) .

(18)

Consequently, Fisher information

I_{N} (ϑ, v)

can be written as

I_{N} (ϑ, v) = - E_{ϑ} \frac{\partial^{2}}{\partial ϑ^{2}} ln L (ϑ, X^{(N)}) = E_{ϑ} \sum_{n = 1}^{N - 1} {(\frac{a_{n}^{*} ζ_{n}}{σ_{n + 1}})}^{2},

(19)

where

a_{n} = (\begin{matrix} 1 \\ β_{n} \end{matrix})

.

3. Main Results

In this part, we will present the main results of this paper. First of all, from the presentation of Fisher information (19), we have the following.

Theorem 1.

The asymptotical optimal input in the class of control

U_{T}

is

u_{o p t} (n) = \sum_{m = 1}^{n} K (n, m) σ_{m + 1}

for

0 < ϑ < 1

and

u_{o p t} (n) = {(- 1)}^{n} \sum_{m = 1}^{n} K (n, m) σ_{m + 1}

or

u_{o p t} (n) = {(- 1)}^{n + 1} \sum_{m = 1}^{n} K (n, m) σ_{m + 1}

for

- 1 < ϑ < 0

. Furthermore,

lim_{N \to \infty} \frac{J_{N} (ϑ)}{N} = I (ϑ),

where

I (ϑ) = \frac{1}{1 - ϑ^{2}} + \frac{1}{{(1 - ϑ)}^{2}}

for

0 < ϑ < 1

and

I (ϑ) = \frac{1}{1 - ϑ^{2}} + \frac{1}{{(1 + ϑ)}^{2}}

for

- 1 < ϑ < 0

.

Remark 4.

Theorem 1 can be generalized to the AR(p) case with the norm of the Fisher information matrix, but this purpose is not as clear as AR(1), that is: when the Fisher information is larger, the error will be smaller. For this reason, we will illustrate only the result of the first order, but not of the order p.

From Theorem 1, since the optimal input does not depend on the unknown parameter

ϑ

, we can consider

{\bar{ϑ}}_{N}

as the MLE

{\hat{ϑ}}_{N}

. The following theorem states that

{\hat{ϑ}}_{N}

will reach the efficiency of (5).

Theorem 2.

With the optimal input

u_{o p t} (n)

defined in Theorem 1, for

0 < | ϑ | < 1

, the MLE

{\hat{ϑ}}_{N}

has the following properties:

${\hat{ϑ}}_{N}$ is strong consistency, that is, ${\hat{ϑ}}_{N} \overset{a . s .}{\to} ϑ$ as $N \to \infty$ .
${\hat{ϑ}}_{N}$ is uniformly consistent on compact $K \subset R$ , i.e. for any $ν > 0$

$lim_{N \to \infty} sup_{ϑ \in K} P_{ϑ}^{N} \{|{\hat{ϑ}}_{N} - ϑ| > ν\} = 0$
${\hat{ϑ}}_{N}$ is uniformly on compacts asymptotically normal, i.e., as $N \to \infty$ ,

$lim_{N \to \infty} sup_{ϑ \in K} |E_{ϑ} f (\sqrt{N} ({\hat{ϑ}}_{N} - ϑ) - E f (ξ))| = 0, \forall f \in C_{b},$

where ξ is a zero mean Gaussian random variable with variance $I^{- 1} (ϑ)$ defined in Theorem 1. Moreover, we have the uniform on $ϑ \in K$ convergence of the moments: for any $q > 0$ ,

$lim_{N \to \infty} sup_{ϑ \in K} |E_{ϑ} {|\sqrt{N} ({\hat{ϑ}}_{N} - ϑ)|}^{q} - E {| ξ |}^{q}| = 0 .$

Remark 5.

From Theorem 2, we can see that the asymptotical properties of the MLE do not depend on the structure of the noise, which is the same as described in [19].

4. Simulation Study

In this section, Monte-Carlo simulations are done just for the verification of the asymptotical normality of the MLE

{\hat{ϑ}}_{N}

with different Gaussian noise such as AR(1), MA(1) and fractional Gaussian noise(fGn). When

β_{n}

defined in (8) for the ARMA case is explicit, we can easily obtain the result of MLE, so here we just take the fGn as an example. In fact, the covariance function of fGn is

ρ (| n - m |) = \frac{1}{2} {(| m - n + 1 |}^{2 H} {- 2 | m - n |}^{2 H} {+ | m - n - 1 |}^{2 H}) .

(20)

As presented in Remark 5, the asymptotical normality for the MLE does not depend on the structure of the stationary noise, which means it does not depend on the Hurst parameter H. So, different from [29] and other LSE methods, which have a change at

H = 2 / 3

or another point in the fractional case, we only need to verify the asymptotical normality with only one fixed H, and we take this value

H = 0.65

. Here we compare the different

ϑ = 0.7

and

ϑ = - 0.7

in Figure 1 and Figure 2.

Figure 1. Histogram of the statistic

Φ (N, ϑ, X)

with

ϑ = - 0.7

and

N = 4000

.

Figure 2. Histogram of the statistic

Φ (N, ϑ, X)

with

ϑ = 0.7

and

N = 4000

.

Even if we have got the optimal input

u_{o p t} (n)

in Theorem 1, our simulation will not use the initial model (1) because obviously this

u_{o p t} (n)

is so complicated in the fractional case. We will compute our MLE with the transformation (15) with the corresponding

v_{o p t} (n) = σ_{n + 1}

using the method of Wood and Chan (see [30]) for the simulation of fractional Gaussian noise. All the simulations will be the same as presented in [19].

Remark 6.

From the two figures, we can see that the statistical error of the MLE is asymptotically normal, and we also have verified that the variance is nearly the same as the inverse of the Fisher information.

5. Conclusions

With the approximation of the controlled fractional Ornstein–Ulenbeck model in (1), we considered the optimal input problem in the AR(1) process driven by stationary Gaussian noise. We have found the controlled function, which maximizes the Fisher information, and with the Laplace Transform, we have proved the asymptotical normality of the MLE, whose asymptotical variance does not depend on the structure of the noise. Our future study will focus on the non-Gaussian case, such as the general fractional ARIMA, and so on.

Author Contributions

Methodology, L.S., M.Z. and C.C.; formal analysis, M.Z.; Writing—original draft, C.C.; Writing—review and editing, L.S. and M.Z. All authors have read and agreed to the published version of the manuscript.

Funding

Lin Sun is supported by the Humanities and Social Sciences Research and Planning Fund of the Ministry of Education of China No. 20YJA630053.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The authors would like to thank the editor and reviewers for their valuable comments, which improved the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

The appendix provides the proofs of Theorems 1 and 2. Without special note, we only consider

0 < ϑ < 1

, for

- 1 < ϑ < 0

the proofs will be the same.

Appendix A.1. Proof of Theorem 1

To prove Theorem 1, we separate the Fisher Information of (19) into two parts:

\begin{matrix} I_{N} (ϑ, v) & = & E_{ϑ} \sum_{n = 1}^{N - 1} {(\frac{a_{n}^{*} ζ_{n} - a_{n}^{*} E_{ϑ} ζ_{n} + a_{n}^{*} E_{ϑ} ζ_{n}}{σ_{n + 1}})}^{2} \\ = & E_{ϑ} \sum_{n = 1}^{N - 1} {(\frac{a_{n}^{*} (ζ_{n} - E_{ϑ} ζ_{n})}{σ_{n + 1}})}^{2} + \sum_{n = 1}^{N - 1} {(\frac{a_{n}^{*} E_{ϑ} ζ_{n}}{σ_{n + 1}})}^{2} \\ = & E_{ϑ} \sum_{n = 1}^{N - 1} {(\frac{a_{n}^{*} P_{n}^{ϑ}}{σ_{n + 1}})}^{2} + \sum_{n = 1}^{N - 1} {(\frac{a_{n}^{*} E_{ϑ} ζ_{n}}{σ_{n + 1}})}^{2} \\ = & I_{1, N} (ϑ) + I_{2, N} (ϑ, v) . \end{matrix}

where

P_{n}^{ϑ}

satisfies the following equation:

P_{n}^{ϑ} = A_{n - 1} P_{n - 1}^{ϑ} + b σ_{n} ϵ_{n}, P_{0}^{ϑ} = 0_{2 \times 1} .

Obviously,

I_{1, N} (ϑ)

does not depend on

v (n)

. Thus, and as presented in [19], we have

lim_{N \to \infty} E_{ϑ} exp (- \frac{1}{2 N} \sum_{n = 1}^{N - 1} \frac{{(a_{n}^{*} P_{n}^{ϑ})}^{2}}{σ_{n + 1}^{2}}) = exp (- \frac{1}{2} I_{1} (ϑ))

and

I_{1} (ϑ) = \frac{1}{1 - ϑ^{2}}

which can be deduced by (9) in [19].

A standard calculation yields

lim_{N \to \infty} \frac{I_{1, N} (ϑ)}{N} = I_{1} (ϑ) = \frac{1}{1 - ϑ^{2}} .

(A1)

To compute

I_{2, N} (ϑ)

, let

s (n) = \frac{E_{ϑ} ζ_{n}}{σ_{n + 1}}

. Then, we can see that

s (n)

satisfies the following equation:

s (n) = A_{n - 1} s (n - 1) \frac{σ_{n}}{σ_{n + 1}} + b f (n),

(A2)

where

f (n) = \frac{v (n)}{σ_{n + 1}}

and it is bounded.

Note that

β_{n} \to 0

and

\frac{σ_{n}}{σ_{n + 1}} \to 1

, we assume that for

n = 1, 2, \dots

,

\frac{σ_{n}}{σ_{n + 1}} \leq (1 + ε)

and

β_{n} \leq ε

for the sufficiently small positive constant

ε

and

(1 + ε) ϑ < 1

. Consequently, we can state the following result.

Lemma A1.

Let

Y = (Y_{n}, n \geq 1)

be the 2-dimension vector, which satisfies the following equation:

Y_{n} = (\begin{matrix} ϑ & 0 \\ 0 & 1 \end{matrix}) Y_{n - 1} + b f (n), Y (0) = y_{0} .

Then, we have

lim_{N \to \infty} \frac{1}{N} (\sum_{n = 1}^{N} {(a_{n}^{*} s (n))}^{2} - \sum_{n = 1}^{N} {(b^{*} Y_{n})}^{2}) = 0 .

Proof.

For the sake of notational simplicity, we introduce a 2-dimensional vector

Y^{'} = (Y_{n}^{'}, n \geq 1)

, which satisfies the following equation:

Y_{n}^{'} = A_{n - 1} Y_{n - 1}^{'} \frac{σ_{n}}{σ_{n + 1}} + b f (n), Y_{0}^{'} = y_{0}^{'} .

In this situation, we have three comparisons. First, we compare

b^{*} (s (n) - Y_{n}^{'}) \to 0

. A standard calculation implies that

\begin{matrix} s (n) - Y_{n}^{'} & = A_{n - 1} s (n - 1) \frac{σ_{n}}{σ_{n + 1}} - A_{n - 1} Y_{n - 1}^{'} \frac{σ_{n}}{σ_{n + 1}} \\ = (\begin{matrix} ϑ & ϑ β_{n - 1} \\ β_{n - 1} & 1 \end{matrix}) (s (n - 1) - Y_{n - 1}^{'}) \frac{σ_{n}}{σ_{n + 1}}, \end{matrix}

since

β_{n} \to 0, \frac{σ_{n}}{σ_{n + 1}} \to 1

, we have

b^{*} (s (n) - Y_{n}^{'}) = ϑ b^{*} (s (n - 1) - Y_{n - 1}^{'}), (n \to \infty)

which implies

b^{*} (s (n) - Y_{n}^{'}) \to 0

.

Now, we compare

b^{*} Y_{n} - b^{*} Y_{n}^{'}

. A simple calculation shows that

Y_{n} - Y_{n}^{'} = (\begin{matrix} ϑ & 0 \\ 0 & 1 \end{matrix}) Y_{n - 1} - A_{n - 1} Y_{n - 1}^{'} \frac{σ_{n}}{σ_{n + 1}}

n \to \infty

on both sides of this equation, then we have

lim_{n \to \infty} (Y_{n} - Y_{n}^{'}) = (\begin{matrix} ϑ & 0 \\ 0 & 1 \end{matrix}) lim_{n \to \infty} (Y_{n - 1} - Y_{n - 1}^{'})

which implies

b^{*} (Y_{n}^{'} - Y_{n}) \to 0

.

Finally, since

β_{n} \to 0

and the component of

s (n)

is bounded, we can easily obtain

a_{n}^{*} s (n) - b^{*} Y_{n} \to 0

, which demonstrates the proof. □

Now, we define

α (n) = b^{*} Y_{n}

. Then

α (n) = ϑ α (n - 1) + f (n)

, where

f (n)

is in the space of

F_{N} = \{f (n) | \frac{1}{N} \sum_{n = 1}^{N} f^{2} (n) \leq 1\}

. Since the initial value

α (0)

will not change our result, we assume

α (0) = 0

without loss of generality.

Let

J_{2, N} (ϑ) = sup_{v \in V_{N}} I_{2, N} (ϑ, v)

. Then, it is clear that

lim_{N \to \infty} \frac{J_{2, N} (ϑ)}{N - 1} = lim_{N \to \infty} \frac{1}{N} sup_{f \in F_{N}} \sum_{n = 1}^{N} α^{2} (n) .

(A3)

Now to prove Theorem 1, we only need the following lemma.

Lemma A2.

As

J_{2, N} (ϑ)

is presented in (A3), we have

lim_{N \to \infty} \frac{J_{2, N} (ϑ)}{N - 1} = I_{2} (ϑ),

(A4)

where

I_{2} (ϑ) = \frac{1}{{(1 - ϑ)}^{2}}

.

Proof.

First of all, taking

f (n) = 1

, then

α (n) = ϑ α (n - 1) + 1

, we can conclude that

α (n) = \frac{1 - ϑ^{n}}{1 - ϑ}, α^{2} (n) = \frac{{(1 - ϑ^{n})}^{2}}{{(1 - ϑ)}^{2}}

we can easily abtain that

lim_{N \to \infty} \frac{1}{N} \sum_{n = 1}^{N} α^{2} (n) = \frac{1}{{(1 - ϑ)}^{2}}

It is easy to get the lower bound

lim_{N \to \infty} \frac{1}{N} sup_{f \in F_{N}} \sum_{n = 1}^{N} α^{2} (n) \geq \frac{1}{{(1 - ϑ)}^{2}} .

Furthermore, a simple calculation shows that

α (n) = φ (n) \sum_{i = 1}^{n} φ^{- 1} (i) f (i), n \geq 1, α (0) = 0,

where

φ (n) = ϑ φ (n - 1), φ (0) = 1 .

Obviously, we can rewrite

\frac{1}{N} \sum_{n = 1}^{N} α^{2} (n)

as

\frac{1}{N} \sum_{n = 1}^{N} α^{2} (n) = \sum_{n = 1}^{N} (φ (n) \sum_{i = 1}^{n} φ^{- 1} (i) \frac{f (i)}{\sqrt{N}}) (φ (n) \sum_{i = 1}^{n} φ^{- 1} (i) \frac{f (i)}{\sqrt{N}}) .

or

\frac{1}{N} \sum_{n = 1}^{N} α^{2} (n) = \sum_{i = 1}^{N} \sum_{j = 1}^{N} F_{N} (i, j) \frac{f (i)}{\sqrt{N}} \frac{f (j)}{\sqrt{N}},

where

F_{N} (i, j) = \sum_{ℓ = i ⋁ j}^{N} (φ (ℓ) φ^{- 1} (i)) (φ (ℓ) φ^{- 1} (j)) .

Let

ϕ_{n} = φ^{- 1} (n) \sum_{ℓ = n}^{N} φ (ℓ) ϵ_{ℓ}

with

ϵ_{ℓ} \sim N (0, 1)

are independent. Then, we have

F_{N} (i, j) = E (ϕ_{i} ϕ_{j})

and

ϕ_{n - 1} = ϑ ϕ_{n} + ϵ_{n - 1}, ϕ_{N} = 0 .

Let us mention that

F_{N} (i, j)

is a compact symmetric operator for fixed N. We should estimate the spectral gap (the first eigenvalue

ν_{1} (N)

) of the operator. The estimation of the spectral gap is based on the Laplace transform of

\sum_{i = 1}^{N} ϕ_{i}^{2}

, which is written as

L_{N} (a) = E_{ϑ} exp (- \frac{a}{2} \sum_{i = 1}^{N} ϕ_{i}^{2}),

for sufficiently small negative

a < 0

. On the one hand, when

a > - \frac{2}{ν_{1} (N)}

,

ϕ

is a centered Gaussian process with a covariance operator

F_{N}

. Using Mercer’s theorem and Parseval’s identity,

L_{T} (a)

can be represented by

L_{N} (a) = \prod_{i \geq 1} {(1 + a ν_{i} (N))}^{- 1 / 2},

(A5)

where

ν_{i} (N)

is the sequence of positive eigenvalues of the covariance operator. A straightforward algebraic calculation shows the following.

L_{N} (a) = {(ϑ^{N - 1} Ψ_{N}^{1})}^{- 1 / 2},

(A6)

where

Ψ_{N} = (\begin{matrix} 1 & 0 \end{matrix}) {(\begin{matrix} \frac{1}{ϑ} & \frac{1}{ϑ} \\ \frac{a}{ϑ} & \frac{a}{ϑ} + ϑ \end{matrix})}^{N - 1} (\begin{matrix} 1 \\ 0 \end{matrix}) .

For

Δ = {(\frac{1 + a}{ϑ} + ϑ)}^{2} - 4 \geq 0,

there exists two real eigenvalues

λ_{2} \geq 1, λ_{1} \leq 1

of the matrix

(\begin{matrix} \frac{1}{ϑ} & \frac{1}{ϑ} \\ \frac{a}{ϑ} & \frac{a}{ϑ} + ϑ \end{matrix}) .

Then, we can see that

Ψ_{N} = (\frac{λ_{2}^{N - 1} - λ_{1}^{N - 1}}{ϑ^{2}} - \frac{λ_{2}^{N - 2} - λ_{1}^{N - 2}}{ϑ}) \frac{ϑ}{λ_{2} - λ_{1}} \geq 0 .

That is to say for

ϑ > 0

and for any

0 > a \geq - {(1 - ϑ)}^{2}

,

L_{N} (a) \geq 0

. Thus,

lim_{N \to \infty} ν_{1} (N) \leq \frac{1}{{(1 - ϑ)}^{2}}

and we complete the proof. □

Remark A1.

For

- 1 < ϑ < 0

,

Δ \geq 0

means

\frac{1 + a}{ϑ} + ϑ \leq - 2

and

0 > a \geq - {(1 + ϑ)}^{2}

. As a consequence, we have

ν_{1} (N) \leq \frac{1}{{(1 + ϑ)}^{2}}

.

Appendix A.2. Proof of Theorem 2

Let

v_{o p t} (n) = σ_{n + 1}

and

ζ^{o} = (ζ_{n}^{o}, n \geq 1)

be the process

ζ

with the function

v_{o p t (n)}

. Then, we have

ζ_{n}^{o} = A_{n - 1} ζ_{n - 1}^{o} + b v_{o p t} (n) + b σ_{n} ϵ_{n}, ζ_{0} = 0_{2 \times 1}

To estimate the parameter

ϑ

from observations

ζ_{1}, ζ_{2}, \dots, ζ_{N}

, we can write the MLE of

ϑ

with the help of (18)

{\hat{ϑ}}_{N} = {(\sum_{n = 1}^{N} {(\frac{a_{n}^{*} ζ_{n}}{σ_{n + 1}})}^{2})}^{- 1} (\sum_{n = 1}^{N} \frac{a_{n}^{*} ζ_{n} b^{*} η_{n + 1}}{σ_{n + 1}^{2}}) .

(A7)

where

η_{n} = (\begin{matrix} Z_{n} - v (n) \\ 0 \end{matrix}) .

A standard calculation yields

{\hat{ϑ}}_{N} - ϑ = \frac{M_{N}}{{⟨ M ⟩}_{N}},

where

M_{N} = \sum_{n = 1}^{N} \frac{a_{n}^{*} ζ_{n}}{σ_{n + 1}} ϵ_{n + 1}, {⟨ M ⟩}_{N} = \sum_{n + 1}^{N} {(\frac{a_{n}^{*} ζ_{n}}{σ_{n + 1}})}^{2} .

The second and third conclusions on the asymptotic normality of Theorem 2 are crucially based on the asymptotical study of the Laplace transform

L_{N}^{ϑ} (\frac{μ}{N}) = E_{ϑ} exp (- \frac{μ}{2 N} {⟨ M ⟩}_{N}),

for

N \to \infty

.

First, we can rewrite

L_{N}^{ϑ} (\frac{μ}{N})

by the following formula:

L_{N}^{ϑ} (\frac{μ}{N}) = E_{ϑ} exp (- \frac{1}{2} \sum_{n = 1}^{N} ζ_{n}^{*} M_{n} ζ_{n}),

where

M_{n} = \frac{μ}{N σ_{n + 1}^{2}} a_{n} a_{n} *

.

As presented in [19] and using the Cameron–Martin formula [31], we have the following result.

Lemma A3.

For any N, the following equality holds:

L_{N}^{ϑ} (\frac{μ}{N}) = \prod_{n = 1}^{N} {[det (Id + γ (n, n) M_{n})]}^{- 1 / 2} exp (- \frac{1}{2} \sum_{n = 1}^{N} z_{n}^{*} M_{n} {(Id + γ (n, n) M_{n})}^{- 1} z_{n}),

where

(γ (n, m), 1 \leq m \leq n)

is the unique solution of the equation

γ (n, m) = [\prod_{r = m + 1}^{n} A_{r - 1} {(Id + γ (r, r) M_{r})}^{- 1}] γ (m, m),

(A8)

and the function

(γ (n, n, n \geq 1))

is the solution of the Ricatti equation:

γ (n, n) = A_{n - 1} {(Id + γ (n - 1, n - 1) M_{n - 1})}^{- 1} γ (n - 1, n - 1) A_{n - 1}^{*} + σ_{n}^{2} b b^{*} .

(A9)

It is worth mentioning that

(z_{n}, 1 \leq n \leq N)

is the unique solution of the equation

z_{n} = m_{n} - \sum_{r = 1}^{n - 1} γ (n, r) {[Id + γ (r, r) M_{r}]}^{- 1} M_{r} z_{r}, z_{0} = m_{0} .

where

m_{n} = E ζ_{n}^{o}

.

With the explicit formula of the Laplace transform presented in Lemma A3, we have its asymptotical property.

Lemma A4.

Under the condition (4), for any

μ \in R

, we have

lim_{N \to \infty} L_{N}^{ϑ} (\frac{μ}{N}) = exp (- \frac{μ}{2} I (ϑ))

(A10)

where

I (ϑ) = \frac{1}{1 - ϑ^{2}} + \frac{1}{{(1 - ϑ)}^{2}}

.

Proof.

In [19], we have stated that

lim_{N \to \infty} \prod_{n = 1}^{N} {[det (Id + γ (n, n) M_{n})]}^{- 1 / 2} = exp (- \frac{1}{2} \frac{μ}{1 - ϑ^{2}}) .

(A11)

Since the component of

γ (n, n)

is bounded, we have

lim_{N \to \infty} Id + γ (n, n) M_{n} = Id .

On the other hand,

\sum_{n = 1}^{N} m_{n}^{*} M_{n} m_{n} = \frac{μ}{N} \sum_{n = 1}^{N} {(\frac{a_{n}^{*} E ζ_{n}^{0}}{σ_{n + 1}})}^{2} ⟶ \frac{μ}{{(1 - ϑ)}^{2}}, N \to \infty .

(A12)

which was presented in the last part. Finally, notice

\sum_{r = 1}^{n - 1} γ (n, r) {[Id + γ (r, r) M_{r}]}^{- 1} M_{r} z_{r} = \sum_{r = 1}^{n - 1} [\prod_{τ = r + 1}^{n} A_{τ - 1} {(Id + γ (τ, τ) M_{τ})}^{- 1}] {[Id + γ (r, r) M_{r}]}^{- 1} M_{r} z_{r}

we have the following.

lim_{N \to \infty} \sum_{r = 1}^{n - 1} γ (n, r) {[Id + γ (r, r) M_{r}]}^{- 1} M_{r} z_{r} = 0 .

(A13)

Combining (A11) and (A12) and (A13), the Lemma A4 achieves. □

From this conclusion, it follows immediately that

P_{ϑ} lim_{N \to \infty} \frac{1}{N} {⟨ M ⟩}_{N} = I (ϑ) .

Furthermore, using the central limit theorem for martingale, we have

\frac{1}{\sqrt{N}} M_{N} ⟹ N (0, I (ϑ)) .

Consequently, the asymptotical part of Theorem 2 is obtained.

Strong consistency is immediate when we change

\frac{μ}{N}

with a positive proper constant

μ

in the Lemma A4 because the determinant part tends to 0 as presented in Section 5.2 of [19] and the extra part is bounded.

References

Kiefer, J. On the Efficient Design of Statistical Investigation. Ann. Stat. 1974, 2, 849–879. [Google Scholar]
Mehra, R. Optimal Input Signal for Linear System Identification. IEEE Trans. Autom. Control 1974, 19, 192–200. [Google Scholar] [CrossRef]
Mehra, R. Optimal Inputs Signal for Parameter Estimation in Dynamic Systems-Survey and New Results. IEEE Trans. Autom. Contrl 1974, 19, 753–768. [Google Scholar] [CrossRef]
Ng, T.S.; Qureshi, Z.H.; Cheah, Y.C. Optimal Input Design for An AR Model with Output Constraints. Automatica 1984, 20, 359–363. [Google Scholar] [CrossRef]
Gevers, M. From the Early Achievement to the Revival of Experiment Design. Eur. J. Control 2005, 11, 1–18. [Google Scholar] [CrossRef]
Goodwin, G.; Rojas, C.; Welsh, J.; Feuer, A. Robust Optimal Experiment Design for System Indentification. Automatica 2007, 43, 993–1008. [Google Scholar]
Ljung, L. System Identification-Theory for the User; Prentice Hall: Englewood Cliffs, NJ, USA, 1987. [Google Scholar]
Ovseevich, A.; Khasminskii, R.; Chow, P. Adaptative Design for Estimation of Unknown Parameters in Linear System. Probl. Inf. Transm. 2000, 36, 125–153. [Google Scholar]
Leland, W.E.; Taqqu, M.S.; Willinger, W.; Wilson, D.V. On the Self-similar Nature of Ethernet Traffic. IEEE/ACM Trans. Netw. 1994, 2, 1–15. [Google Scholar] [CrossRef]
Comte, F.; Renault, E. Long Memory in Continuous-time Stochastic Volatility Models. Math. Financ. 1998, 8, 291–323. [Google Scholar] [CrossRef]
Gatheral, J.; Jaisson, T.; Rosenbaum, M. Volatility is Rough. Quant. Financ. 2018, 18, 933–949. [Google Scholar] [CrossRef]
Yajima, Y. On Estimation of Long-Memory Time Series Models. Aust. J. Stat. 1985, 27, 302–320. [Google Scholar] [CrossRef]
Brouste, A.; Cai, C. Controlled Drift Estimation in Fractional Diffusion Process. Stoch. Dyn. 2013, 13, 1250025. [Google Scholar] [CrossRef]
Brouste, A.; Marina, K.; Popier, A. Design for Estimation of Drift Parameter in Fractional Diffusion System. Stat. Inference Stoch. Process. 2012, 15, 133–149. [Google Scholar] [CrossRef]
Cao, K.; Gu, J.; Mao, J.; Liu, C. Sampled-Data Stabilization of Fractional Linear System under Arbitrary Sampling Periods. Fractal Fract. 2022, 6, 416. [Google Scholar] [CrossRef]
Chen, S.; Huang, W.; Liu, Q. A New Adaptive Robust Sliding Mode Control Approach for Nonlinear Singular Fractional Oder System. Fractal Fract. 2022, 6, 253. [Google Scholar] [CrossRef]
Jia, T.; Chen, X.; He, L.; Zhao, F.; Qiu, J. Finite-Time Synchronization of Uncertain Fractional-Order Delayed Memristive Neural Networks via Adaptive Sliding Mode Control and Its Application. Fractal Fract. 2022, 6, 502. [Google Scholar] [CrossRef]
Liu, R.; Wang, Z.; Zhang, X.; Ren, J.; Gui, Q. Robust Control for Variable-Order Fractional Interval System Subject to Actuator Saturation. Fractal Fract. 2022, 6, 159. [Google Scholar] [CrossRef]
Brouste, A.; Cai, C.; Kleptsyna, M. Asymptotic Properties of the MLE for the Autoregressive Process Coefficients Under Stationary Gaussian Noise. Math. Methods Stat. 2014, 23, 103–115. [Google Scholar] [CrossRef]
Brouste, A.; Cai, C.; Soltane, M.; Wang, L. Testing for The Change of the Mean-Reverting Parameter of An Autoregressive Model with Stationary Gaussian Noise. Stat. Inference Stoch. Process. 2020, 23, 301–318. [Google Scholar] [CrossRef]
Brouste, A.; Kleptsyna, M. Kalman Type Filter Under Stationary Noises. Syst. Control Lett. 2012, 61, 1229–1234. [Google Scholar] [CrossRef]
Robinson, P. Log-periodogram Regression of Time Series with Long-Range Dependence. Ann. Stat. 1995, 23, 1048–1072. [Google Scholar] [CrossRef]
Istas, J.; Lang, G. Quadratic Variation and Estimation of Local Holder Index of a Gaussian Process. Ann. l’I.H.P. Sect. B 1997, 33, 407–436. [Google Scholar] [CrossRef]
Ben Hariz, S.; Brouste, A.; Cai, C.; Soltane, M. Fast and Asymptotically-Efficient Estimation in a Fractional Autoregressive Process. 2021. Available online: https://hal.archives-ouvertes.fr/hal-03221391 (accessed on 8 May 2021).
Liptser, R.S.; Shiryaev, A.N. Statistics of Random Processes II: Applications; Springer: Berlin/Heidelberg, Germany, 2001; Volume 2. [Google Scholar]
Durbin, J. The Fitting of Time Series Models. Rev. Inst. Int. Stat. 1960, 28, 233–243. [Google Scholar] [CrossRef]
Sinai, Y.G. Self-Similar Probability Distribution. Theory Probab. Appl. 1976, 21, 64–80. [Google Scholar] [CrossRef]
Hosking, J. Fractional Differencing. Biometrika 1981, 68, 165–176. [Google Scholar] [CrossRef]
Chen, Y.; Li, T.; Li, Y. Second Estimator for An AR(1) Model Driven by a Long Momory Gaussian Noise. arXiv, 2020; arXiv:2008.12443. [Google Scholar]
Wood, A.; Chan, G. Simulation of Stationary Gaussian Processes in [0,1]^d. J. Comput. Graph. Stat. 1994, 3, 409–432. [Google Scholar] [CrossRef]
Kleptsyna, M.L.; Le Breton, A.; Viot, M. New Formulas Concerning Laplace transforms of Quadratic Forms for General Gaussian Sequences. Int. J. Stoch. Anal. 2002, 15, 309–325. [Google Scholar] [CrossRef]

$Fractalfract 06 00643 g001$

Figure 1. Histogram of the statistic

Φ (N, ϑ, X)

with

ϑ = - 0.7

and

N = 4000

.

$Fractalfract 06 00643 g002$

Figure 2. Histogram of the statistic

Φ (N, ϑ, X)

with

ϑ = 0.7

and

N = 4000

.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Controlled Parameter Estimation for The AR(1) Model with Stationary Gaussian Noise

Abstract

1. Introduction

2. Preliminaries and Notations

2.1. Stationary Gaussian Sequences

2.2. Model Transformation

2.3. Fisher Information

3. Main Results

4. Simulation Study

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

Appendix A.1. Proof of Theorem 1

Appendix A.2. Proof of Theorem 2

References

Article Metrics

Citations

Article Access Statistics