Self-Weighted Quasi-Maximum Likelihood Estimators for a Class of MA-GARCH Model

Danni Xie; Xin Liang; Ruilin Liang

doi:10.3390/sym14081723

,

and

¹

School of Mathematics and Statistics, Guangxi Normal University, Guilin 541004, China

²

Faculty of Science and Technology, University of Macau, Macau 999078, China

^*

Author to whom correspondence should be addressed.

Symmetry2022, 14(8), 1723;https://doi.org/10.3390/sym14081723

This article belongs to the Section Mathematics

Version Notes

Order Reprints

Abstract

In financial time series analysis, symmetric and asymmetric GARCH models have become essential models for measuring the characteristics of economic volatility. In this article, we propose the consistency and asymptotic normality properties of the self-weighted quasi-maximum likelihood estimation without assuming the existence of the second moment for the moving average model with a class of GARCH error. Numerical simulation shows that the parameter estimation performs well; empirical analysis shows that the self-weighted quasi-maximum likelihood estimation of the moving average model with a class of GARCH error can improve the data fitting effect and prediction ability.

Keywords:

a class of MA-GARCH model; the self-weighted quasi-maximum likelihood estimation; the consistency; asymptotic normatity

1. Introduction

With the progress and development of multivariate time series theory and technology, to accurately describe the dynamic time-varying characteristics of the volatility of financial time series data, Engle [1] proposed an ARCH model for the volatility of the UK inflation rate, and Bollerslev [2] extended it to a GARCH model. On this basis, many experts and scholars at home and abroad have done much research on the GARCH model in combination with the asymmetry and long memory of volatility. For example, given that the impact of positive and negative news on stock market volatility is asymmetric, Nelson [3] proposed the exponential GARCH (EGARCH) model; Glosten, Jagannathan, and Runkle [4] proposed the asymmetric GARCH (GJR) model; H. Viet Long and Bin Jebreen et al. [5] employed the generalized autoregressive conditional heteroskedasticity (GARCH) model for risk management as their main contribution. However, in the traditional GARCH model, the conditional heteroscedasticity is a function of the unobservable lag residuals square sequence, which makes the parameter estimation of this model complex. Li and Zhang et al. [6] combined the ideas of a factor model and a symmetric GARCH model to describe the dynamics of a high-dimensional conditional covariance matrix. Therefore, Baillie [7] described the population characteristics of various long memory processes in FIGARCH. The conditional variance of the process implies a slow hyperbolic decay rate for the influence of lagged squared innovations. Ling [8] proposed the DAR(p) model based on the ARCH model. In this way, a better estimate of the parameters can be obtained. Moreover, Ling [9] proposed the consistency and asymptotic properties of self-weighted quasi-maximum likelihood estimation in the presence of fractional moments for the ARMA-GARCH model. Then, Zhu and Ling [10] proposed the strong coincidence and asymptotic properties of global self-weighted quasi-maximal exponential likelihood estimation for the ARMA-GARCH model in the presence of fractional moments. Many models with GARCH also have some properties of similar analysis. For example, Pan and Chen et al. [11] discussed the asymptotic properties of WLAD estimation for the ARFIMA model under stationary and non-stationary conditions. For long-memory and heteroscedastic models, Ramírez-Parietti and Contreras-Reyes et al. [12] computed cross-sample entropy measures for synchrony level, which involved ARMA, ARFIMA, ARMA-GARCH, and FIGARCH processes.

Based on the classical ARCH model, Ling [13] proposed the DAR model to make the parameter estimation of the DAR model become easier to estimate and proved the consistency and asymptotic properties of the QMLE estimation without assuming the existence of the second-order moment of the sequence. Then Zhu and Zhang et al. [14] proposed a class of MA-GARCH models (the DAR(p) model p tends to infinity), proving that Ling has done similar work without assuming the existence of second-order moments. However, when there are extreme data, the parameter weights estimated by QMLE are all 1, which has a specific error. Thus, Ling [9] proved that the self-weighted QMLE estimation of the RAMA-GARCH model is consistent and asymptotic in the presence of fractional moments. The purpose is to use weights to reduce leverage points and make estimates more precise and more efficient.

In this paper, to reduce the error, we perform a self-weighted QMLE estimation based on Zhu and Zhang et al. [14] Giving different weights to different values makes our estimates more robust, proving some properties of the forecast. The rest of the article is organized as follows. Section 2 introduces the consistency and asymptotic properties of estimation; Section 3 presents the numerical simulation of self-weighted QMLE estimation. Section 4 gives a real example; Section 5 concludes the article. The proof is in Appendix A.

2. Self-Weighted QMLE

A class of MA-GARCH model can be written as

\begin{matrix} y_{t} & = ϕ ε_{t - 1} + ε_{t} \end{matrix}

(1)

\begin{matrix} ε_{t} & = e_{t} \sqrt{h_{t}} \end{matrix}

(2)

\begin{matrix} h_{t} & = ω + α y_{t - 1}^{2} + β h_{t - 1} \end{matrix}

(3)

where

| ϕ | < 1, ω, α > 0, 0 < β < 1

and when

t > s

,

{y_{t}}

and

{e_{t}}

are independent of each other. Denote

δ = (ω, α, β)

, and

θ = {(ϕ, δ)}^{'} \subset Θ

, and

Θ = (θ : - 1 < \underset{̲}{ϕ} < ϕ < \bar{ϕ} < 1, 0 < \underset{̲}{ω} < ω < \bar{ω}, 0 < \underset{̲}{α} < α < \bar{α}, 0 < \underset{̲}{β} < β < \bar{β} < 1)

. Denote

Φ (z) = 1 + ϕ z

,

β (z) = 1 - β z

,

Φ^{- 1} (z) = \sum_{i = 0}^{\infty} a_{Φ} (i) z^{i}

,

s u p_{Θ} a (i) = O (ρ_{0}^{i})

,

β^{- 1} (z) = \sum_{i = 0}^{\infty} β^{i} z^{i}

, where

0 < ρ_{0} < 1

.

Compared to traditional MA-GARCH model, the previous

ε_{t - 1}^{2}

in (3) is instead denoted by

y_{t - 1}^{2}

. A class of MA-GARCH model is easier to predict than the traditional MA-GARCH model.

Remark 1.

With simple iteration, the model can be transformed into:

y_{t} = \sum_{i = 1}^{\infty} {(- 1)}^{i - 1} ϕ^{i} y_{t - i} + e_{t} \sqrt{ω / (1 - β) + α \sum_{j = 1}^{\infty} β^{j - 1} y_{t - j}^{2}}

From this, we can see that, compared with the DAR(p) model proposed by Ling [14], the above formula is equivalent to the order p of the conditional mean, and the conditional variance tends to infinity. Then, we refer to Zhu [15] to obtain the parameter space of the above models (1)–(3).

We introduce the following conditions:

A1:: $θ_{0}$ is an interior point in $Θ$ , and ${y_{t}}$ is stationary, ergodic, and identifiable.
A2:: ${e_{t}}$ is independent and identically distributed with a mean of 0 and a variance of 1.
A3:: ${e_{t}^{2}}$ has a nondegenerate distribution with $E e_{t}^{4} < \infty$ .
A4:: $ω_{t} = ω {(y_{t - 1}, y_{t - 2}, \dots)}^{'}$ and $ω_{t}$ is a measurable, positive, and bounded function on $R^{Z_{0}}$ with $E [ω_{t} (ξ_{ρ, t - 1}^{2} + ζ_{ρ, t - 1}^{2} + ξ_{ρ, t - 1}^{2} ζ_{ρ, t - 1}^{2})] < \infty$ , and $Z_{0} = {0, 1, 2, \dots}$ , $ρ \in (0, 1)$ .

Given the observations

{y_{n}, \dots, y_{1}}

and the initial values

{y_{0}, y_{- 1}, y_{- 2}, \dots}

, we can rewrite the parametric models (1)–(3) as

\begin{matrix} ε_{t} & = y_{t} - ϕ ε_{t - 1} \end{matrix}

(4)

\begin{matrix} ε_{t} & = e_{t} \sqrt{h_{t}} \end{matrix}

(5)

\begin{matrix} h_{t} & = ω + α y_{t - 1}^{2} + β h_{t - 1} \end{matrix}

(6)

According to Yoshida’s [16] research on the asymptotic properties of the likelihood estimators of various nonlinear stochastic processes, for convenience, the weighted log-quasi-likelihood function (ignoring a constant) can be written as follows:

L_{s n} = \frac{1}{n} \sum_{t = 1}^{\infty} ω_{t} l_{t} (θ), l_{t} (θ) = log h_{t} (θ) + \frac{ε_{t}^{2} (θ)}{h_{t} (θ)}

Remark 2.

This is important because it ensures that the maximum value of the log of the probability occurs at the same point as the original probability function. The quasi-score function and the quasi-information matrix are given in the Appendix A. Equation (A11) in the Appendix A.1 shows that

s u p_{Θ} ‖ \frac{\partial^{2} l_{t} (θ)}{\partial ϕ \partial ϕ^{'}} ‖ \leq O (1) ξ_{ρ, t - 1}^{2}

It is possible to obtain an estimator such that it is asymptotic normal if we can downweight

ξ_{ρ, t - 1}^{2}

.

We look for the minimizer,

{\hat{θ}}_{s n} = {({\hat{ϕ}}_{s n}, {\hat{δ}}_{s n})}^{'}

, of

L_{s n}

on

Θ

; that is,

{\hat{θ}}_{s n} = a r g m i n_{Θ} L_{s n} (θ)

We first give two lemmas.

Lemma 1.

Let

ξ_{ρ, t} = 1 + Σ_{i = 0}^{\infty} ρ^{i} | y_{t - i} |

, where

0 < ρ < 1

. Then there exists a constant ρ and a neighborhood

Θ_{0}

of

θ_{0}

such that:

$(i)$: $s u p_{Θ} | ε_{t - 1} (θ) | \leq c ξ_{ρ, t - 1}$ ;
$(i i)$: $s u p_{Θ} ‖ \frac{\partial ε_{t} (θ)}{\partial ϕ} ‖ \leq c ξ_{ρ, t - 1}$ ;
$(i i i)$: $s u p_{Θ} ‖ \frac{\partial^{2} ε_{t} (θ)}{\partial ϕ \partial ϕ^{'}} ‖ \leq c ξ_{ρ, t - 1 .}$ .

Lemma 2.

Let

ζ_{ρ, t} = 1 + \sum_{i = 0}^{\infty} ρ^{i} y_{t - 1}^{2}

. Then there exists a constant ρ and a neighborhood

Θ_{0}

of

θ_{0}

, and

0 < β < ρ < 1

. Then:

$(i)$: $s u p_{Θ} h_{t} (θ) \leq c ζ_{ρ, t - 1}$
$(i i)$: $s u p_{Θ} ‖ \frac{1}{h_{t} (θ)} \frac{\partial h_{t} (θ)}{\partial δ} ‖ \leq c ζ_{ρ, t - 1}$
$(i i i)$: $s u p_{Θ} ‖ \frac{1}{h_{t} (θ)} \frac{\partial^{2} h_{t} (θ)}{\partial δ \partial δ^{'}} ‖ \leq c ζ_{ρ, t - 1}$
$(i v)$: $‖ \frac{1}{\sqrt{h_{t} (θ)}} \frac{\partial h_{t} (θ)}{\partial ϕ} ‖ = 0$

We now can state our main results as follows.

Theorem 1.

If Assumptions A1–A4 hold, then

$(i)$: ${\hat{θ}}_{s n} ⟶_{p} θ_{0}$
$(i i)$: $\sqrt{n} ({\hat{θ}}_{s n} - θ_{0}) ⟶_{d} N (0, \sum_{0}^{- 1} Ω_{0} \sum_{0}^{- 1})$ where

Σ_{0} = E [\begin{matrix} 2 \frac{ω_{t}}{h_{t} (θ_{0})} \frac{\partial ε_{t} (θ_{0})}{\partial ϕ} \frac{\partial ε_{t} (θ_{0})}{\partial ϕ^{'}} & 0 \\ 0 & \frac{ω_{t}}{h_{t}^{2} (θ)} \frac{\partial h_{t} (θ_{0})}{\partial δ} \frac{\partial h_{t} (θ_{0})}{\partial δ^{'}} \end{matrix}]

Ω_{0} = E [\begin{matrix} 4 \frac{ω_{t}^{2}}{h_{t} (θ_{0})} {(\frac{\partial ε_{t} (θ_{0})}{\partial ϕ})}^{2} & 0 \\ 0 & ω_{t}^{2} \frac{ς}{h_{t}^{2} (θ_{0})} {(\frac{\partial h_{t} (θ_{0})}{\partial δ})}^{2} \end{matrix}]

ς = E e_{t}^{4} - 1

\frac{\partial ε_{t} (θ)}{\partial ϕ} = \sum_{i = 1}^{\infty} {(- 1)}^{i} i ϕ^{i - 1} y_{t - i}

\frac{\partial h_{t} (θ)}{\partial δ} = [\frac{1}{1 - β}, \sum_{j = 1}^{\infty} β^{j - 1} y_{t - j}^{2}, \sum_{j = 1}^{\infty} β^{j - 1} h_{t - j}]

3. Simulation

In this section, on the basis of a class of MA-GRACH model, we compared SQMLE with QMLE. In the simulation, we studied the cases when

{e_{t}}

has

N (0, 1)

,

L a p l a c e (0, 1)

, and

t_{3}

. Then we took the sequences

{y_{t}}

that are generated by a class of MA-GARCH models, where

θ_{0} = (0.5, 0.1, 0.4, 0.3)

. We needed to select a weight

ω_{t}

. It seemed reasonable to use the following weight analog Ling [17] and Huber [18] influence function:

ω_{t} = \{\begin{matrix} 1, & a_{t} = 0; \\ \frac{c^{3}}{a_{t}^{3}}, & a_{t} \neq 0 \end{matrix}

(7)

where

a_{t} = | y_{t - 1} | I (| y_{t - 1} | \geq c)

, c is the 95th quantile of the sequence

{y_{t}}

. This weight satisfies assumption A4. Then, For the models (1)–(3), we compared the parameter estimation methods of Zhu and Zhang, etc. [14]. We set the sample size at

n = 400, 800, 1200

and used 1000 replications.

From Table 1, when

e_{t} \sim N (0, 1)

, we can see that all estimators in Table 1 have minimal biases, and the SD and AD of self-weighted QMLE are close. As the sample size n increases, the values of SD and AD both decrease. At the same time, we compare the two estimation methods, the quasi-maximum likelihood estimation(QMLE) and the self-weighted quasi-maximum likelihood estimation (SQMLE), in Table 1. With the increase in n, the bias, AD, and SD shows different trends. The quasi-maximum likelihood estimation and the self-weighted quasi-maximum likelihood estimation values were getting closer and closer. Still, the quasi-maximum likelihood estimation’s bias, AD, and SD are always slightly smaller than the bias, AD, and SD of the self-weighted quasi-maximum likelihood estimate. When this happens, it is reasonable to be pointed out by Ling [9]. Therefore, the simulation results illustrate that self-weighted quasi-maximum likelihood estimates are applicable.

Table 1. Estimators for the models (1)–(3) when

e_{t} \sim N (0, 1)

.

When

e_{t} \sim L a p l a c e (0, 1)

,

e_{t} \sim t_{3}

, respectively, we can see that both Table 2 and Table 3 have similar simulation effects to Table 3. Therefore, the self-weighted quasi-maximum likelihood estimation for a class of MA-GARCH model is valid.

Table 2. Estimators for the models (1)–(3) when

e_{t} \sim L a p l a c e (0, 1)

.

Table 3. Estimators for the models (1)–(3) when

e_{t} \sim t_{3}

.

There are many other weights, such as

ω_{t} = (1 + C | y_{t} {|^{2})}^{- 3 / 2}

and

I (m a x_{1 \leq i \leq p} | y_{t - 1} | \leq C)

, that satisfies assumption A4. However, our simulation results that are not reported in this paper show that the self-weighted quasi-maximum likelihood estimated is much more efficient than that based on these weights.

4. A Real Example

In this section, we empirically analyzed the models (1)–(3) by using two real data sets. We used the SQMLE to compare the goodness of fit and prediction effect for a class of MA-GARCH model and traditional MA-GARCH model.

Firstly, we studied the opening price of ST Daji (000564) from 1 January 2018 to 1 January 2022, which has, in total, 837 observations. Let

x_{t}

be the logarithms of the opening price of ST Daji (000564) in Figure 1, and

y_{t} = x_{t} - x_{t - 1}

,

{y_{t}}

is plotted in Figure 2. Ling [8] makes a self-weighted quasi-maximum likelihood estimate for ARMA-GARCH.

Figure 1.

x_{t}

be the logarithms of the opening price of ST Daji (000564).

Figure 2. Time series diagram of

y_{t} = x_{t}

-

x_{t - 1}

.

The ARMA(0,1)-GARCH(0,1) (MA-GARCH) model was adopted to fit

{y_{t}}

by Ling [8], and the result is

\begin{matrix} y_{t} & = - 0.0047 (0.0000) ε_{t - 1} + ε_{t} \end{matrix}

(8)

\begin{matrix} ε_{t} & = e_{t} \sqrt{h_{t}} = e_{t} \sqrt{0.0010 (0.0059) + 0.2703 (3.5188) ε_{t - 1}^{2} + 0.0427 (0.1955) h_{t - 1}} \end{matrix}

(9)

where the standard errors are in parentheses.

Similarly, we used models (1)–(3) to fit

{y_{t}}

, and the result is

\begin{matrix} y_{t} & = - 0.0817 (0.0487) ε_{t - 1} + ε_{t} \end{matrix}

(10)

\begin{matrix} ε_{t} & = e_{t} \sqrt{h_{t}} = e_{t} \sqrt{0.0010 (0.0002) + 0.2614 (0.1060) y_{t - 1}^{2} + 0.0443 (0.1585) h_{t - 1}} \end{matrix}

(11)

where the standard errors are in parentheses (obtained from Theorem 1).

We compared the root-mean-squared error (RMSE) for the one-step-ahead forecast and log-likelihood value for models (8) and (9) and models (10) and (11), respectively. The results are shown in Table 4. In addition, for the data of the data set from 24 March 2021 to 19 August 2021, we give a 95 percent confidence interval of sequence

x_{t}

, as shown in Figure 3 below. Figure 4 is a partial enlargement of Figure 3. As can be seen from Table 4, the RMSE value of models (10) and (11) is smaller than that of models (8) and (9), and the log-likelihood value of models (10) and (11) is bigger than that of models (8) and (9). Thus, the goodness of fit of models (10) and (11) is better than models (8) and (9). In addition, as can be seen from Figure 3, at almost all points, models (10) and (11) have a narrower confidence interval than models (8) and (9), and we can see that all points fall within the confidence interval. This shows that under this data, models (10) and (11) are better than models (8) and (9) in terms of goodness of fit and predictive ability. Therefore, in this paper, we consider models (10) and (11) to be better than models (10) and (11) to a certain extent.

Table 4. Comparison of the fitting effect of models (8) and (9) and models (10) and (11).

Figure 3. Time series diagram of the data

x_{t}

(green line) and 95% forecasting confidence intervals based on models (8) and (9) (red line) and models (10) and (11) (black line), respectively.

Figure 4. Part of Figure 3 is enlarged.

Secondly, we studied the 21-day China Interbank Offered Rate from 2 November 2015 to 8 March 2016, which has 178 observations. Let

x_{t}

be the logarithms of the 21-day China Interbank Offered Rate in Figure 5, and

y_{t} = x_{t} - x_{t - 1}

,

{y_{t}}

is plotted in Figure 6. Ling [8] makes a self-weighted quasi-maximum likelihood estimate for ARMA-GARCH.

Figure 5.

x_{t}

, the logarithms of the 21-day China Interbank Offered Rate.

Figure 6. Time series diagram of

y_{t} = x_{t}

-

x_{t - 1}

.

ARMA(0,1)-GARCH(0,1) (MA-GARCH) model was adopted to fit

{y_{t}}

by Ling [8], and the result is

\begin{matrix} y_{t} & = - 0.5143 (0.0000) ε_{t - 1} + ε_{t} \end{matrix}

(12)

\begin{matrix} ε_{t} & = e_{t} \sqrt{h_{t}} = e_{t} \sqrt{0.0061 (0.0064) + 0.0773 (5.0485) ε_{t - 1}^{2} + 0.0010 (3.8925) h_{t - 1}} \end{matrix}

(13)

where the standard errors are in parentheses.

Similarly, we used models (1)–(3) to fit

{y_{t}}

, and the result is

\begin{matrix} y_{t} & = - 0.4982 (0.0069) ε_{t - 1} + ε_{t} \end{matrix}

(14)

\begin{matrix} ε_{t} & = e_{t} \sqrt{h_{t}} = e_{t} \sqrt{0.0062 (0.0092) + 0.0496 (0.1182) y_{t - 1}^{2} + 0.0010 (1.4060) h_{t - 1}} \end{matrix}

(15)

where the standard errors are in parentheses (From Theorem 1 we get).

From Table 5, we can see that the RMSE and the log-likelihood values of models (12) and (13) and models (14) and (15) are equal. In addition, we give a 95 percent confidence interval of sequence

x_{t}

, as shown in Figure 7 below. Figure 8 is a partial enlargement of Figure 7. From Figure 7, we can see that models (14) and (15) hasve a narrower confidence interval than models (12) and (13), and we can see that all points fall within the confidence interval. This shows that for this data, models (14) and (15) are better than models (12) and (13) in terms of predictive ability. Therefore, in this paper, we consider models (14) and (15) better than models (12) and (13) to a certain extent.

Table 5. Comparison of the fitting effect of models (12) and (13) and models (14) and (15).

Figure 7. Time series diagram of the data

x_{t}

(green line) and 95% forecasting confidence intervals based on models (12) and (13) (red line) and models (14) and (15) (black line), respectively.

Figure 8. Part of Figure 7 is enlarged.

5. Conclusions

With the motivation to study the DAP(p) model with p going to ∞, in this article, we propose a class of MA-GARCH model based on Zhu and Zhang et al. [14] The consistency and asymptotic normality properties of the self-weighted quasi-maximum likelihood estimation of a class of MA-GARCH models are established. Through simulation, we found that the the self-weighted quasi-maximum likelihood estimation of a class of MA-GARCH models is feasible. Empirical results show that a class of a MA-GARCH model is better than the MA-GARCH model in terms of goodness of fit and predictive ability. This implies that the model we considered has some applicability to the field of MA-GARCH models.

Author Contributions

Methology and Simulation, D.X.; Guide and Supervision, X.L.; Solfware, R.L. All authors have read and agreed to the published version of manuscript.

Funding

The work is partially supported by National Natural Science Foundation of China under grant no.12161009, The Science and technology project of Guangxi under grant no.2021AC06001, and Scientific Research Foundation of Development Institute of Zhujiang-Xijiang Economic Zone under grant no. ZX2020013.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The empirical data comes from ST Daji (000546) shared on the Shenzhen Stock Exchange; the empirical data comes from the choice financial date terminal.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

MDPI	Multidisciplinary Digital Publishing Institute
DOAJ	Directory of open access journals
TLA	Three letter acronym
LD	Linear dichroism

Appendix A

Appendix A.1

Proof of Lemma 1.

(i):

\begin{matrix} s u p_{Θ} | ε_{t - 1} (θ) | & = s u p_{Θ} | Φ^{- 1} (B) y_{t - 1} | = s u p_{Θ} | \sum_{i = 0}^{\infty} a_{Φ} (i) B^{i} y_{t - 1} | \leq \sum_{i = 0}^{\infty} O (ρ^{i}) | y_{t - 1 - i} | \\ \leq c ξ_{ρ, t - 1} \end{matrix}

(ii):

\begin{matrix} s u p_{Θ} ‖ \frac{\partial ε_{t} (θ)}{\partial ϕ} ‖ & = s u p_{Θ} ‖ - Φ^{- 2} (B) y_{t - 1} ‖ = s u p_{Θ} ‖ - Φ^{- 1} (B) [Φ^{- 1} (B) y_{t - 1}] ‖ \\ \leq ‖ - Φ^{- 1} (B) \sum_{i = 0}^{\infty} O (ρ_{0}^{i}) | y_{t - 1 - i} | ‖ \leq \sum_{j = 0}^{\infty} O (ρ_{0}^{j}) \sum_{i = 0}^{\infty} O (ρ_{0}^{i}) | y_{t - 1 - i} | \\ = c \sum_{k = 0}^{\infty} (k + 1) ρ_{0}^{k} | y_{t - 1 - k} | = \sum_{i = 0}^{\infty} ρ^{i} ∣ y_{t - 1 - i} ∣ \leq c ξ_{ρ, t - 1} \end{matrix}

for some

ρ

and

1 < \sqrt[k]{k + 1} ρ_{0} \leq ρ < 1

.

(iii):

\begin{matrix} s u p_{Θ} ‖ \frac{\partial^{2} ε_{t} (θ)}{\partial ϕ \partial ϕ^{'}} ‖ & = s u p_{Θ} ‖ 2 Φ^{- 3} (B) y_{t - 1} ‖ = s u p_{Θ} ‖ 2 Φ^{- 2} (B) ε_{t - 1} ‖ \\ \leq c ‖ 2 Φ^{- 1} (B) \sum_{i = 0}^{\infty} ρ_{0}^{i} ε_{t - 2 - i} ‖ \leq c ‖ \sum_{j = 0}^{\infty} ρ_{0}^{j} \sum_{i = 0}^{\infty} ρ_{0}^{i} ε_{t - 2 - i - j} ‖ \\ = c ‖ \sum_{k = 0}^{\infty} (k + 1) ρ_{0}^{k} ε_{t - 2 - k} ‖ = c ‖ \sum_{k = 0}^{\infty} (k + 1) ρ_{0}^{k} [\frac{1}{ϕ} y_{t - 1 - k} + \frac{1}{ϕ} ε_{t - 1 - k}] ‖ \\ \leq ‖ c \sum_{k = 0}^{\infty} (k + 1) ρ_{0}^{k} | y_{t - 1 - k} | ‖ + ‖ c \sum_{k = 0}^{\infty} (k + 1) ρ_{0}^{k} | ε_{t - 1 - k} | ‖ \\ \leq c \sum_{k = 0}^{\infty} ρ_{1}^{k} | y_{t - 1 - k} | + c \sum_{k = 0}^{\infty} ρ_{1}^{k} ξ_{ρ, t - 1} \\ \leq c ‖ \sum_{k = 0}^{\infty} ρ_{1}^{k} | y_{t - 1 - k} | ‖ + c ‖ \sum_{k = 0}^{\infty} ρ_{1}^{k} [1 + \sum_{s = 0}^{\infty} ρ^{s} | y_{t - 1 - s} | ‖ \leq c ξ_{ρ, t - 1} \end{matrix}

for some

ρ_{0}, ρ_{1}

and

ρ_{2}

, and

0 < \sqrt[k]{k + 1} ρ_{0} \leq ρ_{1} \leq ρ_{2} \leq ρ < 1

. □

Proof of Lemma 2.

(i): From the models (1)–(3), we can deduce that

h_{t} (θ) = β^{- 1} (B)

(ω + α y_{t - 1}^{2})

.

\begin{matrix} h_{t} (θ) & = β^{- 1} (B) (ω + α y_{t - 1}^{2}) = \sum_{i = 0}^{\infty} β^{i} ω + α \sum_{i = 0}^{\infty} β^{i} y_{t - 1 - i}^{2} \leq \sum_{i = 0}^{\infty} ρ^{i} ω + \bar{α} \sum_{i = 0}^{\infty} ρ^{i} y_{t - 1 - i}^{2} \\ \leq c ζ_{ρ, t - 1} \end{matrix}

(ii): From models (1)–(3), one iteratively obtains

\begin{matrix} h_{t} (θ) & = ω + α y_{t - 1}^{2} + β h_{t - 1} = ω + α y_{t - 1}^{2} + β (ω + α y_{t - 2}^{2} + β h_{t - 2}) \\ = \dots = \frac{ω}{1 - β} + α \sum_{j = 0}^{\infty} β^{j} y_{t - j - 1}^{2} \end{matrix}

Then

\frac{1}{h_{t} (θ)} \frac{\partial h_{t} (θ)}{\partial ω} = \frac{1}{\frac{ω}{1 - β} + α \sum_{j = 0}^{\infty} β^{j} y_{t - j - 1}^{2}} \frac{1}{1 - β} \leq \frac{1}{\frac{ω}{1 - β}} \frac{1}{1 - β} \leq \underset{̲}{ω}

\begin{matrix} \frac{1}{h_{t} (θ)} \frac{\partial h_{t} (θ)}{\partial α} & = \frac{1}{h_{t} (θ)} β^{- 1} (B) y_{t - 1}^{2} = \frac{1}{\frac{ω}{1 - β} + α \sum_{j = 0}^{\infty} β^{j} y_{t - j - 1}^{2}} \sum_{j = 0}^{\infty} β^{j} y_{t - j - 1}^{2} \\ \leq \frac{1}{\frac{ω}{1 - β}} \sum_{j = 0}^{\infty} β^{j} y_{t - j - 1}^{2} \leq c ζ_{ρ, t - 1} \end{matrix}

because

\frac{\partial h_{t} (θ)}{\partial β} = \frac{ω}{{(1 - β)}^{2}} + α \sum_{j = 0}^{\infty} j β^{j - 1} y_{t - j - i}^{2}

, so

\begin{matrix} \frac{1}{h_{t} (θ)} \frac{\partial h_{t} (θ)}{\partial β} & = \frac{1}{\frac{ω}{1 - β} + α \sum_{j = 0}^{\infty} β^{j} y_{t - j - 1}^{2}} [\frac{ω}{{(1 - β)}^{2}} + α Σ_{j = 0}^{\infty} j β^{j - 1} y_{t - j - 1}^{2}] \\ \leq \frac{1}{\frac{ω}{1 - β}} [\frac{ω}{{(1 - β)}^{2}} + α \sum_{j = 0}^{\infty} j β^{j - 1} y_{t - j - 1}^{2}] \\ = \frac{1}{1 - β} + \frac{1 - β}{ω} \frac{α}{β} \sum_{j = 0}^{\infty} j β^{j} y_{t - j - 1}^{2} \\ \leq \frac{1}{1 - β} + \frac{1 - β}{ω} \frac{α}{β} \sum_{j = 0}^{\infty} ρ^{j} y_{t - j - 1}^{2} \\ \leq c ζ_{ρ, t - 1} \end{matrix}

where

0 < β < \sqrt[j]{j} β < ρ < 1

.

(iii): The proof of (iii) is similar to that of (ii).

(vi): Because

\frac{\partial h_{t} (θ)}{\partial ϕ} = 0

, so,

\frac{1}{\sqrt{h_{t} (θ)}} \frac{\partial h_{t} (θ)}{\partial ϕ} = 0

. □

Appendix A.2

Proof of Theorem 1.

(i): First, the space

Θ

is compact, and

θ_{0}

is an interior point in

Θ

. Second,

L_{s n} (θ)

is continuous in

θ \in Θ

and is a measurable function of

{y_{i}, i = t, t - 1, \dots}

for all

θ \in Θ

. Third, by Lemmas 1 and 2, it follows that

\begin{matrix} | e_{t} | = | \frac{ε_{t} (θ)}{\sqrt{h_{t} (θ)}} | = | \frac{\sum_{i = 0}^{\infty} {(- 1)}^{i} ϕ^{i} y_{t - 1}}{\sqrt{\frac{ω}{1 - β} α \sum_{j = 0}^{\infty} β^{j} y_{t - j - 1}^{2}}} | \leq c ξ_{ρ, t - 1} \end{matrix}

(A1)

\begin{matrix} s u p_{Θ} | ε_{t} (θ) | & = s u p_{Θ} | ε_{t} (θ_{0}) + {(θ - θ_{0})}^{'} \frac{\partial ε_{t} (θ_{0})}{\partial θ} | \\ \leq | ε_{t} (θ_{0}) | + s u p_{Θ} ‖ θ - θ_{0} ‖^{'} s u p_{Θ} ‖ \frac{\partial ε_{t} (θ)}{\partial θ} ‖ \\ \leq | e_{t} | \sqrt{h_{t} (θ_{0})} + O (1) ξ_{ρ, t - 1} \\ \leq O (1) [ξ_{ρ, t - 1} ζ_{ρ, t - 1}^{1 / 2} + ξ_{ρ, t - 1}] \end{matrix}

(A2)

\begin{matrix} s u p_{Θ} h_{t} (θ) \leq c ζ_{ρ, t - 1} \end{matrix}

(A3)

By (A1)–(A3), we can show that

\begin{matrix} E s u p_{Θ} [ω_{t} \frac{ε_{t}^{2} (θ)}{h_{t} (θ)}] & \leq E s u p_{Θ} [ω_{t} O (1) \frac{{(ξ_{ρ, t - 1} ζ_{ρ, t - 1}^{1 / 2} + ξ_{ρ, t - 1})}^{2}}{\underset{̲}{ω}}] \\ \leq c E s u p_{Θ} [ω_{t} ξ_{ρ, t - 1}^{2} ζ_{ρ, t - 1} + ω_{t} ξ_{ρ, t - 1}^{2} + 2 ζ_{ρ, t - 1}^{1 / 2} ξ_{ρ, t - 1}^{2}] \\ < \infty \end{matrix}

(A4)

\begin{matrix} E s u p_{Θ} | ω_{t} l o g h_{t} (θ) | = O (1) E s u p_{Θ} l o g h_{t} (θ) \leq O (1) l o g E s u p_{Θ} h_{t} (θ) < \infty \end{matrix}

(A5)

Therefore, by (A4) and (A5), we have

E s u p_{Θ} ω_{t} l_{t} (θ) \leq E s u p_{Θ} | ω_{t} l o g h_{t} (θ) | + E s u p_{Θ} | ω_{t} \frac{ε_{t}^{2} (θ)}{h_{t} (θ)} | < \infty

By ergodic theorem, we have

L_{s n} \to E [ω_{t} l_{t} (θ)] a . s . \forall θ \in Θ .

By Theorem 3.1 in Ling and McAleer (2003a) [19], it follows that

s u p_{Θ} | L_{s n} (θ) - E [ω_{t} l_{t} (θ)] | ⟶_{p} 0

\begin{matrix} E [ω_{t} l_{t} (θ)] & = E ω_{t} l o g h_{t} (θ) + E ω_{t} \frac{{[ε_{t} (θ_{0}) + {(θ - θ_{0})}^{'} \frac{\partial ε_{t} (θ^{*})}{\partial θ}]}^{2}}{h_{t} (θ)} \\ = E ω_{t} l o g h_{t} (θ) + E ω_{t} \frac{ε_{t}^{2} (θ_{0})}{h_{t} (θ)} + 2 E ω_{t} \frac{ε_{t} (θ_{0}) {(θ - θ_{0})}^{'} \frac{\partial ε_{t} (θ^{*})}{\partial θ}}{h_{t} (θ)} \\ - {(θ - θ_{0})}^{'} E [\frac{ω_{t}}{h_{t} (θ)} \frac{\partial ε_{t} (θ^{*})}{\partial θ} \frac{\partial ε_{t} (θ^{*})}{\partial θ}] {(θ - θ_{0})}^{'} \end{matrix}

Because

E e_{t}^{2} = 1, E e_{t} = 0

, thus,

E ω_{t} \frac{ε_{t}^{2} (θ_{0})}{h_{t} (θ)} = E ω_{t} e_{t}^{2} \frac{h_{t} (θ_{0})}{h_{t} (θ)} = E ω_{t} \frac{h_{t} (θ_{0})}{h_{t} (θ)} [E e_{t}^{2} | F_{t - 1}] = E ω_{t} \frac{h_{t} (θ_{0})}{h_{t} (θ)}

2 E ω_{t} \frac{ε_{t} (θ_{0}) {(θ - θ_{0})}^{'} \frac{\partial ε_{t} (θ^{*})}{\partial θ}}{h_{t} (θ)} = 0

.

Therefore,

E [ω_{t} l_{t} (θ)] = A + B

, where

A = E ω_{t} l o g h_{t} (θ) + E ω_{t} \frac{h_{t} (θ_{0})}{h_{t} (θ)}

B = {(θ - θ_{0})}^{'} E [\frac{ω_{t}}{h_{t} (θ)} \frac{\partial ε_{t} (θ^{*})}{\partial θ} \frac{\partial ε_{t} (θ^{*})}{\partial θ^{'}}] {(θ - θ_{0})}^{'}

.

Considering the function

f (x) = l o g x + \frac{a}{x}

when

a \geq 0

, it reaches a minimum at

x = a

. Thus, for A, A reaches the minimum if and only if

θ = θ_{0}

. For B,

B \geq 0

, B reaches the minimum if and only

θ = θ_{0}

. Thus, we can claim that

E [ω_{t} l_{t} (θ)]

is uniformly minimized at

θ = θ_{0}

.

Next, we use the method in Theorem 4.1.1 in Amemiya [20]. Let V be any open neighborhood of

θ_{0} \in Θ

. Then

V^{c} ⋂ Θ

is the compact set, and

E [ω_{t} l_{t} (θ)]

exists as a minimum in

θ \in V^{c} ⋂ Θ

.

We order

\begin{matrix} ε = min_{Θ \in V^{c} ⋂ Θ} E [ω_{t} l_{t} (θ)] - E [ω_{t} l_{t} (θ_{0})] \end{matrix}

(A6)

and let

| L_{s n} (θ) - E ω_{t} l_{t} (θ) | < \frac{ε}{2}, \forall θ \in Θ

be the event A.

Thus,

A \Rightarrow E ω_{t} l_{t} (θ) > L_{s n} (θ) - \frac{ε}{2}

,

A \Rightarrow L_{s n} (θ) > E ω_{t} l_{t} (θ) - \frac{ε}{2}

.

Because

θ, \hat{θ} (s n) \in Θ

, thus,

\begin{matrix} A \Rightarrow E ω_{t} l_{t} (θ_{0}) > L_{s n} (θ_{0}) - \frac{ε}{2} \end{matrix}

(A7)

\begin{matrix} A \Rightarrow L_{s n} ({\hat{θ}}_{s n}) > E ω_{t} l_{t} ({\hat{θ}}_{s n}) - \frac{ε}{2} \end{matrix}

(A8)

and because

L_{s n} ({\hat{θ}}_{0}) \leq L_{s n} (θ_{0})

, then, by (A2), we have

\begin{matrix} A \Rightarrow E ω_{t} l_{t} (θ_{0}) > L_{s n} ({\hat{θ}}_{s n}) - \frac{ε}{2} \end{matrix}

(A9)

By (A8) and (A9), we have

\begin{matrix} A \Rightarrow E ω_{t} l_{t} (θ_{0}) > E ω_{t} l_{t} ({\hat{θ}}_{s n}) - ε \end{matrix}

(A10)

By (A6) and (A10), we have

A \Rightarrow E ω_{t} l_{t} (θ_{0}) > E ω_{t} l_{t} ({\hat{θ}}_{s n}) - min_{Θ \in V^{c} ⋂ Θ} E [ω_{t} l_{t} (θ)] + E [ω_{t} l_{t} (θ_{0})]

So

A \Rightarrow {min}_{Θ \in V^{c} ⋂ Θ} E [ω_{t} l_{t} (θ)] > E ω_{t} l_{t} ({\hat{θ}}_{s n})

Thus,

A \Rightarrow {\hat{θ}}_{s n} \in V

, and then

P (A) \leq P ({\hat{θ}}_{s n} \in V)

.

And because

{lim}_{n \to \infty} P (A) = 1

, we have

{\hat{θ}}_{s n} \to_{p} θ_{0}

.

(ii): We further give the quasi-score function and the quasi-information matrix as follows.

Take the derivative of

l_{t} (θ)

:

\begin{matrix} \frac{\partial l_{t} (θ)}{\partial θ} = 2 \frac{ε_{t} (θ)}{h_{t} (θ)} \frac{\partial ε_{t} (θ)}{\partial ϕ} + \frac{1}{h_{t} (θ)} (1 - \frac{ε_{t}^{2} (θ)}{h_{t} (θ)}) \frac{\partial h_{t} (θ)}{\partial θ} \\ \frac{\partial^{2} l_{t} (θ)}{\partial ϕ \partial ϕ^{'}} = \frac{2}{h_{t} (θ)} \frac{\partial ε_{t} (θ)}{\partial ϕ} \frac{\partial ε_{t} (θ)}{\partial ϕ^{'}} + \frac{2 ε_{t} (θ)}{h_{t} (θ)} \frac{\partial^{2} ε_{t} (θ)}{\partial ϕ \partial ϕ^{'}} \\ \frac{\partial^{2} l_{t} (θ)}{\partial δ \partial δ^{'}} = - \frac{1}{h_{t}^{2} (θ)} \frac{\partial h_{t} (θ)}{\partial δ} \frac{\partial h_{t} (θ)}{\partial δ^{'}} + \frac{1}{h_{t} (θ)} \frac{\partial^{2} h_{t} (θ)}{\partial δ \partial δ^{'}} + \frac{2 ε_{t}^{2} (θ)}{h_{t}^{3} (θ)} \frac{\partial h_{t} (θ)}{\partial δ} \frac{\partial h_{t} (θ)}{\partial δ^{'}} - \frac{ε_{t}^{2} (θ)}{2 h_{t}^{2} (θ)} \frac{\partial^{2} h_{t} (θ)}{\partial δ \partial δ^{'}} \end{matrix}

Then

\begin{matrix} s u p_{Θ} ‖ \frac{\partial^{2} l_{t} (θ)}{\partial ϕ \partial ϕ^{'}} ‖ \leq O (1) s u p_{Θ} {‖ \frac{\partial ε_{t} (θ)}{\partial ϕ} ‖^{2} + | e_{t} (θ) | ‖ \frac{\partial^{2} ε_{t} (θ)}{\partial ϕ \partial ϕ^{'}} ‖} \\ \leq O (1) ξ_{ρ, t - 1}^{2} \end{matrix}

(A11)

\begin{matrix} s u p_{Θ} ‖ \frac{\partial^{2} l_{t} (θ)}{\partial δ \partial δ^{'}} ‖ & \leq O (1) {ζ_{ρ, t - 1}^{2} + ζ_{ρ, t - 1} + ζ_{ρ, t - 1}^{2} | e_{t} |^{2} + | e_{t} |^{2} ζ_{ρ, t - 1}} \\ \leq O (1) {ζ_{ρ, t - 1}^{2} (1 + ξ_{ρ, t - 1}^{2})} \end{matrix}

(A12)

Therefore,

E s u p_{Θ} ‖ ω_{t} \frac{\partial^{2} l_{t} (θ)}{\partial θ \partial θ^{'}} ‖ < \infty

By ergodic and Theorem 3.1 in Ling and McAleer(2003) [19], it follows that

\frac{\partial^{2} L_{s n} (θ)}{\partial θ \partial θ^{'}} ⟶_{p} E [ω_{t} \frac{\partial^{2} l_{t} (θ)}{\partial θ \partial θ^{'}}]

if and only if

n ⟶ \infty

.

By the double expectation property, it follows that

\begin{matrix} E [ω_{t} \frac{\partial^{2} l_{t} (θ)}{\partial θ \partial θ^{'}}] & = E {E (ω_{t} \frac{\partial^{2} l_{t} (θ)}{\partial θ \partial θ^{'}} | F_{t - 1})} \\ = E [2 \frac{ω_{t}}{h_{t} (θ)} \frac{\partial ε_{t} (θ)}{\partial ϕ} \frac{\partial ε_{t} (θ)}{\partial ϕ^{'}} + \frac{ω_{t}}{h_{t}^{2} (θ)} \frac{\partial h_{t} (θ)}{\partial δ} \frac{\partial h_{t} (θ)}{\partial δ^{'}}] \\ = Σ_{0} \end{matrix}

Then,

\frac{\partial^{2} L_{s n} (θ_{0})}{\partial θ \partial θ^{'}} = \frac{1}{n} \sum_{t = 1}^{\infty} ω_{t} \frac{\partial^{2} l_{n} (θ_{0})}{\partial θ \partial θ^{'}} ⟶_{p} Σ_{0}

Similarly, we can show that

E s u p_{Θ} ∥ \frac{\partial l_{t} (θ)}{\partial θ} ∥ < \infty

holds.

Because

\sqrt{n} \frac{\partial L_{s n} (θ)}{\partial θ} = \frac{1}{\sqrt{n}} \sum_{t = 1}^{n} ω_{t} \frac{\partial l_{t} (θ_{0})}{\partial θ} = \sum_{t = 1}^{n} \frac{1}{\sqrt{n}} ω_{t} \frac{\partial l_{t} (θ_{0})}{\partial θ}

Therefore,

E (\frac{1}{\sqrt{n}} ω_{t} \frac{\partial l_{t} (θ_{0})}{\partial θ} | F_{t - 1}) = 0

Thus,

E (\sqrt{n} \frac{\partial L_{s n} (θ_{0})}{\partial θ} | F_{t - 1}) = 0

,

V a r (\frac{1}{\sqrt{n}} ω_{t} \frac{\partial l_{t} (θ_{0})}{\partial θ} | F_{t - 1}) = \frac{1}{n} E (ω_{t}^{2} \frac{\partial l_{t} (θ_{0})}{\partial θ} \frac{\partial l_{t} (θ_{0})}{\partial θ^{'}} | F_{t - 1})

Because

\frac{\partial h_{t} (θ)}{\partial θ} \frac{\partial ε_{t} (θ)}{\partial θ^{'}} = 0

, then

\begin{matrix} \frac{\partial l_{t} (θ_{0})}{\partial θ} \frac{\partial l_{t} (θ_{0})}{\partial θ^{'}} & = \frac{1}{h_{t}^{2} (θ)} {(1 - \frac{ε_{t}^{2} (θ_{0})}{h_{t} (θ_{0}})}^{2} \frac{\partial h_{t} (θ_{0})}{\partial θ} \frac{\partial h_{t} (θ_{0})}{\partial θ^{'}} + 4 \frac{ε_{t}^{2} (θ_{0})}{h_{t}^{2} (θ_{0})} \frac{\partial ε_{t} (θ_{0})}{\partial θ} \frac{\partial ε_{t} (θ_{0})}{\partial θ^{'}} \\ + 4 \frac{ε_{t} (θ_{0})}{h_{t}^{2} (θ_{0})} (1 - \frac{ε_{t}^{2} (θ_{0})}{h_{t} (θ_{0})}) \frac{\partial h_{t} (θ_{0})}{\partial θ} \frac{\partial ε_{t} (θ_{0})}{\partial θ^{'}} \\ = 4 \frac{1}{h_{t}^{2} (θ)} {(1 - e_{t}^{2})}^{2} \frac{\partial h_{t} (θ_{0})}{\partial θ} \frac{\partial h_{t} (θ_{0})}{\partial θ^{'}} + \frac{e_{t}^{2}}{h_{t} (θ_{0})} \frac{\partial ε_{t} (θ_{0})}{\partial θ} \frac{\partial ε_{t} (θ_{0})}{\partial θ^{'}} \end{matrix}

\begin{matrix} V a r (\sqrt{n} \frac{\partial L_{s n} (θ_{0})}{\partial θ} | F_{t - 1}) & = \sum_{t = 1}^{\infty} V a r (\frac{1}{\sqrt{n}} ω_{t} \frac{\partial l_{s n} (θ_{0})}{\partial θ} | F_{t - 1}) \\ = \frac{1}{n} \sum_{t = 1}^{n} E (ω_{t}^{2} \frac{\partial l_{s n} (θ_{0})}{\partial θ} \frac{\partial l_{s n} (θ_{0})}{\partial θ^{'}} | F_{t - 1}) \\ = \frac{1}{n} \sum_{t = 1}^{n} [4 \frac{ω_{t}^{2}}{h_{t} (θ_{0})} {(\frac{\partial ε_{t} (θ_{0})}{\partial θ})}^{2} + ω_{t}^{2} \frac{E e_{t}^{4} - 1}{h_{t}^{2} (θ_{0})} (\frac{\partial h_{t} (θ_{0})}{\partial θ})] \\ \to_{p} Ω_{0} \end{matrix}

where

Ω_{0} = E [4 \frac{ω_{t}^{2}}{h_{t} (θ_{0})} {(\frac{\partial ε_{t} (θ_{0})}{\partial θ})}^{2} + ω_{t}^{2} \frac{ς}{h_{t}^{2} (θ_{0})} (\frac{\partial h_{t} (θ_{0})}{\partial θ})]

,

ς = E e_{t}^{4} - 1

Then,

Ω_{0} = E [4 \frac{ω_{t}^{2}}{h_{t} (θ_{0})} {(\frac{\partial ε_{t} (θ_{0})}{\partial θ})}^{2} + ω_{t}^{2} \frac{s}{h_{t}^{2} (θ_{0})} (\frac{\partial h_{t} (θ_{0})}{\partial θ})] \leq c E [ω_{t}^{2} ξ_{ρ, t - 1}^{2} + s ω_{t}^{2} ζ_{ρ, t - 1}^{2}]

Because

ω_{t}

is bounded, by Assumptions A3 and A4, it follows that

Ω_{0} < \infty

By the central limit theorem for a martingale different sequence, we have

\frac{\partial L_{s n} (θ_{0})}{\partial θ} \to_{d} N (0, Ω_{0}) .

By Theorem 4.1.3 in Amemiya (1985) [20], it follows that

\sqrt{n} ({\hat{θ}}_{s n} - θ_{0}) \to_{d} N (0, Σ_{0}^{- 1} Ω_{0} Σ_{0}^{- 1})

□

References

Engle, R.F. Autoregressive conditional heteroscdeasticity with estimates of the variance of united kingdom inflation. Econometrica 1982, 50, 897–1007. [Google Scholar] [CrossRef]
Bollerslve, T. Generalized autoregressive condition heteroskedasticity. J. Econom. 1986, 31, 307–327. [Google Scholar] [CrossRef]
Nelson, D.B. Conditional heteroskedasticity in asset returns: A new approach. Econometrica 1991, 59, 347–870. [Google Scholar] [CrossRef]
Glosten, L.R.; Jagannathan, R.; Runkle, D.E. On the relationship between the expected value and the valatility of the nominal excess return on stocks. J. Financ. 1993, 48, 1779–1801. [Google Scholar] [CrossRef]
Long, H.V.; Jebreen, H.B.; Dassios, I.; Baleanu, D. On the Statistical GARCH Model for Managing the Risk by Employing a Fat-Tailed Distribution in Finance. Symmetry 2020, 12, 1698. [Google Scholar] [CrossRef]
Li, X.; Zhang, X.; Li, Y. High-Dimensional Conditional Covariance Matrices Estimation Using a Factor-GARCH Model. Symmetry 2022, 14, 158. [Google Scholar] [CrossRef]
Baillie, R.T. Long memory processes and fractional integration in econometrics. J. Econom. 1996, 73, 5–59. [Google Scholar] [CrossRef]
Ling, S. Estimation and testing stationarity for double autoregressive models. J. R. Stat. Soc. 2004, 66, 63–78. [Google Scholar] [CrossRef]
Ling, S. Self-weighted and local quasi-maximum likelihood estimators for ARMA-GARCH/IGARCH model. J. Econom. 2007, 140, 849–873. [Google Scholar] [CrossRef]
Zhu, K.; Ling, S. Global self-weighted and local quasi-maximum exponential likelihood estimators forARMA-GARCH/IGARCH models. Ann. Stat. 2011, 39, 2131–2163. [Google Scholar] [CrossRef]
Pan, B.; Chen, M.; Wang, Y.; Xia, W. Weighted least absolute deviations estimation for ARFIMA time with finite or infinite variance. J. Korean Stat. Soc. 2015, 44, 1–11. [Google Scholar] [CrossRef]
Ramírez-Parietti, I.; Contreras-Reyes, J.E.; Idrovo-Aguirre, B.J. Cross-sample entropy estimation for time series analysis: A nonparametric approach. Nonlinear Dyn. 2021, 105, 2485–2508. [Google Scholar] [CrossRef]
Ling, S. A double AR(p) model: Structure and estimation. Stat. Sin. 2007, 17, 161–175. [Google Scholar]
Zhu, H.; Zhang, X.; Liang, X.; Li, Y. Moving Average Model with an Alternative GARCH-type Error. J. Syst. Sci. Inf. 2018, 6, 165–177. [Google Scholar] [CrossRef]
Zhu, K.; Ling, S. Quasi-maximum exponential likelihood for a double AR(p) model. Stat. Sin. 2013, 23, 251–270. [Google Scholar]
Yoshida, N. Quasi-likelihood analysis and its applications. Stat. Inference Stoch. Process. 2022, 25, 43–60. [Google Scholar] [CrossRef]
Ling, S. Self-weighted least absolute deviation estimation for infinite variance autoragressive models. R. Stat. Soc. 2005, 67, 381–389. [Google Scholar] [CrossRef]
Huber, P.J. Robust Statistical Procedures; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 1997. [Google Scholar]
Ling, S.; McAleer, M. Asymptotic theory for a new vector ARMA-GARCH model. Econom. Theory 2003, 19, 280–310. [Google Scholar] [CrossRef]
Amemiya, T. Advanced Econometrics; Harvard University Press: Cambridge, MA, USA, 1985. [Google Scholar]

Figure 1.

x_{t}

be the logarithms of the opening price of ST Daji (000564).

Figure 2. Time series diagram of

y_{t} = x_{t}

-

x_{t - 1}

.

Figure 3. Time series diagram of the data

x_{t}

(green line) and 95% forecasting confidence intervals based on models (8) and (9) (red line) and models (10) and (11) (black line), respectively.

Figure 4. Part of Figure 3 is enlarged.

Figure 5.

x_{t}

, the logarithms of the 21-day China Interbank Offered Rate.

Figure 6. Time series diagram of

y_{t} = x_{t}

-

x_{t - 1}

.

Figure 7. Time series diagram of the data

x_{t}

(green line) and 95% forecasting confidence intervals based on models (12) and (13) (red line) and models (14) and (15) (black line), respectively.

Figure 8. Part of Figure 7 is enlarged.

Table 1. Estimators for the models (1)–(3) when

e_{t} \sim N (0, 1)

.

Table 1. Estimators for the models (1)–(3) when

e_{t} \sim N (0, 1)

.

$θ_{0} = (0.5, 0.1, 0.4, 0.3)$
		$\hat{ϕ}$	$\hat{ω}$	$\hat{α}$	$\hat{β}$
n = 400	QMLE(Bias)	−0.0008	0.0053	−0.0024	−0.0167
	SQMLE(Bias)	0.0001	0.0051	0.0048	−0.0196
	QMLE(SD)	0.0470	0.0263	0.0860	0.1030
	SQMLE(SD)	0.0456	0.0260	0.0929	0.1037
	QMLE(AD)	0.0416	0.0224	0.0848	0.0694
	SQMLE(AD)	0.0423	0.0222	0.0938	0.0691
n = 800	QMLE(Bias)	−0.0014	0.0028	0.0008	−0.0099
	SQMLE(Bias)	−0.0012	0.0028	0.0033	−0.0111
	QMLE(SD)	0.0330	0.0177	0.0529	0.0707
	SQMLE(SD)	0.0334	0.0177	0.0635	0.0718
	QMLE(AD)	0.0324	0.0219	0.0603	0.0722
	SQMLE(AD)	0.0329	0.0227	0.0653	0.0757
n = 1200	QMLE(Bias)	−0.0011	0.0013	0.0000	−0.0054
	SQMLE(Bias)	−0.0006	0.0012	0.0010	−0.0052
	QMLE(SD)	0.0272	0.0139	0.0487	0.0564
	SQMLE(SD)	0.0276	0.0140	0.0539	0.0578
	QMLE(AD)	0.0241	0.0140	0.0441	0.0592
	SQMLE(AD)	0.0244	0.0145	0.0492	0.0612

Table 2. Estimators for the models (1)–(3) when

e_{t} \sim L a p l a c e (0, 1)

.

Table 2. Estimators for the models (1)–(3) when

e_{t} \sim L a p l a c e (0, 1)

.

$θ_{0} = (0.5, 0.1, 0.4, 0.3)$
		$\hat{ϕ}$	$\hat{ω}$	$\hat{α}$	$\hat{β}$
n = 400	QMLE(Bias)	−0.0003	0.0080	−0.0002	−0.0304
	SQMLE(Bias)	0.0024	0.0060	0.0254	−0.0334
	QMLE(SD)	0.0506	0.0403	0.1709	0.1809
	SQMLE(SD)	0.0529	0.0405	0.2018	0.1876
	QMLE(AD)	0.0576	0.0614	0.1345	0.2946
	SQMLE(AD)	0.0702	0.0561	0.2902	0.2805
n = 800	QMLE(Bias)	−0.0005	0.0057	0.0026	−0.0219
	SQMLE(Bias)	−0.0008	0.0041	0.0184	−0.0221
	QMLE(SD)	0.0368	0.0323	0.1299	0.1447
	SQMLE(SD)	0.0383	0.0323	0.1617	0.1519
	QMLE(AD)	0.0333	0.0156	0.0908	0.0812
	SQMLE(AD)	0.0362	0.0170	0.1083	0.0952
n = 1200	QMLE(Bias)	−0.0001	0.0032	0.0003	-0.0119
	SQMLE(Bias)	−0.0006	0.0021	0.0065	−0.0100
	QMLE(SD)	0.0297	0.0249	0.1042	0.1174
	SQMLE(SD)	0.0307	0.0257	0.1305	0.1260
	QMLE(AD)	0.0289	0.0204	0.0966	0.0880
	SQMLE(AD)	0.0308	0.0200	0.1218	0.0895

Table 3. Estimators for the models (1)–(3) when

e_{t} \sim t_{3}

.

Table 3. Estimators for the models (1)–(3) when

e_{t} \sim t_{3}

.

$θ_{0} = (0.5, 0.1, 0.4, 0.3)$
		$\hat{ϕ}$	$\hat{ω}$	$\hat{α}$	$\hat{β}$
n = 400	QMLE(Bias)	−0.0014	0.0072	−0.0160	−0.0485
	SQMLE(Bias)	0.0003	0.0060	0.0293	−0.0490
	QMLE(SD)	0.0536	0.0426	0.2223	0.1944
	SQMLE(SD)	0.0555	0.0430	0.2463	0.2016
	QMLE(AD)	0.0581	0.0166	0.1437	0.1026
	SQMLE(AD)	0.0638	0.0159	0.1993	0.2042
n = 800	QMLE(Bias)	−0.0010	−0.0046	−0.0034	-0.0265
	SQMLE(Bias)	−0.0010	0.0042	0.0191	−0.0303
	QMLE(SD)	0.0408	0.0357	0.1921	0.1769
	SQMLE(SD)	0.0436	0.0361	0.2191	0.1846
	QMLE(AD)	0.0371	0.0244	0.1128	0.1032
	SQMLE(AD)	0.0393	0.0228	0.1304	0.0967
n = 1200	QMLE(Bias)	−0.0009	0.0039	0.0059	−0.0243
	SQMLE(Bias)	−0.0010	0.0037	0.0188	−0.0279
	QMLE(SD)	0.0322	0.0328	0.1767	0.1600
	SQMLE(SD)	0.0334	0.0324	0.2017	0.1603
	QMLE(AD)	0.0304	0.0217	0.1243	0.0885
	SQMLE(AD)	0.0316	0.0229	0.1406	0.0941

Table 4. Comparison of the fitting effect of models (8) and (9) and models (10) and (11).

Models	RMSE	Log-Likelihood Value
(8) and (9)	0.0361	2315.2
(10) and (11)	0.0360	2316.4

Table 5. Comparison of the fitting effect of models (12) and (13) and models (14) and (15).

Models	RMSE	Log-Likelihood Value
(12) and (13)	0.0818	340.91
(14) and (15)	0.0818	340.91

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Self-Weighted Quasi-Maximum Likelihood Estimators for a Class of MA-GARCH Model

Abstract

1. Introduction

2. Self-Weighted QMLE

3. Simulation

4. A Real Example

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A

Appendix A.1

Appendix A.2

References

Article Metrics

Citations

Article Access Statistics