Next Article in Journal
Children’s Health Capital Investment: Effects of U.S. Infant Breastfeeding on Teenage Obesity
Previous Article in Journal
Estimating the Competitive Storage Model with Stochastic Trends in Commodity Prices
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Second-Order Least Squares Estimation in Nonlinear Time Series Models with ARCH Errors

1
Department of Statistics, Cairo University, Giza 12613, Egypt
2
Department of Statistics, University of Manitoba, Winnipeg, MB R3T 2N2, Canada
*
Author to whom correspondence should be addressed.
Econometrics 2021, 9(4), 41; https://doi.org/10.3390/econometrics9040041
Submission received: 20 September 2021 / Revised: 23 November 2021 / Accepted: 23 November 2021 / Published: 27 November 2021

Abstract

:
Many financial and economic time series exhibit nonlinear patterns or relationships. However, most statistical methods for time series analysis are developed for mean-stationary processes that require transformation, such as differencing of the data. In this paper, we study a dynamic regression model with nonlinear, time-varying mean function, and autoregressive conditionally heteroscedastic errors. We propose an estimation approach based on the first two conditional moments of the response variable, which does not require specification of error distribution. Strong consistency and asymptotic normality of the proposed estimator is established under strong-mixing condition, so that the results apply to both stationary and mean-nonstationary processes. Moreover, the proposed approach is shown to be superior to the commonly used quasi-likelihood approach and the efficiency gain is significant when the (conditional) error distribution is asymmetric. We demonstrate through a real data example that the proposed method can identify a more accurate model than the quasi-likelihood method.

1. Introduction

Dynamic models have been widely applied in the analysis of economic and financial data. Most theories and methods are developed for mean-stationary data generating processes, specifically ARMA processes, although ARIMA processes have also been studied (Koul and Ling 2006; Ling 2003; Ling and McAleer 2003; Meitz and Saikkonen 2008). However, many financial and economic variables exhibit nonlinear behaviour or relationships (e.g., Enders 2010; Franses and Van Dijk 2000). As pointed out by Li et al. (2002), consistent estimation of variance parameters may be misleading or impossible if the conditional mean function is not adequately specified. Therefore, more general and flexible models with time varying mean functions are desirable to capture the nonlinear dynamic behaviour and the structural relationships in the real data.
On the other hand, the autoregressive conditional heteroscedasticity (ARCH) model and its various generalizations have been widely used to analyze economic and financial data. These models allow both the conditional means and variances of a process to jointly evolve over time. The mainstream method for estimation and inference in generalized ARCH (GARCH) models is likelihood based (e.g., Engle 1982; Engle and Gonzalez-Rivera 1991; Weiss 1986), although the estimating function approach is also studied (Li and Turtle 2000). So far most research focuses on the quasi-likelihood method for various generalizations of the ARCH error component while keeping the process mean function very simple, e.g., constant or linear. Therefore from both theoretical and practical points of view, it is important to develop methodologies for mean nonstationary ARCH processes.
In this paper, we consider a model with a fairly general time varying nonlinear mean function and ARCH error that covers both stationary and mean-nonstationary processes. In particular, we propose a so-called second-order least squares (SLS) approach based on the first two conditional moments of the process. We establish the consistency and asymptotic normality for the proposed estimator under general mixing conditions. We demonstrate that this approach is more efficient than the commonly used quasi-likelihood method and the efficiency gain is significant when the conditional error distribution is asymmetric. We also carry out extensive simulations to study finite sample properties of our proposed estimator and compare its performance with other related estimators. Our results show that in most cases the optimal SLS estimator has superior performance over the estimating function estimators based on the same set of moments. Finally, we apply our approach to the empirical example of the U.K. inflation of Engle (1982), which leads to a different model specification than the quasi-likelihood method.
It is worthwhile to note that some researchers have obtained more efficient estimators than the Gaussian quasi-maximum likelihood estimator (QMLE) by assuming various parametric families of error distributions (see a recent survey by Zhu and Li (2015)). However, our approach is based on the first two conditional moments of the process only and does not require any distributional assumptions. The SLS method was first used by Wang (2003, 2004) to estimate the nonlinear measurement error models. Later, it was extended to the nonlinear longitudinal data models by Wang (2007) and to the censored linear models by Abarin and Wang (2009). Wang and Leblanc (2008) showed that under a nonlinear cross-sectional data model, the SLS estimator is asymptotically more efficient than the ordinary least squares estimator when the error term has nonzero third moment, and both estimators are equally efficient otherwise. Further, Kim and Ma (2012) showed that the SLS estimator attains the optimal semiparametric efficiency bound in general. More recently, Rosadi and Filzmoser (2019); Rosadi and Peiris (2014) and Salamh and Wang (2021) used this method in dynamic models. It has also been applied in the optimal design problems by several researchers, e.g., Bose and Mukerjee (2015); Gao and Zhou (2017); Yin and Zhou (2017) and He and Yue (2019).
The paper is organized as follows. In Section 2 we introduce the model and SLS estimator and establish its consistency and asymptotic normality. In Section 3 we derive the optimal SLS estimator and propose a feasible version of it. We also investigate the efficiency gain of the optimal SLS estimator relative to the QMLE and highlight the differences between our approach and that of the estimating functions. In Section 4 we carry out Monte Carlo simulations to study the finite sample behavior of the SLS estimator in the cases of both skewed and leptokurtic conditional error distributions, and compare it with the QMLE and some other related estimators. In Section 5 we apply the SLS approach to an empirical analysis of the U.K. inflation data. Finally, conclusions and discussion are given in Section 6, while regularity assumptions and mathematical proofs are given in the Appendix A.

2. Model and SLS Estimation

Let y t , x t be a sequence of random vectors defined on a complete probability space Ω , F , P and denote v t = x t , y t 1 , x t 1 , , y t τ , x t τ for some nonnegative integer τ . We consider the model
y t = f t v t , θ 0 + ϵ t , t Z ,
where f t : R υ × Θ R 1 are known measurable functions on R υ for each θ Θ R q , and continuous on Θ uniformly in t a.s.-P. Further, denote the σ -field F t 1 = F x i , y i 1 , i t and assume that ϵ t = σ t ε t satisfying
E ε t | F t 1 = 0 a . s . - P , E ε t 2 | F t 1 = 1 a . s . - P , σ t 2 = ϕ 00 + i = 1 p ϕ 0 i ϵ t i 2 ,
where ϕ 00 , ϕ 0 p > 0 , and ϕ 0 i 0 for i = 1 , 2 , , p 1 . It is easy to see that the linear model with ARCH error is a special case of model (1) and (2). Our main goal is to estimate γ 0 = θ 0 , ϕ 0 Γ which is a compact subset of R q + p + 1 .
Given the random sample y T , x T , , y 1 p τ , x 1 p τ , let ϵ t θ = y t f t v t , θ , σ t 2 ( γ ) = ϕ 0 + i = 1 p ϕ i ϵ t i 2 θ and h t ( γ ) = ϵ t θ , y t 2 f t 2 v t , θ σ t 2 ( γ ) . Then the second-order least squares (SLS) estimator is defined as the F –measurable function satisfying
γ ^ T = argmin γ Γ Q T γ a . s . - P ,
where
Q T γ = T 1 t = 1 T h t ( γ ) W t h t ( γ )
and W t is a nonnegative definite matrix that is measurable, with respect to F v t , , v t p .
Next we establish the consistency and asymptotic normality of γ ^ T under the regularity conditions given in Appendix A. The consistency of γ ^ T follows from the uniform convergence of Q T ( γ ) on Γ to a non-random sequence Q ¯ T ( γ ) which have unique minimizer at γ 0 for sufficiently large T.
Theorem 1.
Under Assumptions A1–A3, γ ^ T a . s . γ 0 , as T .
Theorem 2.
Under Assumptions A1–A9, V T 1 / 2 A ¯ T ( γ 0 ) T γ ^ T γ 0 d N 0 , I q + p + 1 , as T , where
A ¯ T ( γ 0 ) = 2 T 1 t = 1 T E γ h t ( γ 0 ) W t γ h t ( γ 0 )
and
V T = 4 T 1 t = 1 T E γ h t ( γ 0 ) W t h t ( γ 0 ) h t ( γ 0 ) W t γ h t ( γ 0 ) .
The proofs are given in the Appendix A.

3. Optimal SLS Estimator

From Theorem 2 the asymptotic covariance (acov) of T γ ^ T γ 0 is given by A ¯ T 1 ( γ 0 ) V T A ¯ T 1 ( γ 0 ) which depends on the weights W t , t = 1 , 2 , , T . Therefore it is of interest to find the (asymptotically) optimal estimator, say γ ^ T o , which has the smallest asymptotic variance in the class of estimators defined by (3), i.e., for any estimator γ ^ T satisfying (3), acov T γ ^ T γ 0 acov T γ ^ T o γ 0 is nonnegative definite. The following theorem gives the optimal choice of W t to achieve this goal.
Theorem 3.
Suppose U t = E h t ( γ 0 ) h t ( γ 0 ) v t , , v t p is nonsingular a.s.-P, and Assumptions A2, A3, A6–A9 hold with W t = U t 1 . Then, the asymptotically optimal SLS (OSLS) estimator γ ^ T o is obtained by using W t = U t 1 , t = 1 , 2 , , T . Further, the corresponding (inverse) optimal covariance matrix is given by
a c o v 1 T γ ^ T o γ 0 = T 1 t = 1 T E γ h t ( γ 0 ) U t 1 γ h t ( γ 0 ) ,
= T 1 t = 1 T E B t Ω t 1 B t ,
where
B t = θ f t v t , θ 0 θ σ t 2 ( γ 0 ) 0 ϕ σ t 2 ( γ 0 ) ,
and
Ω t = σ t 2 ( γ 0 ) 1 σ t ( γ 0 ) E ε t 3 v t , , v t p · σ t 2 ( γ 0 ) E ε t 4 v t , , v t p 1 .
However, the OSLS estimator γ ^ T o is infeasible because U t depends on γ 0 and E ε t j v t , , v t p , j = 3 , 4 . In practice a two-step procedure can be used as follows. First, a first-step consistent estimator of γ 0 is calculated, such as the QMLE or simply the SLS γ ^ T using the identity weight matrix. Second, the residuals ε ^ t are calculated and suitable autoregressive models are fitted to ε ^ t 3 and ε ^ t 4 respectively. Finally, these fitted values are substituted in U t and the second-step estimator is calculated using the estimated optimal weights W t = U ^ t 1 . Under some general conditions this two-step estimator is consistent and, moreover, it has the same asymptotic variance as in (7) if U ^ t is consistent for U t . Henceforth, this two-step estimator will be called the feasible optimal SLS estimator (FSLS). Note that fitting the autoregressive models is useful if the errors ε t are not i.i.d., otherwise the sample means of ε ^ t 3 and ε ^ t 4 can be used. Alternatively, if the conditioning set v t , , v t p is reasonably small, one can use nonparametric estimators of the conditional skewness and kurtosis to obtain U ^ t . More details about the two-step estimators can be found in White (1996, Section 6.3).
In the rest of this section we investigate the efficiency gain of the OSLS estimator compared to the Gaussian QMLE, which is one of the most popular methods of estimation in GARCH models. The asymptotic properties of the QMLE are studied by Weiss (1986) and Bollerslev and Wooldridge (1992). Specifically for model (1) and (2), the Gaussian QMLE is defined as
γ ^ T Q = argmin γ Γ T 1 t = 1 T log σ t 2 ( γ ) + ϵ t 2 θ σ t 2 ( γ ) a . s . - P .
Under similar conditions as Assumptions A2–A9 and similar to the proofs of Theorems 1 and 2, we can show that γ ^ T Q is T -consistent with acov T γ ^ T Q γ 0 given by
T t = 1 T E B t Σ t 1 B t 1 t = 1 T E B t Σ t 1 Ω t Σ t 1 B t t = 1 T E B t Σ t 1 B t 1 ,
where B t and Ω t are defined in (8) and (9) respectively and
Σ t = σ t 2 ( γ 0 ) 0 0 2 σ t 4 ( γ 0 ) .
Further, similar to the proof of Theorem 3, we can show that
acov T a γ ^ T o γ 0 acov T a γ ^ T Q γ 0
for any a R q + p + 1 , and the equality holds if and only if for t = 1 , 2 , , T ,
Ω t Σ t 1 B t a = B t t = 1 T E B t Ω t 1 B t 1 t = 1 T E B t Σ t 1 B t a a . s . - P .
The above general condition can be simplified under specific settings. For example, if the process y t , x t , σ t , ϵ t is stationary with E ε t 3 v t , , v t p = 0 , and E ε t 4 v t , , v t p = μ 4 , then it can be shown that Equation (12) is equivalent to
a 1 I q C θ f t v t , θ 0 = 0 , a 1 μ 4 1 2 I q C θ σ t 2 ( γ 0 ) = 0 ,
where C = C 1 + 1 2 C 2 C 1 + 1 μ 4 1 C 2 1 , C 1 = E σ t 2 ( γ 0 ) θ f t v t , θ 0 θ f t v t , θ 0 ,
C 2 = E σ t 4 ( γ 0 ) θ σ t 2 ( γ 0 ) θ σ t 2 ( γ 0 ) and a 1 is the subvector of the first q elements of a .
Since it is difficult to quantify the difference between the asymptotic covariance matrices in (7) and (11) in general, in the following we calculate some examples under a simple AR(1) model with ARCH(1) error
y t = θ 0 y t 1 + ϵ t , σ t 2 = 1 ϕ 0 + ϕ 0 ϵ t 1 2 ,
where the innovations ε t = ϵ t / σ t are i.i.d. with zero mean and unit variance. We consider both symmetric and skewed distributions for ε t . In particular, we choose Gamma distribution with various shape parameters and Student’s t distribution with various degrees of freedom to reflect the different degrees of skewness and kurtosis, respectively.
The asymptotic variances of the OSLS and QMLE for various parameter values are given in Table 1, where the asymptotic variance of the true maximum likelihood estimator (MLE) is also given as a benchmark. The results show clearly that the efficiency gain of the OSLS over the QMLE is significant in the case of highly skewed distributions such as Gamma, while this is less so in the case of symmetric distributions such as Student’s t. Note that the QMLE and OSLS of ϕ 0 have the same asymptotic variances under the Student’s t distribution, which is consistent with the theoretical result (13).
In order to see how much loss of efficiency in the QMLE is recovered by the OSLS estimator, next we calculate the relative reduction in the QMLE efficiency-loss (inefficiency)
R I E L a γ 0 = 100 × acov T a γ ^ T Q γ 0 acov T a γ ^ T o γ 0 acov T a γ ^ T Q γ 0 acov T a γ ^ T M γ 0 ,
where γ ^ T M is the true MLE for γ 0 . This measure also indicates which estimator approaches the asymptotic variance lower bound faster.
Figure 1 contains a sample of the numerical outputs. First, Figure 1a shows that 45–60% of the inefficiency of θ ^ T Q is recovered by θ ^ T o when the innovations ε t have a heavy tail Student’s t distribution. Further, R I E L ( θ 0 ) declines sharply as the degrees of freedom increases, indicating that as the distribution of ε t gets close to the Gaussian, the QMLE improves quickly and it gets close to the OSLS estimator, and both of them approach the variance lower bound. However, the situation in Figure 1b,c is opposite, where R I E L ( θ 0 ) and R I E L ( ϕ 0 ) are increasing with the shape parameter of the Gamma distribution. This indicates that the OSLS improves significantly faster than the QMLE as the skewed distribution gets closer to the Gaussian. In other words, the efficiency loss of the QMLE is persistent and therefore it is not desirable under asymmetric conditional error distribution.

4. Simulation Studies

In this section we carry out Monte Carlo simulations to study the finite sample behaviour of the feasible optimal SLS estimator (FSLS) γ ^ T o and compare it with some other related, commonly used estimators.

4.1. Comparison with Quasi-MLE

We first compare the FSLS with the quasi-maximum likelihood estimator (QMLE). Specifically, we generate the data from the AR(1)-ARCH(1) model in (14) with innovations ε t drawn from the standardized distributions of different levels of skewness and kurtosis. We consider various sample sizes including T = 10,000 to approximate the asymptotic results. In each simulation, we vary the values of the parameters ( θ 0 , ϕ 0 ) to represent different levels of persistence in the mean and variance components. For each estimator, we calculate the mean estimates ( θ ^ 0 , ϕ ^ 0 ) and the root mean squared errors ( R M S E ( θ ^ 0 ) , R M S E ( ϕ ^ 0 ) ) based on 3000 independent replications over 4 different pairs of true values ( θ 0 , ϕ 0 ) .
Table 2 contains the summary results for Gamma (2,1) innovations, which show clearly that the FSLS outperforms the QMLE for all sample sizes and panels, while both estimators have the same degree of bias. We have also done the simulations with Student’s t ( 5 ) innovations and the results show that the FSLS has moderate gain of efficiency over the QMLE, which performs fairly well in small samples.
To understand how the values of ( θ 0 , ϕ 0 ) , shape parameter, and sample size T affect the relative RMSE of the FSLS compared to QMLE, we use the numerical results of the simulations to fit two regression equations with R R M S E ( θ ^ T o ) = R M S E ( θ ^ T o ) / R M S E ( θ ^ T Q ) and R R M S E ( ϕ ^ T o ) as the response variables, respectively. The results in Table 3 show that the shape parameter has a positive effect on both RRMSE ( θ ^ T o ) and RRMSE ( ϕ ^ T o ) , while T is negatively associated with both RRMSEs, indicating that the outperformance of the FSLS in large samples is more evident than in small samples if the innovations distribution is skewed. Moreover, the negative sign of θ 0 in the RRMSE ( θ ^ T o ) equation indicates that the performance of the QMLE improves quickly as the value of θ 0 gets larger.

4.2. Comparison with Estimating Function Estimators

Our approach is related to the estimating function (EF) approach. Following Durairajan (1992), it can be shown that under models (1) and (2) the EF estimator γ ^ T E F is obtained by solving the estimating equation
t = 1 T B t ( γ ) Ω t 1 ( γ ) ϵ t ( γ ) ϵ t 2 ( γ ) σ t 2 ( γ ) = 0 ,
where B t ( γ ) and Ω t ( γ ) are given in (8) and (9) with θ 0 , γ 0 and E ε t j v t , , v t p , j = 3 , 4 replaced by θ , γ and E γ ε t j ( γ ) v t , , v t p , j = 3 , 4 respectively. The EF in (15) is optimal with respect to the so-called Godambe’s information criterion. Moreover, under some regularity conditions similar to Assumptions A2–A9, the EF estimator can be shown to be T -consistent with acov T γ ^ T E F γ 0 given by Equation (7). However, although the FSLS and EF estimators have the same asymptotic variance, they are distinct in the following aspects. First, the FSLS is an extremum estimator while the EF estimator represents a solution of the optimal estimating Equation (15). Second, if E γ ε t j ( γ ) v t , , v t p , j = 3 , 4 are known functions of γ , then the EF estimator can be calculated in one step, while the FSLS remains to be a two-step estimator due the dependence of W t on γ 0 . Third, the two estimators may behave differently in finite-sample situations because they have different estimating equations. This can be seen by comparing Equation (15) with the first order condition for the FSLS which can be written as
t = 1 T B t ( γ ) H t ( θ ) Ω t 1 H t ( θ ) ϵ t ( γ ) ϵ t 2 ( γ ) σ t 2 ( γ ) = 0 ,
where
H t ( θ ) = 1 2 f t v t , θ 2 f t v t , θ 0 0 1 .
Next we calculate some numerical examples to compare the FSLS with four different versions of the EF estimators that are commonly used in practice. First, since ε t are i.i.d., Equation (15) can be written as
t = 1 T B t ( γ 1 ) Ω t 1 ( γ 2 , μ 3 , μ 4 ) ϵ t ( γ ) ϵ t 2 ( γ ) σ t 2 ( γ ) = 0 .
Then we calculate four variants of the EF estimator as follows: the estimator EF0 is obtained by taking γ 1 = γ 2 = γ ^ T Q , μ 3 = 1 / T t = 1 T ε t 3 ( γ ^ T Q ) and μ 4 = 1 / T t = 1 T ε t 4 ( γ ^ T Q ) ; EF1 is the same as EF0 except that γ 1 = γ ; EF is the same as EF0 except that γ 1 = γ 2 = γ ; and EF2 is obtained by letting γ 1 = γ 2 = γ , μ 3 = 1 / T t = 1 T ε t 3 ( γ ) and μ 4 = 1 / T t = 1 T ε t 4 ( γ ) . The four variants have the same asymptotic covariance matrix.
Figure 2 shows the relative RMSE of the FSLS over the RMSE of the QMLE and EF estimators, respectively. Figure 2a is based on 4790 simulations with 3000 independent replications where ε t are generated from the standardized Gamma distribution with shape parameters 2, 3, 4, 5, 6, and 7 respectively. Similarly, Figure 2b is based on 4050 simulations with ε t generated from the standardized Student’s t distribution with df 5, 6, 7, 8, and 9, respectively. In all cases, sample size T varies over a range (30, 40, 50, 60, 70, 80, 90, 100, 500, and 1000) and the RMSE of the estimators are calculated on the parameters grid ( 0.1 , 0.1 ) , ( 0.1 , 0.2 ) , , ( 0.1 , 0.9 ) , ( 0.2 , 0.1 ) , , ( 0.9 , 0.9 ) .
The results show clearly that the FSLS outperform the EF estimators for θ 0 in almost all cases and for ϕ 0 in the majority of the cases. In particular, results of EF2 show that replacing the nuisance parameters with highly nonlinear functions of the estimated parameter makes the performance worse. Therefore in practice, a two-step EF estimator such as the EF or EF1 should be recommended. Finally, the QMLE performs reasonably well in the case of symmetric error distributions, while not so well in the case of skewed error distributions.

5. Application

In this section we apply our method to an empirical example of Engle (1982) (see also Enders 2010), who used an AR model with ARCH error to study the wage/price spiral in the U.K. over the period 1958Q2–1977Q2. Specifically, let p t denote the log of the consumer price index and w t denote the log of the index of the nominal wage rates. Then y t = p t p t 1 and r t = w t p t are the rate of inflation and real wage, respectively. Engle (1982) first fitted the following equation using the least squares (LS) method
y t = 0.0257 + 0.334 y t 1 + 0.408 y t 4 0.404 y t 5 + 0.0559 r t 1 + ϵ t , ( 0.006 ) ( 0.103 ) ( 0.110 ) ( 0.114 ) ( 0.014 ) σ ^ t 2 = 8.9 × 10 5 ,
where the standard errors are in parentheses. Since the Lagrange multiplier (LM) test for ARCH(1) error was not significant, but for ARCH(4) process was instead, the following conditional variance equation was specified
σ t 2 = ϕ 0 + ϕ 1 ( 0.4 ϵ t 1 2 + 0.3 ϵ t 2 2 + 0.2 ϵ t 3 2 + 0.1 ϵ t 4 2 ) ,
where the two-parameter variance function with declining weights was chosen to satisfy the nonnegativity and stationarity constraints. Further, Engle (1982) fitted Equations (17) and (18) jointly using the Gaussian quasi-likelihood method, where all coefficients (except the first lag in the inflation rate) were significant at level 0.05 .
Since we could not obtain the wage rates before 1963, we use the data from 1963Q1 through 1982Q1 to compensate for the 19 missing quarters. The data were obtained from the OECD website http://dx.doi.org/10.1787/data-00052-en (accessed on 15 January 2014, see OECD, Main Economic Indicators–complete database). To see if there is major structural difference in our data compared to the data used by Engle (1982), we have done an initial investigation and found no evidence of structural change by replacing the 19 quarters. So our results are to some extent comparable with those in Engle (1982).
We start by fitting the following regression model for inflation using the LS method
y t = θ 0 + θ 1 y t 1 + θ 2 y t 2 + θ 3 y t 3 + θ 4 y t 4 + θ 5 y t 5 + θ 6 r t 1 + ϵ t
The results are shown in the first part of Table 4 under Model-I (LS), where the White’s correction for the standard errors are reported in parentheses.
We calculate the Ljung-Box statistic Q for ε ^ t (denoted by Q1) and ε ^ t 2 (denoted by Q2) at lags 5, 10, 15, and 20. They are all insignificant at level 0.1 except for Q2(5), which agrees with Engle (1982) in including four lags in the variance equation. Further, we use the squared residuals from this regression to fit an ARCH(4) model for the conditional variance
σ t 2 = ϕ 0 + ϕ 1 ϵ t 1 2 + ϕ 2 ϵ t 2 2 + ϕ 3 ϵ t 3 2 + ϕ 4 ϵ t 4 2
The results are shown in the second part of Table 4 under Model-I (LS). Again, the ARCH(4) model is confirmed by the LM test at significance level 0.05. We report only the LS estimates of the variance function without standard errors because those estimates are only used as starting values to compute the QMLE. The Q1 and Q2 statistics from Model-I (LS) (in the third part of Table 4) indicate that the mean and variance equations are fairly well specified since none of these diagnostics are significant at level 0.1. Therefore we fit the Model-I again using the QMLE, which is more efficient than the LS procedure. Although the diagnostics of the standardized innovations from Model-I (QMLE) do not show serial correlation of the first or second order, all coefficients in the variance function are insignificant except for the constant term. This contradicts the ARCH(4) that we found before to be correctly specified. However, this can be explained by the lack of efficiency in the QMLE due to the moderate level of skewness in the corresponding residuals.
On the other hand, our FSLS estimation yields a significant fourth lag in the variance function in addition to the correct specification as indicated by Q1 and Q2. Accordingly, we use the model fitted by FSLS in stepwise regression algorithm to obtain a reduced model (Model-II in Table 4)
y t = θ 0 + θ 1 y t 1 + θ 4 y t 4 + θ 5 y t 5 + θ 6 r t 1 + ϵ t
σ t 2 = ϕ 0 + ϕ 4 ϵ t 4 2
Note that while the mean equation is identical to that in Engle (1982), only the fourth lag is significant in the variance equation. The above ARCH structure can only be detected by using the full model that is more flexible than the two-parameter variance function in Equation (18). Moreover, the more efficient FSLS estimation yields the ARCH(4) structure, while the QMLE would conclude with a misspecified homoscedastic model.

6. Conclusions and Discussion

Although the ARCH-type models have been extensively studied for decades, most theories and methods are developed for stationary data processes whose conditional mean functions are either ARMA or simple linear functions of covariates. Moreover, recent research has been focusing on generalization of the error component while leaving the mean function to be simple linear form. However, many economic and financial time series are nonlinear and/or nonstationary in the mean, therefore, data transformation is required in order to apply standard methodologies in the analysis.
In this paper, we proposed the second-order least squares (SLS) approach to estimate a flexible and general model with nonlinear and time-varying conditional mean and ARCH conditional variance function. This approach is applicable to both stationary and mean-nonstationary processes and therefore can be used to analyze the transformed as well as the original data. Another advantage of this approach is that it does not require specification of the underlying distribution for the errors. Moreover, the feasible optimal SLS estimator (FSLS) is more efficient than the commonly used QMLE, and the efficiency gain is significant when the conditional error distribution is asymmetric. We have demonstrated through a real data example that the efficiency gain of the proposed approach leads to a more accurate model than the QMLE. The third and fourth conditional moments of the innovation provide useful information that is utilized by the FSLS (through the weight matrix) to gain efficiency over the QMLE. This information is even more important in the case of skewed and/or leptokurtic error distributions. Our simulation studies also show that the SLS approach has better finite sample properties than the estimating function approach based on the same set of conditional moments.
There are some issues remaining to be studied in the future. Some assumptions for the asymptotic theories are sufficient but not necessary. They are adopted here mainly due to the way of the proofs we used. As is common in statistics and econometrics, there are usually different ways to prove an asymptotic theory, and each of them requires a specific set of assumptions. Therefore it is interesting to explore the possibilities of establishing the asymptotic properties for the FSLS estimator under the similar conditions for the QMLE. Indeed, our Monte Carlo simulation studies have shown that the FSLS estimator performs well in the finite sample situations even if the innovation does not have the finite moments of order higher than four. From the application point of view, it is also important to extend the method of this paper to models with more general GARCH errors. This is possible by modifying some assumptions to adapt for the mixing process.

Author Contributions

Conceptualization, M.S. and L.W.; methodology, M.S. and L.W.; software, M.S.; validation, M.S. and L.W.; formal analysis, M.S.; investigation, M.S.; resources, L.W.; data curation, M.S.; writing—original draft preparation, M.S.; writing—review and editing, L.W.; visualization, M.S.; supervision, L.W.; project administration, L.W.; funding acquisition, L.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Natural Sciences and Engineering Research Council of Canada (NSERC) grant number 546719.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset in the Application section was accessible at OECD website http://dx.doi.org/10.1787/data-00052-en (accessed on 15 January 2014).

Acknowledgments

We are grateful to the Editor and the three anonymous referees for their comments and suggestions, which were helpful in improving the previous version of this paper. The research was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC). The first author also gratefully acknowledges financial support by the University of Manitoba Graduate Fellowship and Manitoba Graduate Scholarship.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

In this Appendix we first list the technical assumptions that are sufficient for the asymptotic properties of the SLS estimator γ ^ T , followed by the mathematical proofs for the theorems. To simplify the notation, we denote f t · , θ = f t v t , θ .

Appendix A.1. Regularity Conditions

We make the following assumptions for the consistency of the SLS estimator γ ^ T .
Assumption A1.
The process y t , x t is strong mixing of size a , for some a > 1 . That is, there exists δ > 0 such that α ( m ) = O ( m a δ ) , where
α ( m ) = sup n sup F F n , G F n + m | P ( F G ) P ( F ) P ( G ) | ,
F n = F y t , x t , t n and F n + m = F y t , x t , t n + m .
This is a high-level assumption which allows for considerable dependence and heterogeneity in the underlying process. As noted by White and Domowitz (1984), it preserves the asymptotic independence of the observed process even under further transformations. This assumption can be justified on a case by case basis. For example, if i = 1 p ϕ 0 i < 1 , then ϵ t is strong mixing with geometric rate if the innovation noise sequence ε t is i.i.d., has finite second moment, and Lebesgue density being strictly positive in a neighbourhood of zero (Lindner 2009). The geometric memory decay implies that a can be set to an arbitrarily large number. It can also be shown that finite order Gaussian ARMA processes are strong mixing (Ibragimov and Linnik 1971, pp. 312–13).
Assumption A2.
Let . be the Euclidean norm. Then for some r > a a 1 ,
sup t N E W t 1 + i = 0 p ϵ t i 4 + sup Θ f t i 4 · , θ r < .
Assumption A3.
For any open neighbourhood N Γ of γ 0 , there exists T 0 ( N ) such that
inf T T 0 T 1 t = 1 T min γ N c Γ E h t ( γ ) h t ( γ 0 ) W t h t ( γ ) h t ( γ 0 ) > 0 .
Note that Assumption A2 ensures the uniform convergence of Q T ( γ ) and Assumption A3 is sufficient for parameter identification. If the process y t , x t is stationary, f t = f : R υ × Θ R 1 and W t = W v t , , v t p is positive definite a.s.-P, then Assumption A3 is equivalent to that f ( · , θ ) = f ( · , θ 0 ) a . s . P only if θ = θ 0 .
Further, we make the following additional assumptions for the asymptotic normality of γ ^ T .
Assumption A4.
The true value γ 0 is an interior point of Γ.
Assumption A5.
The random functions f t · , θ are twice continuously differentiable on Γ uniformly in t a.s.-P.
Assumption A6.
For some r > a a 1 , it holds
sup t N E W t sup Θ θ 2 f t · , θ 2 + i = 0 p θ f t i · , θ 4 + i = 1 p ϵ t i 2 θ 2 f t i · , θ 2 + i = 0 p f t i 2 · , θ θ 2 f t i · , θ 2 r < .
Assumption A7.
The sequence A ¯ T ( γ 0 ) = 2 T 1 t = 1 T E γ h t ( γ 0 ) W t γ h t ( γ 0 ) is bounded and lim inf T | A ¯ T ( γ 0 ) | > 0 .
Assumption A8.
For some r > a a 1 ,
sup t N E W t 2 1 + f t 2 · , θ 0 θ f t · , θ 0 2 + i = 1 p ϵ t i 2 θ f t i · , θ 0 2 + θ f t · , θ 0 2 + i = 1 p ϵ t i 4 1 + i = 0 p ϵ t i 4 + ϵ t 2 f t 2 · , θ 0 r < .
Assumption A9.
The sequence V T = 4 T 1 t = 1 T E γ h t ( γ 0 ) W t h t ( γ 0 ) h t ( γ 0 ) W t γ h t ( γ 0 ) is bounded and lim inf T | V T | > 0 .
Assumptions A2, A6, and A8 are for general cases and can be simplified for specific choice of W t . For example, for the optimal weight W t = U t 1 in Section 3, these assumptions can be simplified to the following assumptions, respectively.
Assumption A10
(Assumption A2). For k = 0 , 1 , , p , and some r > a a 1 ,
sup t N E ε t 4 + σ t 4 sup Θ f t k 4 · , θ r < .
Assumption A11
(Assumption A6). For s = 1 , 2 and k = 0 , 1 , , p , and some r > a a 1 ,
sup t N E σ t 4 sup Θ θ s f t k · , θ 4 r < .
Assumption A12
(Assumption A8). For k = 0 , 1 , , p , and some r > a a 1 ,
sup t N E ε t 8 + σ t 8 f t 8 · , θ 0 + σ t 8 θ f t k · , θ 0 8 r < .

Appendix A.2. Proof of Theorem 1

First, by using Hölder’s inequality and C r inequality, we can easily verify that the sequence h t ( γ ) W t h t ( γ ) is dominated by uniformly L r -bounded variables (i.e., sup t E | h t ( γ ) W t h t ( γ ) | r < ). Therefore, Q ¯ T ( γ ) = T 1 t = 1 T E h t ( γ ) W t h t ( γ ) is well defined and is continuous on Γ uniformly in T. Then by the uniform law of large numbers (ULLN) (White and Domowitz 1984, Theorem 2.3), we have sup γ Γ | Q T ( γ ) Q ¯ T ( γ ) | a . s . 0 as T . Further, since h t ( γ 0 ) , F t is a martingale difference sequence and W t is measurable– F t 1 , we have
E h t ( γ ) W t h t ( γ ) = E h t ( γ ) h t ( γ 0 ) W t h t ( γ ) h t ( γ 0 ) + E h t ( γ 0 ) W t h t ( γ 0 ) .
Since W t is non-negative definite a.s.-P, Assumption A3 ensures the uniqueness of the minimum of Q ¯ T ( γ ) for sufficiently large T. Thus the result follows from Theorem 3.4 of White (1996). □

Appendix A.3. Proof of Theorem 2

The proof consists of the following four steps.
(i) First we apply the mean value theorem for random functions to the first order condition for a minimum of Q T ( γ ) . Since γ ^ T a . s . γ 0 and γ 0 is interior to Γ , there is a neighbourhood N Γ of γ 0 such that γ ^ T N a . s . for sufficiently large T. Further, since f t θ is twice continuously differentiable on Θ uniformly in t, by Jennrich (1969, Lemma 3), for sufficiently large T,
γ 2 Q T γ ˜ T γ ^ T ω γ 0 = γ Q T γ 0 ,
where γ 2 Q T γ is the Hessian matrix of Q T γ and γ ˜ T γ 0 γ ^ T γ 0 .
(ii) Let A ¯ T ( γ ) = 2 T 1 t = 1 T E γ h t ( γ ) W t γ h t ( γ ) and we show that
γ 2 Q T γ ˜ T A ¯ T ( γ 0 ) a . s . 0 as T .
Using Hölder’s, triangle and C r inequalities we can verify that Assumptions A2 and A6 imply that A t ( γ ) are dominated by uniformly L r -bounded functions, where
A t ( γ ) = 2 γ h t ( γ ) W t γ h t ( γ ) + 2 h t γ W t I q + p + 1 γ vec γ h t γ .
Hence by the ULLN we have
sup γ Γ γ 2 Q T ( γ ) T 1 t = 1 T E A t ( γ ) a . s . 0 as T .
Moreover, by the triangle inequality we have
γ 2 Q T γ ˜ T T 1 t = 1 T E A t ( γ 0 ) sup γ Γ γ 2 Q T ( γ ) T 1 t = 1 T E A t ( γ ) + sup K N K 1 t = 1 K E A t ( γ ˜ T ) K 1 t = 1 K E A t ( γ 0 ) a . s .
Since E A t ( γ ) is continuous on Γ uniformly in t and E A t ( γ 0 ) = 2 E γ h t ( γ 0 ) W t γ h t ( γ 0 ) , Equation (A2) follows by letting T in the last inequality. Further, since A ¯ T ( γ 0 ) and V T are uniformly nonsingular by Assumptions A7 and A9 respectively, for sufficiently large T, we have
V T 1 / 2 A ¯ T ( γ 0 ) T γ ^ T γ 0 = V T 1 / 2 T γ Q T γ 0 + V T 1 / 2 A ¯ T ( γ 0 ) A ¯ T 1 ( γ 0 ) γ 2 Q T 1 γ ˜ T V T 1 / 2 V T 1 / 2 T γ Q T γ 0 .
(iii) Now we use Cramér-Wold device (Rao 1973, p. 123) to show that
V T 1 / 2 T γ Q T γ 0 d N 0 , I q + p + 1 as T .
Let λ R q + p + 1 with λ = 1 . Then it is sufficent to show that T 1 / 2 t = 1 T λ V T 1 / 2 S t γ 0 d N 0 , 1 as T , where S t γ 0 = 2 γ h t ( γ 0 ) W t h t ( γ 0 ) . By Assumption A9 we have V T 1 / 2 = O ( 1 ) and, therefore, by Assumption A8 and applying again the Hölder’s and C r inequality the double array m T t = λ V T 1 / 2 S t γ 0 is uniformly L r -bounded for all T sufficiently large. Further, since h t ( γ 0 ) , F t is a martingale difference, we have E m T t = 0 and var T 1 / 2 t = 1 T m T t = 1 for all T sufficiently large. It follows from Theorem 14.1 of Davidson (1994) and Assumption A1 that m T t to be strong mixing of size a . Hence by Theorem 5.20 of White (2001) we have T 1 / 2 t = 1 T m T t d N 0 , 1 as T .
(iv) By (A2), Assumption A7 and Theorem 2.16 of White (2001), we have γ 2 Q T 1 γ ˜ T A ¯ T 1 ( γ 0 ) = o p ( 1 ) . Since V T 1 / 2 T γ Q T γ 0 = O p ( 1 ) from (iii), it follows that
V T 1 / 2 A ¯ T ( γ 0 ) A ¯ T 1 ( γ 0 ) γ 2 Q T 1 γ ˜ T T γ Q T γ 0 = o p ( 1 ) .
By the method of subsequences (Davidson 1994, Theorem 18.6), there exists a subsequence T such that
V T 1 / 2 A ¯ T ( γ 0 ) A ¯ T 1 ( γ 0 ) γ 2 Q T 1 γ ˜ T T γ Q T γ 0 a . s . 0 as T ,
which implies
V T 1 / 2 T A ¯ T ( γ 0 ) γ ^ T γ 0 + γ Q T γ 0 a . s . 0 as T .
Finally since T is arbitrary, we have
V T 1 / 2 A ¯ T ( γ 0 ) T γ ^ T γ 0 + V T 1 / 2 T γ Q T γ 0 P 0 as T ,
and the proof is completed by applying the result (2c.4.12) of Rao (1973). □

Appendix A.4. Proof of Theorem 3

Let R = 1 T R 1 , R 2 , , R T , M = 1 T M 1 , M 2 , , M T , R t = γ h t ( γ 0 ) W t U t 1 / 2 and M t = γ h t ( γ 0 ) U t 1 / 2 . Then the proof follows by noting that
E R M E 1 M M E M R R M E 1 M M E M R
is nonnegative definite matrix. Moreover the equality in (A4) holds if W t = U t 1 , t = 1 , 2 , , T , which justifies Equation (6). The equivalence between Equations (6) and (7) follows from substituting Ω t 1 and B t in Equation (7). □

References

  1. Abarin, Taraneh, and Liqun Wang. 2009. Second-order least squares estimation of censored regression models. Journal of Statistical Planning and Inference 139: 125–35. [Google Scholar] [CrossRef]
  2. Bollerslev, Tim, and Jeffrey M. Wooldridge. 1992. Quasi-maximum likelihood estimation and inference in dynamic models with time-varying covariances. Econometric Reviews 11: 143–72. [Google Scholar] [CrossRef]
  3. Bose, Mausumi, and Rahul Mukerjee. 2015. Optimal design measures under asymmetric errors, with application to binary design points. Journal of Statistical Planning and Inference 159: 28–36. [Google Scholar] [CrossRef] [Green Version]
  4. Davidson, James. 1994. Stochastic Limit Theory: An Introduction for Econometricians. Oxford: Oxford University Press. [Google Scholar]
  5. Durairajan, T. M. 1992. Optimal estimating function for non-orthogonal model. Journal of Statistical Planning and Inference 33: 381–84. [Google Scholar] [CrossRef]
  6. Enders, Walter. 2010. Applied Econometric Time Series. New York: Wiley. [Google Scholar]
  7. Engle, Robert F. 1982. Autoregressive conditional heteroscedasticity with estimates of the variance of united kingdom inflation. Econometrica 50: 987–1007. [Google Scholar] [CrossRef]
  8. Engle, Robert F., and Gloria Gonzalez-Rivera. 1991. Semiparametric ARCH models. Journal of Business & Economic Statistics 9: 345–59. [Google Scholar]
  9. Franses, Philip Hans, and Dick Van Dijk. 2000. Non-Linear Time Series Models in Empirical Finance. Cambridge: Cambridge University Press. [Google Scholar]
  10. Gao, Lucy L., and Julie Zhou. 2017. D-optimal designs based on the second-order least squares estimator. Statistical Papers 58: 77–94. [Google Scholar] [CrossRef]
  11. He, Lei, and Rong-Xian Yue. 2019. R-optimality criterion for regression models with asymmetric errors. Journal of Statistical Planning and Inference 199: 318–26. [Google Scholar] [CrossRef]
  12. Ibragimov, I. A., and Yu V. Linnik. 1971. Independent and Stationary Sequences of Random Variables. Gröningen: Wolters-Noordhoff. [Google Scholar]
  13. Jennrich, Robert I. 1969. Asymptotic properties of non-linear least squares estimators. The Annals of Mathematical Statistics 40: 633–43. [Google Scholar] [CrossRef]
  14. Kim, Mijeong, and Yanyuan Ma. 2012. The efficiency of the second-order nonlinear least squares estimator and its extension. Annals of the Institute of Statistical Mathematics 64: 751–64. [Google Scholar] [CrossRef] [Green Version]
  15. Koul, Hira L., and Shiqing Ling. 2006. Fitting an error distribution in some heteroscedastic time series models. The Annals of Statistics 34: 994–1012. [Google Scholar] [CrossRef] [Green Version]
  16. Li, David X., and Harry J. Turtle. 2000. Semiparametric ARCH models: An estimating function approach. Journal of Business & Economic Statistics 18: 174–86. [Google Scholar]
  17. Li, Wai Keung, Shiqing Ling, and Michael McAleer. 2002. Recent theoretical results for time series models with GARCH errors. Journal of Economic Surveys 16: 245–69. [Google Scholar] [CrossRef]
  18. Lindner, Alexander M. 2009. Stationarity, Mixing, Distributional Properties and Moments of GARCH (p, q)–Processes. Handbook of Financial Time Series; Berlin: Springer, pp. 43–69. [Google Scholar]
  19. Ling, Shiqing. 2003. Adaptive estimators and tests of stationary and nonstationary short-and long-memory ARFIMA–GARCH models. Journal of the American Statistical Association 98: 955–67. [Google Scholar] [CrossRef]
  20. Ling, Shiqing, and Michael McAleer. 2003. On adaptive estimation in nonstationary ARMA models with GARCH errors. The Annals of Statistics 31: 642–74. [Google Scholar] [CrossRef]
  21. Meitz, Mika, and Pentti Saikkonen. 2008. Stability of nonlinear AR-GARCH models. Journal of Time Series Analysis 29: 453–75. [Google Scholar] [CrossRef] [Green Version]
  22. Rao, Calyampudi Radhakrishna. 1973. Linear Statistical Inference and Its Applications. New York: Wiley. [Google Scholar]
  23. Rosadi, D, and Peter Filzmoser. 2019. Robust second-order least-squares estimation for regression models with autoregressive errors. Statistical Papers 60: 105–22. [Google Scholar] [CrossRef]
  24. Rosadi, Dedi, and Shelton Peiris. 2014. Second-order least-squares estimation for regression models with autocorrelated errors. Computational Statistics 29: 931–43. [Google Scholar] [CrossRef]
  25. Salamh, Mustafa, and Liqun Wang. 2021. Second-order least squares method for dynamic panel data models with application. Journal of Risk and Financial Management 14: 410. [Google Scholar] [CrossRef]
  26. Wang, Liqun. 2003. Estimation of nonlinear Berkson-type measurement error models. Statistica Sinica 13: 1201–10. [Google Scholar]
  27. Wang, Liqun. 2004. Estimation of nonlinear models with Berkson measurement errors. The Annals of Statistics 32: 2559–79. [Google Scholar] [CrossRef] [Green Version]
  28. Wang, Liqun. 2007. A unified approach to estimation of nonlinear mixed effects and Berkson measurement error models. Canadian Journal of Statistics 35: 233–48. [Google Scholar] [CrossRef]
  29. Wang, Liqun, and Alexandre Leblanc. 2008. Second-order nonlinear least squares estimation. Annals of the Institute of Statistical Mathematics 60: 883–900. [Google Scholar] [CrossRef]
  30. Weiss, Andrew A. 1986. Asymptotic theory for ARCH models: Estimation and testing. Econometric Theory 2: 107–31. [Google Scholar] [CrossRef] [Green Version]
  31. White, Halbert. 1996. Estimation, Inference and Specification Analysis. Cambridge, UK: Cambridge University Press. [Google Scholar]
  32. White, Halbert. 2001. Asymptotic Theory for Econometricians. New York: Academic Press. [Google Scholar]
  33. White, Halbert, and Ian Domowitz. 1984. Nonlinear regression with dependent observations. Econometrica 52: 143–61. [Google Scholar] [CrossRef] [Green Version]
  34. Yin, Yue, and Julie Zhou. 2017. Optimal designs for regression models using the second-order least squares estimator. Statistica Sinica 27: 1841–56. [Google Scholar] [CrossRef] [Green Version]
  35. Zhu, Ke, and Wai Keung Li. 2015. A new Pearson-type QMLE for conditionally heteroskedastic models. Journal of Business and Economic Statistics 33: 552–65. [Google Scholar] [CrossRef] [Green Version]
Figure 1. Reduction (%) in the QMLE efficiency-loss. (a) ε t Student’s t, ϕ 0 = 0.5 ; (b) ε t Gamma, ϕ 0 = 0.5 ; (c) ε t Gamma, θ 0 = 0.5 .
Figure 1. Reduction (%) in the QMLE efficiency-loss. (a) ε t Student’s t, ϕ 0 = 0.5 ; (b) ε t Gamma, ϕ 0 = 0.5 ; (c) ε t Gamma, θ 0 = 0.5 .
Econometrics 09 00041 g001
Figure 2. Ratio of the RMSE of FSLS to the RMSE of QML and EF estimators respectively. (a) ε t Gamma with shape parameter (2, 3, 4, 5, 6, 7). (b) ε t Student’s t with df (5, 6, 7, 8, 9).
Figure 2. Ratio of the RMSE of FSLS to the RMSE of QML and EF estimators respectively. (a) ε t Gamma with shape parameter (2, 3, 4, 5, 6, 7). (b) ε t Student’s t with df (5, 6, 7, 8, 9).
Econometrics 09 00041 g002
Table 1. Asymptotic variances of OSLS, QMLE, and MLE under AR(1)-ARCH(1) model.
Table 1. Asymptotic variances of OSLS, QMLE, and MLE under AR(1)-ARCH(1) model.
v ( θ ^ 0 ) v ( ϕ ^ 0 ) v ( θ ^ 0 ) v ( ϕ ^ 0 ) v ( θ ^ 0 ) v ( ϕ ^ 0 ) v ( θ ^ 0 ) v ( ϕ ^ 0 )
θ 0 = 0.2 θ 0 = 0.2 θ 0 = 0.8 θ 0 = 0.8
ϕ 0 = 0.2 ϕ 0 = 0.6 ϕ 0 = 0.2 ϕ 0 = 0.6
Gamma (2)OSLS0.832.890.811.230.242.770.191.23
QML1.354.481.632.020.424.480.382.03
ML0.150.250.080.140.040.060.040.13
Gamma (8)OSLS0.972.080.880.940.312.060.220.94
QML1.182.521.091.150.382.510.281.14
ML0.871.440.690.640.271.350.190.62
Gamma (12)OSLS1.022.000.900.900.321.990.230.90
QML1.172.301.051.040.372.290.271.04
ML0.971.580.790.690.301.500.210.72
Gamma (20)OSLS1.061.930.910.880.341.930.240.88
QML1.152.111.000.960.372.110.260.96
ML1.031.670.860.730.331.620.230.75
t (5)OSLS1.346.321.512.860.376.260.292.86
QML1.566.322.342.860.416.260.412.86
ML1.052.441.041.110.292.410.211.11
t (7)OSLS1.263.301.261.510.373.320.291.51
QML1.303.301.411.510.383.320.321.51
ML1.102.311.051.050.322.340.251.05
t (13)OSLS1.202.311.081.060.372.330.261.06
QML1.202.311.101.060.372.330.261.06
ML1.152.121.030.970.362.170.240.97
Table 2. Simulation results for FSLS and QMLE under AR(1)-ARCH(1) model with ε t Gamma(2,1).
Table 2. Simulation results for FSLS and QMLE under AR(1)-ARCH(1) model with ε t Gamma(2,1).
T θ ^ 0 RMSE ( θ ^ 0 ) ϕ ^ 0 RMSE ( ϕ ^ 0 ) θ ^ 0 RMSE ( θ ^ 0 ) ϕ ^ 0 RMSE ( ϕ ^ 0 )
(a): θ 0 = ϕ 0 = 0.2 (b): θ 0 = 0.2 , ϕ 0 = 0.6
60QMLE0.210.1490.300.2130.190.1520.580.163
FSLS0.210.1190.270.1720.190.1210.590.136
100QMLE0.200.1160.250.1690.200.1210.580.134
FSLS0.200.0910.230.1380.200.0940.590.108
1000QMLE0.200.0380.200.0630.200.0400.600.037
FSLS0.200.0290.200.0520.200.0280.600.030
10,000QMLE0.200.0120.200.0200.200.0130.600.012
OSLS0.200.0090.200.0160.200.0090.600.010
(c): θ 0 = 0.8 , ϕ 0 = 0.2 (d): θ 0 = 0.8 , ϕ 0 = 0.6
60QMLE0.770.0980.290.2150.770.0970.590.161
FSLS0.780.0730.270.1780.780.0770.590.139
100QMLE0.780.0740.250.1720.780.0710.580.128
FSLS0.790.0540.230.1400.790.0540.590.106
1000QMLE0.800.0210.200.0620.800.0210.600.038
FSLS0.800.0160.190.0500.800.0150.600.030
10,000QMLE0.800.0070.200.0200.800.0070.600.012
OSLS0.800.0050.200.0160.800.0050.600.010
Table 3. Effect of the shape, sample size, and parameter values on the RRMSEs under Gamma distribution. All coefficients are significant at 0.0001 level.
Table 3. Effect of the shape, sample size, and parameter values on the RRMSEs under Gamma distribution. All coefficients are significant at 0.0001 level.
Coefficients R 2 Error df
ConstShapeT θ 0 ϕ 0
R R M S E ( θ ^ T o ) 0.765460.02887 0.00007 0.02783 0.025250.786924785
R R M S E ( ϕ ^ T o ) 0.799520.02078 0.00003 0.010130.009620.701354785
Table 4. Fitted Model-I (19) and (20) and Model-II (21) and (22), where standard errors are in parentheses. The superscript a indicates statistical significance at 5% level. Q 1 ( n ) ( Q 2 ( n ) ) is the Ljung-Box statistic for the (squared) standardized innovations and the corresponding p-values are in parentheses. JB is the standard Jarque-Bera test.
Table 4. Fitted Model-I (19) and (20) and Model-II (21) and (22), where standard errors are in parentheses. The superscript a indicates statistical significance at 5% level. Q 1 ( n ) ( Q 2 ( n ) ) is the Ljung-Box statistic for the (squared) standardized innovations and the corresponding p-values are in parentheses. JB is the standard Jarque-Bera test.
Coef.Model-I Model-II
LSQMLEFSLS QMLEFSLS
Conditional Mean Equation
θ 0 0.071 a 0.079 a 0.049 a 0.067 a 0.063 a
(0.016)(0.015)(0.014) (0.015)(0.013)
θ 1 0.417 a 0.339 a 0.280 a 0.323 a 0.257 a
(0.141)(0.1)(0.093) (0.096)(0.086)
θ 2 0.039 0.004 0.050
(0.095)(0.086)(0.081)
θ 3 0.180 0.235 a −0.158
(0.148)(0.101)(0.094)
θ 4 0.436 a 0.481 a 0.563 a 0.328 a 0.339 a
(0.184)(0.108)(0.101) (0.116)(0.104)
θ 5 0.350 a 0.310 a 0.294 a 0.246 a 0.234 a
(0.101)(0.094)(0.088) (0.094)(0.084)
θ 6 0.076 a 0.086 a 0.051 a 0.073 a 0.067 a
(0.018)(0.017)(0.016) (0.017)(0.015)
Conditional Variance Equation
ϕ 0 0.0001 0.000 a 0.000 0.000 a 0.000
(0.000)(0.000) (0.000)(0.000)
ϕ 1 0.1064 0.0930.021
(0.134)(0.126)
ϕ 2 0.0000 0.0000.000
(0.077)(0.072)
ϕ 3 0.0806 0.1000.102
(0.146)(0.137)
ϕ 4 0.3364 0.389 0.479 a 0.553 0.556 a
(0.252)(0.235) (0.308)(0.281)
Diagnostic Statistics of the Standardized Innovations
Q1(5)0.9 (0.97)1.0 (0.96)2.1 (0.83) 2.0 (0.85)1.9 (0.86)
Q1(10)4.4 (0.93)7.4 (0.68)6.4 (0.78) 8.4 (0.59)8.5 (0.58)
Q1(15)12.8 (0.62)16.1 (0.38)14.1 (0.52) 19.6 (0.19)18.7 (0.23)
Q2(5)3.7 (0.59)4.4 (0.50)0.4 (0.99) 6.5 (0.26)6.4 (0.27)
Q2(10)5.7 (0.84)6.9 (0.74)1.3 (0.99) 8.2 (0.61)8.1 (0.62)
Q2(15)9.2 (0.87)9.8 (0.83)2.7 (0.99) 9.1 (0.87)9.4 (0.85)
Skewness0.780.611.67 0.730.99
Kurtosis4.073.498.48 3.734.27
JB 11.0 a 5.35 115 a 8.12 a 16.93 a
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Salamh, M.; Wang, L. Second-Order Least Squares Estimation in Nonlinear Time Series Models with ARCH Errors. Econometrics 2021, 9, 41. https://doi.org/10.3390/econometrics9040041

AMA Style

Salamh M, Wang L. Second-Order Least Squares Estimation in Nonlinear Time Series Models with ARCH Errors. Econometrics. 2021; 9(4):41. https://doi.org/10.3390/econometrics9040041

Chicago/Turabian Style

Salamh, Mustafa, and Liqun Wang. 2021. "Second-Order Least Squares Estimation in Nonlinear Time Series Models with ARCH Errors" Econometrics 9, no. 4: 41. https://doi.org/10.3390/econometrics9040041

APA Style

Salamh, M., & Wang, L. (2021). Second-Order Least Squares Estimation in Nonlinear Time Series Models with ARCH Errors. Econometrics, 9(4), 41. https://doi.org/10.3390/econometrics9040041

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop