Next Article in Journal
Effect of Symmetry/Asymmetry of Shear Rotation of a Plasma Column in a Radial Electric Field on the Level of Turbulent Density Fluctuations
Next Article in Special Issue
Enhanced Dispersion Monitoring Structures Based on Modified Successive Sampling: Application to Fertilizer Production Process
Previous Article in Journal
Modeling the Characteristics of Unhealthy Air Pollution Events Using Bivariate Copulas
Previous Article in Special Issue
Survival Analysis and Applications of Weighted NH Parameters Using Progressively Censored Data
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Daily Semiparametric GARCH Model Estimation Using Intraday High-Frequency Data

1
School of Economics and Statistics, Guangzhou University, Guangzhou 510006, China
2
Research Centre for Applied Mathematics, Shenzhen Polytechnic, ShenZhen 518055, China
*
Author to whom correspondence should be addressed.
Symmetry 2023, 15(4), 908; https://doi.org/10.3390/sym15040908
Submission received: 27 February 2023 / Revised: 5 April 2023 / Accepted: 12 April 2023 / Published: 13 April 2023
(This article belongs to the Special Issue Mathematical Models and Methods in Various Sciences)

Abstract

:
The GARCH model is one of the most influential models for characterizing and predicting fluctuations in economic and financial studies. However, most traditional GARCH models commonly use daily frequency data to predict the return, correlation, and risk indicator of financial assets, without taking data with other frequencies into account. Hence, financial market information may not be sufficiently applied to the estimation of GARCH-type models. To partially solve this problem, this paper introduces intraday high-frequency data to improve estimation of the volatility function of a semiparametric GARCH model. To achieve this objective, a semiparametric volatility proxy model was proposed, which includes both symmetric and asymmetric cases. Under mild conditions, the asymptotic normality of estimators was established. Furthermore, we also discuss the impact of different volatility proxies on estimation precision. Both the simulation and empirical results showed that estimation of the volatility function could be improved by the introduction of high-frequency data.
JEL Classification:
C13; C14; C22

1. Introduction

The GARCH model has been widely applied in the study of financial volatility since the seminal papers of Engle [1] and Bollerslev [2]. Since then, lots of extended GARCH models have been proposed to deal with certain types of financial time series. GARCH models can be roughly divided into two types: symmetric GARCH models and asymmetric GARCH models. A symmetric GARCH model assumes that the response of the conditional variance (volatility) to shocks is only a function of the shock intensity, with no relation to the sign of the shock. An asymmetric GARCH model assumes that the response of the conditional variance (volatility) to shocks depends on both the intensity and sign (direction) of the shock. For more details, one can refer to [3,4,5,6,7,8,9,10,11].
Commonly, for both symmetric and asymmetric GARCH models, estimations usually need certain specifications for volatility function or distributional assumptions. However, such specifications and distribution assumptions could be restrictive in practice. To solve this problem, nonparametric or semiparametric ARCH/GARCH models have been proposed, see [12,13,14,15,16,17]. Compared with parametric GARCH-type models, these models are more flexible and have no restrictive function specifications and distribution assumptions. Due to their flexibility, nonparametric or semiparametric ARCH/GARCH models have been widely investigated. While most of the existing studies have focused on daily sequences and few of them have adopted information from high-frequency data. Consequently, financial market data have not been sufficiently used.
With the development of electronic trading systems, intraday high-frequency data can be obtained easily. Such data contain a lot of useful information, which can be applied to improve the estimation of daily models. Following this point, Visser [18] proposed a method to estimate the daily parametric GARCH model with high-frequency data based on the framework of the volatility proxy model. Most existing results following Visser [18] mainly focus on parametric GARCH-type models, such as [19,20,21,22,23,24]. It has been shown that asymmetric/symmetric GARCH model estimation can obviously be improved by introducing intraday high-frequency data. Liang et al. [25] utilized high-frequency data to study the estimation of the daily nonparametric ARCH model. As with cases of the parametric GARCH model, they found that the estimation of the nonparametric ARCH model could be significantly improved by using intraday high-frequency data. Their framework gave an insight into studying other nonparametric or semiparametric GARCH models. It is quite straightforward to extend the work of Liang et al. [25] to nonparametric or semiparametric GARCH models. For simplicity, this paper introduces the use of high-frequency data to estimate a semiparametric GARCH model proposed by Yang [26], which has extensive applications in practice.
The paper is arranged as follows. Section 2 introduces the daily semiparametric GARCH model. Section 3 proposes the estimation method and gives asymptotically normal properties. Simulations are given in Section 4 to show the performance of the estimation method under a finite sample. An empirical study is conducted in Section 5 to analyze the Shanghai Stock Index. Assumptions and proofs are in the Appendix A.

2. The Daily Semiparametric GARCH Model

2.1. The Semiparametric GARCH Model

Motivated by asymmetric GARCH models and nonparametric models, let { Y t } be a daily return series, Yang [26] introduced the followed semiparametric GARCH model to describe the conditional volatility of { Y t } :
Y t = σ t ξ t , σ t 2 = g j = 1 t α j 1 v ( Y t j ; η ) , t = 1 , 2 , ,
where { ξ t } t = 0 are i.i.d. random variables independent of Y 0 satisfying E ( ξ t ) = 0 , E ( ξ t 2 ) = 1 , and E ( ξ t 4 ) = m 4 < + , where { σ t 2 } t = 0 denotes the conditional volatility series such that σ t 2 = var ( Y t Y t 1 , Y t 2 , ) ; the function g is some unknown smooth non-negative link function defined on R + = [ 0 , + ] , α ( 0 , 1 ) ; { v ( y ; η ) } η H is a known family of non-negative functions, continuous in y and twice continuously differentiable in the parameter η H , where H = [ η 1 , η 2 ] is a compact interval with η 1 < η 2 . The model in (1) contains many symmetric models and asymmetric models as special cases. A case in point is that when g ( x ) = β x + w / ( 1 α ) and v ( y ; η ) = y 2 + η y 2 1 ( y < 0 ) , the model in (1) is reduced to the GJR model (an asymmetric model).
As in Yang’s work [26], we define
U t = j = 0 t α j v ( Y t j ; η ) , t = 1 , 2 , ,
for convenience and, then, the model in (1) can be transformed as follows:
Y t = g 1 / 2 ( U t 1 ) ξ t , σ t 2 = g ( U t 1 ) , t = 1 , 2 , ,
where the process { U t } t = 0 satisfies the Markovian equation,
U t = α U t 1 + v ( g 1 / 2 ( U t 1 ) ξ t ; η ) , t = 1 , 2 , .
Therefore, the model in (3) can be rewritten as
Y t 2 = g ( U t 1 ) + g ( U t 1 ) ( ξ t 2 1 ) .
From (5), it follows that
E ( Y t 2 U t 1 = u ) = g ( u ) , var ( Y t 2 U t 1 = u ) = g 2 ( u ) ( m 4 1 ) , and m 4 = E ξ t 4 .
For the semiparametric GARCH(1, 1) model (1), we need to estimate the unknown link function g ( u ) and the parameters α , η . We state the estimation method in Section 3.

2.2. The Semiparametric Volatility Proxy Model

Following Visser [18], to introduce high-frequency data, we proposed a semiparametric scaling model by normalizing the trading day into a unit time interval:
Y t ( m ) = g 1 / 2 ( U t 1 ) ξ t ( m ) , 0 m 1 , σ t 2 = g ( U t 1 ) , t = 1 , 2 , ,
where U t is defined in (2), Y t ( m ) indicates the continuous-time intraday log-return process on day t, and ξ k ( m ) and ξ l ( m ) , where k l , denote the standard noise processes on different days. These values are assumed to be independent and to follow the same probability distribution. When m = 1 , we have
Y t = Y t ( 1 ) , ξ t = ξ t ( 1 ) , E ξ t 2 ( 1 ) = 1
and the model in (7) reduces to the model in (3). It is easy to see that the model in (7) and the model in (3) share the same volatility function and U t structure, while the model in (7) introduces the intraday high-frequency information, Y t ( m ) . Hence, it is expected to obtain a more precise estimation of volatility based on (7). However, the frequencies between Y t ( m ) and U t are inconsistent, and the model in (7) cannot be estimated directly. To estimate the volatility function, we further introduced a volatility proxy model.
Let H t be a function of Y t ( m ) with the positive homogeneity property, namely
H ( s Y t ( m ) ) = s H ( Y t ( m ) ) > 0 , f o r s > 0 .
Following Visser [18], H t is named as the volatility proxy. A frequently used volatility proxy is the daily realized volatility, which is given by the following:
H t = R V t = k   ( r t , k r t , k 1 ) 2 ,
where r t , k denotes the return over the kth intraday interval on day t. Based on (7), the homogeneity indicates that
H ( Y t ( m ) ) = g 1 / 2 ( U t 1 ) H ( ξ t ( m ) ) .
Denote
H t H ( Y t ( m ) ) , Z H t H ( ξ t ( m ) ) , μ 2 H = E Z H t 2 , ξ H t = Z H t / μ 2 H , g H ( u ) = g ( u ) μ 2 H ,
then we can obtain the following semiparametric volatility proxy model:
H t = g H 1 / 2 ( U t 1 ) ξ H t , σ t 2 = g H ( U t 1 ) , t = 1 , 2 , ,
where E ( H t 2 | U t 1 ) = g H ( U t 1 ) , g H ( u ) = g ( u ) μ 2 H , and E ξ H t 2 = 1 . In the model in (12), the link function g H ( u ) , the parameters α , η , and a new constant μ 2 H need to be estimated. When H t = | Y t ( 1 ) | = | Y t | , E ( H t 2 | Y t 1 ) = g H ( Y t 1 ) reduces to E ( Y t 2 | Y t 1 ) = g ( Y t 1 ) , which means that only the daily information, Y t , is adopted for estimation. Consequently, (12) includes the traditional daily model in (3) as a special case. Moreover, the frequencies of different variables in (12) are unified through the volatility proxy, hence, the model in (12) can be easily estimated.

3. Volatility Function Estimation

3.1. Estimation When Parameters Are Known

In this section, we suppose the parameters α , η are known first and discuss the estimation of the volatility function g ( · ) in (12). In practice, we can observe the { Y t } t = 1 n and calculate { H t } t = 1 n based on a discrete intraday high-frequency sequence. Define the following:
V = V 2 V 3 V n = H 2 2 H 3 2 H n 2 , Z = 1 U 1 u h 1 U 2 u h 1 U n 1 u h ,
where E 1 = ( 1 , 0 ) , E 2 = ( 0 , 1 ) , W = diag { 1 n K h ( U 1 u ) , , 1 n K h ( U n 1 u ) } , K h ( · ) = 1 h K ( · h ) for any kernel function K ( · ) and a certain bandwidth h.
Referring to Yang [26], based on (12), the local linear estimator of g H ( u ) is given as
g ^ H ( u ) = E 1 ( Z τ W Z ) 1 Z τ W V .
From (13), we can obtain the local linear estimator g ^ ( u ) for g ( u ) by setting H t = | Y t ( 1 ) | = | Y t | , where no intraday high-frequency information is used. Note that μ Z H = g H ( u ) / g ( u ) is an unknown parameter depending on H t . Following Liang [25], an estimator for μ Z H is defined as
μ ^ Z H = 1 n 1 t = 2 n g ^ H ( U t 1 ) g ^ ( U t 1 ) ,
where g ^ H ( U t 1 ) and g ^ ( U t 1 ) can be calculated according to (13). Then, the final estimator for the volatility function, g ( u ) , introducing the intraday high-frequency information is defined by
g ˜ ( u ) = g ^ H ( u ) μ ^ Z H .
The following theorems show that g ^ H ( u ) and g ˜ ( u ) behave as in the standard univariate local linear estimator and may have smaller asymptotic variance by choosing a proper volatility proxy.
Theorem 1.
Under Assumptions A1–A12, for any fixed u A given in Assumption A3, as n h , n h 5 = O ( 1 ) ,
n h ( g ^ H ( u ) g H ( u ) h 2 b H ( u ) ) D N ( 0 , V H ( u ) )
where b H ( u ) = Λ 0 , 2 g H ( 2 ) ( u ) / 2 , V H ( u ) = | | K 0 | | 2 2 ( m 4 H 1 ) g H 2 ( u ) φ 1 ( u ) , and Λ 0 , 2 is defined in (A4).
Theorem 2.
Under Assumptions A1–A12, for any fixed u A given in Assumption A3, if h n r for some r ( 1 / 4 , 1 ) , as n , ( μ ^ Z H μ Z H ) = o p ( n 1 2 ) and
n h ( g ˜ ( u ) g ( u ) h 2 b ˜ ( u ) ) D N ( 0 , V ˜ ( x ) ) ,
where b ˜ ( u ) = Λ 0 , 2 g ( 2 ) ( y ) / 2 and V ˜ ( u ) = ( m 4 H 1 ) | | K 0 | | 2 2 g 2 ( u ) φ 1 ( u ) .
Remark 1.
In Theorem 1, when H t = | Y t | , g ^ H ( u ) , and g H ( u ) become g ^ ( u ) and g ( u ) , respectively. Hence, b H ( u ) = b ( u ) = Λ 0 , 2 g ( 2 ) ( u ) / 2 ! and V H ( u ) = V ( u ) = | | K 0 | | 2 2 ( m 4 1 ) g 2 ( u ) φ 1 ( u ) . Note that the estimator, g ^ ( u ) , does not use high-frequency information.
Remark 2.
In Theorem 2, the revised estimator, g ˜ ( u ) , continues to have the same bias term as g ^ ( u ) , while the asymptotic variance term is different. The main difference in asymptotic variance between g ^ ( u ) and g ˜ ( u ) lies in that the previous term ( m 4 1 ) becomes ( m 4 H 1 ) after introducing high-frequency information. Consequently, a smaller m 4 H will lead to smaller asymptotic variance in g ˜ ( u ) . Therefore, one can choose a proper volatility proxy by comparing it to the value of m 4 H , which is helpful in obtaining a more precise function estimator.
In practice, m 4 H = c · E H t 4 / ( E H t 2 ) 2 and c = [ E ( g ( U t 1 ) ) ] 2 / E [ g 2 ( U t 1 ) ] . Let
m 4 H = E H t 4 / ( E H t 2 ) 2 .
Due to the fact that c > 0 and m 4 H = c · m 4 H > 0 , one can choose the optimal proxy, H t , by comparing which proxy has the smallest value of m 4 H .

3.2. Estimating the Parameters

In practice, γ = ( α , η ) is usually unknown and we discuss its estimation in this section. Without loss of generality, assume that γ is located inside Γ = [ α 1 , α 2 ] × [ η 1 , η 2 ] , where 0 < α 1 < α 2 < 1 , < η 1 < η 2 < + . Let γ be a parameter vector in Γ = [ α 1 , α 2 ] × [ η 1 , η 2 ] , and consider regressing the H t 2 series on U γ , t 1 , where U γ , t is given by
U γ , t = j = 0 t α j v ( Y t j ; η ) = j = 0 t α j v ( g 1 / 2 ( U t j 1 ) ξ t j ; η ) , t = 1 , 2 , . . . , γ = ( α , η ) Γ .
Based on U γ , t 1 , the predictor of H t 2 can be defined as
g H , γ ( u ) = E ( H t 2 U γ , t 1 = u )
for any γ Γ , and the weighted mean square prediction error can be expressed as
L ( γ ) = lim t E { ( H t 2 g H , γ ( U γ , t 1 ) } 2 π ( U ˜ t 1 ) ,
where π ( · ) is a non-negative and continuous weight function and its compact support is contained in A, which is defined in Assumption A3. Following Yang [26], we can define a series, U ˜ t :
U ˜ t = j = 0 t α 2 j v ( g H 1 / 2 ( U t j 1 ) ξ t j ; η ˜ ) , t = 1 , 2 , ,
where η ˜ is given in Assumption A4. According to A4, it is not difficult to have U γ , t U ˜ t , t = 1 , 2 , , γ Γ . Then, the usual deviation variance decomposition on L ( γ ) is performed and we obtain
L ( γ ) = lim t E { g H ( U t 1 ) g H , γ ( U γ , t 1 ) } 2 π ( U ˜ t 1 ) + ( m 4 1 ) × lim t E g H 2 ( U t 1 ) π ( U ˜ t 1 ) .
From Assumption A7, L ( γ ) has a unique minimum point at γ and is locally convex. Consequently, we can calculate the true parameter, γ , consistently by minimizing the prediction error of H t 2 on U γ , t 1 .
Note that V = ( V t ) 2 t n , V t = H t 2 , t = 2 , , n , for each u A , γ Γ . Define the estimator of g H , γ ( u ) :
g ^ H , γ ( u ) = E 1 ( Z γ W γ Z γ ) 1 Z γ W γ V ,
where
Z γ = ( 1 , U γ , t u h ) 1 t n 1 , W γ = d i a g 1 n K h ( U γ , t u ) t = 1 n 1 .
Define the estimated function
L ^ ( γ ) = 1 n t = 2 n { H t 2 g ^ H , γ ( U γ , t 1 ) } 2 π ( U ˜ t 1 ) .
Let γ ^ be the estimator of γ . Then, γ ^ can be obtained by minimizing the above function L ^ ( γ ) , i.e.,
γ ^ = arg min γ Γ L ^ ( γ ) .
The following theorem shows the asymptotic normality of the estimator, γ ^ .
Theorem 3.
Under Assumptions A1–A12, if h n r for some r ( 1 / 5 , 1 / 4 ) , then as n , the γ ^ defined by (24) satisfies
n ( γ ^ γ ) N ( 0 , { 2 L ( γ ) } 1 Σ { 2 L ( γ ) } 1 ) ,
where Σ = 4 ( m 4 H 1 ) E g H 2 ( U 1 ) π 2 ( U ˜ 1 ) { g H , γ ( U γ , 1 ) } { g H , γ ( U γ , 1 ) } T γ = γ .
Note that when γ = γ , g ^ H , γ ( u ) in (18) becomes g ^ H ( u ) , as given in (13). Consequently, by treating the estimated value, γ ^ , as the true value for the parameter, g ^ H ( u ) can be approximated well by g ^ H , γ ^ ( u ) . Then, the final estimation of the function g ˜ ( · ) can be obtained based on (14) and (15).

4. Simulation

To assess the finite-sample performance of the proposed estimator, g ˜ ( u ) , we first need to simulate the intraday noise process, ξ t ( m ) , to generate Y t and Y t ( m ) . Following Visser [18], ξ t ( m ) is simulated by the following two processes:
d γ t ( m ) = δ ( γ t ( m ) μ ) d m + σ γ d B t ( 2 ) ( m ) ,
d ξ t ( m ) = e γ t ( m ) d B t ( 1 ) ( m ) , m [ 0 , 1 ] .
The Brownian motions B t ( 1 ) and B t ( 2 ) are not related. The values ξ t ( 0 ) = 0 and γ ( 0 ) are sampled from N ( μ , σ γ 2 ) . We divide the unit time interval [ 0 , 1 ] into 240 small intervals and set δ = 1 / 2 , σ γ = 1 / 4 , μ = 1 / 16 .
Following Liang [25], for each day, we consider three volatility proxies: the realized volatility in 5 min ( H 5 t ), the realized volatility in 30 min ( H 30 t ), and | Y t | (corresponding to the case without using high-frequency information). Here, H 5 t and H 30 t were calculated based on (10), and the estimation of function g was computed according to Section 3.
For model (7), we considered two asymmetric models as simulation examples:
Example 1.
α = 0.5 , η = 0.1 , Γ = [ 0.4 , 0.6 ] × [ 0 , 0.2 ] , g ( u ) = 0.1 ( 2 u + 1 ) / ( 1 α ) , and v ( y ; η ) = y 2 + η y 2 1 ( y < 0 ) .
Example 2.
α = 0.2 , η = 0.1 , Γ = [ 0.1 , 0.3 ] × [ 0 , 0.2 ] , g ( u ) = 1 2 π σ 2 e ( u 0.5 ) 2 2 σ 2 , where σ = 0.4 , and v ( y ; η ) = y 2 + η y 2 1 ( y < 0 ) .
For the above two examples, the sample sizes were n = 300 , 600, and 900, respectively, and the replication time was 100. To obtain the estimator, g ˜ ( u ) , based on (15), the bandwidth was set as 1.06 × s t d ( U t ) × n 1 / 5 and the kernel function was K ( x ) = 0.75 ( 1 x 2 ) + . According to the 10% and 90% percentiles of the simulated U t 1 , the subset A in Theorem 1 was defined as [ 0.2 , 2 ] , and the grid point vector was set as G = [ 0.2 : 0.025 : 2 ] for each proxy H 5 t , H 30 t , and | Y t | .
For function estimation, we display the results in Figure 1 (Example 1) and Figure 2 (Example 2). In both figures, the green curve denotes 100 replicated estimated curves g ˜ ( G i ) , and the bold black line denotes the true curve for each sample size and each proxy. The three columns correspond to the cases with n = 300 , 600, and 900 from left to right, respectively. In the first column, the subplots of (ai) ( i = 1 , 2 , 3 ) are the estimated curves corresponding to the cases of the proxies H 5 t , H 30 t , and | Y t | , respectively. Similarly, the subplots of (bi) and (ci) ( i = 1 , 2 , 3 ) are the estimation results for n = 600 and 900, respectively. Subplots (a4,b4,c4) are the box plots of m 4 H in (16) for H 5 t , H 30 t , and | Y t | (from left to right in each subplot) under different sample sizes.
Figure 1 and Figure 2 show that the estimator under the proxy H 5 t performed best among the three proxies considered for each sample size, especially for the case with a small sample size. This was consistent with the results in subplots (a4), (b4) and (c4), which meant that, under the proxy, H 5 t , the m 4 H values were generally smaller than the other proxies. The estimator under the proxy H 30 t showed more precise estimation than that of proxy | Y t | . As the sample size increased, each proxy showed a better fitting performance, justifying the asymptotic normality in Theorem 2.
In addition, the mean estimation results of α and η in scenarios with data of different frequencies are shown in Table 1, Table 2 and Table 3. It is easy to find that, compared with daily proxy | Y t | , the mean parameter estimation results under the high-frequency proxy were more precise. According to the simulation results, it was found that introducing intraday high-frequency data could effectively improve the estimation accuracy of the semiparametric GARCH model.

5. Empirical Study

In this section, we applied the proposed model in order to study the Shanghai Stock Index. The data spanned the period from 2 April 2004 to 2 April 2010, which consisted of 1458 days worth of high-frequency data based on a 1 min sampling frequency. The intraday log–return can be computed as
Y t ( m ) = 100 [ l o g P t ( m ) l o g P t 1 ( 1 ) ] , m [ 0 , 1 ] ,
where P t ( m ) denotes the mth intraday price on day t. First, we considered 11 different volatility proxies for each day: a 1 min realized volatility, H 1 t , up to a 10 min realized volatility, H 10 t , and a daily absolute return, | Y t | . Here, we calculated the 1 min proxy as
H 1 t = Σ k = 1 240 [ Y t ( m k ) Y t ( m k 1 ) ] 2 ,
where Y t ( m k ) indicates the k t h observation in the intraday sequence, Y t ( m ) . According to (16), the estimated values of m 4 H for proxies | Y t | and H 1 t H 10 t were (4.7623, 3.6370, 2.4960, 2.2525, 2.1855, 2.1993, 2.1783, 2.2584, 2.3804, 2.3623, 2.3910), where H 6 t had the smallest value. For comparison, H 1 t , H 6 t , H 10 t , and | Y t | were considered in our study to compare the impacts of different frequencies.
To estimate the volatility function for the considered data, according to the 10% and 90% percentiles of U t 1 , we set the subset A in Theorem 1 as [ 5 , 50 ] and the grid point vector as G = [ 5 : 0.01 : 50 ] . Referring to Section 3.2, we first calculated { U γ , t } t = 1 1458 for γ in each estimation domain. The bandwidth was simply set as 1.06 s t d ( U t ) n 1 / 5 , and the 95% confidence interval of g ( u ) was calculated as
g ^ ( u ) ± z 0.975 K 0 2 2 ( m 4 H 1 ) g ^ 2 ( u ) φ ^ ( u ) n h ,
where z 0.975 = 1.96 , φ ^ ( u ) is an ordinary kernel density estimator defined by Silverman (1986) [27].
The parameter estimation results are shown in Table 4. Figure 3 shows the function estimation results of the function g ( u ) in the semiparameiric GARCH(1, 1) model. It was found that estimated parameters and the function under Y t and H 1 t showed different performances. This was not surprising as H 1 t used extremely high-frequency data, which might contain lots of noise and, hence, led to unreliable results. For the proxy Y t , no high-frequency information was used and its results might be not that adequate. As a comparison, results under H 6 t and H 10 t seemed more similar, stable, and reasonable, which was consistent with the fact that H 6 t was optimal among different proxies.
It is of interest to compare the confidence interval under different proxies. To give an explicit demonstration, we only plotted the confidence intervals for g ( u t ) under H 6 and Y t . Figure 4 and Figure 5 show the time series plots of the estimation curves of g ( u t ) and its 95% confidence intervals. It can be seen that the estimated function g ( u t ) under H 6 t and Y t showed different fluctuations. In addition, the confidence intervals under H 6 t were generally narrower than those under Y t . Therefore, the estimation results under H 6 t seemed more precise and reasonable.

6. Concluding Remarks

The volatility model has been widely applied to the study of financial markets. In this paper, we proposed a semiparametric volatility model by introducing high-frequency data. The proposed model includes many symmetric and asymmetric GARCH models as special cases. Estimators for both parameters and unknown functions were given and related limiting properties were also developed. Simulation and empirical studies implied that the estimation precision for the considered model could be effectively improved by choosing a proper volatility proxy. The work in this paper is insightful and could be used for the further study of other symmetric/asymmetric GARCH models by using intraday high-frequency data.

Author Contributions

Methodology, F.C.; formal analysis, Z.C.; data curation, X.Z.; writing—original draft, F.C.; writing—review and editing, X.Z.; supervision, Y.L.; funding acquisition, F.C., Y.L. and X.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was partially supported by Guangdong Basic and the Applied Basic Research Foundation (2022A1515010046), funded by Science and Technology Projects in Guangzhou (SL2022A03J00654) and the Graduate Innovation Research Funding Program of Guangzhou University 2021GDJC-D02.

Data Availability Statement

Data is unavailable due to privacy.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

In this section, the following assumptions were necessary for our theoretical development, which have been used in Fan and Yao [28] and Yang [26].
Assumption A1.
The link function g ( · ) is positive everywhere on R + and has a Lipschitz continuous 2nd derivative.
Assumption A2.
There exist constants δ , c 1 , c 2 > 0 such that
lim sup u + g δ / 2 ( u ) / u = g 0 0 , 1 α 2 m δ c 1 + c 2 η ,
where m δ = E ξ 1 δ < and that for every a > 0
E υ ( a ξ 1 ; η ) a δ m δ c 1 + c 2 η .
Assumption A3.
Both { U t } t = 0 and { Y t } t = 0 are geometric ergodic processes, where the variable U t has a stationary density, φ ( · ) , which is Lipschitz and satisfies inf u A φ ( u ) > 0 , where A is a compact subset of R with a nonempty interior.
Assumption A4.
There exists a η ˜ H such that for any y R
υ ( y ; η ˜ ) = max η H υ ( y ; η )
and that
lim sup u + g δ / 2 ( u ) / u = g 0 0 , 1 α 2 m δ c 1 + c 2 η ¯ .
Assumption A5.
The processes U t , U γ , t , U ˜ t t 1 have stationary densities φ ( u t , u γ , u ˜ ) , and there are constants m and M such that 0 < m φ γ ( u ) M , u A , γ Γ where φ γ ( · ) is the marginal stationary density of U γ , 1 .
Assumption A6.
The functions g H , γ ( u ) , γ Γ defined in (18) satisfy sup γ Γ sup u A g H , γ ( P + 1 ) ( u ) < + , while the process Y t t = 0 satisfies E exp a Y t γ < + for some constants a > 0 and r > 0 .
Assumption A7.
The function L ( γ ) has a positive definite Hessian matrix at its unique minimum, γ, and it is locally convex: there is a constant C > 0 such that L ( γ ) L ( γ ) + γ γ , γ Γ , where · is the Euclidean norm.
Assumption A8.
The sequence ( Y 1 , V 2 ) , , ( Y n 1 , V n ) is a stationary α mixing sequence with mixing coefficient α ( k ) . For α mixing sequences, there exists a sequence of positive integers satisfying s n and s n = o { ( n h n ) 1 / 2 } such that ( n / h n ) 1 / 2 α ( s n ) 0 as n .
Assumption A9.
The kernel function, K, is bounded with a bounded support, and let | | K | | 2 2 = K 2 ( u ) d u , μ r ( K ) = μ r K ( u ) d u ,
S = μ 0 ( K ) μ 1 ( K ) μ 1 ( K ) μ 2 ( K ) , S 1 = S 00 S 01 S 10 S 11 ,
K λ ( u ) = λ = 0 1 S λ λ μ λ K ( u ) , Λ λ , 2 = K λ ( u ) u 2 d u , λ = 0 , 1 .
Assumption A10.
The conditional density f Y 0 , Y l | V 1 , V l + 1 ( y 0 , y l | v 1 , v l + 1 ) A 1 < , l 1 .
Assumption A11.
For α mixing sequences, we assume that for some δ > 2 and a > 1 2 / δ , l l a [ α ( l ) ] 1 2 / δ < , E | V 0 | δ < , f Y 0 | V 1 ( y 0 | v 1 ) A 2 < .
Assumption A12.
The random variable ξ H t has a continuous density function, which is positive everywhere and m 4 H = E ξ H t 4 < .
The proof of Theorem 3 in this paper is analogous to Theorem 3 in Yang [26] and hence omitted. A detailed proof can be found in Yang [26] and Fan and Yao [28]. Next, we give a simple deduction for Theorem 1 and Theorem 2.
Proof of Theorem  1.
According to the definition of matrix, one can conclude
E 1 τ ( Z τ W Z ) 1 Z τ W Z E 1 = 1 , E 1 τ ( Z τ W Z ) 1 Z τ W Z E 2 = 0 .
Applying (13) and Lemma A2 of Yang [26], we have
g ^ H ( u ) g H ( u ) = E 1 τ ( Z τ W Z ) 1 Z τ W V g H ( u ) E 1 τ ( Z τ W Z ) 1 Z τ W Z E 1 g H ( u ) h E 1 τ ( Z τ W Z ) 1 Z τ W Z E 2 Q 1 Q 2 Q 3 ,
where
Q 1 = 1 n φ ( u ) i = 1 n 1 K 0 h ( U i u ) g H ( U i ) ξ H ( i + 1 ) 2 [ 1 + o p ( 1 ) ] Q 2 = 1 n φ ( u ) g H ( u ) i = 1 n 1 K 0 h ( U i u ) [ 1 + o p ( 1 ) ] Q 3 = 1 n φ ( u ) g H ( u ) i = 1 n 1 K 0 h ( U i u ) ( U i u ) [ 1 + o p ( 1 ) ] .
Combining (A5) and (A6), the above becomes
g ^ H ( u ) g H ( u ) = 1 n φ ( u ) i = 1 n 1 K 0 h ( U i u ) g H ( U i ) ξ H ( i + 1 ) 2 [ 1 + o p ( 1 ) ] 1 n φ ( u ) g H ( u ) i = 1 n 1 K 0 h ( U i u ) [ 1 + o p ( 1 ) ] 1 n φ ( u ) g H ( u ) i = 1 n 1 K 0 h ( U i u ) ( U i u ) [ 1 + o p ( 1 ) ] T 1 T 2 ,
where
T 1 = 1 n φ ( u ) i = 1 n 1 K 0 h ( U i u ) [ g H ( U i ) g H ( u ) g H ( u ) ( U i u ) ] [ 1 + o p ( 1 ) ] T 2 = 1 n φ ( u ) i = 1 n 1 K 0 h ( U i u ) g H ( U i ) ( ξ H ( i + 1 ) 2 1 ) [ 1 + o p ( 1 ) ] .
By a change in variable, U = u + h v , and Taylor expansion we have:
T 1 = Λ 0 , 2 g H ( 2 ) ( u ) 2 ! h 2 + o p ( h 2 ) = h 2 b H ( u ) + o p ( h 2 ) .
On the other hand, applying the Martingale central limit theorem, as in Härdle et al. [29], the term T 2 is asymptotically normal, with variance
m 4 H 1 n 2 φ ( y ) 2 i = 1 n 1 [ K 0 h ( U i u ) g H ( U i ) ] 2 [ 1 + o p ( 1 ) ] ,
which, by a change in variable, U = u + h v , and Taylor expansion, becomes
( m 4 H 1 ) | | K 0 | | 2 2 g H 2 ( u ) n h φ ( u ) [ 1 + o p ( 1 ) ] = 1 n h V H ( u ) [ 1 + o p ( 1 ) ] .
Combining equations (A7), (A8), and (A10), Theorem 1 holds. □
Proof of Theorem  2.
It is not difficult to get
μ ^ Z H μ Z H = 1 n 1 t = 2 n 1 g ^ ( U t 1 ) { g ^ H ( U t 1 ) g H ( U t 1 ) } 1 n 1 t = 2 n g H ( U t 1 ) g ^ ( U t 1 ) g ( U t 1 ) { g ^ ( U t 1 ) g ( U t 1 ) } I 1 + I 2 .
Under Assumptions A1–A5 and A12, applying Lemma A3 of Yang [26], as h n r for some r ( 1 / 4 , 1 ) , we have that
I 1 = i = 1 n 1 1 n φ ( y ) j = 1 n 1 k 0 , h ( U j U i ) { g H ( U j ) g H ( U i ) g H ( U i ) ( U j U i h ) } + i = 1 n 1 1 n φ ( y ) j = 1 n 1 k 0 , h ( U j U i ) g H ( U j ) ( ξ j + 1 2 1 ) = O p ( h 2 ) + Z .
Further, applying Lemma 2 and Lemma 3 of Yoshihara [30], one can show Z = O p ( n 1 h 1 / 2 ) = o p ( n 1 / 2 ) . Then, we can obtain I 1 = O p ( h 2 ) . Using similar methods as I 1 , we can prove that I 2 are o p ( n 1 / 2 ) . Hence, one obtains that μ ^ Z H μ Z H = o p ( n 1 / 2 ) .
Next, we can write
n h { g ˜ ( u ) g ( u ) } = n h { 1 μ ^ Z H [ g ^ H ( u ) g H ( u ) ] } n h { g H ( u ) μ ^ Z H μ Z H [ μ ^ Z H μ Z H ] } I 3 + I 4 .
Using the conclusion μ ^ Z H μ Z H = o p ( n 1 / 2 ) , this then yields I 3 = ( 1 / μ Z H ) n h { g ^ H ( u ) g H ( u ) } + o p ( 1 ) and I 4 = o p ( 1 ) . Therefore,
n h { g ˜ ( u ) g ( u ) } = 1 μ Z H n h { g ^ H ( u ) g H ( u ) } + o p ( 1 ) ,
where the function g ^ H ( u ) is asymptotic normality. Thus, the proof of Theorem 2 is completed. □

References

  1. Engle, R.F. Autoregressive conditional heteroskedasticity with estimates of the variance of U.K. Inflation. Econometrica 1982, 50, 987–1008. [Google Scholar] [CrossRef]
  2. Bollerslev, T. Generalized autoregressive conditional heteroskedasticity. J. Econom. 1986, 31, 307–327. [Google Scholar] [CrossRef] [Green Version]
  3. Engle, R.F.; Lilien, D.M.; Robbins, R.P. Estimating time varying risk premia in the term structure: The ARCH-M model. Econometrica 1987, 55, 391–407. [Google Scholar] [CrossRef]
  4. Nelson, D.B. Conditional heteroskedasticity in asset returns: A new approach. Economics 1991, 59, 347–370. [Google Scholar] [CrossRef]
  5. Glosten, L.R.; Jagannathan, R.; Runkle, D. On the relation between the expected value and the volatility of the nominal excess return on stocks. J. Financ. 1993, 48, 1779–1801. [Google Scholar] [CrossRef]
  6. Engle, R.F.; Sokalska, M.E.; Chanda, A. High frequency multiplicative component GARCH. Comput. Econ. Financ. 2005, 409. [Google Scholar] [CrossRef]
  7. Liaqat, M.I.; Akgül, A.; De la Sen, M.; Bayram, M. Approximate and Exact Solutions in the Sense of Conformable Derivatives of Quantum Mechanics Models Using a Novel Algorithm. Symmetry 2023, 15, 744. [Google Scholar] [CrossRef]
  8. Hasan, A.; Akgül, A.; Farman, M.; Chaudhry, F.; Sultan, M.; De la Sen, M. Epidemiological Analysis of Symmetry in Transmission of the Ebola Virus with Power Law Kernel. Symmetry 2023, 15, 665. [Google Scholar] [CrossRef]
  9. Attia, N.; Akgül, A.; Alqahtani, R.T. Extension of the Reproducing Kernel Hilbert Space Method’s Application Range to Include Some Important Fractional Differential Equations. Symmetry 2023, 15, 532. [Google Scholar] [CrossRef]
  10. Farman, M.; Shehzad, A.; Akgül, A.; Baleanu, D.; De la Sen, M. Modelling and Analysis of a Measles Epidemic Model with the Constant Proportional Caputo Operator. Symmetry 2023, 15, 468. [Google Scholar] [CrossRef]
  11. Attia, N.; Akgül, A.; Seba, D.; Nour, A.; De la Sen, M. An Efficient Approach for Solving Differential Equations in the Frame of a New Fractional Derivative Operator. Symmetry 2023, 15, 144. [Google Scholar] [CrossRef]
  12. Engle, R.F.; Ng, V.K. Measuring and testing the impact of news on volatility. J. Financ. 1993, 48, 1749–1778. [Google Scholar] [CrossRef]
  13. Yang, L.; Härdle, W.; Nielsen, J. Nonparametric autoregression with multiplicative volatility and additive mean. J. Time Ser. Anal. 1999, 20, 597–604. [Google Scholar] [CrossRef] [Green Version]
  14. Hafner, C. Nonlinear Time Series Analysis with Applications to Foreign Exchange Rate Volatility; Physica-Verlag: Heidelberg, Germany, 1998. [Google Scholar]
  15. Carroll, R.; Härdle, W.; Mammen, E. Estimation in an additive model when the components are linked parametrically. Econom. Theory 2002, 18, 886–912. [Google Scholar] [CrossRef] [Green Version]
  16. Yang, L. Finite nonparametric GARCH model for foreign exchange volatility. Commun. Stat.-Theory Methods 2000, 5–6, 1347–1365. [Google Scholar] [CrossRef]
  17. Yang, L. Direct estimation in an additive model when the components are proportional. Stat. Sin. 2002, 12, 801–821. [Google Scholar]
  18. Visser, M.P. Garch parameter estimation using high-frequencydata. J. Financ. Econom. 2011, 9, 162–197. [Google Scholar]
  19. Huang, J.S.; Wu, W.Q.; Chen, Z.; Zhou, J.J. Robust M-estimate of GJR model with high frequency data. Acta Math. Appl. Sin. Engl. Ser. 2015, 31, 591–606. [Google Scholar] [CrossRef]
  20. Wang, M.; Chen, Z.; Wang, C.D. Composite quantile regression for GARCH models using high-frequency data. Econom. Stat. 2018, 7, 115–133. [Google Scholar] [CrossRef]
  21. Deng, C.; Zhang, X.; Li, Y.; Xiong, Q. Garch Model Test Using High-Frequency Data. Mathematics 2020, 8, 1922. [Google Scholar] [CrossRef]
  22. Zhang, X.; Zhang, R.; Li, Y.; SL, C. LADE-based inferences for autoregressive models with heavy-tailed G-GARCH(1, 1) noise. J. Econom. 2022, 227, 228–240. [Google Scholar] [CrossRef]
  23. Li, L.; Zhang, X.; Deng, C.; Li, Y. Quasi Maximum Exponential Likelihood Estimation of GARCH Model Based on High Frequency Data. Acta Math. Appl. Sin. Engl. Ser. 2022, 45, 652–664. [Google Scholar]
  24. Li, L.; Zhang, X.; Li, Y. Daily GARCH Model Estimation Using High Frequency Data. J. Guangxi Norm. Univ. (Nat. Sci. Ed.) 2021, 39, 68–78. [Google Scholar]
  25. Liang, X.; Zhang, X.; Li, Y.; Deng, C. Daily nonparametric ARCH(1) model estimation using intraday high frequency data. AIMS Math. 2021, 6, 3455–3464. [Google Scholar] [CrossRef]
  26. Yang, L. A semiparametric GARCH model for foreign exchange volatility. J. Econom. 2006, 130, 365–384. [Google Scholar] [CrossRef]
  27. Silverman, B.W. Density Estimation; Chapman & Hall: London, UK, 1986. [Google Scholar]
  28. Fan, J.; Yao, Q. Nonlinear Time Series: Nonparametric and Parametric Methods; Springer: New York, NY, USA, 2003. [Google Scholar]
  29. Härdle, W.; Tsybakov, A.B.; Yang, L. Nonparametric vector autoregression. J. Stat. Plan. Inference 1998, 68, 221–245. [Google Scholar]
  30. Yoshihara, K. Limiting behavior of U-statistics for stationary, absolutely regular processes. Probab. Theory & Relat. Fields 1976, 35, 237–252. [Google Scholar]
Figure 1. The subplots of (ai,bi,ci) ( i = 1 , 2 , 3 ) are the estimated curves for g ˜ ( G i ) (green lines) and the true function curve of g ( G i ) (bold black line) for different sample sizes and different proxies. The subplots (a4,b4,c4) are the box plots of m 4 H in (16) for H 5 t , H 30 t , and | Y t | (from left to right in each subplot) under different sample sizes.
Figure 1. The subplots of (ai,bi,ci) ( i = 1 , 2 , 3 ) are the estimated curves for g ˜ ( G i ) (green lines) and the true function curve of g ( G i ) (bold black line) for different sample sizes and different proxies. The subplots (a4,b4,c4) are the box plots of m 4 H in (16) for H 5 t , H 30 t , and | Y t | (from left to right in each subplot) under different sample sizes.
Symmetry 15 00908 g001
Figure 2. The subplots of (ai,bi,ci) ( i = 1 , 2 , 3 ) are the estimated curves for g ˜ ( G i ) (green lines) and the true function curve of g ( G i ) (bold black line) for different sample sizes and different proxies. The subplots (a4,b4,c4) are the box plots of m 4 H in (16) for H 5 t , H 30 t , and | Y t | (from left to right in each subplot) under different sample sizes.
Figure 2. The subplots of (ai,bi,ci) ( i = 1 , 2 , 3 ) are the estimated curves for g ˜ ( G i ) (green lines) and the true function curve of g ( G i ) (bold black line) for different sample sizes and different proxies. The subplots (a4,b4,c4) are the box plots of m 4 H in (16) for H 5 t , H 30 t , and | Y t | (from left to right in each subplot) under different sample sizes.
Symmetry 15 00908 g002
Figure 3. Estimation results of the function g ( u ) in the semiparametric GARCH model under H 1 t , H 6 t , H 10 t , and the | Y t | proxy.
Figure 3. Estimation results of the function g ( u ) in the semiparametric GARCH model under H 1 t , H 6 t , H 10 t , and the | Y t | proxy.
Symmetry 15 00908 g003
Figure 4. Estimation and 95% confidence interval of the function g ( u ) in the semiparametric GARCH model under H 6 t and the | Y t | proxy.
Figure 4. Estimation and 95% confidence interval of the function g ( u ) in the semiparametric GARCH model under H 6 t and the | Y t | proxy.
Symmetry 15 00908 g004
Figure 5. Estimation and 95% confidence interval of the function g ( u ) in the semiparametric GARCH model under H 6 t and the | Y t | proxy.
Figure 5. Estimation and 95% confidence interval of the function g ( u ) in the semiparametric GARCH model under H 6 t and the | Y t | proxy.
Symmetry 15 00908 g005
Table 1. The mean estimation results of α and η in scenarios with data of different frequencies ( n = 300 ).
Table 1. The mean estimation results of α and η in scenarios with data of different frequencies ( n = 300 ).
Example12
Proxy α ^ η ^ α ^ η ^
H 5 t 0.49650.10300.19580.1013
H 30 t 0.48650.10510.19910.0840
Y t 0.48520.10550.20500.1051
Table 2. The mean estimation results of α and η in scenarios with data of different frequencies ( n = 600 ).
Table 2. The mean estimation results of α and η in scenarios with data of different frequencies ( n = 600 ).
Example12
Proxy α ^ η ^ α ^ η ^
H 5 t 0.50030.10610.19570.1112
H 30 t 0.49500.10120.20970.0991
Y t 0.49060.11470.20410.1143
Table 3. The mean estimation results of α and η in scenarios with data of different frequencies ( n = 900 ).
Table 3. The mean estimation results of α and η in scenarios with data of different frequencies ( n = 900 ).
Example12
Proxy α ^ η ^ α ^ η ^
H 5 t 0.50290.09690.19410.0973
H 30 t 0.49830.10100.20270.1080
Y t 0.50310.10340.18140.1013
Table 4. Parameter estimation results with ASD.
Table 4. Parameter estimation results with ASD.
Parameter Y t H 1 t H 6 t H 10 t
α ^ 0.700.710.590.59
(0.1912)(0.1431)(0.1638)(0.1907)
η ^ 0.390.550.930.93
(0.0639)(0.0210)(0.0331)(0.0528)
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chai, F.; Li, Y.; Zhang, X.; Chen, Z. Daily Semiparametric GARCH Model Estimation Using Intraday High-Frequency Data. Symmetry 2023, 15, 908. https://doi.org/10.3390/sym15040908

AMA Style

Chai F, Li Y, Zhang X, Chen Z. Daily Semiparametric GARCH Model Estimation Using Intraday High-Frequency Data. Symmetry. 2023; 15(4):908. https://doi.org/10.3390/sym15040908

Chicago/Turabian Style

Chai, Fangrou, Yuan Li, Xingfa Zhang, and Zhongxiu Chen. 2023. "Daily Semiparametric GARCH Model Estimation Using Intraday High-Frequency Data" Symmetry 15, no. 4: 908. https://doi.org/10.3390/sym15040908

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop