Next Article in Journal
Dynamic Modeling for Metro Passenger Flows on Congested Transfer Routes
Previous Article in Journal
EEG-Based Emotion Recognition via Knowledge-Integrated Interpretable Method
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Convergence Rates of Large Volatility Matrix Estimator Based on Noise, Jumps, and Asynchronization

1
School of Mathematics and Statistics, Xuzhou University of Technology, Xuzhou 221018, China
2
School of Mathematics Sciences, Huaibei Normal University, Huaibei 235000, China
*
Author to whom correspondence should be addressed.
Mathematics 2023, 11(6), 1425; https://doi.org/10.3390/math11061425
Submission received: 13 February 2023 / Revised: 12 March 2023 / Accepted: 14 March 2023 / Published: 15 March 2023

Abstract

:
At the turn of the 21st century, the wide availability of high-frequency data aroused an increasing demand for better modeling and statistical inference. A challenging problem in statistics and econometrics is the estimation problem of the integrated volatility matrix based on high-frequency data. The existing estimators work well for diffusion processes with micro-structural noise and may get worse when jumps are considered. This paper proposes a novel estimation in the presence of jumps, micro-structural noise, and asynchronization. First, we adopt sub-sampling to synchronize the high-frequency data. Then, we use a two-time scale to realize co-volatility to handle noise. Finally, we employ the threshold parameters to remove the effect of jumps and sparsity in two steps. Both the minimax bound and the convergence rate are discussed in the paper. The estimation procedures of the heavy-tailed data will be solved in the future.

1. Introduction

At the turn of the 21st century, with the advance of technology in high-frequency trading, high-frequency financial data were recorded at several seconds, even mill- and microseconds. In the United States, tick-by-tick stock transactions can be obtained by the Trade and Quote (TAQ2) database, which includes the New York Stock Exchange. On the other hand, in China, we can obtain one-second records for all stocks from databases of some fund management companies. Such data are widely used for volatility estimation in finance, the environment, and other fields, providing guidance and prediction for financial risk management, enviromental monitoring, and other aspects [1,2,3,4,5]. Therefore, the wide availability of high-frequency data arouses increasing demand for better modeling and statistical inference. In financial applications, it often involves dozens or even hundreds of assets, and the corresponding integral volatility matrix is also a high-dimensional problem. When the number of assets is larger than the sample size, the estimation of the integral volatility matrix has been a focus and a challenging problem in finance and statistics recently.
It has long been recognized that market micro-structure noise plays a significant role in the estimation of volatility [6]. The market micro-structure noise is formed by the interaction of transaction cost and transaction friction, which mainly include bid–ask spread, the discreteness of price, etc. The Itô process is used to model the logarithmic price of assets in high-frequency finance, and various non-parametric methods are developed to estimate the integrated (co)volatility for multiple assets based on high-frequency data contaminated with micro-structural noise over a period of time. Such methods include Hayashi and Yoshida (HY) estimators based on overlap intervals [7], multi-scale realized co-volatility (MSRV) based on previous tick data synchronization [8], a quasi-maximum likelihood estimator (QMLE) based on generalized sampling time [9], realized kernel volatility estimator based on a refresh time scheme [10], and pre-averaging realized volatility [11]. In portfolio risk and hedging of funds management, a large number of assets are often encountered, and one key and challenging problem is estimating the integrated matrix based on high-frequency data. Reference [12] proposed the ARVM estimator under the sparse integrated volatility matrix based on contaminated high-frequency data. Reference [13] employed a large volatility matrix estimator using high-frequency data and established that the convergence rate of the estimator depends on sample size with 1/6-exponent under the general diffusion setup with micro-structure noise in the data. Under sparse conditions, the convergence rate of the kernel and the pre-averaging realized volatility is slower than that of the multi-scale realized volatility, due to the trade-off between positive semi-definiteness and fast convergence rate [14]. An adaptive thresholding estimator of a large volatility matrix with varying entries is developed, and the optimal spectral norm convergence rate of the estimator is shown under sparse conditions [15].
However, in [12,13,14,15], the estimations were carried out only under a continuous framework. In fact, in an efficient market environment, the release of major “news” will trigger significant changes in very short time frames, which are called jumps in the pricing process, and the effect of jumps has generally been acknowledged in the high-frequency literature. For a single asset, [16] proposed a non-parametric estimator for the integrated volatility in the presence of jumps and micro-structural noise. Reference [17] developed a statistical test against the necessity of a diffusion component. They used the stock price records of Microsoft (MFST) in 1 November, 1 December, and 11 December 2000 to implement their test, and rejected the existence of the diffusion component. A novel test was proposed to check whether the underlying process of high-frequency data can be modeled using a pure-jump process that was robust to jumps of infinite variation [18]. Reference [19] estimated the integrated volatility in the presence of jumps and endogeneity. Reference [20] developed multipower estimators in the presence of jumps, micro-structural noise, and multiple records of observations. Reference [21] considered a new estimator of the integrated volatility in the presence of jumps and micro-structural noise when the sampling was endogenous. For multiple assets, [22] proposed a threshold estimator to estimate the covariation of two asset prices in the simultaneous presence of jumps and micro-structural noise, but the observed times wer synchronous.
These works all concluded that the effect of jumps cannot be appropriately characterized by a continuous model; hence, the estimators proposed in these papers may perform poorly, even if the number of assets is small. When the number of assets becomes larger, the existing studies did not consider the estimation of the integrated volatility matrix in the presence of jumps, micro-structural noise, and nonsynchronous observations.
This paper proposes a thresholding estimator for the integrated volatility matrix which is constructed as follows. First, we synchronize observation time points based on previous tick data. Second, we adopt the threshold method to remove the effect of jumps, such that those increments smaller than the threshold level are included for the calculation of estimator, and those increments larger than the threshold level will be excluded as n . Finally, we use another threshold to overcome the effect of the sparsity condition. We show that the thresholding estimator is robust to the simultaneous presence of non-synchronous, jumps, and microstructural noise and establish an asymptotic theory for the proposed estimator when both the assets size and the sample size approach infinity. In this paper, we restrict our study to financial high-frequency data. In addition, our results can be extended to engineering, biomedical, imaging science, and photographic technology [23,24,25,26,27,28,29].
The remainder of the paper proceeds as follows. The price model, observed data, and the estimation problem are described in Section 2. Section 3 presents the advised estimator in three steps and the sparse condition. Asymptotic theories are displayed in Section 4, and proofs are put forward in Section 5. Some conclusions and future research are proposed in Section 6.

2. Methodology

2.1. Price Model

Let X ( t ) = ( X 1 ( t ) , , X p ( t ) ) T be the vector of true log prices of p assets at time t. It is well known that X ( t ) is a semimartingale process [30]; thus
X ( t ) = X c ( t ) + X d ( t ) ,
where
d X c ( t ) = b ( t ) d t + σ ( t ) T d B t , t [ 0 , 1 ] ,
d X d ( t ) = | x | 1 x ( μ ν ) ( d t , d x ) + | x | > 1 x μ ( d t , d x ) ,
where b ( t ) = ( b 1 ( t ) , , b p ( t ) ) T is a drift vector, B t = ( B 1 t , , B p t ) T is standard p-dimensional Brownian motion, and σ ( t ) is a c a ` d l a ` g and locally bounded p-by-p matrix, μ is the jump measure compensated by ν , and ν has the form d t d F t ( x ) , where F t ( d x ) is a transition measure form Ω × R + endowed with the predictable σ -field into R / 0 . The jump activity index is defined as β = : { inf s : | x | 1 | x | s F t ( d x ) < , and if 0 β < 1 , we say that X d ( t ) has finite variation.
Remark 1.
Equation (1) is a rather canonical model in finance theory when the continuous part and discontinuous or jumps part are considered in the model. All powers of σ are locally integrated with respect to the Lebesgue measure, since σ is c a ` d l a ` g and locally bounded.

2.2. Observed Data

Because different assets are traded at distinct times in high-frequency finance, the data for multiple assets often encounter non-synchronization problems, and the true log prices X ( t ) are observed with contamination by the micro-structural noise. On this basis, it is assumed that the observed high-frequency financial data Y i ( t i , r ) obey the model
Y i ( t i , r ) = X i c ( t i , r ) + X i d ( t i , r ) + ϵ i ( t i , r ) , i = 1 , , p ; r = 0 , , n i ,
where t i , r denotes the r-th observation time point for the i-th asset.
Assumption 1.
Let ϵ i ( t i , r ) , i = 1 , , p , r = 0 , , n i , be independent noises with mean zero, for each fixed i. ϵ i ( t i , r ) , r = 0 , , n i are i.i.d. random variables with variance η i i , and ϵ i ( · ) , X i c ( · ) , and X i d ( · ) are independent.
Define the quadratic variation of X ( t ) as
[ X c , X c ] 1 = ( 0 1 Σ k = 1 p σ i k ( s ) σ k j ( s ) d s ) 1 i , j p ,
and denote the quadratic variation of X ( t ) as the large volatility matrix Γ , i.e.,
Γ = ( Γ i j ) 1 i , j p = [ X c , X c ] 1 ,
where Γ i j is the i j -th element of matrix Γ .
The focus of the current paper is to construct a new estimator for the large volatility matrix Γ , and to investigate some asymptotic properties of the proposed estimator in the presence of non-synchronization, micro-structural noise, and jumps.

3. The Estimator of the Large-Volatility Matrix

In quantitative finance, capital asset price or return volatility is an important index to measure investment risk. A large number of authors have conducted in-depth studies on financial volatility. Based on the literature, this paper considers the estimation problem of the integral volatility matrix when the asset price process has a jump in a high-dimensional case. It provides a theoretical basis for later statistical inference, asset allocation, risk management, and optimization.
To estimate the large volatility matrix Γ , we need to give a new estimator Γ ^ = ( Γ ^ i j T ( ϖ ) ) 1 i , j p in this section, and the main results, i.e., the asymptotic properties, will be provided in next section. Now, let us describe the estimator in detail.
For a fixed integer m, we partition the interval [0, 1] into m equal sub-intervals, and let K = n / m . Take τ l k = l m + k 1 n , where k = 1 , , K ; l = 0 , 1 , , m 1 , and n is the average sample size of p assets and
n = 1 p i = 1 p n i ,
is the pre-determined sampling frequency.
For a given k, k = 1 , , K and asset i, we define the previous-tick times as
τ i , l k = max { t i , r τ l k ; r = 1 , , n i } , l = 0 , 1 , , m 1 .
Meanwhile, let
Δ l = τ l k τ l 1 k = 1 m ,
and define Δ l k Y i = Y i ( τ i , l k ) Y i ( τ i , l 1 k ) .
Thus, under the model (1)–(3), wedenote
Γ ^ i j T = 1 K k = 1 K l = 1 m 1 [ Δ l k Y i Δ l k Y j ] 1 { | Δ l k Y i u l , | Δ l Y j | u l | } ,
where u l satisfies u l / Δ l ϖ 1 0 , u l / Δ l ϖ 2 , for some 0 ϖ 1 < ϖ 2 < 1 / 6 .
Since the noise of Γ ^ i i T cannot be easily ignored, we need to redefine Γ ^ i i T as
Γ ^ i i T = 1 K k = 1 K l = 1 m 1 [ Δ l k Y i ] 2 1 { | Δ l k Y i u l } 2 m η ^ i i ,
where
η ^ i i = 1 2 n i r = 1 n i [ Y i ( t i , r ) Y i ( t i , r 1 ) ] 2 ,
is the estimator of η i i [31].
Then, we can redefine Γ ˜ i j T by Γ ^ i j T , for which
Γ ˜ i j T = Γ ^ i j T , if i j , Γ ^ i i T 2 m η ^ i i , if i = j .
Remark 2.
First, we use Δ l to sub-sample high-frequency data, and the purpose of sub-sampling is to delete noise. Second, we use the threshold to discard the data larger than u l , and the purpose of the threshold is to remove the effect of jumps. Finally, we take their average, and the purpose of averaging is to yield a better convergence rate.
Remark 3.
After the noise is deleted, the increments from the jump part are equal to or larger than Δ l 1 / 2 , while the increments from the diffusion part are still smaller than Δ l 1 / 2 . The threshold level u l is provided such that those increments ( | Δ l k Y i | ) smaller than u l will be included in the calculation of the integrated volatility matrix, while those increments equal to or larger than u l will be gradually excluded.
We denote Γ ˜ = ( Γ ˜ i j T ) 1 i , j p ; for small p, Γ ˜ provides a good estimator for Γ . However, when p is large, even going to infinity as n , Γ ˜ performs poorly. To estimate a large integrated volatility matrix, one of the key assumptions is that the target matrix of interest is sparse, which is a regularization condition in many studies [12,13,14,32,33,34,35].
Assumption 2.
Sparsity condition: M is a positive random variable, π ( p ) is a deterministic function of p that grows very slowly in p, and 0 δ < 1 . Assume that Γ satisfies
j = 1 p | Γ i j | δ M π ( p ) , i = 1 , , p ; E [ M ] C .
Under the sparsity condition (14), we regularize Γ ˜ i j T as follows,
Γ ^ i j T ( ϖ ) = Γ ˜ i j T 1 ( | Γ ˜ i j T | ϖ ) , for i , j = 1 , , p ,
where ϖ is a threshold parameter. Denote Γ ^ = ( Γ ^ i j T ( ϖ ) ) i , j = 1 , , p , then the ( i , j ) -th element Γ ^ i j T ( ϖ ) of Γ ^ is equal to Γ ˜ i j T if its absolute value equals or exceeds ϖ and is zero otherwise, as well as Γ = ( Γ i j ( ϖ ) ) 1 i , j p .
Remark 4.
In this part, when both the number of assets and the sample size approach infinity, we provide a new estimator in order to estimate the large dimensional integral volatility matrix, in the presence of jumps, micro-structural noise, and non-synchronization. First, we adopt the sub-sampling to synchronize the high-frequency data. Second, we use the two-time scale realized co-volatility on the off-diagonal elements of Γ but correcting bias on the diagonal elements of Γ , to handle noise. Finally, we employ the threshold parameters to remove the effect of jumps and sparsity in two steps.

4. Asymptotic Properties

In this section, the k-moment convergence rate and the minimax bound of the suggested estimator are provided, which are the main results of this paper. To provide some theory, some notations are given. Let x = ( x 1 , , x p ) T denote a vector, and U = ( U i j ) p × p denote a matrix; we can define their l d -norms as follows:
x d = ( i = 1 p | x i | d ) 1 / d , U d = sup { Ux d , x d = 1 } , d = 1 , 2 , .
then,
U 1 = max 1 j p i = 1 p | U i j | , U = max 1 i p j = 1 p | U i j | ,
and
U 2 2 U 1 U .
Furthermore, we need the following assumptions for models (1)–(3).
Assumption 3.
For some α 2 ,
max 1 i p max 0 t 1 E [ | σ i i ( t ) | α ] < , max 1 i p max 0 t 1 E [ | b i ( t ) | α ] < , max i i p E [ | ϵ i ( t i r ) | 2 α ] < .
Assumption 4.
We assume that each of the p assets has at least one observation between τ l 1 k and τ l k , and
C 1 min 1 i p n i n max 1 i p n i n C 2 , max 1 i p max 1 l n i | t i , r t i , r 1 | = O ( n 1 ) , m = o ( n ) .
Theorem 1.
Under models (1)–(3) and Assumptions 1–4, we have, for all 1 i , j p ,
E ( | Γ ^ i j T ( ϖ ) Γ i j | α ) C n α ,
where C n α = C ( ( K n 1 / 2 ) α + K α / 2 + ( n / K ) α / 2 + K α + n α / 2 ) and C is a generic constant. When K = n / m n 2 / 3 , the convergence rate C n of the estimator is n 1 / 6 .
Theorem 2.
Under models (1)–(3) and Assumptions 1–4, if h n , p is any sequence converging to infinity arbitrarily slow, we have
Γ ^ Γ 2 Γ ^ Γ = O P ( π ( p ) [ C n p 2 / α h n , p ] 1 δ ) ,
where C n n 1 / 6 is given in Theorem 1, and ϖ = C n p 2 / α h n , p .
Remark 5.
In order to make C n p 2 / α go to zero, p needs to grow more slowly than n α / 12 . The convergence rate in Theorem 2 is nearly equal to π ( p ) [ C n p 2 / α ] 1 δ , since C n n 1 6 .

5. Proofs

Proof of Theorem 1.
Under the above assumptions, X i d , i = 1 , , p are of finite variation, i.e., β i < 1 , we use X as the continuous part of X and X as the discontinuous martingale or jumps part, i.e.,
X i t = X 0 + 0 t b i s d s + 0 t σ s d W s , X i t = X i t X i t ,
where b i t = b i t | x | > 1 F t ( d x ) and X i = s t Δ X i ( s ) . We also define Y i t = X i t + ϵ i t , and hence Y i t = Y i t + X i t .
We can decompose Γ ˜ i j T Γ i j as the following: if i j ,
Γ ˜ i j T Γ i j = 1 K k = 1 K l = 1 m 1 ( Δ l k Y i Δ l k Y j ) 1 { | Δ l k Y i | u l , | Δ l k Y j | u l } Γ i j = 1 K k = 1 K l = 1 m 1 ( Δ l k Y i Δ l k Y j ) 1 { | Δ l k Y i | u l , | Δ l k Y j | u l } 1 K k = 1 K l = 1 m 1 ( Δ l k Y i Δ l k Y j ) + 1 K k = 1 K l = 1 m 1 ( Δ l k Y i Δ l k Y j ) Γ i j = 1 K k = 1 K l = 1 m 1 [ Δ l k Y i Δ l k Y j 1 { | Δ l k Y i | u l , | Δ l k Y j | u l } Δ l k Y i Δ l k Y j ] + 1 K k = 1 K l = 1 m 1 Δ l k Y i Δ l k X j 1 { | Δ l k Y i | u l , | Δ l k Y j | u l } + 1 K k = 1 K l = 1 m 1 Δ l k X i Δ l k Y j 1 { | Δ l k Y i | u l , | Δ l k Y j | u l } + 1 K k = 1 K l = 1 m 1 Δ l k X i Δ l k X j 1 { | Δ l k Y i | u l , | Δ l k Y j | u l } + 1 K k = 1 K l = 1 m 1 ( Δ l k Y i Δ l k Y j ) Γ i j = : 1 K k = 1 K l = 1 m 1 [ A i j 1 + A i j 2 + A i j 3 + A i j 4 + A i j 5 ] .
From [6], we have E | A i j 5 | α C ( n α / 2 K α + n α / 2 K α / 2 ) , and for A i j s , s = 1 , 2 , 3 , 4 , we can consider the following disjoint cases of | Δ l k Y i | and | Δ l k Y j | . Here, we use C, l 1 , l 2 , r 1 , and r 2 to denote any positive real numbers that may vary from place to place.
  • If | Δ l k Y i | u l / 2 and | Δ l k Y j | u l / 2 , we have that
    | A i j 1 | C | Δ l k Y i | 1 + l 1 | Δ l k Y j | 1 + l 2 u l l 1 u l l 2 , | A i j 2 | C | Δ l k Y i | 1 + l 1 | Δ l k Y j | l 2 | Δ l k X j | u l l 1 u l l 2 , | A i j 3 | C | Δ l k Y i | l 1 | Δ l k X i | | Δ l k Y j | 1 + l 2 u l l 1 u l l 2 , | A i j 4 | C | Δ l k Y i | l 1 | Δ l k X i | | Δ l k Y j | l 2 | Δ l k X j | u l l 1 u l l 2 .
  • If | Δ l k Y i | u l / 2 and | Δ l k Y j | u l / 2 , we obtain
  • | A i j 1 | C | Δ l k Y i | | Δ l k Y j | | Δ l k X i | r 1 | Δ l k X j | r 2 u l r 1 u l r 2 , | A i j 2 | C | Δ l k Y i | | Δ l k X i | r 1 | Δ l k X j | 1 + r 2 u l r 1 u l r 2 , | A i j 3 | C | Δ l k X i | 1 + r 1 | Δ l k Y j | | Δ l k X j | r 2 u l r 1 u l r 2 , | A i j 4 | C | Δ l k X i | 1 + r 1 | Δ l k X j | 1 + r 2 u l r 1 u l r 2 .
  • If | Δ l k Y i | u l / 2 and | Δ l k X j | u l / 2 , we obtain
    | A i j 1 | C | Δ l k Y i | 1 + l 1 | Δ l k Y j | | Δ l k X j | r 2 u l l 1 u l r 2 , | A i j 2 | C | Δ l k Y i | 1 + l 1 | Δ l k X j | u l l 1 , | A i j 3 | C | Δ l k X i | | Δ l k Y i | l 1 | Δ l k Y j | u l l 1 , | A i j 4 | C | Δ l k X i | | Δ l k X j | | Δ l k Y i | l 1 u l l 1 .
    The case of | Δ l k Y i | u l / 2 and | Δ l k X j | u l / 2 is similar to the above.
Next, by Hölder’s and Burkholder’s inequalities, we can estimate | Δ l k Y i | and | Δ l k X i | as follows,
E ( | Δ l k X i | 2 ) C Δ l , and E ( | Δ l k Y i | s ) C s ( Δ l ) s / 6 .
Without loss of generality, we let u l = Δ l ϵ + ϖ 1 , where 0 < ϵ < ϖ 2 ϖ 1 . Then, we have
E [ | 1 K k = 1 K l = 1 m 1 ( A i j 1 + A i j 2 + A i j 3 + A i j 4 + A i j 5 ) | α ] C K α ( K m ) α / 2 1 k = 1 K l = 1 m 1 [ E | A i j 1 | α + E | A i j 2 | α + E | A i j 3 | α + E | A i j 4 | α + E | A i j 5 | α ] C ( K α 2 m α 2 Δ r α [ 1 2 + l 1 ( 1 6 ϵ ϖ 1 ) ] + n α 2 K α + n α 2 K α 2 ) C ( K α 2 m l 1 α [ 1 6 ( ϵ + ϖ 1 ) ] + n α 2 K α + n α 2 K α 2 ) .
We have l 1 α [ 1 6 ( ϵ + ϖ 1 ) ] > 0 , because ϖ 1 < ϵ + ϖ 1 < ϖ 2 < 1 4 . Thus,
E | Γ ˜ i j T Γ i j | α C ( n α 2 K α + n α 2 K α 2 ) .
If i = j , we have
Γ ˜ i i T 2 m η ^ i i Γ i i = 1 K k = 1 K l = 1 m 1 ( Δ l k Y i ) 2 1 { | Δ l k Y i | u l } 2 m η ^ i i Γ i i = 1 K k = 1 K l = 1 m 1 ( ( Δ l k X i ) 2 1 { | Δ l k Y i | u l } ( Δ l k X i ) 2 ) + 2 K k = 1 K l = 1 m 1 ( Δ l k X i ) ( Δ l k ϵ i ) 1 { | Δ l k Y i | u l } + ( 1 K k = 1 K l = 1 m 1 ( Δ l k ϵ i ) 2 1 { | Δ l k Y i | u l } 2 m η ^ i i ) + 1 K k = 1 K l = 1 m 1 ( Δ l k X i ) 2 1 { | Δ l k Y i | u l } + 2 K k = 1 K l = 1 m 1 ( Δ l k Y i ) ( Δ l k X i ) 1 { | Δ l k Y i | u l } + 1 K k = 1 K l = 1 m 1 ( Δ l k X i ) 2 Γ i i .
Through a long tedious process, we also have
E | Γ ˜ i i T 2 m η ^ i i Γ i i | α C ( n α 2 K α + n α 2 K α 2 ) .
Now, let K 2 n = O ( n K ) , then K = n / m n 2 / 3 , and the convergence rate C n of the estimator is n 1 / 6 . This completes the proof of Theorem 1. □
Proof of Theorem 2.
From the triangle inequality, we have
Γ ^ ( ϖ ) Γ Γ ^ ( ϖ ) Γ ( ϖ ) 2 + Γ ( ϖ ) Γ 2 Γ ^ ( ϖ ) Γ ( ϖ ) + Γ ( ϖ ) Γ .
Next, Lemma 2 implies
Γ ^ ( ϖ ) Γ = max 1 i p j = 1 p | Γ i j | 1 ( | Γ i j | ϖ ) = O P ( π ( p ) ϖ 1 δ ) .
By Lemmas 1 and 2,
Γ ^ ( ϖ ) Γ ( ϖ ) max 1 i p j = 1 p | Γ ˜ i j T Γ i j | 1 { | Γ ˜ i j T | ϖ , | Γ i j | ϖ } + max 1 i p j = 1 p | Γ ˜ i j T | 1 { | Γ ˜ i j T | ϖ , | Γ i j | < ϖ } + max 1 i p j = 1 p | Γ i j | 1 { | Γ ˜ i j T | < ϖ , | Γ i j | ϖ } max 1 i , j p | Γ ˜ i j T Γ i j | max 1 i p j = 1 p 1 { | Γ i j | ϖ } + max 1 i p j = 1 p | Γ i j | 1 { | Γ i j | < ϖ } + max 1 i , j p | Γ ˜ i j T Γ i j | max 1 i p j = 1 p 1 { | Γ ˜ i j T | ϖ , | Γ i j | < ϖ } + ϖ max 1 i p j = 1 p 1 { | Γ i j | ϖ } = o P ( ϖ ) O P ( π ( p ) ϖ δ ) + O P ( π ( p ) ϖ 1 δ ) + o P ( ϖ ) O P ( π ( p ) ϖ δ ) + ϖ O P ( π ( p ) ϖ δ ) = O P ( π ( p ) ϖ 1 δ ) ,
which immediately shows that Γ ( ϖ ) Γ ( ϖ ) is of order ϖ 1 δ π ( p ) . □
Lemma 1.
Under models (1)–(3) and Assumptions 1–4,
max 1 i , j p | Γ ˜ i j T Γ i j | = O P ( C n p 2 / α ) = o P ( ϖ ) ,
P ( max 1 i p j = 1 p 1 { | Γ ˜ i j T Γ i j | ϖ / 2 } > 0 ) = o ( 1 ) ,
max 1 i p j = 1 p 1 ( | Γ ˜ i j T | ϖ , | Γ i j | < ϖ ) 2 δ M π ( p ) ϖ δ + o P ( 1 ) = O P ( π ( p ) ϖ δ ) ,
where ϖ is chosen as in Theorem 2.
Proof of Lemma 1.
Applying the Markov inequality, Theorem 1 and letting d = d 1 p 2 / α e n , we obtain that, as n , p ,
P ( max 1 i , j p | Γ ˜ i j T Γ | > d ) i , j = 1 p P ( | Γ ˜ i j T Γ i j | > d ) C p 2 C n α d α = C d 1 α 0 ,
and then d 1 . This proves (36).
Since h n , p as n , p , using the above inequality,
P ( max 1 i p j = 1 p 1 { | Γ ˜ i j T Γ i j | ϖ / 2 } > 0 ) P ( max 1 i , j p | Γ ˜ i j T Γ i j | ϖ / 2 ) 2 α C p 2 C n α ϖ α = 2 α C h n , p α 0 ,
which proves (37).
Similarly, the inequality (38) can be obtained by
max 1 i p j = 1 p 1 ( | Γ ˜ i j T | ϖ , | Γ i j | < ϖ ) max 1 i p j = 1 p 1 ( | Γ ˜ i j T | ϖ , | Γ i j | ϖ / 2 ) + max 1 i p j = 1 p 1 ( | Γ ˜ i j T | ϖ , ϖ / 2 < | Γ i j | < ϖ ) max 1 i p j = 1 p 1 ( | Γ ˜ i j T Γ i j | ϖ / 2 ) + max 1 i p j = 1 p 1 ( | Γ i j | > ϖ / 2 ) o P ( 1 ) + 2 δ M π ( p ) ϖ δ = O P ( π ( p ) ϖ δ ) .
Lemma 2
([12]). Under models (1)–(3) and Assumptions 1–4, and ϖ is chosen as in Theorem 2. Then, for any fixed a > 0 ,
max 1 i p j = 1 p | Γ i j | 1 ( | Γ i j | a ϖ ) a 1 δ M π ( p ) ϖ 1 δ = O P ( π ( p ) ϖ 1 δ ) ,
max 1 i p j = 1 p 1 ( | Γ i j | a ϖ ) a δ M π ( p ) ϖ δ = O P ( π ( p ) ϖ δ ) .

6. Conclusions

In this work, for cases when both the number of assets and the sample size approach infinity, we provide a new estimator for the large integrated volatility matrix, in the presence of jumps, micro-structural noise, and non-synchronization. First, we adopt the sub-sampling to synchronize the high-frequency data. Second, we use the two-time-scale realized co-volatility on the off-diagonal elements of Γ but correcting the bias on the diagonal elements of Γ , to handle noise. Finally, we employed the threshold parameters to remove the effect of jumps and sparsity in two steps. Both the minimax bound and the convergence rate were investigated. There are still some problems that we are eager to solve in the future research. First, we assume that entries of the integrated volatility matrix are homogeneous under the sparse condition; however, the volatility of financial assets usually has entries with a very wide range of variability, which motivates us to extend the current work to a general frame. Second, heavy-tailed data are often encountered in financial engineering, which motivates us to develop estimation procedures to solve these issues.

Author Contributions

E.G. and C.L. conceived and designed the study; E.G., C.L. and F.T. developed the mathematical model; E.G., C.L. and F.T. conducted the computational implementation of the model and post-processing of the results; E.G., C.L. and F.T. wrote, reviewed, and edited the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Erlin Guo OF Jiangsu Province grant number BY2022768, Cuixia Li OF Qing Lan Project of Jiangsu Province and Jiangsu Province grant number BY2022743 and Fengqin Tang OF NSF China grant number 12201235 and Anhui Province grant number 2108085QA14.

Data Availability Statement

Not applicable.

Acknowledgments

The authors would like to thank the editor, associate editor, and referees.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Andersen, T.G.; Bollerslev, T.; Meddahi, N. Correcting the errors: Volatility forecast evaluation using high-frequency data and realized volatilities. Econometrica 2005, 73, 279–296. [Google Scholar] [CrossRef] [Green Version]
  2. Barndorff-Nielsen, O.E.; Shephard, N. Econometric analysis of realized covariation: High frequency based covariance, regression, and correlation in financial economics. Econometrica 2004, 72, 885–925. [Google Scholar] [CrossRef]
  3. Todorov, V. Estimation of continuous-time stochastic volatility models with jumps using high-frequency data. J. Econcometrics 2009, 148, 131–148. [Google Scholar] [CrossRef]
  4. Brogaard, J.A.; Hendershott, T.; Riordan, R. High frequency trading and price discovery. Rev. Financ. Stud. 2014, 27, 2267–2306. [Google Scholar] [CrossRef] [Green Version]
  5. Dai, C.; Lu, K.; Xiu, D. Knowing factors or factor loadings, or neither?Evaluating estimators of large covariance matrices with noisy and asynchronous data. J. Econcometrics 2019, 208, 43–79. [Google Scholar] [CrossRef]
  6. Black, F. Noise. J. Financ. 1986, 41, 529–543. [Google Scholar] [CrossRef]
  7. Hayashi, T.; Yoshida, N. On covariance estimation of non-synchronously observed diffusion processes. Bernoulli 2005, 11, 359–379. [Google Scholar] [CrossRef]
  8. Zhang, L. Estimating covariation: Epps effect, microstructure noise. J. Econom. 2011, 160, 33–47. [Google Scholar] [CrossRef]
  9. Aït-Sahalia, Y.; Fan, J.; Xiu, D. High-frequency covariance estimates with noisy and asynchronous financial data. J. Am. Stat. Assoc. 2010, 105, 1504–1517. [Google Scholar] [CrossRef]
  10. Barndorff-Nielsen, O.; Hansen, P.; Lunde, A.; Shephard, N. Multivariate realized kernels: Consistent positive semi-definite estimators of the covariation of equity prices with noise and nonsynchronous trading. J. Econom. 2011, 162, 149–169. [Google Scholar] [CrossRef] [Green Version]
  11. Christensen, K.; Kinnebrock, S.; Podolskij, M. Pre-averaging estimators of the ex-post covariance matrix in noisy diffusion models with non-synchronous data. J. Econom. 2010, 159, 116–133. [Google Scholar] [CrossRef] [Green Version]
  12. Wang, Y.; Zou, J. Vast volatility matrix estimation for high-frequency financial data. Ann. Stat. 2010, 38, 943–978. [Google Scholar] [CrossRef] [Green Version]
  13. Tao, M.; Wang, Y.; Zhou, H. Optimal sparse volatility matrix estimation for high-dimensional Itô process with measurement erro. Ann. Stat. 2013, 41, 1816–1864. [Google Scholar] [CrossRef] [Green Version]
  14. Kim, D.; Wang, Y.; Zou, J. Asymptotic theory for large volatility matrix estimation based on high-frequency financial data. Stoch. Proc. Appl. 2016, 126, 3527–3577. [Google Scholar] [CrossRef] [Green Version]
  15. Kim, D.; Kong, X.; Li, C.; Wang, Y. Adaptive thresholding for large volatility matrix estimation based on high-frequency financial data. J. Econom. 2018, 203, 69–79. [Google Scholar] [CrossRef]
  16. Jing, B.; Liu, Z.; Kong, X. On the estimation of integrated volatility with jumps and microstructure noise. J. Bus. Econ. Stat. 2014, 3, 457–467. [Google Scholar] [CrossRef]
  17. Jing, B.; Kong, X.; Liu, Z. Modeling high-frequency financial data by pure jump processes. Ann. Stat. 2012, 40, 759–784. [Google Scholar] [CrossRef] [Green Version]
  18. Kong, X.; Liu, Z.; Jing, B. Testing for pure-jump processes for high-frequency data. Ann. Stat. 2015, 43, 847–877. [Google Scholar] [CrossRef]
  19. Li, C.; Chen, J.; Liu, Z.; Jing, B. On integrated volatility of Itô semimartingales when sampling times are endogenous. Commun. Stat.-Theor. Methods 2014, 43, 5263–5275. [Google Scholar] [CrossRef]
  20. Liu, Z. Jump-robust estimation of volatility with simultaneous presence of microstructure noise and multiple observations. Financ. Stoch. 2017, 21, 427–469. [Google Scholar] [CrossRef]
  21. Li, C.; Guo, E. Estimation of the integrated volatility using noisy high-frequency data with jumps and endogeneity. Commun. Stat.-Theor. Methods 2018, 3, 521–531. [Google Scholar] [CrossRef]
  22. Jing, B.; Li, C.; Liu, Z. On estimating the integrated co-volatility using noisy high-frequency data with jumps. Commun. Stat.-Theor. Methods 2013, 42, 3889–3901. [Google Scholar] [CrossRef]
  23. Yang, C.; Zhang, J.; Huang, Z. Numerical study on cavitation-vortex-noise correlation mechanism and dynamic mode decomposition of a hydrofoil. Phys. Fluids 2022, 34, 125105. [Google Scholar] [CrossRef]
  24. Li, R.; Zhang, H.; Chen, Z.; Yu, N.; Kong, W.; Li, T.; Wang, E.; Wu, X.; Liu, Y. Denoising method of ground-penetrating radar signal based on independent component analysis with multifractal spectrum. Measurement 2022, 192, 110886. [Google Scholar] [CrossRef]
  25. Yu, J.; Lu, L.; Chen, Y.; Zhu, Y.; Kong, L. An indirect eavesdropping attack of keystrokes on touch screen through acoustic sensing. IEEE Trans. Mob. Comput. 2021, 20, 337–351. [Google Scholar] [CrossRef]
  26. Lu, S.; Yang, B.; Xiao, Y.; Liu, S.; Liu, M.; Yin, L.; Zheng, W. Iterative reconstruction of low-dose CT based on differential sparse. Biomed. Signal. Proces. 2023, 79, 104204. [Google Scholar] [CrossRef]
  27. Ye, R.; Liu, P.; Shi, K.; Yan, B. State damping control: A novel simple method of rotor UAV with high performance. IEEE Access. 2020, 8, 214346–214357. [Google Scholar] [CrossRef]
  28. Jin, C.; Tsai, F.; Gu, Q.; Wu, B. Does the porter hypothesis work well in the emission trading schema pilot? Exploring moderating effects of institutional settings. Res. Int. Bus. Financ. 2022, 62, 101732. [Google Scholar] [CrossRef]
  29. Zhong, T.; Wang, W.; Lu, S.; Dong, X.; Yang, B. RMCHN: A residual modular cascaded heterogeneous network for noise suppression in DAS-VSP records. IEEE Geosci. Remote Sens. 2023, 20, 7500205. [Google Scholar] [CrossRef]
  30. Delbaen, F.; Schachermayer, W. A general version of the fundamental theorem of asset pricing. Math. Ann. 1994, 300, 463–520. [Google Scholar] [CrossRef]
  31. Zhang, L.; Mykland, P.; Aït-Sahalia, Y. A tale of two time scales: Determining integrated volatility with noisy high-frequency data. J. Am. Stat. Assoc. 2005, 100, 1394–1911. [Google Scholar] [CrossRef]
  32. Bickel, P.; Levina, E. Covariance regularization by thresholding. Ann. Stat. 2008, 36, 2577–2604. [Google Scholar] [CrossRef] [PubMed]
  33. Lam, C.; Fan, J. Sparsistency and rates of convergence in large covariance matrix estimaiton. Ann. Stat. 2009, 37, 4254–4278. [Google Scholar] [CrossRef]
  34. El Karoui, N. High-dimensionality effects in the markowitz problem and other quadratic programs with linear constraints: Risk underestimation. Ann. Stat. 2010, 38, 3487–3566. [Google Scholar] [CrossRef]
  35. Rigollet, P.; Tsybakov, A. Estimation of covariance matrices under sparsity constraints. Stat. Sin. 2012, 22, 1319–1378. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Guo, E.; Li, C.; Tang, F. The Convergence Rates of Large Volatility Matrix Estimator Based on Noise, Jumps, and Asynchronization. Mathematics 2023, 11, 1425. https://doi.org/10.3390/math11061425

AMA Style

Guo E, Li C, Tang F. The Convergence Rates of Large Volatility Matrix Estimator Based on Noise, Jumps, and Asynchronization. Mathematics. 2023; 11(6):1425. https://doi.org/10.3390/math11061425

Chicago/Turabian Style

Guo, Erlin, Cuixia Li, and Fengqin Tang. 2023. "The Convergence Rates of Large Volatility Matrix Estimator Based on Noise, Jumps, and Asynchronization" Mathematics 11, no. 6: 1425. https://doi.org/10.3390/math11061425

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop