Next Article in Journal
Research on Rumor-Spreading Model with Holling Type III Functional Response
Previous Article in Journal
Soliton-Type Equations on a Riemannian Manifold
Previous Article in Special Issue
Do Commodities React More to Time-Varying Rare Disaster Risk? A Comparison of Commodity and Financial Assets
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

High-Frequency Quote Volatility Measurement Using a Change-Point Intensity Model

1
Center for Economics, Finance and Management Studies, Hunan University, Changsha 410006, China
2
Department of Applied Math, Stony Brook University, Stony Brook, New York, NY 11790, USA
*
Author to whom correspondence should be addressed.
Mathematics 2022, 10(4), 634; https://doi.org/10.3390/math10040634
Submission received: 31 December 2021 / Revised: 13 February 2022 / Accepted: 15 February 2022 / Published: 18 February 2022
(This article belongs to the Special Issue Mathematical and Statistical Methods Applications in Finance)

Abstract

:
Quote volatility is important in determining the cost of demand in a high frequency (HF) order market. This paper proposes a new model to measure quote volatility based on the point process and price-change duration. Specifically, we built a change-point intensity (CPI) model to describe the dynamics of price-change events for a given level of threshold. The instantaneous volatility of quote price can be calculated at any time according to price-change intensities. Based on this, we can quantify the cost of demanding liquidity for traders with different trading latency by using integrated variances. Furthermore, we use the autoregressive conditional intensity (ACI) model proposed by Russell (1999) as a benchmark comparison. The results suggest that our model has better performance of both in-sample fitness and out-of-sample prediction.
JEL Classification:
C58; G10; G17

1. Introduction

Currently, Limit Order Book (LOB) is widely used in financial markets to facilitate traders to manage their orders and to implement transactions. In the LOB, the existing limit orders for a financial asset can be viewed as the up to time liquidity provision in the market, while the issuing of market orders from any traders is an instant demand of that liquidity. A liquidity demander who wants to buy shares (or sell shares) immediately will take the price at the best ask (or at the best bid) to complete transactions. However, because of the trading latency, traders may suffer price uncertainty (risks) of transactions as the quoted price at the best bid and the best ask would fluctuate from the time of issuing orders to the time that the order is matched. The problem is critical for traders with relatively high trading latency and becomes severe in the periods when the quote price changes quickly.
Because of the development of algorithmic trading and high-frequency trading in the last two decades, some traders have a speed advantage in order to execute their orders. The volatility of the quoted price and the latency of trades determine the uncertainty of the purchasing (selling) price for those liquidity demanders. Hasbrouck [1] has pointed out that low-latency traders have the advantage of having lower cost to demand orders, while in the paper, he assumes that the price path is already given at the time of order submission. However, the dynamic of the quote price is quite unpredictable with some periods being much more volatile than others. If a trader arrives at a time when the quote price is more volatile, she will encounter a higher risk of transaction pricing.
Different traders have different latency, from microseconds for high-frequency traders to dozens of seconds for ‘slow’ traders; see [2]. To provide a framework that is suitable for us to quantify quote volatility and the cost of demanding liquidity for different traders, we resort to the instantaneous volatility of quote price at any time point and further construct an integrated variance for different time horizons. However, the estimation of instantaneous volatility becomes particularly difficult for irregular LOB data in HF.
In the development of information technology and electronic trading, the updating frequency of order events has reached the level of microseconds, while there is also silent time where LOB stays unchanged for a couple of seconds. To deal with irregular data, the theory of point process is employed. According to it, the arrivals of a certain type of market events can be viewed as ordered points occurring in the time space. In the past, a number of econometric models were built to describe the occurrence of limit order submission, trade arrival, price change, etc. [3,4], and many of them directly study the financial durations between those market events [5,6,7].
Recently, Yang et al. [8] applied the autoregressive conditional duration (ACD) model by [5] and Markov-switching multifractal inter-trade duration (MSMD) model by [9] to study LOB and find their limitations in fitting HF transaction data. Abergel and Jedidi [10] introduced the Hawkes process to model LOB, in which past events can influence the occurrence intensity of a current event. Then, Swishchuk and Huffman [11] constructed general compound Hawkes processes and investigated their properties in LOB. Later, Morariu-Patrichi and Pakkanen [12] applied state-dependent Hawkes processes to HF LOB data and built a novel model that captures the feedback loop between the order flow and the shape of the LOB. In addition, Li et al. [13] used a time-varying Markov regime switching model to study the arrival time of trades in LOB and captured the bimodal distribution of intertrade durations.
Although the point process has been extensively exploited to model HF financial data, short-term volatility measurement using price duration is not common in the literature. Cho and Frees [14] initiated a discussion of using price duration as volatility measurement, and Gerhard and Hautsch [15] formally proposed a volatility estimator based on price durations. Tse and Yang [16] developed duration-based variance estimators using ACD specifications, and recently, Hong et al. [17] proposed a non-parametric duration-based estimator and concluded that the duration-based volatility estimators are more efficient than noise-robust realized volatility estimators. Furthermore, some papers focus on modeling the intensity function itself. By defining the intensity in continuous time and allowing the intensity process to be updated whenever required, the instantaneous volatility can be calculated by the inverse of intensity. Russell [18] first proposed a univariate dynamic intensity model, the autoregressive conditional intensity (ACI) model, which follows an autoregressive structure that is updated at the time of new market events. Then, the ACI model was extended to the stochastic conditional intensity processes and multivariate process [19,20,21].
In this paper, we propose a new change-point model to measure the quote volatility. The seminal work on change-point models can be traced back to Box and Tiao [22], who want to solve a common modeling problem for the time series for which its parameters may undergo occasional changes. It arises in many applications, e.g., engineering, econometrics, and biomedicine. However, the generalized statistic method for estimation was developed until Lai et al. [23], who used BCMIX (bounded complexity mixture) to reduce the complexity of computation. In our framework, we view the jump of the intensity of the quote price movement as a change point. Moreover, the domain set for renewing the intensity is infinite, and the renewing distribution is continuous. Thus, our model can generate a much larger space than the traditional ACI model. The empirical results demonstrate that our model performs much better in terms of fitting current HF data. Moreover, the change-point intensity model can not only measure the cost of demanding liquidity for traders with different latency but also can be used to test volatility jumps in HF environments.
The paper is organized as follows. Section 2 introduces the method of volatility measurement using the price-change duration and our change-point intensity model. Section 3 provides the estimation procedure and the simulation results. Section 4 presents the data we used. Section 5 shows the in-sample fitness of our model and the measurement of HF quote volatility, with a benchmark comparison with the ACI model. Section 6 further implements the out-of-sample test and evaluates the model’s predictive power. Section 7 provides the conclusions.

2. Volatility Measurement Using Price Duration

2.1. Instantaneous Volatility Measurement Use Price Duration

According to [5], the instantaneous volatility can be measured by the conditional instantaneous variance of returns, which is defined as follows:
σ 2 ( t ) : = lim Δ 0 E 1 Δ p ( t + Δ ) p ( t ) p ( t ) 2 F t ,
where { p ( t ) , t 0 } is a price process of a financial security, and F t denotes the information setup, including t. Following [15,16], duration-based variance estimators rely on a relationship between the conditional intensity function and the conditional instantaneous variance of a point process. Specifically for price volatility, we can consider the price-change process as a point process. We define δ as the threshold of the price-change event, and { t i δ } i = 1 , 2 , , n are the times when these price-change events occur. Clearly, the number of events n depends on the value of δ .
If we define x i δ : = t i δ t i 1 δ as the price duration between two consecutive price-change events, then the conditional variance per time over the price duration is as follows.
σ 2 ( t i δ ) = E 1 x i + 1 δ δ p ( t i δ ) 2 F t i = E 1 x i + 1 δ F t i δ p ( t i δ ) 2 .
The above calculation requires either specifying a stochastic process for 1 x i + 1 δ or computing the distribution 1 x i + 1 δ using a transformation of the conditional distribution of x i + 1 δ .
Here, we introduce more information on the point process theory, which leads to the formulation of instantaneous volatility using the price-change intensity function. Let { t i } i = 1 , 2 , , n be a sequence of event arrival times 0 t i t i + 1 , then a orderly point process is associated with a counting process, N ( t ) , where N ( t ) = Σ i 1 1 t i t is the number of events up to and including time t. A point process can be characterized by a intensity function λ ( t ; F t ) , which is described as follows.
λ ( t ; F t ) = lim Δ 0 1 Δ P r [ N ( t + Δ ) > N ( t ) | F t ] .
It represents the probability for a new arrival of the event in an infinitesimal time interval. In many applications, this is equivalent to the hazard function, particularly in traditional duration or survival analysis, where cross-sectional duration data are analyzed [5,24], while the intensity function is mostly defined in continuous time and conditions on a possibly continuously varying information set F t .
In particular, for the price-change event with the price threshold δ , the price variation in a small time interval Δ can only be δ or δ . Hence, the instantaneous variance of returns at time t can be derived in terms of the following expression.
σ 2 ( t ) = lim Δ 0 E 1 Δ p ( t + Δ ) p ( t ) p ( t ) 2 F t = lim Δ 0 1 Δ P r | p ( t + Δ ) p ( t ) | δ F t δ p ( t ) 2 = lim Δ 0 1 Δ P r [ N δ ( t + Δ ) > N δ ( t ) | F t ] δ p ( t ) 2 = λ δ ( t ; F t ) δ p ( t ) 2 .
Therefore, the measurement of instantaneous volatility lies in estimation of the intensity function associated with the process of δ —price changes, i.e., λ δ ( t ; F t ) . A similar result is obtained in [4,5,17].
Another important property about the integrated intensity function builds the basis for the construction of the likelihood function of an intensity-based model and leads to the mixture-of-exponential representation that is essential for our change-point intensity (CPI) model. According to the random time change theorem by [25] which transforms a wide class of point processes to a unit-rate Poisson process, both Barndorff-Nielsen and Shiryaev [26] and Hautsch [4] have shown that, if the event arrival time of a point process is t 1 , t 2 , , t n , then the integrated intensity function is as follows:
Λ ( t i 1 , t i ) t i 1 t i λ ( s ) d s i . i . d E x p ( 1 ) ,
where E x p ( 1 ) is the exponential distribution with the rate parameter as 1.
If we further assume that the intensity function is constant between two consecutive events, i.e., λ ( t ) = λ i between t i 1 and t i , then the above property becomes y i λ i i . i . d E x p ( 1 ) , where y i = t i t i 1 is the event duration between t i 1 and t i . Rearranging the equation, we will have the mixture-of-exponential representation.
y i = ϵ i λ i , ϵ i i . i . d E x p ( 1 ) .
Chen et al. [9] also arrives at the same result by interpreting a point process as a dynamic, uncountable set of independent Bernoulli trials.

2.2. The Benchmark ACI Model

The benchmark autoregressive conditional intensity (ACI) model was proposed by [18]. It presents dynamic parameterizations of the intensity function in continuous time, which allows updating the intensity process whenever required. The intensity is characterized in the the following form:
λ ( t , F ( t ) ) = ψ ( t ) λ 0 ( t ) s ( t ) ,
which is driven by three components: one component ϕ ( t ) capturing the dynamic structure, a baseline intensity component λ 0 ( t ) , and a seasonal periodicity component s ( t ) .
The core part is to model dynamic component ψ ( t ) , which is given by the following:
ψ ( t ) = exp ψ ˜ N ( t ) + 1 + z N ( t ) T γ
where z N ( t ) is the vector of covariates that is collected at the time of the preceding event (say t i 1 ), γ is the vector of coefficients for these covariates, and ψ ˜ N ( t ) + 1 is a piecewise-constant dynamic component between i 1 and i events. This piecewise-constant component follows a form of ARMA( 1 , 1 ):
ψ ˜ i = c + α ε ˜ i 1 + β ψ ˜ i 1 ,
where β is the persistence parameter, and α is the coefficient associated with the innovation term ε ˜ i 1 . The innovation term is specified as follows.
ε ˜ i : = 1 ε i = 1 Λ ( t i 1 , t i ) .
Based on the theory of point process, the probability of events occurring at time t 1 , t 2 , , t n is i = 1 n λ ( t i ) · exp t i 1 t i λ ( t ) d t . Hence, the log-likelihood of the observation of events is as follows.
ln L = i = 1 n ln λ ( t i ) t i 1 t i λ ( t ) d t .
Specifically, for this quote volatility measurement, set the baseline intensity λ 0 ( t ) = exp ( ω ) x ( t ) a 1 , where x ( t ) is the backward recurrence time at t and is defined as x ( t ) = t t N ( t ) . t N ( t ) is the nearest backward event time, and x ( t i ) = t i t i 1 . For the seasonal factor s ( t ) , we can use 1 h intervals and set it as piecewise linear within one interval: s ( t ) = s ( t ^ k 1 ) + b k ( t t ^ k 1 ) , t ^ k 1 t t ^ k where t ^ k , k = 1 , , 6 are the interval cutting time and we set s ( t ^ 0 ) = 1 .
Therefore, the parameters in this simple ACI model without covariates are as follows:
c , α , β , ω , a , b 1 , , b 6
and we can use the maximum likelihood (MLE) for estimation.

2.3. The Change-Point Model for Quote Volatility

For the quote volatility measurement in an HF environment, we propose a new change-point intensity (CPI) model, following [22,23]. Similarly to other intensity-based duration models, we think that the price duration follows an exponential distribution that has the price-change intensity as its rate parameter λ . In our case, the shift of intensity is a change point, which may respond to market environment fluctuation, liquidity shocks, and a myriad of perceived changes of information on the stock. Allowing intensity to change over time can account for relatively long periods punctuated by extremely short periods observed in the time series.
According to the mixture of exponential representation in the Equation (3), the duration is as follows:
y t = ε t λ t ,
where ε t follows an i.i.d. E x p ( 1 ) . The conditional intensity follows a Markov change-point process with the renewing distribution G ( · ) .
λ t + 1 = λ t w . p . 1 p , λ t + 1 G ( · ) w . p . p .
It belongs to the class of change point process because the underlying intensity λ t undergoes occasional changes. It is also a Markov process because the value of future intensity only depends on the current state, i.e., either keep the same value as the current intensity with probability 1 p or randomly draw from a fixed distribution G ( · ) with probability p.
Given G ( · ) defined in a continuous and infinite space, our model can generate much greater flexibility by using a very small numbers of parameters. Specifically, in our model, we assume that the renewing distribution is a Gamma distribution, and the p.d.f is as follows:
G ( λ t ) = Gamma ( λ t ; α , β ) = β α λ t α 1 e λ t β Γ ( α ) = Z ( α , β ) · λ t α 1 e λ t β
where α and β are the shape and rate parameter of Gamma distribution, respectively, and Z ( α , β ) = β α Γ ( α ) is a defined function of α and β .
This model can also be view as a conditional mean duration model ( similarly to the traditional ACD model) in terms of the following:
y t = M t ε t ,
where M t = 1 λ t is the conditional mean level for the inter-trade duration, and M t also follows Markov change-point process:
M t + 1 = M t w . p . 1 p M t + 1 F ( · ) w . p . p
where F ( · ) is the renewing distribution of M t .
The economic intuition of our CPI model is as follows. In the period without a change point (with probability 1 p ), it is plausible to think that the market is behaving consistently and the price-change intensity remains unchanged. At this time, quote volatility is constant. However, when the market environment changes or there is some liquidity/informational shock (with probability p) on the stock, the evolution of quote price enters a new state and the price-change intensity changes, and correspondingly, the quote volatility becomes a new value.

2.4. Model Comparison

In this part, we show some comparative structures of our CPI model, using the ACI model as an anchoring benchmark.
  • The price-change intensity in our model is set to be constant within the interval of price updates while it may have a sudden jump to a new level for subsequent price-change events. We think that this setting is reasonable for ultra-high-frequency data as the time interval of quote-price updates is short, for which we can ignore the dynamics in between, but more importantly, it can help us to circumvent the complex estimation of the time structure of the underlying intensity. Nevertheless, when there is some exogenous shock (information or liquidity shock), the quote-price change intensity can become a new one, and correspondingly, quote volatility changes.
  • Although, in general, the quote-price change duration is short, it may also have a large dispersion as the shortest duration can proceed to the extent of microseconds while the longest duration could be a couple of seconds. This will become a potential problem for the ACI model as the evolution of price-change intensity in the ACI model is relatively smooth. (In the ACI model, in addition to the baseline component λ 0 ( t ) and the seasonal component s ( t ) , the dynamic component ϕ ( t ) follows an ARMA structure.) However, in our change-point intensity model, the new level of intensity is drawn from a continuous distribution G ( λ ) . Hence, it allows drastic changes of intensity, depending on the value of distribution parameters.
  • Our model is also feasible to be extended to a framework of incorporating other influencing factors in determining the change of quote volatility. For example, we can allow intensity renewal probability p to be time varying and to be driven by some other factors:
    log p t 1 p t = x t 1 T β
    where x is a k × 1 vector of influencing factors, and β is a k × 1 vector of the corresponding factor loadings (or the regression coefficients). By having this structure, we can further analyze whether some factors can result in a high probability of changing the quote volatility. Nonetheless, this part of extension is beyond the scope of this paper and deserves a further development.

3. Model Estimation and Simulation

3.1. Model Estimation

For the change-point model introduced in Section 2.3, the complete log-likelihood function is as follows:
l c ( { y 1 n , λ 1 n } ) = log P ( λ 1 ) + t = 1 n log f ( y t | λ t ) + t = 2 n log P ( λ t | λ t 1 ) ,
where y 1 n is the sequence of observed event durations { y t } t = 1 , , n , and λ 1 n is the sequence of underlying intensities { λ t } t = 1 , , n . P ( λ 1 ) is probability that the initial intensity is λ 1 , f ( y t | λ t ) is the probability density of y t given the current intensity is λ t , and P ( λ t | λ t 1 ) is the conditional probability of λ t given the previous intensity is λ t 1 . In our model, the above equation is equivalent to the following:
l c ( { y 1 n , λ 1 n } ) = log G ( λ 1 ) + t = 1 n log f ( y t | λ t ) + t = 2 n log G ( λ t ) · 1 ( I t = 1 ) + log p · 1 ( I t = 1 ) + log ( 1 p ) · 1 ( I t = 0 ) ,
because log P ( λ 1 ) = log G ( λ 1 ) as we think the initial state is independently drawn from the Gamma distribution, and conditional probability P ( λ t | λ t 1 ) can be derived based on two possible situations, i.e., with probability p it renews from G ( · ) , and with probability 1 p , it keeps unchanged. I t = 1 means there is a change-point at t -th trade, i.e., λ t λ t 1 , and 1 ( I t = 1 ) is a indexing function that returns 1 when I t = 1 and otherwise returns 0. Similarly, 1 ( I t = 0 ) equals 1 if there is no change-point at t -th trade, and 0 if there is a change-point at t -th trade.
The parameters to be estimated in the change-point model are α , β , and p. The sequence of price-change durations y 1 n is known to us, however, we cannot observe the actual values of hidden variables λ 1 n . Thus, directly maximizing the complete log-likelihood is infeasible, and we need to use the Expectation Maximization (EM) method for model estimation.

3.1.1. Expected Likelihood

In the E-step, the expected log-likelihood, conditional on D = { y 1 n , parameters   of   last iteration } , is as follows:
E l c ( { y 1 n , λ 1 n } ) | D = E log G ( λ 1 ) | D + t = 1 n E log f ( y t | λ t ) | D + t = 2 n E log G ( λ t ) · 1 ( I t = 1 ) | D + t = 2 n log p · P ( I t = 1 | D ) + log ( 1 p ) · P ( I t = 1 | D ) ,
where the expectation is taken over the posterior distribution of hidden variables λ i .
As G ( λ ) is a Gamma distribution, it is a conjugate prior for the exponential distribution. Hence, the posterior distribution of λ i is also a Gamma distribution. We have shown in the Appendix A that the posterior distributions of λ t is as follows:
f ( λ t | D ) = 1 i t j n Π i t j · g i j ( λ t ) ,
where g i j ( λ ) Gamma ( α + ( j i + 1 ) , β + i j y t ) and Π i t j is the posterior probability that the last change point occurs at i and the next change point occurs at j + 1 , which means the current state of λ t starts from i and ends at j. The steps for calculating Π i t j will also be shown in the Appendix A. Moreover, the following is the case:
P ( I t + 1 = 1 | D ) = 1 i t Π i t t P ( I t + 1 = 0 | D ) = 1 P ( I t + 1 = 1 | D )
where t [ 1 , N 1 ] , and we also set P ( I 1 = 1 | D ) 1 .
Therefore, in Equation (15), we have the following.
E ( log G ( λ 1 ) | D ) = λ 1 log G ( λ 1 ) · f ( λ 1 | D ) d λ 1 = λ 1 log Z ( α , β ) + ( α 1 ) log λ 1 β λ 1 · 1 i 1 j n Π i 1 j · g i j ( λ 1 ) d λ 1
E ( log f ( y t | λ t ) | D ) = λ t log f ( y t | λ t ) · f ( λ t | D ) d λ t = λ t log λ t λ t y t · 1 i t j n Π i t j · g i j ( λ t ) · d λ t
E ( log G ( λ t ) · 1 ( I t = 1 ) | D ) = log G ( λ t ) · f ( λ t , I t = 1 | D ) d λ t
Since f ( λ t | D ) = 1 i t j n Π i t j · g i j ( λ t ) , thus f ( λ t , I t = 1 | D ) = t j n Π t t j · g t j ( λ t ) . Hence, we have the following.
E ( log G ( λ t ) · 1 ( I t = 1 ) | D ) = t j n Π t t j log G ( λ t ) · g t j ( λ t ) d λ t

3.1.2. Maximization and Parameters’ Update

Once we have the expected log-likelihood in (15) and write it as a function form with arguments ( α , β , p ) :
l E C ( α , β , p ) E l c ( { y 1 n , m 1 n } ) | D ,
we can perform our maximization step in EM and update our estimations of model parameters.
As only the last two items in (15) contains parameter p, therefore, by first order maximization, we have the following.
l E C ( α , β , p ) p ^ = 1 p ^ t = 2 n P ( I t = 1 | D ) 1 1 p ^ t = 2 n P ( I t = 0 | D ) = 0
Thus, due to the following:
p ^ = t = 2 n P ( I t = 1 | D ) n 1
we obtain the new value of p ^ .
The first and third items in l E C ( α , β , p ) contain parameters ( α , β ) :
E log G ( λ 1 ) | D + t = 2 n E log G ( λ t ) · 1 ( I t = 1 ) | D = λ 1 log G ( λ 1 ) · 1 i 1 j n Π i 1 j · g i j ( λ 1 ) d λ 1 + t = 2 n λ t log G ( λ t ) · t j n Π t t j · g t j ( λ t ) d λ t = t = 1 n λ t log G ( λ t ) · t j n Π t t j · g t j ( λ t ) d λ t = t = 1 n λ t log Z ( α , β ) + ( α 1 ) log λ t β λ t · t j n Π t t j · g t j ( λ t ) d λ t = t = 1 n t j n Π t t j λ t log Z ( α , β ) + ( α 1 ) log λ t β λ t g t j ( λ t ) d λ t = log Z ( α , β ) · A + ( α 1 ) · B β · C = [ α log β log Γ ( α ) ] · A + ( α 1 ) · B β · C ,
where
A = t = 1 n t j n Π t t j = t = 1 n P ( I t = 1 | D )
B = t = 1 n t j n Π t t j · λ t log λ t · g t j ( λ t ) d λ t
C = t = 1 n t j n Π t t j · λ t λ t · g t j ( λ t ) d λ t
Moreover, in this case, we can have a nice result in B.
t j n Π t t j · λ t log λ t · g t j ( λ t ) d λ t = t j n Π t t j · E g t j [ log λ t ] .
And because g t j ( λ ) Gamma ( α + ( j t + 1 ) , β + t j y s ) , E g t j [ log λ t ] = ψ ( α o l d + j t + 1 ) log ( β o l d + t j y s ) , where ψ ( · ) is the Digamma function (first order derivative of log-gamma function):
ψ ( α ) = d log Γ ( α ) d α = Γ ( α ) Γ ( α ) = γ k = 0 1 α + k 1 k + 1
and constant γ 0.5772156649 .
Similarly, in C, we have the following.
t j n Π t t j · λ t λ t · g t j ( λ t ) d λ t = t j n Π t t j · E g t j [ λ t ] .
Because g t j ( λ ) Gamma ( α + ( j t + 1 ) , β + t j y s ) , we have E g t j [ λ t ] = α o l d + j t + 1 ( β o l d + t j y s ) .
Then, substitute (23) into (21), and we can obtain the following,
l E C ( α , β , p ) β ^ = A · α ^ β ^ C = 0 α ^ = C A β ^ ,
and
l E C ( α , β , p ) α ^ = 0 A · log β ^ A · ψ ( α ^ ) + B = 0 .
Thus, combining the last two Equations (24) and (25), we can solve the new value of α ^ and β ^ . In addition to the result of p ^ in (22), we can perform next EM iteration until the convergence of estimators.

3.1.3. Inference of λ t

We cannot obtain the accurate value of λ t ; however, we can use the posterior mean as its estimator, i.e., λ ^ t . Therefore, the following is the case.
λ ^ t = λ t · f ( λ t | D ) d λ t = λ t · 1 i t j n Π i t j g i j ( λ t ) d λ t = 1 i t j n Π i t j · λ t · g i j ( λ t ) d λ t = 1 i t j n Π i t j · E g i j [ λ t ]
Since g i j ( m ) Gamma ( α ^ + ( j i + 1 ) , β ^ + i j y s ) , E g i j [ m t ] = α ^ + j i + 1 ( β ^ + i j y s ) . Thus, the following is the case.
λ ^ t = 1 i t j n Π i t j · α ^ + j i + 1 ( β ^ + i j y s ) t = 1 , , n

3.2. Simulation

In order to validate the change-point model and the corresponding estimation algorithm, we first simulated data that provided us with the true values of model parameters for comparison. We have simulated 7000 points of price-change events, and the parameters are α = 5.0 (shape parameter in Gamma), β = 2.0 (rate parameter in Gamma), and p = 0.018 (the change point probability). Given this data generating process of the price-change intensity, we plot the simulated durations of quote-price changes in Figure 1. We also plot the inverse of simulated intensity λ t , which is mean duration m t in Figure 1. From the graph, we can observe that there is a large variation of event duration, i.e., the longest duration is more than 12 s, while the short duration is about 1 × 10 3 s. This means that sometimes it takes a couple of seconds for a change in quote price, while sometimes it only needs a millisecond for a price change. Clearly, the volatility of this simulated quote price is not constant.
We use the estimation method shown in Section 3.1 to estimate model parameters. We have performed our estimation by using the first 1000 points, 4000 points, and the entire sample, respectively. The results are presented in Table 1, and we observe that the estimations are close to real values, especially for the large sample. Moreover, we have plotted simulated intensities and estimated intensities, together with simulated inter-trade durations in Figure 2 for N s a m p l e = 7000 , and we find that the estimated intensities of price changes are also close to the ones in the simulated path.

4. Data Environment

Our data are downloaded from LOBSTER (https://lobsterdata.com/ accessed on 2 February 2018), which provides high-quality LOB data of all Nasdaq stocks from June 2007. The LOB data reconstructed by LOBSTER are based on Nasdaq’s Historical TotalView-ITCH data, i.e., the historic record of what Nasdaq calls. LOBSTER simultaneously generates two files for each active trading day of a selected ticker. One is a ‘message’ file, which contains indicators for the type of event causing an update of LOB in the requested price range. The other is an ‘orderbook’ file, which records the ask and bid quotes of LOB at the time when the ‘message’ file is updated.
Table 2 shows a sample of LOB ‘messages’ and ‘orderbook’ files of AMZN on 2 January 2013. We show three events of the LOB. In panel A of Table 2, which is the ‘messages’ file, type 3 event and type 1 event represent a deletion and submission of a limit order, respectively. The direction ‘−1’ means the order event from the ask side, and ‘1’ denotes the bid-side order. In the meantime, the ‘orderbook’ file in panel B of Table 2 records the ‘shape’ of LOB after these three events. From this, we can observe that after submission of a new order at the bid side with a higher price than the existing best bid, the new best bid changes from 2550700 to 2550800, i.e., $255.07 to $255.08.
Figure 3 plots the evolution of quote price at the best bid in the first 50 s of AMZN stock on 2 January 2013. From this, we can observe that there is non-constant variation of bid price, with some intervals being much more volatile than others. A liquidity demander with 1 s of latency will encounter a high uncertainty and cost when she posts a (sell-side) market order at t = 20 or t = 30 , compared with the entering time at t = 40 . Therefore, we need a method to effectively quantify the volatility of the quote price at any time and also the cost of demand for traders with different trading latencies.

5. Model Fitness and In-Sample Analysis

5.1. Model Fit and Instantaneous Volatility

We use AMZN LOB data on 2 January 2013 to illustrate model fitness and to obtain the measurement of quote volatility. We choose the threshold of price change as 3 cents, which is three times the minimum unit of quote price. In Figure 4, we plot the series of quote-price duration (at the best bid level) of AMZN stock on 2 January 2013, given the threshold of price change as 3 cents. There is a large variation for this level of price duration. The shortest duration is 10 5 s, while the longest duration is about 80 s.
We first use our change-point intensity model (CPI) for estimation. The results are α ^ = 0.25 , β ^ = 0.006 , and p ^ = 0.34 . Moreover, we can infer underlying intensities { λ t } 1 , , N according to Equation (26). The instantaneous volatility can be derived according to Equation (1). Specifically, for time t [ t i , t i + 1 ) , we have the following:
σ 2 ( t ) = λ ( t i ) δ p ( t i ) 2 t [ t i , t i + 1 )
because, in the change-point model, intensity is assumed to be constant between two events. In Figure 5, we plot the fitted instantaneous quote volatility of the best bid price for the first 50 s of AMZN on the trading day of 2 January 2013. From the graph, we can observe that instantaneous quote volatility jumps to a high level when the quote price changes dramatically in a short time. On the other hand, volatility stayed at a low level if the quote price maintains fixed or changes steadily.
Moreover, we supplement the quote volatility estimation results by the ACI model in Figure 6, from which we can observe that ACI volatility exhibits strong spikes at the point when the quote price suddenly changes over a threshold. Furthermore, we provide the comparison between the models’ fitted residuals in Figure 7 and Figure 8. According to the property of the integrated intensity function shown in Equation (2) and the mixture of exponential expression in Equation (3), we should have the model’s fitted residuals follow an Exponential(1) distribution. From the results, we can clearly observe that the CPI model has a better result in terms of fitness.

5.2. Integrated Variance and Cost of Demand

Denote p t as the logarithm price of the best bid quote or the best ask quote. Under the assumption of no arbitrage, we suppose it follows a continuous semi-martingale process:
p t = p 0 + 0 t μ ( τ ) d τ + 0 t σ ( τ ) d W τ
where W denotes a standard Wiener process, μ ( τ ) is a finite càdlàg drift process, and σ ( τ ) is an adapted càdlàg volatility process associated with the instantaneous conditional mean and volatility of the corresponding return.
The integrated variance over a interval [ 0 , t ] is as follows:
I V ( 0 , t ) : = 0 t σ 2 ( τ ) d W τ ,
which also equals to the quadratic variation of a process corresponding to the sum of its squared increments measured on infinitesimal intervals. Hence, it is a natural quantity reflecting the riskiness of an asset over a given time span. Thus, we can use the derived the instantaneous volatility to calculate the the integrated variance.
For liquidity demanders with different trading latency Δ i , the integrated variances are specifically as follows.
I V ( t , t + Δ i ) : = t t + Δ i σ 2 ( τ ) d W τ .
Therefore, the trading cost not only depends on latency Δ i but also on the magnitude of the (instantaneous) volatility over the interval. In Table 3, we calculate the integrated variance for three types of trades, with trading latency of 0.01 s, 1 s, and 5 s, respectively. Clearly, a low-latency trader who has speed advantages in sending orders will encounter a lower risk to obtain her liquidity. For example, the mean value of the volatility of the transaction prices for a trader whose trading latency is 5 s is 0.035, while the value for a trader whose trading latency is 0.01 s is just 7.6 × 10 5 .
Moreover, in the below figures (Figure 9, Figure 10 and Figure 11), we plot the fitted standard deviation (the square root of the integrated variance) of the bid price for the AMZN stock on 2 January 2013 in the first 50 s for three types of traders, for which their trading latencies are 0.01 s, 1 s, and 5 s, respectively. Compared with the ACI model, the price standard deviation estimated from our CPI model is relatively smooth, especially for the low-latency environment, which helps us to be more effective in evaluating the dynamics of quote volatility for HF traders. As the the evaluating time window increases, say the price standard deviation for 5 s, both models obtain similar results.
We can quantify the cost of demand for a specific type of traders at different time points when initiating her orders. We use the price standard deviation as the cost of demanding liquidity. For example, for a trader whose trading latency is 1 s, she will suffer price uncertainty of USD 0.089 when she enters the market at t = 10 s. While when this trader enters the market at t = 20 s, the price uncertainty of her transaction price is about USD 0.289.

6. Out-of-Sample Performance and Model Prediction Power

At last, we want to examine the out-of-sample performance of our CPI model. However, it is hard to predict the volatility directly and to observe the model’s performance because actual volatility is unobserved. Therefore, we only predict the duration length for the next price change as the actual price-change duration is known to us. When the duration for the change of quote price is long, quote volatility is low. On the other hand, when the duration for the change of quote price is short, quote volatility should be high.
We use one-step-ahead forecasting based on the model’s in-sample estimation results. Specifically, for the quote prices (at the best bid) of the AMZN stock on 2 January 2013, we use the first 4000 data points (which are the 4000 events of price changes) for parameter estimation and perform a one-step-ahead prediction for the remaining observations of price durations. The expected price-change duration for the quote price can be derived as follows.
E t ( y t + 1 ) = 1 E t ( λ t + 1 ) = 1 p ^ · λ t ^ + ( 1 p ^ ) · α ^ β ^ .
This is because, in the CPI model, the intensity for the next price change either retains its past value λ t (with probability p ^ ) or renews from the Gamma distribution (with probability 1 − p ^ ), and the mean value of the Gamma distribution is α ^ β ^ .
We test the model’s out-of-sample performance by using the Mincer–Zarnowitz ordinary least squares (OLS) regressions.
y t + 1 = β 0 + β 1 E t ( y t + 1 ) + μ t ,
where β 0 is the regression intercept, β 1 is the regression coefficient for E t ( y t + 1 ) , and μ t is the regression error term.
Moreover, we compare the model’s performance with the ACI model, and the results are shown in Table 4. From the results, we can observe that coefficients β 1 are significantly positive in both models, suggesting that the predictions from both the CPI and ACI models can explain the variation of the real value of price-change durations. Nevertheless, our CPI model is significantly better in terms of prediction power because the R-squared in the fitting of CPI model is much higher.

7. Conclusions

This paper has proposed a new method to measure the volatility of quote prices in the limit order market, which is important to quantify the cost of demanding liquidity in the HF trading environment. We use the point process to describe price-change events that occur at the best quote level (at the best bid or the best ask), and volatility is measured based on the inference of price-change intensity according to realized price-change durations. In particular, we resort to the change-point model proposed by Lai et al. [23] to describe the dynamics of price-change intensity and name it as the change-point intensity (CPI) model. In the model, the underlying price-change intensity follows a Markov process, i.e., either maintains its past value or renews from a Gamma distribution. Thus, we can use the data of price-change durations to infer the underlying price-change intensity and further calculate quote volatility based on the method proposed by Engle and Russell [5].
We apply the CPI model to study the quote volatility of the AMZN stock on 2 January 2013. Specifically, we choose the threshold of price change as 3 cents to define the price-change event and construct the series of price-change durations. The instantaneous quote volatility at any time of the trading day can be derived from the estimated price-change intensity by our CPI model. Furthermore, we have calculated the cost of demand for traders with different trading latency based on the integrated variance. In addition, we compare both the in-sample fitness and out-of-sample prediction power with the benchmark ACI model by Russell [18], and the results suggest that our model performs better.
Our work has made progress in modeling HF quote volatility. Nonetheless, it leaves much room for future development. The current CPI model is a univariate structure that studies the dynamics of quote volatility itself. We can further extend it by incorporating other factors in determining the changes of quote volatility. This can be performed by setting intensity renewal probability p to be time-varying and to be driven by some other factors.

Author Contributions

Conceptualization, Z.L. and H.X.; methodology, H.X.; software, Z.L.; validation, Z.L. and H.X.; formal analysis, Z.L.; investigation, Z.L.; resources, H.X.; data curation, Z.L.; supervision, H.X.; writing, original draft, Z.L.; writing, review and editing, Z.L. and H.X. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data were obtained from LOBSTER (https://lobsterdata.com/ accessed on 2 February 2018).

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. The Posterior Distribution of λ t in EM

Appendix A.1. Foward–Backward Filter

Here, we first show the general filter to update the posterior distribution of λ t given { y 1 , , y n } .

Appendix A.1.1. Forward Filter

Denote R t = max { k | I k = 1 , for k t } , then given ( y 1 t , R t = s ) , we have the density of the posterior distribution of λ t .
g s t ( λ t ) f ( λ t | y t , R t = s ) i = s t f y i | λ t G ( λ t )
Proposition A1.
The posterior distribution of λ t given y 1 t can be expressed as follows:
f ( λ t | y 1 t ) = i = 1 t p i t · g i t ( λ t )
where p i t = P ( R t = i | y 1 t ) . The weight coefficients of the above mixture can be calculated recursively by the following:
p i t = p i t * s = 1 t p s t * for i = 1 , , t
and with the following.
p i t * = p · f t t / f 00 i = t ( 1 p ) p i , t 1 · f i t / f i , t 1 i < t
Proof. 
We note that the following is the case.
f ( λ t | y 1 t ) f ( λ t , y t | y 1 , t 1 ) = P ( I t = 1 ) · f ( λ t , y t | y 1 , t 1 , I t = 1 ) + P ( I t = 0 ) · f ( λ t , y t | y 1 , t 1 , I t = 0 ) = p · f ( λ t , y t ) + ( 1 p ) · f ( λ t , y t | y 1 , t 1 , I t = 0 ) = p f ( y t ) · f ( λ t | y t ) + ( 1 p ) i = 1 t 1 f ( λ t , y t , R t = i | y 1 , t 1 , I t = 0 ) = p f ( y t ) · f ( λ t | y t ) + ( 1 p ) i = 1 t 1 P ( R t = i | y i , t 1 , I t = 0 ) · f ( λ t , y t | y 1 , t 1 , I t = 0 , R t = i ) = p f ( y t ) · f ( λ t | y t ) + ( 1 p ) i = 1 t 1 p i , t 1 f ( y t | y 1 , t 1 , I t = 0 , R t = i ) f ( λ t | y 1 , t , I t = 0 , R t = i )
From definition (A1), we could observe that f ( λ t | y t ) = g t t ( λ t ) , and f ( λ t | y 1 t , I t = 0 , R t = i ) = g i t ( λ t ) . Substituting it into last equation, we have the following.
f ( λ t | y 1 t ) p f t t · g t t ( λ t ) + ( 1 p ) i = 1 t 1 p i , t 1 f ( y t | y 1 , t 1 , I t = 0 , R t = i ) g i t ( λ t )
Moreover, by integrating λ , we have the following:
f ( y t ) = f ( y t | λ ) G ( λ ) d λ
f ( y t | y 1 , t 1 , I t , R t = i ) = f ( y 1 t | I t = 0 , R t = i ) f ( y 1 , t 1 | I t = 0 , R t = i ) = s = i t f ( y s | λ ) · G ( λ ) d λ s = i t 1 f ( y s | λ ) · G ( λ ) d λ = f i t f i , t 1
where f i t = s = i t f ( y s | λ ) · G ( λ ) d λ , and f 00 = 1 is the normalizing term. Substituting these back into f ( λ t | y 1 t ) in expression (A4), we can prove Proposition A1. □

Appendix A.1.2. Backward Filter

Then, we consider the case given the information after t. We define R ˜ t + 1 = min { k | I k + 1 = 1 , k t + 1 } , i.e., the nearest change point in backward direction is k ( λ k λ k + 1 I k + 1 = 1 ). Moreover, the change point probability q t + 1 , j = P ( R ˜ t + 1 = j | y t + 1 , n ) is considered. Similarly, we have the below proposition.
Proposition A2.
The posterior distribution of λ t given y t + 1 , n can be expressed as follows:
f ( λ t | y t + 1 , n ) = p · G ( λ t ) + ( 1 p ) j = t + 1 n q t + 1 , j · g t + 1 , j ( λ t )
where the following is the case:
q t + 1 , j = q t + 1 , j * s = t + 1 n q t + 1 , s * for j = t + 1 , , n
and q t + 1 , j * can be calculated recursively as follows.
q t + 1 , j * = p · f t + 1 , t + 1 / f 00 j = t + 1 ( 1 p ) q t + 2 , j · f t + 1 , j / f t + 2 , j j > t + 1
Proof. 
We first show that the following is the case.
f ( λ t + 1 | y t + 1 , n ) = j = t + 1 n q t + 1 , j · g t + 1 , j ( λ t )
The steps are similar as those in the forward filter.
f ( λ t + 1 | y t + 1 , n ) f ( λ t + 1 , y t + 1 | y t + 2 , n ) = j = t + 1 n f ( λ t + 1 , y t + 1 , R ˜ t + 1 = j | y t + 2 , n ) = P ( R ˜ t + 1 = t + 1 | y t + 2 , n ) · f ( λ t + 1 , y t + 1 | y t + 2 , n , R ˜ t + 1 = t + 1 ) + j = t + 2 n f ( λ t + 1 , y t + 1 , R ˜ t + 1 = j | y t + 2 , n ) = p f t + 1 , t + 1 · g t + 1 , t + 1 ( m ) + ( 1 p ) j = t + 2 n q t + 2 , j · f ( y t + 1 , n | R ˜ t + 1 = j ) f ( y t + 2 , n | R ˜ t + 1 = j ) · g t + 1 , j ( m ) = j = t + 1 n q t + 1 , j · g t + 1 , j ( λ t ) .
The last step is based on the definition and calculation of q t + 1 , j .
Then, since we have the following:
f ( λ t | y t + 1 , n ) = p · G ( λ t ) + ( 1 p ) · f ( λ t + 1 | y t + 1 , n ) ,
we obtain the result in Proposition A2. □

Appendix A.1.3. Combination (Forward–Backward Algorithm)

Proposition A3.
The posterior distribution of λ t given y 1 , n can be expressed as follows:
f ( λ t | y 1 n ) = 1 i t j n Π i t j · g i j ( λ t ) ,
where Π i t j = Π i t j * / 1 s t k n Π s t k * , and the following is the case.
Π i t j * = p · p i t j = t , 1 i t ( 1 p ) p i t · q t + 1 , j · f i j f i t f t + 1 , j j > t , 1 i t .
Moreover, we have the following.
Π i t j = P ( I i = 1 , I i + 1 = I j = 0 , I j + 1 = 1 | y 1 n )
Proof. 
Now, we use the Bayes theorem to combine forward and backward filters.
f ( λ t | y 1 n ) G ( λ t ) · f ( y 1 n | λ t ) G ( λ t ) · f ( y 1 t | λ t ) · f ( y t + 1 , n | λ t ) f ( λ t | y 1 t ) · f ( λ t | y t + 1 , n ) / G ( λ t ) = i = 1 t p i t g i t ( λ t ) · p G ( λ t ) + ( 1 p ) j = t + 1 n q t + 1 , j · g t + 1 , j ( λ t ) G ( λ t ) = i = 1 t p · p i t · g i t ( λ t ) + ( 1 p ) i = 1 t j = t + 1 n p i t q t + 1 , j · g i t ( λ t ) g t + 1 , j ( λ t ) G ( λ t )
Then, it is easy to show that the following is the case.
g i t ( λ t ) g t + 1 , j ( λ t ) G ( λ t ) = f ( λ t | y i t ) · f ( λ t | y t + 1 , j ) G ( λ t ) = f i j f i t f t + 1 , j g i j ( λ t )
Therefore, according to the definition of Π i t j , we derive the result in Proposition A3. □

Appendix A.2. The Calculation Steps of Posterior Distributions

In the expression of posterior distribution (16), we first calculate g i j ( λ ) . According to the previous step of calculating the foward–backward filter, given ( y 1 n , R t = i , R t + 1 = j ) , we have the density of the posterior distribution of λ .
Substitute the Gamma distribution form into last equation, we have the following.
t = i j f y t | λ G ( λ ) = β α λ α 1 e λ β Γ ( α ) · λ e λ y i · λ e λ y i + 1 λ e λ y j = β α λ α + ( j i ) e m ( β + y i + y i + 1 + + y j ) Γ ( α ) .
Therefore, it is easy to observe that g i j ( λ ) Gamma ( α + ( j i + 1 ) , β + i j y t ) .
Then, the next thing is to calculate Π i t j . By the foward–backward filter part, we have Π i t j = Π i t j * 1 s t k n Π s t k * , and the following is the case:
Π i t j * = p · p i t j = t , 1 i t , ( 1 p ) p i t · q t + 1 , j · f i j f i t f t + 1 , j j > t , 1 i t .
where f i j is defined as follows:
f i j = t = i j f ( y t | λ ) · G ( λ ) d λ
As f ( · ) and G ( · ) are conjugate, and we already calculated that the following is the case.
t = i j f ( y t | λ ) · G ( λ ) = β α λ α + ( j i + 1 ) 1 e λ ( β + i j y s ) Γ ( α )
Thus, it is easy to observe the following.
f i j = Γ ( α + j i + 1 ) · β α Γ ( α ) · β + t = i j y t ( α + j i + 1 )
Moreover, p i t = p i t * s = 1 t p s t * and q t + 1 , j = q t + 1 , j * s = t + 1 n q t + 1 , s * can be deducted recursively.
p i t * = p · f t t / f 00 i = t ( 1 p ) p i , t 1 · f i t / f i , t 1 i < t
q t + 1 , j * = p · f t + 1 , t + 1 / f 00 j = t + 1 ( 1 p ) q t + 2 , j · f t + 1 , j / f t + 2 , j j > t + 1
In the forward filter p i t , the recursive calculation can be executed in the following manner:
  • First when t = 1 , p 11 = 1 ;
  • When t = 2 , p 12 * can be deducted by p 11 , and p 22 * can be directly calculated by itself. Thus, by normalization, we have p 12 and p 22 ;
  • When t = 3 , p 13 * can be deducted by p 12 , p 23 * can be deducted by p 22 , and p 33 * can be directly calculated by itself;
  • t = n , we can obtain the value of p 1 n until p n n .
In the backward filter q t + 1 , j , the recursive calculation of can be executed in the following manner:
  • When t = n 1 , q t + 1 , j = q n n = 1 ;
  • When t = n 2 , q n 1 , n 1 * can be directly calculated by itself, and q n 1 , n * can be deducted by q n n . Thus, by normalization, we have q n 1 , n 1 and q n 1 , n ;
  • When t = n 3 , q n 2 , n 2 * can be directly calculated by itself, q n 2 , n 1 * can be deducted by q n 1 , n 1 , and q n 2 , n * can be deducted by q n 1 , n .
  • t = 1 , we can obtain the value of q 22 until q 2 n .
Therefore, once we have f i j , p i t , and q t + 1 , j , we can obtain the value of Π i t j and further the expression of f ( λ t | D ) in (16).

References

  1. Hasbrouck, J. High-frequency quoting: Short-term volatility in bids and offers. J. Financ. Quant. Anal. 2018, 53, 613–641. [Google Scholar] [CrossRef]
  2. Hasbrouck, J.; Saar, G. Low-latency trading. J. Financ. Mark. 2013, 16, 646–679. [Google Scholar] [CrossRef]
  3. Bauwens, L.; Hautsch, N. Modelling financial high frequency data using point processes. Handb. Financ. Time Ser. 2009, 1, 953–979. [Google Scholar]
  4. Hautsch, N. Econometrics of Financial High-Frequency Data; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2011. [Google Scholar]
  5. Engle, R.F.; Russell, J.R. Autoregressive conditional duration: A new model for irregularly spaced transaction data. Econometrica 1998, 1, 1127–1162. [Google Scholar] [CrossRef]
  6. Hautsch, N. Modelling Irregularly Spaced Financial Data: Theory and Practice of Dynamic Duration Models; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2011. [Google Scholar]
  7. Žikeš, F.; Baruník, J.; Shenai, N. Modeling and forecasting persistent financial durations. Econom. Rev. 2017, 36, 1081–1110. [Google Scholar] [CrossRef]
  8. Yang, J.; Li, Z.; Chen, X.; Xing, H. Modeling inter-trade durations in the limit order market. New Adv. Stat. Data Sci. 2017, 1, 259–276. [Google Scholar]
  9. Chen, F.; Diebold, F.X.; Schorfheide, F. A Markov-switching multifractal inter-trade duration model, with application to US equities. J. Econom. 2013, 177, 320–342. [Google Scholar] [CrossRef] [Green Version]
  10. Abergel, F.; Jedidi, A. Long-time behavior of a Hawkes process–based limit order book. SIAM J. Financ. Math. 2015, 6, 1026–1043. [Google Scholar] [CrossRef]
  11. Swishchuk, A.; Huffman, A. General compound Hawkes processes in limit order books. Risks 2020, 8, 28. [Google Scholar] [CrossRef] [Green Version]
  12. Morariu-Patrichi, M.; Pakkanen, M.S. State-dependent Hawkes processes and their application to limit order book modelling. Quant. Financ. 2021, 1, 1–21. [Google Scholar] [CrossRef]
  13. Li, Z.; Xing, H.; Chen, X. A multifactor regime-switching model for inter-trade durations in the limit order market. arXiv 2019, arXiv:1912.00764. [Google Scholar] [CrossRef] [Green Version]
  14. Cho, D.C.; Frees, E.W. Estimating the volatility of discrete stock prices. J. Financ. 1988, 43, 451–466. [Google Scholar] [CrossRef]
  15. Gerhard, F.; Hautsch, N. Volatility estimation on the basis of price intensities. J. Empir. Financ. 2002, 9, 57–89. [Google Scholar] [CrossRef] [Green Version]
  16. Tse, Y.-K.; Yang, T.T. Estimation of high-frequency volatility: An autoregressive conditional duration approach. J. Bus. Econ. Stat. 2012, 30, 533–545. [Google Scholar] [CrossRef]
  17. Hong, S.Y.; Nolte, I.; Taylor, S.; Zhao, V. Volatility estimation and forecasts based on price durations. J. Financ. Econom. 2021, nbab032. [Google Scholar] [CrossRef]
  18. Russell, J.R. Econometric Modeling of Multivariate Irregularly-Spaced High-Frequency Data. Manuscript, GSB, University of Chicago. 1999. Available online: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.202.486&rep=rep1&type=pdf (accessed on 1 December 2021).
  19. Bauwens, L.; Hautsch, N. Stochastic conditional intensity processes. J. Financ. Econom. 2006, 4, 450–493. [Google Scholar] [CrossRef]
  20. Hall, A.D.; Hautsch, N. Modelling the buy and sell intensity in a limit order book market. J. Financ. Mark. 2007, 10, 249–286. [Google Scholar] [CrossRef]
  21. Bowsher, C.G. Modelling security market events in continuous time: Intensity based, multivariate point process models. J. Econom. 2007, 141, 876–912. [Google Scholar] [CrossRef] [Green Version]
  22. Box, G.E.; Tiao, G.C. Intervention analysis with applications to economic and environmental problems. J. Am. Stat. Assoc. 1975, 70, 70–79. [Google Scholar] [CrossRef]
  23. Lai, T.L.; Liu, H.; Xing, H. Autoregressive models with piecewise constant volatility and regression parameters. Stat. Sin. 2005, 1, 279–301. [Google Scholar]
  24. Lancaster, T. The Econometric Analysis of Transition Data; Cambridge University Press: Cambridge, UK, 1990. [Google Scholar]
  25. Daley, D.J.; Vere-Jones, D. An Introduction to the Theory of Point Processes: Volume I: Elementary Theory and Methods; Springer: New York, NY, USA, 2003. [Google Scholar]
  26. Barndorff-Nielsen, O.E.; Shiryaev, A.N. Change of Time and Change of Measure; World Scientific Publishing Company: Singapore, 2015. [Google Scholar]
Figure 1. The simulated quote-price change durations and price-change intensities (first 1000 points).
Figure 1. The simulated quote-price change durations and price-change intensities (first 1000 points).
Mathematics 10 00634 g001
Figure 2. Comparison between actual intensity in the simulation and estimated intensity from model estimation.
Figure 2. Comparison between actual intensity in the simulation and estimated intensity from model estimation.
Mathematics 10 00634 g002
Figure 3. The evolution of best bid price in the first 50 s of AMZN stock on 2 January 2013.
Figure 3. The evolution of best bid price in the first 50 s of AMZN stock on 2 January 2013.
Mathematics 10 00634 g003
Figure 4. The quote price duration of the best bid price for the AMZN stock on 2 January 2013.
Figure 4. The quote price duration of the best bid price for the AMZN stock on 2 January 2013.
Mathematics 10 00634 g004
Figure 5. The instantaneous quote volatility of the best bid price for AMZN stock on 2 January 2013 calculated by the change-point model.
Figure 5. The instantaneous quote volatility of the best bid price for AMZN stock on 2 January 2013 calculated by the change-point model.
Mathematics 10 00634 g005
Figure 6. The instantaneous quote volatility of the best bid price for AMZN stock on 2 January 2013 calculated by the ACI model.
Figure 6. The instantaneous quote volatility of the best bid price for AMZN stock on 2 January 2013 calculated by the ACI model.
Mathematics 10 00634 g006
Figure 7. The distribution of duration residuals of CPI.
Figure 7. The distribution of duration residuals of CPI.
Mathematics 10 00634 g007
Figure 8. The distribution of duration residuals of ACI.
Figure 8. The distribution of duration residuals of ACI.
Mathematics 10 00634 g008
Figure 9. The fitted 0.01 s price standard deviation of the best bid price for AMZN stock on 2 January 2013.
Figure 9. The fitted 0.01 s price standard deviation of the best bid price for AMZN stock on 2 January 2013.
Mathematics 10 00634 g009
Figure 10. The fitted 1 s price standard deviation of the best bid price for AMZN stock on 2 January 2013.
Figure 10. The fitted 1 s price standard deviation of the best bid price for AMZN stock on 2 January 2013.
Mathematics 10 00634 g010
Figure 11. The fitted 5 s price standard deviation of the best bid price for AMZN stock on 2 January 2013.
Figure 11. The fitted 5 s price standard deviation of the best bid price for AMZN stock on 2 January 2013.
Mathematics 10 00634 g011
Table 1. Estimation results for the simulated data, which has α = 5.0 , β = 2.0 , and p = 0.018 .
Table 1. Estimation results for the simulated data, which has α = 5.0 , β = 2.0 , and p = 0.018 .
α ^ β ^ p ^
1000 points3.471.510.030
4000 points3.911.670.021
7000 points4.101.770.019
Table 2. The message file and order book file of LOBSTER data.
Table 2. The message file and order book file of LOBSTER data.
Panel A: message file
Time (s)Event TypeOrder IDSizePriceDirection
35,101.685250324,832,0001002,555,000−1
35,101.685251124,836,3871002,554,700−1
35,101.685879124,836,4031002,550,8001
Panel B: order book file
Ask Price 1Ask Size 1Bid Price 1Bid Size 1Ask Price 2Ask Size 2Bid Price 2Bid Size 2
25,550,003002,550,7001002,555,1001002,550,500100
2,554,7001002,550,7001002,555,0003002,550,500100
2,554,7001002,550,8001002,555,0003002,550,700100
Table 3. The integrated variances for different traders that calculated for AMZN stock on 2 January 2013.
Table 3. The integrated variances for different traders that calculated for AMZN stock on 2 January 2013.
LatencyMeanS.D.Min.25%Median75%Max.
0.01 S. 7.6 × 10 5 1.1 × 10 4 3.0 × 10 6 2.0 × 10 5 2.6 × 10 5 8.1 × 10 5 8.0 × 10 4
1 S. 7.5 × 10 3 1.0 × 10 2 3.0 × 10 6 2.2 × 10 3 2.6 × 10 3 9.7 × 10 3 5.1 × 10 2
5 S. 3.5 × 10 2 3.8 × 10 2 3.0 × 10 6 1.1 × 10 2 1.5 × 10 2 4.9 × 10 2 1.3 × 10 1
Table 4. Mincer–Zarnowitz OLS results for CPI and ACI models.
Table 4. Mincer–Zarnowitz OLS results for CPI and ACI models.
CPI ModelACI Model
β 0 −0.241−5.568
(−2.735)(−10.953)
β 1 4.23612.614
(78.800)(18.310)
R 2 0.6760.101
N29742974
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Li, Z.; Xing, H. High-Frequency Quote Volatility Measurement Using a Change-Point Intensity Model. Mathematics 2022, 10, 634. https://doi.org/10.3390/math10040634

AMA Style

Li Z, Xing H. High-Frequency Quote Volatility Measurement Using a Change-Point Intensity Model. Mathematics. 2022; 10(4):634. https://doi.org/10.3390/math10040634

Chicago/Turabian Style

Li, Zhicheng, and Haipeng Xing. 2022. "High-Frequency Quote Volatility Measurement Using a Change-Point Intensity Model" Mathematics 10, no. 4: 634. https://doi.org/10.3390/math10040634

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop