Information Entropy and Measures of Market Risk

In this paper we investigate the relationship between the information entropy of the distribution of intraday returns and intraday and daily measures of market risk. Using data on the EUR/JPY exchange rate, we find a negative relationship between entropy and intraday Value-at-Risk, and also between entropy and intraday Expected Shortfall. This relationship is then used to forecast daily Value-at-Risk, using the entropy of the distribution of intraday returns as a predictor.


Introduction
Entropy, as a measure of uncertainty of a system, is widely used in many applications, from physics to social sciences.As stated by the second law of thermodynamics, "this entropy cannot decrease in any process in which the system remains adiabatically isolated, i.e., shielded from heat exchange with its environment" [1].
From this point of view, the stock market could be regarded as a non-isolated system, subject to a constant information exchange process with the real economy.Using the terminology from the information theory (Avery, [2]), the information entropy of the system cannot increase, other way than by exchanging information with the exterior environment.The impact of the incoming information on the stock market entropy can be illustrated in the case of a collective behaviour determined by some extreme bad news: most traders will tend to sell thus reducing the overall market entropy.
There is a lot of theoretical and empirical work dealing with the relationship between entropy and financial markets.The entropy has been used as a measure of stock market efficiency in Zunino et al. [3], since high values of entropy are related to randomness in the evolution of stock prices, according to the Efficient Market Hypothesis.A variant of entropy called normalized entropy-being a dimensionless measure-can be used to assess the relative efficiency of a stock market.Risso [4], on the other hand, relates entropy to stock market crashes, his main result being that for some markets the probability of having a crash increases as the market efficiency, measured by entropy, decreases.An application in the foreign exchange markets is that of Oh et al. [5], who use the approximate entropy as a measure of the relative efficiency of the FX markets.Their results suggest that market efficiency measured by approximate entropy is correlated with the liquidity level of foreign exchange markets.
Considering the stock market a complex open system far from equilibrium, Wang et al. [6] analyse the interactions among agents based on generalized entropy.Using nonlinear evolutionary dynamic equations for the stock markets, derived from the Maximum Generalized Entropy Principle, the structural evolution of the stock market system is demonstrated.
A related use of entropy is to study the predictability of stock market returns, as in Maasoumi and Racine [7] who employ an entropy measure for the dependence between stocks' returns.They find that the entropy is capable of detecting nonlinear dependence between the returns series.Billio et al. [8] use entropy to construct an early warning indicator for the systemic risk of the banking system.They estimate the entropy of marginal systemic risk measures like Marginal Expected Shortfall, Delta CoVaR and network connectedness.By using various definitions of entropy (Shannon, Tsallis and Rényi), they prove that entropy measures have the ability to forecast and predict banking crises.
Dionisio et al. [9] provide a comparison between the theoretical and empirical properties of the entropy and the variance, as measures of uncertainty (although volatility can be considered a measure of risk in finance, it is a measure of uncertainty in statistical terms.Here, measures that are symmetric by nature we call measures of uncertainty, and tail measures that consider certain negative outcomes we call measures of risk.).They conclude that the entropy is a more general measure of uncertainty than the variance or standard deviation, as originally suggested by Philippatos and Wilson [10] and Ebrahimi et al. [11].An explanation is that the entropy may be related to higher-order moments of a distribution, unlike the variance, so it could be a better measure of uncertainty.Furthermore, argue Dionisio et al. [9], both measures, the entropy and the variance, reflect concentration but use different metrics; while the variance measures the concentration around the mean, the entropy measures the dispersion of the density irrespective of the location of the concentration (see also [12,13]).Moreover, as we will show in the paper, the entropy of a distribution function is strongly related to its tails and this feature is more important for distributions with heavy tails or with an infinite second-order moment (like the non-Gaussian alpha-stable distribution) for which an estimator of variance is obsolete.
Entropy-based measures have been compared to the classical coefficient of correlation as well.A measure called cross-sample entropy has been used by Liu et al. [14] to assess the degree of asynchrony between foreign exchange rates, concluding that their measure is superior to the classical correlation measure as a descriptor of the relationship between time series.
Entropy can be applied in the area of risk management as described in Bowden [15].The authors propose a new concept called directional entropy and use it to improve the performance of classical measures like value-at-risk (VaR) in capturing regime changes.An interesting application is using measures based on the Tsallis entropy as a warning indicator of financial crises, as in Gradojevic and Gencay [16], Gencay and Gradojevic [17] and Gradojevic and Caric [18].A further application of entropy is for option pricing, as in Stutzer [19] and Stutzer and Kitamura [20].Also, entropy-based risk measures have been used in a decision-making model context in Yang and Qiu [21].
Besides [5,14] above, other innovative approaches involving entropy and FX markets can be found in Ishizaki and Inoue [22], showing how entropy can be a signal of turning points for exchange rates regimes.Furthermore, Bekiros and Marcellino [23] and Bekiros [24] use entropy in wavelet analysis, revealing the complex dynamics across different timescales in the FX markets.
The main objective of this paper is (1) to study the link between the entropy of the distribution function of intraday returns, and intraday and daily measures of market risk, namely VaR and Expected Shortfall (ES); and then (2) to demonstrate their VaR-forecasting ability.The entropy is considered to have more informational content than the standard measures of risk and it is also more reactive to new information.This paper uses the concept of entropy of a function (Lorentz [25]) in order to estimate the entropy of a distribution function, using a non-parametric approach, with an application to the FX market.The main advantage of this approach is that the entropy can be estimated for any distribution, without any prior knowledge about its functional form, which is especially important for distributions with no closed form for the probability distribution function.
The rest of this paper is organized as follows: in Section 2 we provide the theoretical background defining the entropy of a distribution function and measures of market risk and uncertainty.Section 3 presents the results of the empirical analysis whilst Section 4 concludes.

The Entropy and Intraday Measures of Market Risk and Uncertainty
The entropy, as a measure of uncertainty, can be defined using different metrics (Shannon Entropy, Tsallis Entropy, Rényi Entropy etc.), based on the informational content of a discrete or continuous random variable (see Zhou et al. [26] for a comprehensive review on entropy measures used in finance).The most common entropy metric, the Shannon Information Entropy, quantifies the expected value of information contained in a discrete distribution ( [27]): Definition 1 (Shannon information entropy).If X is a discrete random variable, with probability distribution X : x 1 . . . . . .x n p 1 . . . . . .p n , where p i = P(X = x i ), 0 ≤ p i ≤ 1 and ∑ i p i = 1, then the Shannon Information Entropy is defined as follows: It will reach its maximum value of H(X) = log 2 n for the uniform distribution, while the minimum of 0 is attained for a distribution where one of the probabilities p i is 1 and the rest are 0. In other words, high (low) levels of entropy are obtained for probability distributions with high (low) levels of uncertainty.If X is a continuous random variable with probability density function f (x), then we can define the differential entropy as: Unlike the Shannon entropy, the differential entropy does not possess certain desirable properties: invariance to linear transformations and non-negativity ( [25,28]).However, the analogue Shannon entropy of a function can be defined through a transformation called quantization.We present this transformation, as in Lorentz [25].
Definition 2 (sampled function).Let f : I = [a, b] → R be a real valued continuous function, let n ∈ N * be fixed and let x i = a + (i + 1/2)h, for i = 0, .., n − 1, where h = (b − a)/n.Then the sampled function for f is: If The sampling defined in (3) is called point sampling, whilst the one in (4) is called mean sampling.

Definition 3 (quantization).
The quantization process of a function refers to creating a simple function that approximates the original function.Let q > 0 be a quantum.Then the following function defines a quantization of f: Definition 4 (entropy of a function at a quantization level q).Let f be a measurable and essentially bounded real valued function defined on [a,b] and let q > 0. Also let I i = [iq, (i + 1)q) and B i = f −1 (I i ).
Then the entropy of f at quantization level q is where µ is the Lebesgue measure.
In light of this definition, we can calculate the entropy of any continuous function on a compact interval.The following theorem provides a conceptual framework for defining an estimator of the entropy of a continuous function.

Theorem (Lorentz, [25]
). "Let f be continuous for point sampling, measurable and essentially bounded for mean sampling.The sampling spacing is 1/n.Let S n ( f ) be the corresponding sampling of (3) and respectively (4).Fix q > 0 and let Q q S n be the quantization of the samples with resolution q as given in (5).Denote the number of occurrences of the value (i + 1/2)q in Q q S n by c n (i) = card (i + 1/2)q ∈ Q q S n and denote the relative probability of the occurrence of the value i by p n (i) = c n (i) n .Then we have the following result: The above theorem assures us that regardless of the sampling and quantization, we obtain a onsistent estimator of the entropy of a function.As such, we can define the entropy of a distribution function for a continuous random variable on a compact interval.In general, continuous random variables do not have a compact support, with finite Lebesgue measure.In order to meet the assumptions of Lorentz's theorem, we can define the entropy of the distribution function on a compact interval.In what follows we assume that we are dealing with a continuous random variable X, whose support is the interval [0,1].Then its distribution function F: [0,1] → [0,1] is continuous and the conditions of Lorentz's theorem are fulfilled, so we can define the entropy of the density function H q (F) at the quantization level q > 0.

Entropy of a Distribution Function
Let I i = [iq, (i + 1)q) and B i = F −1 (I i ).Then, according to Lorentz's theorem, the entropy of F with the quantum q is H q (F) = −∑ i µ(B i ) log 2 (µ(B i )), where µ is the Lebesgue measure.In fact,

Note:
In general, for any distribution function, defined on the set A, not necessarily of finite measure, we can consider the restriction of this function on some compact interval: satisfies the conditions of Lorentz's theorem so we can define the entropy measure.
The framework described above can be applied to estimate the entropy of a distribution function of a continuous random variable X.The cumulative distribution function (CDF) is F(x) = P(X < x).The distribution function is defined on the support set of X, with values on [0,1] and has the following properties: If in addition F is absolutely continuous, then there is a Lebesgue integrable function f(x) such as In practice, there are cases when the analytical form of the distribution function is unknown.When it is not known, a robust approach can be taken based on a nonparametric method.

Empirical Distribution Function
The cumulative density function can be estimated in a simple way by using the histogram estimator of a probability density function [29].The basic algorithm to obtain the estimator of the CDF from a sample X 1 < ... < X n is as follows: Step 1.Let x 0 be a fixed point and let h > 0 be the bin width; Step 2. Define the bins as Step 5.The empirical estimator of distribution function (CDF) is:

Kernel Density Estimator
To estimate the distribution function, we can use the Kernel Density Estimation (KDE) methods.If X 1 , ..., X n is a sample of i.i.d.observations, then an estimator of the distribution function is: where K is a real function with the following properties: Such a function is called the kernel and is usually chosen from the known probability density functions.The parameter h is the scale parameter (also called the smoothing parameter or bandwidth), the choice of which determines the estimate.The asymptotic properties of the kernel estimator above have been studied in numerous papers [30,31], establishing the uniform convergence and convergence in probability to the theoretical distribution function, regardless of the form of the kernel used.
As a special case, the uniform distribution is considered.Given X a uniformly distributed random variable on interval [0,1], with distribution function: Then the entropy of the function the maximum value of the entropy.Next, we present the estimation methodology of the entropy of a distribution function.

Estimation of the Entropy of a Distribution Function
Let X 0 , . . .., X n−1 be a sample of i.i.d.observations drawn from the distribution F. In order to ensure the comparability of results between various estimates, we assume that the observed values are normalized in the interval [0,1], through a transformation of the type X i → X i −X min X max −X min .The following steps present the estimation of the entropy of a distribution function (a similar approach, but in a different context, can be found in [27,28,32]): Step 1. Estimate the distribution function, obtaining values Fn (X i ) for i = 0, .., n − 1; Step 2. Sample from the distribution function, using the sampled function S n ( Fn Step 5. Estimate the entropy of the distribution function: As previously shown, the entropy of the distribution function reaches its maximum value for the uniform distribution.One can define a dimensionless measure of uncertainty, the normalized entropy, defined as the ratio between the entropy of the distribution function and the entropy of the uniform distribution: In the following sections, we will refer to the entropy of the distribution function as the normalized entropy of the distribution function:

Properties and Asymptotic Behaviour of the Entropy of a Distribution Function
To illustrate the properties of the entropy estimator, we performed a Monte Carlo experiment, estimating the entropy for simulated distributions, using a sample of 400 observations and replicating the experiment 1000 times.We have simulated several α-stable distributions, allowing for higher probabilities in the tails.Stable distributions have some important properties: they allow for heavy tails and more, any linear combination of independent stable variable follows a stable distribution, up to a scale and location parameter [33]; the Gaussian distribution is a particular case of a stable distribution.
In the literature there are several parameterizations of α-stable distributions.In this paper we use the S1 parameterization [33]: a random variable X follows a α-stable distribution S(α, β, γ, δ; 1) if its characteristic function is: In the above notation α ∈ (0.2] is the characteristic parameter (for a normal distribution α = 2), β ∈ [−1, 1] is the skewness parameter, γ ∈ (0, ∞) is the scale parameter and δ ∈ R is the location parameter.To simulate a α-stable distribution S(α, β, γ, δ; 1) we have used the algorithm from [34] (see Appendix A).The results of the simulations are presented in Table 1.The entropy reaches its maximum value for the uniform distribution and as the α parameter decreases, the entropy of the stable distribution decreases, too.As expected, low entropy values are associated with heavy-tailed distributions; as the tail probability increases, the expected value of the entropy goes down.Figure 1 presents the relationship between the α parameter of a stable distribution, and entropy.Before turning our attention to the link between entropy and measures of risk and uncertainty, we consider the optimal sampling frequency to efficiently compute returns (ignoring noise).distributions; as the tail probability increases, the expected value of the entropy goes down.Figure 1 presents the relationship between the α parameter of a stable distribution, and entropy.Before turning our attention to the link between entropy and measures of risk and uncertainty, we consider the optimal sampling frequency to efficiently compute returns (ignoring noise).

Optimal Sampling Frequency
When dealing with intraday data, one problem is to separate the fundamental dynamics from market noise.Assuming that the trading price can be decomposed into an efficient component and a noise component, reflecting market microstructure frictions, one way to distinguish between the informational content of these components is to choose an optimal sampling frequency for the intraday data.Following Bandi and Russell [35], we assume that the observed logprice is given by: , 1...
where n is the number of trading days, i p is the efficient log-price and i η is the microstructure noise.Now we divide the trading day into M subperiods and define the observed intraday logreturns as: is the sampling frequency of the intraday returns, used to estimate the daily entropy.Then the intraday returns can be decomposed into an unobserved efficient return and a market microstructure disturbance as:

Optimal Sampling Frequency
When dealing with intraday data, one problem is to separate the fundamental dynamics from market noise.Assuming that the trading price can be decomposed into an efficient component and a noise component, reflecting market microstructure frictions, one way to distinguish between the informational content of these components is to choose an optimal sampling frequency for the intraday data.Following Bandi and Russell [35], we assume that the observed logprice is given by: where n is the number of trading days, p i is the efficient log-price and η i is the microstructure noise.Now we divide the trading day into M subperiods and define the observed intraday logreturns as: where δ = 1/M is the sampling frequency of the intraday returns, used to estimate the daily entropy.Then the intraday returns can be decomposed into an unobserved efficient return and a market microstructure disturbance as: In terms of the probability density function, if the unobserved efficient return and the market microstructure disturbance are independent, then the probability density function of the observed returns is the convolution of the probability density functions of unobserved returns and microstructure noise: Assuming that the entropy of the distribution function is estimated using intraday data with a quantum q = 0.05, while the distribution function is estimated using the empirical distribution function, the following measures of risk and uncertainty can be defined, for an α ∈ (0, 1): (1) Intraday VaR at significance level α computed from observations at frequency ν, being the α-quantile of the distribution of intraday returns, so that the following is satisfied: (2) Intraday ES at significance level α computed from observations at frequency ν, defined as: IVaR γ,ν dγ; (3) Intraday Realized Volatility computed from intraday returns at frequency ν, computed as: If [1, T] is the time-horizon of daily data, then we can compute daily estimates of the following 4 measures: entropy H δ;t , intraday Value at Risk IVaR α,ν;t , intraday Expected Shortfall IES α,ν;t and Intraday Realized Volatility IRV ν;t .
In order to asses the relationship between entropy of the distribution of the intraday returns and intraday measures of market risk, we estimate static and dynamic linear regression models using entropy as (one of) the explanatory variable(s):

Static Models
The first class of models study the explanatory power of the (daily) entropy to explain different measures of market risk and uncertainty: daily observations of Intraday ES, Intraday VaR and Intraday Realized Volatility estimated at different time scales ν, running the following regressions:

Dynamic Models
The second class of models aims to check whether the entropy of the distribution of intraday returns can provide additional information to that contained in the latest observation of the risk measures by estimating the following regressions: Entropy 2017, 19, 226 9 of 19

Quantile Regressions
Classical linear regression is used to estimate the conditional mean of a dependent variable, given the values of an explanatory variable.However, the presence of outliers and/or heteroskedasticity can affect the results.Also, in many situations not just the conditional mean of a variable is required, but its entire conditional distribution, in particular the conditional quantiles.For a random variable Y with distribution function F, the τth quantile is defined as the inverse of distribution function Q(τ) = inf{y, F(y) ≥ τ}, where τ ∈ (0, 1).

Quantile Regressions for Intraday VaR
To better understand the effect of the entropy on IVaR, we estimate a quantile regression model using the entropy of the distribution of intraday returns as explanatory variable.The model is: where, for a given time scale v and significance level α: Q IVaR (τ) = in f {IVaR α,ν;t , F(IVaR α,ν;t ) ≥ τ}.

Quantile Regressions for Daily Returns
Quantile regressions can be used to assess the relationship between extreme values of daily returns and the entropy of the distribution of intraday returns.We estimate the following model: where, denoting the daily log-returns by R t , the quantile of the returns is denoted by:

Forecasting Daily VaR Using Entropy
The daily VaR at probability level α can be defined by the following equation: VaR can be forecasted using various methods.Furthermore, many VaR measures and forecasts fail to react fast enough to new information (market shocks) so often underestimate risk.Entropy, on the other hand, is very sensitive to new information so can be used to update and improve VaR forecasts.In order to forecast the daily entropy-based VaR, witht ∈ {k + 1, ..., k + w}, k ∈ {0, ..., T − w + 1} and T the number of daily returns, Equation ( 13) below is estimated (we tried adding extra lags of the entropy in the regression, but the extra lags were not significant, the optimal lag length was found to be one) using a rolling window of length w: Estimating this on the time interval [k + 1, k + w], the parameter estimates β k 0 and β k 1 are obtained.Then the forecast of VaR for the next trading day is given by the following: Equation ( 13) can be extended to include an autoregressive term for the VaR as below (it is not required to add extra lags of the quantile in the regression as the quantile is highly autocorrelated): Then the forecast of VaR for the next trading day can be computed as: The results obtained based on the models ( 13) & ( 15) and forecasting formulae ( 14) & ( 16), respectively, are compared with the VaR forecasting results obtained from a historical VaR forecasting model, and a VaR forecasting model based on the GARCH(1,1) model below: Model ( 17) is estimated using the same rolling windows as above, with the error term z t being a standard normal variable or following a Student's t distribution, and the forecast of the VaR for the next trading day is given by the formula below: where q α is the Gaussian quantile or the Student-t quantile with the degrees of freedom estimated.Thus, we compare the VaR forecasting ability of the following five models: 1. Historical VaR forecasts, estimated using a rolling window of length w; 2.
In order to test the forecasting ability of the above models, Christoffersen's [36] tests are used: the LR test of Unconditional Coverage, the LR test of Independence and the LR test of Independence and Conditional Coverage.Also, we employed the forecast performance tests of Diebold and Mariano [37] and West [38], using the loss function of Giacomini and White [39] (see Appendixes B and C):

Empirical Analysis
In order to illustrate the application of the entropy of the distribution of intraday returns in financial risk management, we consider the EUR/JPY exchange rate (sourced from Disk Trading).FX rates are mostly symmetric, so the two tails of the distributions are similar, and so the entropy is closer related to the VaR estimate.For highly asymmetrical distributions, like stocks and commodities, the link between entropy and VaR could be weaker.The time period considered is 1999-2005.The database used for estimation has two components: (1) intraday prices (2025 transaction days and 2,340,624 minute-by-minute intraday observations); and (2) daily prices (2025 daily observations).Using the methodology from Bandi and Russell [35], the optimal sampling frequency for intraday data was estimated at 10 minutes on average (δ = δ* = 10); this frequency is used to compute the entropy of distribution of intraday returns.We use the frequencies of v ∈ {1, 10, 15} (minutes) to compute intraday measures of risk and uncertainty, and compute the risk measures at 1% significance level.

Entropy and Intraday Measures of Market Risk and Uncertainty
Figure 2 presents a comparison of the entropy and the intraday ES (estimated at 1% level).The two series show a strong negative correlation; the entropy has a similar relationship with IVaR and IRV as well.Next, the models given in ( 9) and ( 10) are estimated, using the statistical software SAS 9.3.
rates are mostly symmetric, so the two tails of the distributions are similar, and so the entropy is closer related to the VaR estimate.For highly asymmetrical distributions, like stocks and commodities, the link between entropy and VaR could be weaker.The time period considered is 1999-2005.The database used for estimation has two components: (1) intraday prices (2025 transaction days and 2,340,624 minute-by-minute intraday observations); and (2) daily prices (2025 daily observations).Using the methodology from Bandi and Russell [35], the optimal sampling frequency for intraday data was estimated at 10 minutes on average (δ = δ* = 10); this frequency is used to compute the entropy of distribution of intraday returns.We use the frequencies of 1 10 15 v { , , } ∈ (minutes) to compute intraday measures of risk and uncertainty, and compute the risk measures at 1% significance level.

Entropy and Intraday Measures of Market Risk and Uncertainty
Figure 2 presents a comparison of the entropy and the intraday ES (estimated at 1% level).The two series show a strong negative correlation; the entropy has a similar relationship with IVaR and IRV as well.Next, the models given in ( 9) and ( 10) are estimated, using the statistical software SAS 9.3.
Panel A in Table 2 presents the results of the regressions specified in (9), using level α = 1% for the risk measures.The R 2 estimates show that the entropy is strongly linked with intraday VaR, intraday ES and intraday Realized Volatility and, as expected, the coefficients are all negative and significant.Panel B in Table 2 presents the results of the dynamic regression models (10) using significance level α = 1% for the risk measures.The coefficients of the entropy remain in all cases negative and significant, showing that the entropy has forecasting power for intraday measures of risk and uncertainty, even after taking past values of intraday measures of risk and uncertainty into account.Panel A in Table 2 presents the results of the regressions specified in (9), using level α 1% for the risk measures.The R 2 estimates show that the entropy is strongly linked with intraday VaR, intraday ES and intraday Realized Volatility and, as expected, the coefficients are all negative and significant.Panel B in Table 2 presents the results of the dynamic regression models (10) using significance level α = 1% for the risk measures.The coefficients of the entropy remain in all negative and significant, showing that the entropy has forecasting power for intraday measures of risk and uncertainty, even after taking past values of intraday measures of risk and uncertainty into account.

Quantile Regression Results
We use quantile regressions to see the effect of the entropy on the of VaR, estimating Equation (11) for frequencies of 1, 10 and 15 min and α = 1%.Panel A of Table presents the estimation results; as expected, there is a positive correlation between the entropy and the upper tails of the distribution of IVaR.Also, IVaR is more sensitive to entropy in the upper tail of its distribution, meaning that high values of IVaR have a stronger relationship with entropy.Furthermore, the relationship is strongest for ν = 15 min frequency.As an example, Figure 3 gives a visual presentation of the dependence of the 99% quantile on entropy.Note: Quantile regressions for IVAR (Panel A) and daily log-returns (Panel B) illustrating the effect of entropy on the quantiles of IVaR and daily log-returns; the models estimated are (11) and (12).*** signify significance at 1%.
Low (high) values of the entropy of the distribution of intraday returns generally correspond to high (low) absolute VaR estimates for daily returns; in line with the results in the previous section.

Forecasting Daily VaR Using Entropy
Entropy can be used to forecast VaR, and for this Equation ( 13) is estimated on a rolling basis.We use a length of w = 1000 days for the estimation windows.The time series for daily returns are shown and plotted against the entropy-based 1% VaR forecasts and the GARCH-based 1% VaR forecasts in Figures 5 and 6.As expected, the GARCH-based VaR is more stable whilst the entropybased VaR forecast reacts faster to new information.Regarding the relationship between extreme values of daily returns and the entropy of the distribution of intraday returns, Panel B of Table 3 presents the results of regression (12), whilst Figure 4 presents the scatter plot and the regression line of the estimation.As expected, the relationship between the quantile (equal with minus VaR) and entropy is positive and significant.Low (high) values of the entropy of the distribution of intraday returns generally correspond to high (low) absolute VaR estimates for daily returns; in line with the results in the previous section.

Forecasting Daily VaR Using Entropy
Entropy can be used to forecast VaR, and for this Equation ( 13) is estimated on a rolling basis.We use a length of w = 1000 days for the estimation windows.The time series for daily returns are shown and plotted against the entropy-based 1% VaR forecasts and the GARCH-based 1% VaR forecasts in Figures 5 and 6.As expected, the GARCH-based VaR is more stable whilst the entropy-

Forecasting Daily VaR Using Entropy
Entropy can be used to forecast VaR, and for this Equation ( 13) is estimated on a rolling basis.We use a length of w = 1000 days for the estimation windows.The time series for daily returns are shown and plotted against the entropy-based 1% VaR forecasts and the GARCH-based 1% VaR forecasts in Figures 5 and 6.As expected, the GARCH-based VaR is more stable whilst the entropy-based VaR forecast reacts faster to new information.The backtesting results of the VaR forecasts for the five models given in Section 2.3, for α = 1% are presented in Table 4.The second column gives the probabilities that the returns are below the negative of the VaR, for different models; it can be seen that the Historical VaR model (with a probability of 0.29) provides the highest VaR forecasts on average.Looking at the unconditional test results, the entropy-based AR VaR has the smallest test statistic and the Historical VaR model   The backtesting results of the VaR forecasts for the five models given in Section 2.3, for α = 1% are presented in Table 4.The second column gives the probabilities that the returns are below the negative of the VaR, for different models; it can be seen that the Historical VaR model (with a probability of 0.29) provides the highest VaR forecasts on average.Looking at the unconditional test results, the entropy-based AR VaR has the smallest test statistic and the Historical VaR model The backtesting results of the VaR forecasts for the five models given in Section 2.3, for α = 1% are presented in Table 4.The second column gives the probabilities that the returns are below the negative of the VaR, for different models; it can be seen that the Historical VaR model (with a probability of 0.29) provides the highest VaR forecasts on average.Looking at the unconditional test results, the entropy-based AR VaR has the smallest test statistic and the Historical VaR model marginally fails the test.Considering the independence and full results, it can be seen that the Historical VaR model performs the worst failing the tests, whilst the normal GARCH model passes all three tests.We conclude that the best results overall are obtained by the entropy-based AR VaR forecast model.Additionally, we employed the VaR forecast comparison tests of Diebold and Mariano [37] and West [38].The test statistic only considers the unconditional forecasting ability of VaR models, and is highly asymmetric, the loss function in (19) favouring models which overestimate VaR and strongly penalizes models which, even very mildly, underestimate it.Our results in Table 5 show that the Historical VaR is the best performer, whilst the t-GARCH(1,1) based VaR and the entropy-based VaR models are favoured over the normal GARCH(1-1) VaR.Furthermore, the difference between the performance of the entropy-based AR VaR model and the entropy-based VaR is not statistically significant.However, these results depend very strongly on the loss function.Looking at the overall picture, we conclude that the entropy has good forecasting power for VaR.[37] and West [38], with loss function given in [19].
The DM test statistics reported are for comparisons of the models on the left side against models in the column titles.** and *** signify significance at 5% and 1%, respectively.N = 689.

Conclusions
This paper investigates the link between entropy and various measures of market risk such as Value-at-Risk or Expected Shortfall.Based on the result of Lorentz [25], we developed the concept of entropy of a distribution function and we applied this concept to estimate the entropy of the distribution of intraday returns.Using Monte-Carlo simulations, we showed that there is an inverse relationship between entropy and the probability in the tails of a distribution, high levels of entropy being characteristic of a distribution with light tails like the normal distribution, and low entropy values being associated with heavy-tailed distributions.
Furthermore, we investigated the relationship between risk measures and the entropy of the distribution of intraday returns, in a static and dynamic setting.The entropy of a distribution function has more informational content than the classical measures of market risk, as it takes into account the entire distribution.We found evidence of a strong, negative relationship between entropy and intraday Value-at-Risk, intraday Expected Shortfall and intraday Realized Volatility.From a dynamic point of view, the entropy proves to be a strong predictor for IVaR, IES and IRV, with R 2 values up to 41%.Our quantile results confirm that the entropy has a strong explanatory power for the quantiles of the intraday VaR as well as the quantiles of the daily returns.The final part of our empirical study compares entropy-based VaR estimates, which are mostly more reactive to new information than standard VaR models, with competing VaR forecasts.Whilst the Historical VaR model is the preferred model based on the Diebold-Mariano (unconditional) test results, when Cristoffersen's unconditional, conditional and joint test results are considered, it comes out as the worst performer, failing the tests.Cristoffersen's three tests favour the entropy-based AR VaR model, and we conclude that the entropy is a strong predictor of daily VaR, performing better than the competing VaR models.As it takes into account the extreme events that happen at an intraday level, it proves to be able to provide reliable VaR forecasts.

Figure 1 .
Figure 1.The entropy of distribution functions of simulated alpha-stable distributions, as a function of α.

Figure 1 .
Figure 1.The entropy of distribution functions of simulated alpha-stable distributions, as a function of α.

Figure 2 . 4 Figure 2 .
Figure 2. Intraday ES and the entropy of the distribution of intraday returns.

Figure 3 .
Figure 3. Quantile regression results for the 1% quantile of IVaR as a function of the entropy, for ν = 1 min.

Figure 4 .
Figure 4. Quantile regression results for the 1% quantile of the returns as a function of the entropy.

Figure 3 .
Figure 3. Quantile regression results for the 1% quantile of IVaR as a function of the entropy, for ν = 1 min.
) values of the entropy of the distribution of intraday returns generally correspond to high (low) absolute VaR estimates for daily returns; in line with the results in the previous section.

Figure 3 .
Figure 3. Quantile regression results for the 1% quantile of IVaR as a function of the entropy, for ν = 1 min.

Figure 4 .
Figure 4. Quantile regression results for the 1% quantile of the returns as a function of the entropy.

Figure 4 .
Figure 4. Quantile regression results for the quantile of the returns as a function of the entropy.

Table 1 .
The entropy of simulated distributions.

of the Entropy of the Distribution Function Standard Deviation of the Entropy of the Distribution Function
: This table presents the average value of the entropy and its standard deviation, estimated by simulating a sample of 400 cases and repeating the experiment 1000 times. Note

Table 1 .
The entropy of simulated distributions.This table presents the average value of the entropy and its standard deviation, estimated by simulating a sample of 400 cases and repeating the experiment 1000 times.

Table 2 .
The relationship between intraday measures of risk and uncertainty and entropy.

Table 5 .
The Diebold and Mariano test results for VaR forecasts.