1. Introduction
In finance, risk management is the activity of identifying, analyzing, estimating and controlling the risk of losing money. For our purposes, risk management is a procedure for shaping a loss distribution of an investment. The value at risk (VaR) is the most popular risk measure and it represents how much an investment might lose during usual market conditions with a given probability in a time interval. In other words, VaR is a percentile of a loss distribution. Another very popular risk measure is the conditional value at risk (CVaR), or the expected shortfall. CVaR is a risk measure for investments reintroduced in the literature by Rockafellar and Uryasev [
1], for a former reference see Love et al. [
2]. According to Sarykalin et al. [
3], it approximately (or exactly, under certain conditions) equals the average of some percentage of the worst-case loss scenarios.
Relative to the definitions, there is a near correspondence between VaR and CVaR. For instance, Yamai and Yoshiba [
4] defined CVaR as the conditional expectation of loss given that the loss is beyond the VaR level. Consequently, considering the same confidence level, VaR is a lower bound for CVaR. In particular, Rockafellar and Uryasev [
1,
5] showed that CVaR is superior to VaR in applications related to investment portfolio optimization. In practice, the choice between VaR and CVaR rests on the differences in mathematical properties, stability of statistical estimation, simplicity of optimization procedures, acceptance by regulators, and so on [
3]. For instance, in terms of mathematical properties, the CVaR of a portfolio is a continuous and convex function with respect to positions in instruments, whereas the VaR may be even a discontinuous function.
The volatility is the standard deviation of the distribution of logreturns and a very simple and earlier measure of financial risk. The corresponding variance is a natural measure of the statistical uncertainty but it just captures a small portion of the informational content of the distribution of the logreturns. On the other hand, the entropy is a more general measure of uncertainty than the variance because it may be related to higher-order moments of a distribution [
6,
7,
8]. According to Dionisio et al. [
8], the variance measures the concentration around the mean while the entropy measures the dispersion of the density irrespective of the location of the concentration. Finally, for Pele et al. [
9], the entropy of a distribution function is strongly related to its tails and this feature is more important for distributions with heavy tails or with an infinite second-order moment for which the variance does not make sense.
In the literature, there are empirical papers showing that entropy has good predictive power for risk. For instance, Billio et al. [
10] showed that entropy has the ability to forecast and predict banking crises using directly the entropy of systemic risk measures. In addition, Pele et al. [
9] showed that entropy of the intraday distribution of logreturns is a strong predictor of daily VaR, performing better than the classical GARCH models, for a time series of EUR/JPY exchange rates. Similarly, Pele and Mazurencu-Marinescu-Pele [
11], instead of using the entropy of the intraday distribution of logreturns, defined the entropy using symbolic time series analysis (STSA) showing that their entropy is a strong predictor of daily VaR, performing better than the classical GARCH models, using high-frequency data for Bitcoin.
There is a recent interest in the statistical properties and risk behavior of cryptocurrencies [
12,
13,
14] and, in particular, Bitcoin [
15]. Consequently, in this paper, we estimate the entropy of the symbolic intraday distribution of Bitcoin’s logreturns through the STSA [
11] and we model and forecast the Bitcoin’s daily CVaR using the estimated entropy. The main contribution of this paper is the extension of the study performed by Pele and Mazurencu-Marinescu-Pele [
11] to include the CVaR. The rest of the paper is organized as follows: in
Section 2, we present the details of the methodology; in
Section 3, we present our empirical study describing the dataset, the results and the corresponding comments; finally, in
Section 4, we conclude the paper.
2. Methodology
In this section, we review the methodology to estimate the entropy of the symbolic intraday distribution of logreturns through the STSA, a logistic model connecting the daily VaR and the entropy, and a forecasting model for the daily VaR using the entropy based on a quantile regression published by Pele and Mazurencu-Marinescu-Pele [
11]. In addition, we introduce the two main contributions of this paper: a logistic model connecting the daily CVaR and the entropy, and a forecasting model for the daily CVaR using the entropy based on a modified quantile regression model. It is also important to mention that the Bitcoin exchange rate is hereinafter referred to as Bitcoin price.
2.1. Entropy of Symbolic Intraday Logreturns
In the intraday context, it is usual to consider a set of days
and each day equally partitioned in
M time bins. Consequently, for a day
d and a time bin
, we associate a price
and a logprice
. Then, the intraday logreturn of an asset is defined as follows:
For the empirical study of this paper, it is possible to define because the Bitcoin is continuously traded. However, it is important to point that for other kind of assets, it would be better to ignore the logreturn . In addition, is not defined.
The intraday logreturns is usually very noisy. The idea behind the STSA technique [
16] to produce low-resolution data from high-resolution data. In particular, STSA is a transformation of a real number sequence to a binary sequence. In our case, the STSA transformation is applied to the intraday logreturn to obtain the symbolic intraday logreturn. The symbolic intraday logreturn is defined as follows:
Basically, the symbolic intraday logreturn is a binary sequence of 0s representing increasing prices and 1s representing decreasing prices.
Based on the Shannon entropy definition [
17], the entropy of the symbolic intraday logreturns is defined as follows:
where
and
. It is possible to notice that the entropy of the symbolic intraday logreturns is a daily entropy. In addition, we estimate
using the sample frequency
and
using the sample frequency
.
2.2. Entropy and Daily VaR and CVaR
Intuitively, the entropy of the symbolic intraday logreturns is higher at the presence of higher uncertainty in the returns and lower at the presence of lower uncertainty in the returns. Consequently, the likelihood of extreme negative daily logreturns is explained by higher values of entropy. In [
11], it was verified that the entropy is positively correlated to the likelihood of extreme negative daily logreturns and the relation between VaR and entropy was modeled using the following logistic regression model:
where
and
are constants to be estimated;
are the indicators of the lower tails of the daily logreturns;
are the daily logreturns;
is the closing price of day
d; and
is the daily value at risk at the significance level
defined by
or, alternatively,
where
is the cumulative distribution function of the daily logreturns.
In this paper, the hypothesis is also that the entropy is positively correlated to the likelihood of extreme negative daily logreturns and we model the relation between CVaR and entropy using the following logistic regression model:
where
and
are constants to be estimated;
are the indicators of the lower tails of the daily logreturns; and
is the daily conditional value at risk at the significance level
defined by
where
is the continuous probability density function of the daily logreturns.
2.3. Forecasting Model for Daily VaR and CVaR
Pele et al. 2017 and Pele et al. 2019 [
9,
11] considered a quantile regression model to forecast the daily VaR using the entropy as the explanatory variable. The forecasting model for the daily VaR
at day
using the entropy of the day
is given by:
where
and
are estimated using a quantile regression model between the dependent variable
and the independent variable
for
;
Based on Koenker and Bassett [
18], we consider the following optimization problem for the quantile regression estimation:
where
is the asymmetric absolute loss function and
is the indicator function.
Our forecasting model for the daily CVaR
at day
using the entropy of the day
is given by:
where
and
are estimated using a quantile regression model between the dependent variable
and the independent variable
for
. We consider the following optimization problem for the quantile regression estimation:
where
is the significance level,
is the empirical cumulative distribution function of the logreturns estimated using the time window
and
is the empirical density function of the logreturns estimated using the time window
.