In this section, we present the manner in which we performed our calculations and their outcomes in an attempt to reveal the effect of the sample size on the ES and how eventual inaccuracies might be mitigated.
4.1. Fitting to Probability Distributions
We have considered four different probability distributions, including three fat-tailed ones, to fit the time series of returns. These are the well-known normal distribution (Gaussian probability density function), the non-centered (and potentially skewed) t-student distribution, the generalized hyperbolic distribution, and the Lévy stable distribution. Some authors have already applied parametric estimators based on these fat-tailed distributions to the calculation of the Expected Shortfall (
Chen & Chen, 2012;
Hellmich & Kassberger, 2011). This set of distributions was chosen just for illustrating purposes, i.e., to present the properties and usefulness of the method, which consists of fitting the returns to fat-tailed distributions. The finance practitioner can also include other distributions and select them for the calculation of the ES if the observed data fits them better.
For the normal distribution, the location (loc) and scale parameters are simply the average and standard deviation of the returns. Concerning the fat-tailed distributions, we fitted them using the maximum likelihood method, minimizing a loss function defined as follows:
where
stands for the number of datapoints (returns) and
is the tested probability density function. Therefore, the lower the loss function, the more accurate the fitting of the probability density function to the dataset. We call
the set of two, four, or five parameters of the distribution. The minimization of the loss function was carried out using the gradient descent method with step size à la Barzilai–Borwein (
Barzilai & Borwein, 1988), including some randomness in the calculations (to reach different local minima of the loss function) and using different initial values of the parameters of the distributions. Further explanations of the fitting to distributions can be viewed in (
García-Risueño et al., 2023;
Johnson et al., 1995) or in the shared source code used in our calculations.
In
Figure 1, we present an example of the fitting of the analyzed probability density functions to one of the chosen datasets (returns of the BASF bond, in this case; similar plots for the other analyzed products are presented in the
Supplementary Information. In
Figure 1, the dark blue bars represent the histogram of the absolute returns of the bond price, and the continuous lines represent the probability density functions. In this case, the loss function is minimal for the generalized hyperbolic function. This happens in 3 out of the 4 analyzed products. The non-centered t-student has minimal loss function in the remaining case. The parameters from our fitting, as well as their corresponding loss functions, are presented in
Table 2,
Table 3,
Table 4 and
Table 5. Our results indicate that fat-tailed distributions fit much better to the histogram of data than normal (Gaussian) distributions, which agrees with results from previous research works (
Chen & Chen, 2012). In the
Supplementary Information we present example Q-Q plots for the analyzed probability distributions for all the analyzed financial products.
We will use the fitted distributions to generate synthetic data, which will be used for inferring properties of the ES of the returns of financial products. The usage of synthetic data has been encouraged by prestigious authors (
López de Prado, 2018). In order to avoid too extreme values, we establish truncation values for the synthetic data. If the generated random numbers—which correspond to absolute returns of bonds—are above +30 or below −30 (note that the prices of bonds are usually measured so that they are about 100 currency units), then the generated synthetic datum is discarded, and another random number is generated. We deem these round values reasonable for bonds; for example, the bond with ISIN US33616CAB63 (First Republic Bank) fell over 23 in a single day in spring 2023. For the stocks of Shell, we set maximum and minimum truncation limits of log(1.4) and log(0.6), which are consistent with historical extreme values of oil companies (+36% of BP on 1969 June 4 and −47% of Marathon Oil Corporation on 2020 March 3). For the stock of Apple Inc we establish limits of log(0.48) and log(1.33), consistent with its historical extrema. These truncation limits (
) will also be used when we perform numerical integration to calculate the Expected Shortfall of probability density functions (this is using Equation (
3) below with integration limit
instead of
).
4.2. How the Expected Shortfall from Historical Simulation Depends on the Size of the Dataset
The Expected Shortfall is a risk measure equal to the conditional expectation of a return
r (financial loss) given that
r is below a specified quantile (
Brazauskas et al., 2008). If the return
r is considered to be a continuous random variable, then the Expected Shortfall can be defined as follows:
where
is the probability density function of the return;
is the value of the return such that
, being the
confidence level the number which represents the total probability of the most adverse returns to be considered in the calculation of the ES. The
multiplicative factor is sometimes omitted from the definition of the ES; see, for instance, equation (8.52) of (
McNeil et al., 2005). Other equivalent definitions exist, e.g.,
where
indicates expected value, and ES, VaR is the Expected Shortfall and Value-at-Risk for confidence level
of the analyzed return. To sum up, the Expected Shortfall provides the average expected loss for the worst-case scenarios (
Acerbi & Tasche, 2002), with those scenarios depending on a predetermined threshold (for example, the 5% worst possible losses).
Despite definition (
3), the Expected Shortfall is frequently calculated by considering just observed values of a given return
. For example, the European Banking Authority (
European Banking Authority, 2020) specifies the formula that follows for its calculation:
(this formula was found in
Peracchi and Tanase (
2008)) where the
signs indicate the integer part (floor) of
x and the indices (
i) of
are ordered so that
are monotonically increasing; we set
(2.5%). This way of proceeding ignores altogether other non-observed values of the returns. Such values were possible but did not take part in the calculation; hence, potentially important information is discarded. On the one hand, Equation (
4) has the advantage that it does not require any assumption on the actual distribution of the returns (i.e., if it is normal, generalized hyperbolic, etc.). On the other hand, the cost of making this assumption may be to distort the calculated Expected Shortfall strongly. The arbitrariness of the choice of the underlying probability distribution can be partly mitigated by analyzing different ones, as we do in this research work.
To quantify whether the usage of Historical Simulation (Equation (
4)) severely distorts the calculation of the ES, we proceed as follows. We fit the collection of returns of a given financial product for a given time range (see
Table 1) to four different probability density functions. Among them, we choose the one whose loss function (Equation (
2)) is minimal, i.e., which has the maximum likelihood (see numbers in bold in
Table 3 and
Table 4). We then generate synthetic datasets using the parameters of the chosen distribution. Each synthetic dataset consists of
points. For each value of
, we generate half a million synthetic datasets; for each of them, we calculate the ES using Equation (
4). We then calculate the mean, median, standard deviation, and 95% confidence interval of this collection of 500 k ESs for each
. For the returns of the analyzed BASF bond, these quantities are presented in
Figure 2-top. The solid lines of the upper subplots clearly indicate that the average Expected Shortfall tends to increase in a monotonical manner with the sample size. It converges to steady values (plateau) for high values of
but requires relatively high numbers of returns to approach it. For example, the mean and median of the ES of the synthetic data are far from their converged values for
.
Such a monotonical increase is a consequence of the highly nonlinear definition of the Expected Shortfall. For example, if we calculate the average of the synthetic data instead of the ES, we will notice that there is no trend with the size of the dataset. This can be viewed in
Figure 2-bottom, which presents the mean, median, standard deviation, and confidence interval of the
average of each synthetic dataset that was used in the calculations for
Figure 2-top. Here, the solid lines are horizontal; the sizes of the differences of the mean of the averages for different values of
are far lower than the standard deviation (orange curve in
Figure 2-top, left).
In the
Supplementary Information, we present graphs similar to
Figure 2, which correspond to the other financial products listed in
Table 1. They all confirm that the average ES monotonically increases with the sample size until it reaches a plateau.
4.3. Expected Shortfall from Fitting to Small Datasets
The results presented in the previous section indicate that the usage of Historical Simulation (i.e., using observed values only) in the calculation of the ES is potentially inaccurate and prone to systematic errors (underestimation of the ES). This drawback can be mitigated by increasing the sample size. However, such a solution may not always be either feasible or accurate. Simply increasing the number of returns used in the HS calculation would probably require using older data, which may be stale. The economic conditions, as well as the inner operation of a company, tend to change over time; hence, the obsolescence of data may lead to inaccurate results. Moreover, there exist illiquid products, like many corporate bonds, for which the returns are unknown for many dates. In those cases, one is forced to perform calculations with few returns (datasets of small size), thus leading to potentially severe inaccuracies, as indicated by
Figure 2 (which corresponds to absolute returns of a bond price) and by figures presented in the
Supplementary Information. Can this inconvenience be overcome? In this section, we present a method to mitigate the problem.
Since the inaccuracies in the Expected Shortfall calculation are a consequence of not considering non-observed, though possible, extreme values, the impact of their neglect can be eased by finding an approximation to them. We can infer the probability density function from the analyzed dataset, as indicated above. Even if this dataset consists of a few points, the fitting will provide probabilities for extreme values, which can be used in the calculation of the Expected Shortfall.
We exemplify this way to proceed with the results displayed in
Figure 3. In the plots on top, which correspond to the BASF bond, the dash-dotted lines correspond to the mean and median of the 2000 values of the ES of different sets of synthetic data using HS (from Equation (
4)). The dashed lines represent the mean and median of the ES obtained by fitting each dataset to a generalized hyperbolic probability density function and then using such continuous functions to calculate the ES. Since the continuous probability density functions are less prone to underestimate the extreme values (
tails) than Historical Simulation, the values of the dashed line are always above the values of the dash-dotted line. These results indicate that if we take the ES of the fitting to the large (whole) observed dataset (gray horizontal line in
Figure 3-top) as our baseline, then fitting to probability distributions provides more accurate results than HS. The effect is especially strong for low values of the sample size.
The analyzed bonds (BASF and Charles Schwab, which are liquid) can be considered proxies of other corporate bonds, which can be illiquid. We prefer to use such liquid bonds (proxies) in our analysis to account for illiquid bonds to avoid distortions in the price due to low supply and demand. Such distortions, often noticeable as large values of the bid-ask spreads of illiquid bonds, would increase the complexity of the products and thus opaque the ES-vs-sample size analysis, whose clear presentation is among the main goals of this paper. For illiquid bonds, just a few returns are known, and hence, taking values of a proxy (e.g., BASF’s) bond is expected to reasonably give an account of the modeled product (illiquid bond). However, this is not the case for stocks (shares of companies traded in stock exchanges). Though there exist many small companies whose stocks have very low trading volume (and may not be purchased even once a day), the vast majority of the volume of traded stocks corresponds to very liquid products, which are traded numerous times a day. Therefore, the fitting of observed data for calculating accurate ES may, in principle, not look necessary for stocks because their daily prices are known. Conversely, we think that the fitting procedure is also worthwhile for stocks. This is because the number of days to be used for calculating the ES (
in Equation (
4)) is arbitrary. If one wants to calculate risks with a given horizon (e.g., 126 days, the usual number of trading days in a semester), the market conditions may have changed during that time, making the oldest data stale. In that case, it may be more appropriate to choose, e.g., 63 days instead. However, from our previous analysis (
Figure 2-top), we know that such a small number of data would distort ES if calculated through historical simulation, leading to a non-converged value (below the plateau). Therefore, a wiser way to proceed would be to use recent data, to fit them to a fat-tailed distribution, and finally to calculate the ES of that distribution.
In
Figure 3-bottom, we present data analogous to those of
Figure 3-top, yet with the corresponding calculations performed differently. For every sample, with size
(
x-axis of the figures), we no longer randomly generated
returns of the price of the analyzed product. In contrast, we took the last
returns of the stock price. This was carried out for every single trading date of the analyzed time interval (see
Table 1). Therefore, each point displayed in
Figure 3-bottom is not the average of 2000 trials, but the average of a number of trials equal to the total number of returns of the time interval (5 years) minus
.
In
Figure 3-bottom, we also present a horizontal gray line that indicates the ES from the distribution fitted to the whole dataset. Note that this is just a reference; it does not need to be equal to the calculated values (blue lines). This is because the ES may abruptly change for sudden strong price drops. For example, if
, the ES may be 0.05 the first year, 0.01 the second year, 0.03 the third year, etc., and the average of these numbers would not need to be the ES of the whole (5-year) dataset, which may closer to 0.05 if the strongest price dips concentrated in that period.
Fitting a dataset to probability density functions is more computationally demanding than a simple calculation of the ES using Historical Simulation. However, due to the capabilities of present-day computing facilities (
García-Risueño & Ibáñez, 2012), such calculations are affordable. The calculations whose results are presented in this article were performed on a personal laptop (MacBook Pro 13-inch M1 2020, i.e., not on a cluster or other supercomputing facility) and were carried out within a few weeks. The code was run in
Python 3.11, using recent versions of
numpy and
scipy.stats modules (
Harris et al., 2020;
Virtanen et al., 2020).