Forecasts of Value-at-Risk and Expected Shortfall in the Crude Oil Market: A Wavelet-Based Semiparametric Approach

We propose the use of wavelet-based semiparametric models for forecasting the value-at-risk (VaR) and expected shortfall (ES) in the crude oil market. We compared the forecast outcomes across different time scales for three semiparametric models, three nonparametric, distribution-based, generalized, autoregressive, conditional, heteroskedasticity (GARCH) models, and three rolling-window models. We found that the GARCH model estimated by the Fissler and Ziegel (FZ) zero loss minimization (GARCH-FZ) model performs the best at forecasting the VaR and ES in the short term, whereas the hybrid model performs the best for mid- and long-term time scales. Thus, long-term investors should consider the hybrid model and short-term investors should employ the GARCH-FZ model in their risk management processes. Overall, our proposed wavelet-based semiparametric models outperform the other models tested for all time scales and market conditions. As such, we suggest that these models are considered for the management of crude oil price risk and in the development of energy policy.


Introduction
Crude oil is one of the most volatile financial assets and important industrial inputs in the global economy. As such, the crude oil risk has been well researched. From an economic perspective, a highly volatile crude oil price can cause great economic uncertainty, as was the case during the great recession due to the sharp increase in crude oil prices. Then, during the 2008 global financial crisis, the sharp decline in crude oil spot prices caused great uncertainty in the financial market. For example, the average annualized standard deviation of the West Texas Intermediate (WTI) crude oil price in our sample period is 39.186, whereas that of the S&P 500 market index is 12.873. (This ratio is based on the authors' own calculations using the sample period of the present study.) In other words, the volatility of crude oil is three times that of the U.S. stock market.
Given this high volatility, managing and forecasting the risk associated with the crude oil market has become increasingly important. For investors, the high level of risk causes extreme price movements, which can damage their portfolios [1]. For policymakers, the highly volatile crude oil market makes the future state of the economy less certain, thus making it more difficult to formulate appropriate policies. Therefore, the main goal of this study was to develop an alternative approach to forecast the crude oil risk, especially in cases of extreme risk over time, thus helping investors and policymakers to quantify such risk.
where Y t denotes the crude oil returns over the sample period t; hence, Quantile{Y t } t−1 s=t−m denotes the sample quantile of Y t from t−m to t−1 over period s, and m denotes the window size. In general, m is set to 125, 250, or 500 days. In this study, we employed the rolling-window approach as our benchmark model.

Wavelet Approach
In addition to the rolling window approach, we employed the wavelet transform approach to analyze the dynamic VaR and ES values over different time scales. Wavelet analyses have been widely employed to deconstruct returns in series by period. In particular, we followed the methods of Percival and Walden [25] and Yang et al. [26], employing the maximal overlap discrete wavelet transform (MODWT) to obtain information components across different time scales by decomposing the original series. As a modified version, the MODWT has a similar length and shift invariance to those of the discrete wavelet transform (DWT).
We decomposed the daily crude oil return series Y = {Y t , t = 1, 2, . . . , N} into J frequencies. Then, by combining the MODWT wavelet and scaling filters, we obtained each J frequency's wavelet coefficients by filtering Y in a circular manner, as follows: and V j,t = L j −1 l=0 g j,l Y t−l mod N , Energies 2020, 13, 3700 4 of 24 where h j,l denotes the jth-level MODWT wavelet, and g j,l denotes a scaling filter of width L j = 2 j − 1 (L − 1) + 1. To maximize the number of boundary-free coefficients, we employed the Haar filter to implement the MODWT for out-of-sample forecasts. (We applied the Haar filter mainly because it is reversible, in contrast to other wavelets. By doing so, we were able to make our empirical results more robust by considering data of different frequencies.) For example, h 1,0 = 1 2 , h 1,1 = − 1 2 , g 1,0 = 1 2 , and g 1,1 = 1 2 in the initial step. Therefore, g j,l measures the localized moving average, and h j,l measures the deviations from this average. We used the matrix notation, as shown by Percival and Walden [25]: and where ω j and υ j represent N × N matrices, with elements of circularly shifted h j,l and g j,l , respectively. To avoid seasonality in the crude oil return series, we used a multi-resolution analysis (MRA) in the DWT. Then, we expressed the crude oil return series as the sum of the smoothed version of the deconstruction and all coefficients, in matrix form: where the wavelet detail coefficients at level j are described by D j,t = ω T j,t W J,t , and the smooth coefficients are described by S J,t = v T j,t V J,t . Specifically, D j,t captures the variance of the crude oil return series at different levels of j, whereas S J,t denotes the smooth part of the crude oil return series. Importantly, we rescaled the wavelet and scaling coefficients to retain the variance-preserving property of the DWT. Moreover, because we applied zero-phase filters for the detailed and smooth components, the features in the crude oil return series for any sample size could be correlated with those of the MODWT-based MRA.
As prior empirical studies suggest, crude oil data have a multi-scale, nonlinear, chaotic nature, which breaks the strict assumptions of the parametric model [27][28][29]. At the same time, the assumptions of the distribution are time-and location-invariant, which means that the nonparametric model results differ from the actual results in practice. By combining nonparametric techniques with parametric approaches, we compromised by using the semiparametric model to strike a balance between relaxing the strict assumptions and improving the estimation efficiency. Indeed, prior studies have discussed and explored the linkages between wavelet analyses and extreme value theory [5,17]. In the following sections, we discuss these two approaches in detail.

Nonparametric Models
For the second type of dynamic model, we modeled the crude oil returns process using a GARCH model, by incorporating a wavelet analysis. Specifically, we employed the autoregressive moving average (ARMA) (p, q) process to model the mean equation for both the original crude oil returns (Y t ) and their wavelet details ( D j,t ), which we can specify as the following: where ϕ i and θ i denote the autoregressive parameter and the moving average parameter, respectively, ε t is the white noise error term, φ 0 is the conditional mean, and R t is a simplified notation representing both the original crude oil return series (Y t ) and its wavelet detail series ( D j,t ). The optimal pand q-values were selected based on the Bayesian information criterion (BIC). Following Bollerslev [30], we used the standard GARCH(1, 1) process to model the daily volatility in risk measurement forecasting: where β denotes the persistence parameter. In particular, we used three common types of distribution: the normal distribution, the skew-t distribution [31], and an empirical distribution.
The latter distribution is an empirical distribution function based on a filtered historical simulation [20]. Following Equation (1), we were able to specify the VaR and ES as risk measures for a given time horizon, as follows: where z α is the left-tailed quantile at value α = 0.05. The conditional distribution F t is based on R t .

Semiparametric Models
Furthermore, we employed the GAS model proposed by Creal et al. [16] and Yuan and Yang [32] to update the current value of the variable from its lagged information. We considered three dynamic models for the comparison. Moreover, to better compare our forecasting results, we only considered the parameter structure with a single variable, because volatility can serve as a useful time-varying risk measure in risk forecasting. The first model is the one-factor GAS model driven by a single variable, κ t : where φ t denotes VaR t , and e t denotes ES t . Following Creal et al. [16] and Patton et al. [33], we were able to specify the evolution equation using the FZ loss function, as follows: where the FZ loss function only identifies φ t and e t , whereas the generalized autoregressive process determines the parameters ω, β, and γ. For simple identification, we set ω to zero. The second model that was used was the GARCH model with FZ loss minimization. Importantly, if the true conditional distribution is known, then the ARMA−GARCH model performs best. Otherwise, the model's performance can be improved by estimating its parameters using FZ loss minimization. Similarly to Equation (7), R t is the product of the conditional deviation σ t and the parameter distribution η t , where σ t follows the GARCH process and η t ∼ iid F η (0, 1). We obtained an analogous structure to the one-factor GAS model and were able to write the dynamic VaR and ES as the following: Therefore, we only needed to estimate the parameter vectors (β, γ, a, and b). This approach allowed us to obtain VaR and ES forecasts, rather than volatility forecasts, with the best fit.
The final model was a direct combination of the GAS and GARCH models, which we refer to as the hybrid model, specified as follows: Energies 2020, 13, 3700 6 of 24 where the latent variable κ t is the log of volatility. To ensure that the units remain stable in the evolution equation, we used the lagged log absolute return instead of the lagged squared return. Five parameters (β, γ, δ, a, and b) must be estimated in this model.

Estimation Methodology
In this section, we describe the estimation of the dynamic VaR and ES measures following the FZ approach [13]. Indexed by the functions G 1 and G 2 , the scoring function proposed by FZ [13] is consistent for VaR and ES; thus, we used this to address the issue of non-elicitability in the ES measure. Based on these loss functions, the scoring functions provide the true VaR and ES values only by minimizing the expected loss; hence, we were able to construct the following FZ loss functions: The final model was a direct combination of the GAS and GARCH models, which we refer to as the hybrid model, specified as follows: where the latent variable is the log of volatility. To ensure that the units remain stable in the evolution equation, we used the lagged log absolute return instead of the lagged squared return. Five parameters ( , , δ, a, and b) must be estimated in this model.

Estimation Methodology
In this section, we describe the estimation of the dynamic VaR and ES measures following the FZ approach [13]. Indexed by the functions G1 and G2, the scoring function proposed by FZ [13] is consistent for VaR and ES; thus, we used this to address the issue of non-elicitability in the ES measure. Based on these loss functions, the scoring functions provide the true VaR and ES values only by minimizing the expected loss; hence, we were able to construct the following FZ loss functions: where G1 is increasing, G2 is increasing and positive, and we have ℊ 2 ′ = G2. The VaR and ES were derived by minimizing any member of the class: where G 1 is increasing, G 2 is increasing and positive, and we have Therefore, we only needed to estimate the parameter vectors ( , , a, and b). This approach allowed us to obtain VaR and ES forecasts, rather than volatility forecasts, with the best fit.
The final model was a direct combination of the GAS and GARCH models, which we refer to as the hybrid model, specified as follows: where the latent variable is the log of volatility. To ensure that the units remain stable in the evolution equation, we used the lagged log absolute return instead of the lagged squared return. Five parameters ( , , δ, a, and b) must be estimated in this model.

Estimation Methodology
In this section, we describe the estimation of the dynamic VaR and ES measures following the FZ approach [13]. Indexed by the functions G1 and G2, the scoring function proposed by FZ [13] is consistent for VaR and ES; thus, we used this to address the issue of non-elicitability in the ES measure. Based on these loss functions, the scoring functions provide the true VaR and ES values only by minimizing the expected loss; hence, we were able to construct the following FZ loss functions: ( , , ; , 1 , 2 ) = ( { ≤ } − ) ( 1 ( ) − 1 ( ) + 1 where G1 is increasing, G2 is increasing and positive, and we have ℊ 2 ′ = G2. The VaR and ES were derived by minimizing any member of the class: 2 = G 2 . The VaR and ES were derived by minimizing any member of the class: To obtain the minimized value of Equation (17), we selected G 1 and G 2 under the FZ loss function. We first defined the loss difference for the two forecasts (φ 1,t , e 1,t ) and (φ 2,t , e 2,t ) as ∆L FZ (R, φ 1,t , e 1,t , φ 2,t , e 2,t ) = L FZ (Y, φ 1,t , e 1,t ) − L FZ (Y, e 1,t , e 2,t ). Following Patton and Sheppard [34] and Patton et al. [33], we chose G 1 and G 2 based on the loss function that generates ∆L FZ , which is homogeneous to degree zero.
After identifying the uniqueness of the FZ zero (FZ0) loss minimization loss function, we propose the semiparametric dynamic ES and VaR measures: where the information set − ∈ − contains the elements of the specified parametric functions for the true VaR and ES. We estimated the model parameters using the following: As Patton et al. [33] suggested, the asymptotic theory estimates effectively in both linear and nonlinear models. Although Equation (12) imposes parametric structures on the lagged information, the dynamic VaR and ES in Equation (1) are free of regularity conditions and assumptions on the conditional distribution. In other words, these models are semiparametric and thus can be used in comparisons with the nonparametric models detailed in Section 2.3.

Goodness-of-Fit Test
To identify the performance of the VaR and ES measures for the different types of models, we considered a goodness-of-fit test for the out-of-sample forecasts. Given the correct specifications for the VaR and ES measures, this is as follows: where  (21). Therefore, we considered , and , to represent an expression of the "generalized residual" in the model. We took the standardized forms of , and , by eliminating the effect of serial correlation due to t−1 contains the elements of the specified parametric functions for the true VaR and ES. We estimated the model parameters using the following: As Patton et al. [33] suggested, the asymptotic theory estimates θ effectively in both linear and nonlinear models. Although Equation (12) imposes parametric structures on the lagged information, the dynamic VaR and ES in Equation (1) are free of regularity conditions and assumptions on the conditional distribution. In other words, these models are semiparametric and thus can be used in comparisons with the nonparametric models detailed in Section 2.3.

Goodness-of-Fit Test
To identify the performance of the VaR and ES measures for the different types of models, we considered a goodness-of-fit test for the out-of-sample forecasts. Given the correct specifications for the VaR and ES measures, this is as follows: Energies 2020, 13, 3700 The variables φ t and e t are strictly negative; hence, E t−1 λ φ,t and E t−1 [λ e,t ] must equal zero to satisfy Equation (21). Therefore, we considered λ φ,t and λ e,t to represent an expression of the "generalized residual" in the model. We took the standardized forms of λ φ,t and λ e,t by eliminating the effect of serial correlation due to the persistence of φ t and e t , as follows: These standardized generalized residuals are also i.i.d. (0,1) under the correct specifications. As Christoffersen [35] and Engle and Russell [36] showed, the standardized residual for the VaR simplifies to be a well-known test for the VaR. Moreover, we employed the dynamic quantile test detailed by Engle and Manganelli [18] to select the models with the best fit in our study. We examined the following regression: where a = [a 0 , a 1 , are the regression parameters, and u φ,t and u e,t are the residuals of the regression. We employed the common two-sided alternative to examine the forecast optimality under the null hypothesis that all parameters in these regressions are zero.

Data
The data used in our study were the WTI and Brent crude oil daily prices from 4 January 1988 to 28 December 2018 (this is presumably the longest series of daily data for the crude oil prices in the database), and the same period was used for both to provide a fair comparison. After deleting the nontrading days, our sample consisted of 7709 observations for the WTI crude oil price and 7641 observations for the Brent crude oil price. Table 1 presents the descriptive statistics of the crude oil price log returns. The mean return for the WTI crude oil price was lower than that for the Brent crude oil price for the full sample. However, the standard deviations, skewness, and kurtosis were greater for WTI crude oil returns than for Brent crude oil returns, which indicates that extreme returns are more likely to occur for the WTI crude oil prices. The negative skewness also suggests a greater tendency towards extreme negative returns in the crude oil market. Furthermore, we employed the first 12 years as the data for the in-sample estimations for our models. After deleting the nontrading days, the in-sample data from 4 January 1988 to 30 December 1999 included 2962 and 2875 observations for the WTI and Brent crude oil returns, respectively. Therefore, the out-of-sample period was 4 January 2000 to 28 December 2018. (We followed the methods used in the study of Patton et al. [33] to separate the sample.) We do not present the out-of-sample periods because we only used them to validate and compare our models. The statistics for the in-sample periods were similar to those for the full-sample period, except for the higher returns on the WTI crude

VaR and ES for the Raw Data
We present the estimated parameter (January 1988 to December 1999) for the distribution-based model in Panel A of Table 2, which was determined using stepwise estimation methodology. First, we estimated the mean equation based on the ARMA process. Then, we estimated the GARCH process. Finally, we estimated the parameters using the standardized residuals. The empirical results are reported in a similar manner. In the first panel of Table 2, the estimations for the ARMA (p.q) model are provided. We selected the optimal choices of p and q using the BIC, yielding zero for p and three for q for both crude oil returns. Moreover, the optimal ARMA model included a constant value that was considered as the unpredictability of the crude oil returns. After obtaining the innovations from the ARMA model, we determined the estimation results of the GARCH(1, 1) model, and these are provided in the second panel. We observed high persistence in the GARCH processes. The estimated parameters were equal to 0.878 and 0.883, which are quite similar to those presented in the majority of previous studies [9,10]. The last row contains the parameters for the skew-t distribution with significant skewness, consistent with the work of Lyu et al. [10].

VaR and ES for the Raw Data
We present the estimated parameter (January 1988 to December 1999) for the distribution-based model in Panel A of Table 2, which was determined using stepwise estimation methodology. First, we estimated the mean equation based on the ARMA process. Then, we estimated the GARCH process. Finally, we estimated the parameters using the standardized residuals. The empirical results are reported in a similar manner. In the first panel of Table 2, the estimations for the ARMA (p·q) model are provided. We selected the optimal choices of p and q using the BIC, yielding zero for p and three for q for both crude oil returns. Moreover, the optimal ARMA model included a constant value that was considered as the unpredictability of the crude oil returns. After obtaining the innovations from the ARMA model, we determined the estimation results of the GARCH(1, 1) model, and these are provided in the second panel. We observed high persistence in the GARCH processes. The estimated β parameters were equal to 0.878 and 0.883, which are quite similar to those presented in the majority of previous studies [9,10]. The last row contains the parameters for the skew-t distribution with significant skewness, consistent with the work of Lyu et al. [10].
Similarly, we report the empirical results of the three ES and VaR measures based on the one-factor models in Panel B of Table 2. The empirical results for the one-factor GAS model are provided in the first column. The second and third columns show the GARCH model estimated using FZ0 loss minimization and the hybrid model. Specifically, we constructed the hybrid model by incorporating a GARCH-type forcing variable (σ) into an augmented one-factor GAS model. In the last row, we provide the in-sample average loss for each specified model. The GARCH forcing variable in the hybrid model was significantly different from zero, which indicates that the GARCH process is important for modeling the crude oil VaR and ES values. Moreover, the persistence parameter β was found to be significant, with a value close to one, implying similar persistence processes to the GARCH model.
To save space, we report a summarized version of the descriptive statistics for the in-sample VaR and ES when α = 0.05. We provide all data for the VaR and ES after identifying the model with the best fit. We present the empirical results for the nonparametric model in Panel A of Table 3, which shows a slight difference between the means and standard deviations for VaR and ES. However, the skewness Energies 2020, 13, 3700 9 of 24 and kurtosis values are shown to be the same. Moreover, a higher risk is presented for the WTI crude oil than for the Brent crude oil, as indicated for both the VaR and the ES. Table 2. Estimations of the autoregressive moving average (ARMA)−generalized autoregressive conditional heteroskedasticity (GARCH)−skew-t model and the generalized autoregressive score (GAS) one-factor models. for three GAS models of the WTI and Brent crude oil over the in-sample period from January 1988 to December 1999. The table presents the results of the one-factor GAS model (GAS-1F), a GARCH model estimated by Fissler and Ziegel zero (FZ0) loss minimization (GARCH-FZ), and the hybrid one-factor model. AL denotes the average loss. *** , ** , and * denote significance at the 1%, 5%, and 10% levels, respectively.
Panel B provides the descriptive statistics for the in-sample VaR and ES when α = 0.05, based on the semiparametric models. All the descriptive statistics were found to differ across the models and were quite different from those in Panel A. Moreover, for both the WTI and Brent crude oil returns, the values of the descriptive statistics for the parameter-driven (non-distribution-based) model were slightly lower than those of the non-parameter-driven (distribution-based) model.
In this section, we describe the employment of out-of-sample forecasting to validate the fitness between distribution-and non-distribution-based models. For simplicity, we only discuss the results when α = 0.05. We used three types of models to forecast the VaR and ES: the rolling-window, distribution-based, and semiparametric dynamic models. We considered the rolling-window method (125, 250, and 500 days) as the benchmark case. We employed the ARMA−GARCH model as the distribution-based model. To forecast the VaR and ES nonparametrically, we used the normal distribution, the skew Student's-t distribution, and an empirical distribution based on the estimated standardized residuals. The semiparametric dynamic ES and VaR measures used were the one-factor GAS model, the GARCH model estimated using FZ0 loss minimization, and the hybrid GAS/GARCH model. As shown in the previous section, we estimated the parameters of these models to forecast the VaR and ES based on a 12-year sample period. In Table 4, we show the average loss of the model forecasts based on the FZ0 loss function. The bold text highlights the lowest values and the italics denote the second lowest value in each column. For the VaR and ES, GARCH-FZ was found to perform best for both the WTI and the Brent crude oils. Moreover, GARCH-SKT and GAS-1F performed second-best for the WTI and Brent crude oil prices, respectively. The worst-performing model was the rolling-window model (500 days). In addition, the p-values from the goodness-of-fit test presented in the second and third columns of Table 4 were used to evaluate the VaR and ES forecasts. Here, entries greater than 0.1 are shown in bold, and those between 0.05 and 0.1 are shown in italics. For the WTI crude oil, four models passed the test for the VaR, whereas three models passed the test for the ES. In contrast, only one dynamic model passed the test for the VaR and ES for the Brent crude oil. Unfortunately, the three rolling-window models and the hybrid models failed in all cases. The GARCH-FZ model performed the best, as usual, in the goodness-of-fit test. To further confirm our results, we determined the Diebold−Mariano (DM) t-statistics for the VaR and ES of the WTI crude oil on the loss difference, using the "row model minus column model" calculation, and these are shown in Table 5. A positive result indicates that the row model underperformed compared to the column model. As Table 5 shows, the GARCH-FZ model outperformed all competing models, with all positive entries having significant values, in contrast to the three rolling-window, GARCH-N, and hybrid models. The exceptions were the GARCH-SKT, GARCH-EDF, and GAS-1F models, with DM t-statistics of 1.382, 1.637, and 0.678, respectively. The DM t-statistics for the Brent crude oil VaR and ES values were similar. Table 5. The Diebold−Mariano t-statistics of the average out-of-sample loss differences (α = 0.05). Finally, we provide the fitted 5% ES and VaR values for the three types of models by selecting the best-performing model in the subgroup-that is, the rolling-window model (125 days), the GARCH-SKT model, and the GARCH-FZ model. As Figures 2 and 3 show, the WTI crude oil VaR and ES values were found to be more volatile than those of the Brent crude oil. The most extreme ES values were observed during the Gulf War, with an estimated 35% decrease for the WTI crude oil and a 30% decrease for the Brent crude oil. The second most extreme ES values occurred during the 2008 global financial crisis, with a decrease of approximately 15% for the WTI crude oil and a 10% decrease for the Brent crude oil, which are similar to the findings of Do and Bhatti [37], Wu et al. [38] and Parker and Bhatti [39].

VaR and ES for the Wavelet
By following the same procedures for the VaR and ES as those of the original crude oil return, we considered the VaR and ES for the wavelet. Specifically, we decomposed the original crude oil return into eight wavelet details (i.e., D1 to D8). We considered the time scales from D1 to D4 as the short-term scale, from D5 to D6 as the mid-term scale, and from D7 to D8 as the long-term scale. December 2018 for 10 forecasting models. A positive value indicates that the row model has a higher average loss than that of the column model. Absolute values greater than 1.96 indicate that the average loss is significantly different from zero at the 95% confidence level. The values along the main diagonal are all zero and are omitted for interpretability. The first three rows correspond to the rolling-window forecasts, the next three rows to the GARCH forecasts based on different models for the standardized residuals, and the last three rows correspond to the VaR and ES GAS model forecasts.
Finally, we provide the fitted 5% ES and VaR values for the three types of models by selecting the best-performing model in the subgroup-that is, the rolling-window model (125 days), the GARCH-SKT model, and the GARCH-FZ model. As Figures 2 and 3 show, the WTI crude oil VaR and ES values were found to be more volatile than those of the Brent crude oil. The most extreme ES values were observed during the Gulf War, with an estimated 35% decrease for the WTI crude oil and a 30% decrease for the Brent crude oil. The second most extreme ES values occurred during the 2008 global financial crisis, with a decrease of approximately 15% for the WTI crude oil and a 10% decrease for the Brent crude oil, which are similar to the findings of Do and Bhatti [37], Wu et al. [38] and Parker and Bhatti [39].

VaR and ES for the Wavelet
By following the same procedures for the VaR and ES as those of the original crude oil return, we considered the VaR and ES for the wavelet. Specifically, we decomposed the original crude oil return into eight wavelet details (i.e., D1 to D8). We considered the time scales from D1 to D4 as the short-term scale, from D5 to D6 as the mid-term scale, and from D7 to D8 as the long-term scale. (Specifically, D1 denotes a 2 day scale, D2 denotes a 4 day scale, D3 denotes an 8 day scale, D4 denotes a 16 day scale, D5 denotes a 32 day scale, D6 denotes a 64 day scale, D7 denotes a 128 day scale, and D8 denotes a 256 day scale.) For simplicity, we report the average loss of the model forecasts based on the FZ0 loss function for these wavelet details in Table 6. Panel A summarizes the average loss for the WTI crude oil, and Panel B summarizes the average loss for the Brent crude oil. The lowest values in each column are denoted in bold. Panel A shows that the GARCH-FZ model is the best choice for D1 and D2; the GARCH-EDF model is the best choice for D3; the GAS-1F model is the best choice for D5, D6, and D7; the hybrid model is the best choice for D4 and D8. Panel B shows that the GARCH-FZ model is the best choice for D1 to D4; the hybrid model is the best choice for D5, D7, and D8; the GAS-1F model For simplicity, we report the average loss of the model forecasts based on the FZ0 loss function for these wavelet details in Table 6. Panel A summarizes the average loss for the WTI crude oil, and Panel B summarizes the average loss for the Brent crude oil. The lowest values in each column are denoted in bold. Panel A shows that the GARCH-FZ model is the best choice for D1 and D2; the GARCH-EDF model is the best choice for D3; the GAS-1F model is the best choice for D5, D6, and D7; the hybrid model is the best choice for D4 and D8. Panel B shows that the GARCH-FZ model is the best choice for D1 to D4; the hybrid model is the best choice for D5, D7, and D8; the GAS-1F model is the best choice for D6. Overall, regardless of whether we examined the WTI or the Brent crude oil, the semiparametric models significantly outperformed the nonparametric models.
Similarly, the DM t-statistics for the VaR and ES of the WTI and Brent crude oils are presented in Tables 7 and 8. The DM t-statistics were found to be positive for the models with the best fit, as discussed above. For example, the GARCH-FZ model contains all positive entries for D1 and D2, implying that it outperformed all the competing models. This is significant when comparing the three parametric GARCH-N models. The GARCH-EDF model contains all positive entries for D3, with the only exception being the hybrid model, with a DM t-statistic of 0.508. We obtained the same results for the DM t-statistics of the Brent crude oil VaR and ES. Overall, we confirmed that the semiparametric model performs better than the nonparametric models. Note: The left panel of this table presents the average losses using the FZ0 loss function for the WTI and Brent crude oil return series over the out-of-sample period from January 2000 to December 2018 for 10 different forecasting models. The lowest average loss in each column is highlighted in bold; the second lowest is highlighted in italics.
The first three rows correspond to the GARCH forecasts based on different models for the standardized residuals, and the last three rows correspond to the VaR and ES GAS model forecasts. Table 7. The Diebold−Mariano t-statistics for the average out-of-sample loss differences (α = 0.05, WTI).  Table 7. Cont.

GCH-N GCH-SKT GCH-EDF GAS-1F GCH-FZ Hybrid
the WTI crude oil, and Figure 5 plots the forecasts for the Brent crude oil. The wavelet details for the WTI crude oil VaR and ES were found to be more volatile than those for the Brent crude oil. Moreover, the volatility of the VaR and ES decreased as the time scale increased. The results are similar to those of Fernandez [7]-that is, the variance in the VaR and ES stems mainly from the short-term wavelet components. Moreover, the GARCH-FZ model was found to be driven only by information when the VaR was violated, which indicates a deterministic reversion to the long-run mean on the following day. Thus, the behaviors of the VaR and ES for the GARCH-FZ model are smoother than those of the other models.

Further Robustness Tests
To ensure the robustness of our results, we considered the average loss across four different tails: values of 0.10, 0.05, 0.025, and 0.01. We used this procedure to investigate the performance of the models in their response to changes based on the depths of the tails (see Table 9). For the original WTI crude oil series, the best-performing model was the GARCH-FZ model, followed by the GARCH-SKT model, across different values of α. In other words, the GARCH estimated by the FZ0 loss minimization parametrically outperformed the GARCH estimated by the nonparametric residuals. However, for the Brent crude oil, the best-performing model was the GARCH-FZ, followed by GAS-1F, when was equal to 0.05, 0.025, and 0.01. For = 0.10, the GAS-FZ model outperformed the GARCH-FZ model, which occurred because the GAS model depends on the observed VaR violation.

Further Robustness Tests
To ensure the robustness of our results, we considered the average loss across four different tails: α values of 0.10, 0.05, 0.025, and 0.01. We used this procedure to investigate the performance of the models in their response to changes based on the depths of the tails (see Table 9). For the original WTI crude oil series, the best-performing model was the GARCH-FZ model, followed by the GARCH-SKT model, across different values of α. In other words, the GARCH estimated by the FZ0 loss minimization parametrically outperformed the GARCH estimated by the nonparametric residuals. However, for the Brent crude oil, the best-performing model was the GARCH-FZ, followed by GAS-1F, when α was equal to 0.05, 0.025, and 0.01. For α = 0.10, the GAS-FZ model outperformed the GARCH-FZ model, which occurred because the GAS model depends on the observed VaR violation.
When the values of α were very small, these violations rarely occurred. In contrast, the squared residual from the GARCH model provided information flows to move the risk measures, regardless of the VaR violation. If the values of α were not too small, then the GAS model with the FZ0 loss function began to perform well. This is likely because the Brent crude oil price data are less volatile than that of the WTI crude oil price data. Overall, we took the average of the rank in the final column. Clearly, the distribution-based GARCH model is always the second or lower choice.  Since the out-of-sample period was from January 2000 to December 2018, it included the 2008 global financial crisis, which presented a significant structural change in VaR and ES forecasting. To address this structural break, we further determined the results of the performance of the models in terms of their responses to changes, based on the depth of the tails before and after the 2008 global financial crisis (see Table 10). For simplicity, we only looked at the forecast results based on the WTI price. As shown in Table 10, although there were slight changes in rankings for the performance of the individual models, the average rankings remained the same. In this sense, our model provided a consistent forecast.   To further explore the performance differences as the time scale changed, we also determined the performance rankings for both the nonparametric models and the semiparametric models. Once again, the semiparametric models always outperformed the nonparametric models. The distribution-based GARCH model was always the second or lower choice for the short-term scales. Specifically, the GARCH-FZ model performed the best for the short-term scales, whereas the hybrid model performed the best for the long-term scales.
Moreover, as the time scale increased, the semiparametric model estimated using FZ0 loss minimization parametrically outperformed the GARCH model estimated using nonparametric residuals, regardless of the type of model. Since the short-term wavelet components are more volatile than the long-term wavelet components, the hybrid model began to perform well for the midand long-term scales. In addition, we also identified a similar forecast performance by considering two market regimes across the time scales, as shown in Table 10. Only the short-term scale exhibited Energies 2020, 13, 3700 22 of 24 variations in the rankings of performance for the individual model without changing the average rankings. For example, for the D6 component of the WTI crude oil, the hybrid model was ranked first, followed by the GAS-1F and GARCH-FZ models. Moreover, as the time scale increased, violations of rank across the values of α rarely occurred for these models, especially for the semiparametric models. At the same time, the wavelet components for the Brent crude oil exhibited similar patterns to those for the WTI crude oil. The results also indicate that the semiparametric models are the best choice when estimating the VaR and ES, regardless of the time scale.

Conclusions
Understanding the risk of the crude oil market across time scales is essential for risk management and asset pricing. Conventional analyses of market risk rely on the assumption of the conditional distribution of returns. However, the accuracy of this approach relies on making correct choices for distribution and assumptions. To overcome the problem of distribution-dependent forecasts of the VaR and ES, we developed a wavelet-based approach based on the semiparametric methodology that is independent of assumptions on the distribution. Moreover, we investigated the statistical decision theory to jointly model the dynamic ES and VaR values in order to address the problem of elicitability for ES. We proposed three types of dynamic VaR and ES based on parametric structures and compared their performance with that of the nonparametric forecast models. Our study relaxed the strict assumption about the nonparametric distribution-based model with a more robust result that will help financial practitioners, energy policymakers, and energy economists to improve their forecasting ability for VaR and ES in the crude oil market.
In the empirical analysis, we employed a full sample of the daily WTI and Brent crude oil prices from January 1988 to December 2018 to investigate whether the semiparametric models outperform the nonparametric models in terms of forecasting the VaR and ES in the crude oil market for different time horizons. We used the first 12 years (before 2000) to estimate the parameters and the last 19 years (after 2000) to forecast the VaR and ES under different confidence levels of α for different time scales. The empirical results show that the VaR and ES forecasts based on the GARCH-FZ model outperform those based on the other models for the original crude oil return series. For the wavelet-based VaR and ES forecasts, we found that the GARCH-FZ model performs best in the short term, whereas the hybrid model performs best in the mid and long term. Moreover, as the time scale increases, the semi-parameter models outperform the nonparametric distribution-based model with a decreasing market risk in the crude oil market. Overall, we found that our proposed semiparametric model outperforms the nonparametric model, regardless of the time scale.
Our findings have several implications. Firstly, investors should choose their distributions or investment horizons carefully when building their risk forecast models. A semiparametric model provides more robust results in the crude oil market than a parametric model, regardless of the time scale. It also improves the accuracy of risk and portfolio management for different time horizons. Secondly, by introducing a wavelet analysis into the semiparametric VaR and ES measures, our proposed models allow risk managers to evaluate the extreme crude oil risk under different investment horizons in highly volatile environments, by considering the GARCH-FZ model in the short term and by considering the hybrid model in the long term. The ES measure is especially effective for measuring extreme risk for the crude oil market. Thirdly, with fewer assumptions of the conditional distribution in the VaR and ES model, policymakers can create appropriate crude oil policies, because they can easily model the influence of crude oil risk on the real economy. Specifically, by using the appropriate energy policy to stabilize the oil price fluctuations, the negative outcome to the economy from oil shock will be alleviated.