Influencing Factors Analysis of Crude Oil Futures Price Volatility Based on Mixed ‐ Frequency Data

.


Introduction
The quantity and quality of statistical data are the cornerstones of macroeconomic model construction, estimation, and forecasting. How much more information can be extracted from economic data has always been a goal worthy of pursuit. As a commodity, crude oil plays an important role in the real economy and the financial market. As a major supplier of energy and raw materials for many industrial materials and agricultural sectors, it has gradually become an investment tool and can even impact the business cycle in the financial market [1].
The boom of commodities that began in 2004 saw international oil prices rise steadily until 2008, along with the increase of frequency and extent of oil price fluctuations. When the subprime crisis broke out in 2008, speculative funds left the market one after another and the dollar strengthened, which caused commodity prices around the world to fall sharply at the same time. The price of crude oil dropped by 70% from its peak in the latter half of 2018, which far exceeded the 40% decline in copper, corn and soybean futures, indicating that oil prices are more volatile. After the financial crisis, the oil price fluctuated and reacted more to external factors. The psychological expectations of market participants and the existence of speculative factors aggravated the fluctuation range of the oil price, and the ensuing risks in the crude oil market and the crude oil futures market were also greater [2].
The volatility shock of crude oil price has an important impact on all markets [3,4]. The measurement and prediction of volatility are also of great significance for asset pricing and risk management, and the price volatility and volatility changes of commodities will also affect the pricing of commodity derivatives and the construction of the optimal hedging ratio [5]. Therefore, the study of fluctuation factors becomes the focus of research. Simple fundamental factors can no longer fully explain the volatility of oil prices. Existing studies have explored some factors that cause the volatility of crude oil prices and crude oil futures prices. The factors include OPEC's market and demand structure and demand elasticity [6], OPEC and the U.S. strategic petroleum reserve (SPR) announcement for crude oil spot and futures markets [7], the information efficiency of crude oil inventory positions [8,9], speculative factors [10], and irregular extreme events [2], natural disasters [11], geopolitical risks (GPRS) [12], even including the "long memory" of oil price fluctuation [13]. For oil futures, compared to the fundamentals of supply and demand, macroeconomic and financial data have become more important factors affecting oil futures prices [14,15]. Although traditional research methods have certain effects, they still have obvious disadvantages. As for the study of crude oil price volatility, the previous literature mainly included several methods. One was the GARCH (Generalized Autoregressive Conditional Heteroskedasticity)-class models [16,17]. After comparing nine different GARCH models (RiskMetrics, GARCH, IGARCH, GIR, EGARCH, APARCH, FIGARCH, FIAPARCH, and HYGARCH (Hyperbolic Memory Generalized Autoregressive Conditional Heteroscedasticity) models), and it was found that none of them were better than other models [18]. Although the addition of oil volatility index (OVX)-based implied volatility can greatly improve the forecast accuracy of crude oil prices compared with the simple GARCH-TYPE volatility models [17], most of these models are difficult to deal with the different frequencies between crude oil earnings volatility and macroeconomic covariables. Another method is that some researchers predicted the actual fluctuation of oil price based on the historical data in the past [19]. This method ignores various external factors and long-term stable conditions that affect the fluctuation of oil price. Previous studies have shown that the large fluctuations of WTI (West Texas Intermediate) crude oil market have great instability in both the short and long term, that is, the crude oil market is ineffective in the weak form [20], while the addition of macroeconomic uncertainties can improve the accuracy of oil price volatility assessment. The realized volatility model was to extract forecast information from historical volatility or prices [21]. Macro variables were not included in this model, and previous and current oil price fluctuations could not reflect and contain all available information [20].
The form of mixed data contained in the influencing factor data needs to be emphasized and studied in risk modeling. Due to many factors that affect crude oil price fluctuations, the complicated time series data and the various data sampling frequencies, as well as the factors involved in the data of supply and demand fundamentals, the macroeconomy, and the financial market, will change because of varying efficiency and difficulty of statistical work. Research must consider the form of mixed data [20]. However, the current research mostly uses the same frequency data modeling, which is a method of adding up or replacing. The high-frequency data are processed as low-frequency data, which causes the loss of sample information in the high-frequency data, and will artificially reduce the fluctuation of high-frequency financial data to a certain extent, and cannot meet the timeliness requirements. The high-frequency data are obtained by interpolating low-frequency data using mathematical methods that often lack economic theoretical support and cannot ensure accuracy. At the same time, in the current mixed modeling process, the statistical properties of the financial data, such as the peak and thick tail characteristics, are often been ignored, which may lead to the use of inappropriate models to analyze financial phenomena. Therefore, in the context of the complex characteristics of macroeconomic and financial market data, different data characteristics of different types of influencing factors need to be paid full attention in the research. This article will use the GARCH-MIDAS method, considering the form of mixed data contained in crude oil futures data and the peak and thick tail characteristics of returns, to construct a mixed measurement model and to explore the influence of various factors with different data frequencies on crude oil futures price volatility.

Literature Review
The influencing factors of international crude oil price fluctuations can't be attributed solely to a single factor or a certain leading factor. Oil price fluctuations are a complex and changeable process and are the result of a combination of multiple factors. In the selection of factors affecting the fluctuation of international crude oil prices, some time series' adopt daily data, such as the US dollar exchange rate, but some adopt low-frequency monthly data, such as US crude oil inventories and OPEC crude oil production, so both high-frequency and low-frequency data appear simultaneously in the crude oil price volatility model. The processing method of most scholars has involved carrying out preprocessing on the data so that they could have the same frequency. For example, the mixed data are processed into the same frequency data by summing or interpolation, but this will cause the data to be unable to express all the information and finally affect the model results. Therefore, mixed data modeling can be used to reduce information loss caused by data processing.
On the study of oil price fluctuations and forecast, researchers believe that oil futures market shall be as a forecasting method. By applying four kinds of evaluation criteria, the futures market and the expert system for the EIA (Electronic Industries Alliance) were compared with each other from the aspects of prediction error and researchers finally found the NYMEX futures market situation is not inferior in terms of price forecast [22]. However, subsequent studies have reached the opposite result. The granger causality test method has been used to confirm that the forward curve is useful for one week and one month in advance in the daily forecast and the weekly forecast, respectively, but almost useless in the long-term forecast [23].
More researchers have used a variety of modeling methods for research. There have been studies chose a generalized regime switching model, which performs noticeably better than nonswitching models regardless of evaluation criteria [5]. Using the dynamic Nelson-Siegel model to explain the term structure of crude oil prices and using a generalized regression framework based on neural networks to forecast oil prices can be more accurately [24]. Developing three kinds of new HAR (Heterogeneous Autoregressive Model)-type models by incorporating investor sentiment and/or leverage effect in the corresponding original HAR-type models can be used to examine the predictable effect of investor sentiment and leverage on daily, as well as weekly and monthly fluctuations in the crude oil futures market [25]. Some scholars believe that using a simple autoregressive norm to simulate long memory, and using 11 predictive HAR-type time-series models to forecast realized variance can achieve the same effect as the more complex models [26]. More specifically, the TGARCH (Threshold Generalized Autoregressive Conditional Heteroskedasticity) model applies to the volatility of heating oil and natural gas, while the GARCH model applies to the volatility of crude oil and leaded gasoline. Simple moving average models can also work well if choosing the right order. Some complex models are even less effective than single-equation GARCH models. In the backtracking test, the nonparametric model for calculating risk measurement is superior to the parametric model in terms of transcendental number [27]. Some researchers studied the relationship between crude oil and natural gas prices, examined the behavior of natural gas and crude oil price volatility, and measured the extent to which crude oil volatility is displayed in longterm gas prices [28][29][30][31]. However, some researchers showed the evidence of causality from gas to oil [32].
The mixed data sampling method (MIDAS) was first proposed in 2007 [33]. Subsequently, a generalized autoregressive conditional heteroscedasticity model of mixed data sampling, namely GARCH-MIDAS, was proposed in 2013 [34]. The GARCH-MIDAS model divides the conditional variance into long-term and short-term variables, and low-frequency variables affect the conditional variance through long-term components. The effectiveness of the GARCH-MIDAS model in predicting the volatility of the stock market has been proved, and this model has been gradually applied to studies [35]. Past researches have used the GARCH-MIDAS framework to re-examine the relationship between oil and the stock market by analyzing the long-term macroeconomic determinants of the daily stock market and crude oil price returns in the United States [36]. Some scholars also used the GARCH-MIDAS model to study the effects of oil supply and demand levels and fluctuations in the price fluctuations of WTI and Brent crude oil from 1986 to 2015 [37].
The GARCH-MIDAS model is also gradually used in predicting oil price. Some scholars forecasted oil market volatility using the probability (or intensity) of jump occurrences, the sign (e.g., positive or negative) of jumps, and the concurrence with stock market jumps, which are all under the mixed data sampling (MIDAS) modeling framework. The empirical results showed that this model can markedly improve the accuracy of oil market volatility forecasting [38]. Existing factors influencing oil price fluctuations to improve study and forecast changes in the oil market include speculation, economic fundamentals [39,40], basic (physical) predictors, financial predictors [41], uncertainty of geopolitical risk [42], economic policy uncertainty (EPU), and indicator variables [43,44]. In order to improve the accuracy of forecasts, scholars have suggested that the use of highfrequency financial data is more helpful in predicting oil prices [45]. Some scholars used MIDAS models and high-frequency data from four stock market indexes to predict WTI and Brent crude oil prices at lower frequencies. The results showed that the high frequency stock market index has a certain advantage over the low-frequency data in predicting the monthly crude oil price, and the MIDAS model of high-frequency data is superior to the common model [46].
However, most of the existing literature studies are based on the basic assumption of normal distribution, which does not meet the characteristics of financial data's spikes and thick tails, and these results cannot fully reflect market conditions. Compared with the normal distribution, a tdistribution can more accurately describe the characteristics of the "heavy tail" of the data, and thus can be closer to the actual distribution of financial data. At present, the t-distribution is mostly used in the research on the return and volatility of financial data such as stocks, but the t-distribution still has a relatively important reference for the study of crude oil price fluctuations.
In summary, the current researches on the volatility of the crude oil market are still not thorough and it can be manifested in two aspects: ① the traditional single-frequency data econometric model often ignores different frequency data information, which reduces the model's interpretation ability. ② from the perspective of the comprehensive characteristics of financial data, it lacks consideration of the complex characteristics of financial market data. In the process of modeling, the statistical properties of the financial data itself, such as peaks and thick tails, are often ignored, which may lead to the use of inappropriate models to analyze financial phenomena.

Improved GARCH-MIDAS Volatility Model
In order to re-examine the relationship between stock market volatility and macroeconomic activities and their volatility, the GARCH-MIDAS model was constructed by introducing the MIDAS method into the GARCH model [34]. Different information may have different effects on financial markets, depending on their short-term or long-term effects. In the GARCH-MIDAS model, the volatility of the high-frequency dependent variable is decomposed into the product of long-term lowfrequency volatility components and short-term high-frequency volatility components; the lowfrequency volatility component is then explained by the weighted level or weighted volatility of the low-frequency variable. The high-frequency volatility component is described by the traditional GARCH structure.
According to the research of Engle and Rangel, the GARCH-MIDAS model constructed in this paper is expressed as follows: , , where , represents the logarithmic return rate of crude oil futures prices on the ith day of month t; , |Φ , -N(0,1), and Φ , is the set of information available on the (i−1)th day during t. Volatility has two components: the short-term dynamic component of daily volatility , and the long-term component of volatility . , is related to daily liquidity and other possible short-term factors, and is first related to expected future cash flows and future discount rates. Assume that the components , follow the (day) GARCH (1,1) process: , 1 , , . ( Furthermore, the MIDAS regression method is used to characterize the components based on the realized volatility (RV) of the rate of return: where K represents the maximum lag order of the low-frequency variable, and represents the realized volatility in the month t . The weighting function in the equation adopts the Beta function, and the specific form is Equation (1)  According to the foregoing analysis, the long-term component of crude oil futures price fluctuations has a strong correlation with the fluctuations of its influencing factors. Equation (3) is replaced, and GARCH-MIDAS models based on the level and volatility of their influencing factors are established, respectively. For horizontal effects, where is the maximum lag order of the measured level of a certain macroeconomic variable and , represents the level value of the macroeconomic variable lagging k period relative to period t. For the volatility effect, the volatility value of the explanatory variable must first be obtained. This article considers the use of AR(p) autoregressive model to obtain residual items, and the square of residuals is used to measure the volatility of macroeconomic variables. Thus, where is the maximum lag order of the measured fluctuation value of a certain macroeconomic variable and , represents the fluctuation value of the macroeconomic variable relative to period t lagging by period k.
This article no longer considers the influence of the low-frequency volatility of the dependent variable.
When the influencing factors include two or more than two, the resulting mixed model is a multifactor GARCH-MIDAS model. The multifactor mixed model is different from the single-factor mixed model in the setting of long-term components. It is a model that includes the level effect and fluctuation effect of all sequences at the same time: In this way, Equation (1), (2), (5), and (8) together constitute the multifactor GARCH-MIDAS model.
At the same time, in order to make full use of high-frequency influencing factor information, this paper considers the improvement of the classic GARCH-MIDAS model, adopts the method of realized volatility, and incorporates high-frequency macro-factors into the long-term component

equation. Let
, be the logarithmic return rate of a high-frequency macro variable on day i of month t: Here, because the realized volatility method is used to convert the high-frequency macroeconomic sequence into low-frequency volatility, and represent the level value and volatility value of the high-frequency macro-variable returns, respectively. The corresponding longterm composition equation is as follows: Therefore, Equation (1), (2), (5), and (9)-(11) together constitute a multifactor improved GARCH-MIDAS model incorporating high-frequency explanatory variables.
In particular, since there may be negative numbers in the macro data, the long-term component equation is taken in logarithmic form to ensure that the variance is nonnegative.
Thus, Equation (6) can be replaced by Equation (7) can be replaced by Equation (8) can be replaced by Equation (11) can be replaced by

GARCH-MIDAS-Skewed-t Model
In the traditional GARCH-MIDAS model, the residual items t , i  obey the standard normal distribution; that is, , |Φ , ~N(0,1) the traditional normal distribution of which does not well deal with the "spikes and thick tails" and biases that exist in financial time series data. Hansen (1994) et al. introduced the skewed-t distribution into the ARCH family model and found that, when the residual term distribution is assumed to be a skewed-t distribution, the fitting effect significantly outperforms the fitting effect when the residual term is a normal distribution [47]. With reference to Hansen et al., this paper introduces the skewed-t distribution as the distribution state of the residual items in the GARCH-MIDAS model. The density function of the standard Skewed-t distribution is as follows: When b a Z /   , the density function of the skewed-t distribution is When b a / Z   ,the density function of the skewed-t distribution is , and where  is the degree of freedom. When  is greater than 0, the density function is skewed to the right; when  is less than 0, the density function is skewed to the left; if  is equal to 0, the skewed-t distribution is the standard t distribution, the mean is 0, and the variance is 1. As far as the characteristics of the distribution function are concerned, although the skewed-t distribution can better characterize the "spikes and thick tails" of financial data and its biased phenomenon than the standard normal distribution, this conclusion is not yet verified in the research on the GARCH-MIDAS model. Therefore, it is necessary to introduce the skewed-t distribution of the residual term in the GARCH-MIDAS model.

Variables and Data Description
This section will first define the variables used in the empirical research of this article and describe their basic characteristics; the GARCH-MIDAS empirical model of the crude oil futures market will then be constructed, and empirical analysis will be conducted. The research results will be compared based on different data types (same frequency and mixed frequency). The GARCH model based on the same frequency data will be constructed, and the analysis of the volatility effect will be conducted. Finally, whether the analysis is based on the mixed data or the same frequency data, the overall analysis thinking from single factor to multiple factors will be followed.

Variable Definition
The volatility of crude oil futures prices refers to the rise and fall of crude oil futures prices within a certain period. The study of international crude oil price fluctuations should be based on the study of volatility factors, and the volatility effect which is the volatility risk of price returns should be further studied. At present, in the international crude oil pricing system, Brent crude oil futures and the Brent crude oil spot market have the most significant influence. About 70% of the crude oil spot transaction price reference is Brent crude oil. Two-thirds of the global crude oil market sets the price of Brent crude oil as pricing benchmarks, including China's refined oil price reference mechanism.
(1) Volatility variables The paper selects Brent light crude oil futures (Brent) (USD/barrel), one of the most important international crude oil futures, as the representative of international crude oil futures. The data came from the website of the US Department of Energy Information Administration (EIA, https://www.eia.doe.gov). The daily closing price of Brent is used to construct the volatility indicator of the oil market, and the logarithmic return is taken as the explained variable (RF). The variable range is from January 2008 to December 2017, including multiple stages of the international crude oil spot price surge, plummet, recovery, high rise, and continuous decline in the past 10 years, which makes the research conclusions credible.
Regarding the choice of low-frequency and high-frequency explanatory variables, three aspects are mainly considered: theoretical relevance, existing literature practices, and data availability.
(2) Low-frequency explanatory variables OPEC crude oil production (OPEC) and OECD crude oil consumption (OECD) were selected to represent crude oil supply and crude oil consumption demand, respectively [48,49]; US crude oil commercial inventory (Inventory, hereinafter referred to as US crude oil inventory) was selected to represent crude oil inventory level [50]; Australia's Newcastle Port Power Coal spot offshore price (Cspp, hereinafter referred to as coal price) and the Henry Hub natural gas spot price (GasSpotPrice, hereinafter referred to as natural gas price) were selected to represent alternative energy sources as explanatory variables [51]; WTI crude oil noncommercial arbitrage holdings (Wtioi, hereinafter referred to as noncommercial Sexual arbitrage positions) represent speculation as an explanatory variable [52].
Among the above variables, data on US crude oil inventories, OPEC crude oil production, OECD crude oil consumption, and Henry Hub natural gas spot price data came from the US Department of Energy Information Administration (EIA) website. The FOB (Free On Board) spot data of thermal coal of Australia's Newcastle Port and the WTI crude oil noncommercial arbitrage holdings are derived from Wind Database (https://www.wind.com.cn). The low frequency indicators are all monthly data, and the interval is from January 2008 to December 2017.
(3) High-frequency explanatory variables In the process of selecting explanatory variables, in addition to considering low-frequency variables, high-frequency variables (daily data) are also considered in order to increase the sufficiency of the explanation. It needs to be pointed out that high frequency and low frequency are relative concepts, not absolute. Considering the relevance of the indicators and the availability of data, we selected the closing price of the US dollar index [53], the trading volume of Brent crude oil futures [54], the open interest of Brent crude oil futures (empty volume) [55], the Henry Hub Natural Gas Spot Price (Dollars per Million Btu) [32], and the Europe Brent Spot Price FOB as explanatory variables [56]. Among them, the US dollar index (DI), the Brent crude oil spot price (PS), and the natural gas price (GP) represent high-frequency factors outside the market, and volume (V) and open interest (OI) represent high-frequency factors on the market. The reasons why natural gas price might be an influencing factor of oil price worth mentioning. Researchers started the empirical analysis by splitting the data into subsamples, before and after the start of shale gas, and test for cointegration in the second subsample which was during the period from 2007 to 2013. Analyzing the impulse response functions of the three fitted models can note a relevant difference between the first and second sample. The impact of gas (or gas quantity) shocks is persistent on oil (or oil price) in the second sub-sample [32]. The data comes from Wind. The high-frequency data are all daily data, and the sample interval is from 1 January 2008 to 31 December 2017.
Variable description and data sources are shown in Table 1. The rate of return of oil futures prices selects the logarithmic rate of return RF, that is, the logarithm of the current oil futures price minus the logarithm of the previous period's futures price. The calculation formula is (21) where PFt represents the Brent crude oil futures price on day t; PFt−1 represents the Brent crude oil futures price on day t−1.
It can be seen from Figure 1 that the trend of Brent crude oil futures prices and spot prices are almost the same, showing good consistency. From January 2008 to July 2008, the Brent crude oil futures price and the spot price were in the rising range and reached the highest point in history. From August 2008 to March 2009, they were in a downward trend. By July 2014, it entered the turbulent rise range, and then gradually fell, and rebounded after the end of 2017.  Then, we will further discuss the basic statistical characteristics of Brent crude oil futures prices, spot prices, and futures yields. The results are shown in Table 2.
, where S is skewness and K is kurtosis; [ ], the value in brackets is the probability value (p value); coefficient of variation (CV) = Std. Dev./Mean.
Judging from the basic statistical characteristics of each series in Table 2: ① The average value of Brent crude oil futures returns is negative. In other words, for a long futures investor, the average investment income during the sample period is negative. In fact, if the transaction costs (such as transaction fees, the cost of funds occupied by margin, etc.) are considered, the futures yield is even lower. Obviously, the average value of Brent crude oil spot and futures price series is greater than zero. ② Nonnormality of sequence distribution. A common assumption in financial theory and empirical analysis is that the rate of return obeys a normal distribution. However, the actual financial rate of return often obeys a nonnormal distribution. The Brent crude oil futures prices, spot prices, and futures returns are no exception, the statistics in Table 2 illustrate this point. The skewness of the normal distribution should be equal to 0, that is positive skewness while the skewness value of the Brent crude oil futures return series is greater than 0, and that is negative skewness while the skewness of the Brent crude oil futures price and spot price series is less than 0. The kurtosis of spot prices and futures prices is less than 3, while the kurtosis of the Brent crude oil futures return sequence is greater than 3, indicating that it is a kurtosis distribution and fat-tailed distribution. The thick-tailed distribution means that there are many sample values that deviate from the sample average by a large margin, that is, the absolute value of the rate of return is larger than expected.
If the rate of return is normally distributed, then the Q-Q chart should be a straight line. In fact, the Q-Q diagram is an S-shaped curve, which shows that it has the characteristics of sharp peaks and thick tails compared to the normal distribution. The test statistics of JB (Jarque-Bera) for all series are also much larger than the critical value (5.992). Based on the above results, it can be seen that the sequence has the characteristics of sharp peaks and thick tails, and does not obey the normal distribution.
③ From the perspective of coefficient of variation (CV), the volatility of Brent crude oil spot prices is slightly higher than that of futures prices. In addition, because the mean value of Brent crude oil futures returns is negative, its CV value is less than zero, but the absolute value of CV is relatively large, indicating that the fluctuations are relatively severe. The stationarity of each high-frequency original sequence is then tested, and it is found that they are all first-order single integer sequences. As the time series put into the GARCH or ARMA model must be a stationary series, this article calculates the logarithmic difference of each series according to the processing method of the crude oil futures return sequence, that is, the return rate is modeled.
In the extraction of volatility, this article adopts the Schwert (1989) method, using explanatory variable logarithmic difference sequence to construct an AR(p) model, and the residuals obtained are then squared and used as the volatility of each variable. The wave dynamic potential of each variable is captured, and the AIC information criterion and the SC information criterion should be comprehensively considered as for the lag order p of the AR(p) model.

Low-Frequency Explanatory Variables
This article analyzes the influence of various factors on the price fluctuation of crude oil market from multiple aspects such as level effect, volatility effect, and the level and volatility effects combined. Therefore, it is necessary to process the data of each low-frequency explanatory variable. In addition, when constructing the GARCH-MIDAS model, the seasonal effect adjustment (X12 addition adjustment) was carried out before modeling because monthly data was selected.
In the extraction of volatility, this paper adopts the Schwert (1989) method, using the original data of explanatory variables (high-frequency variables) or seasonally adjusted data (low-frequency variables) to construct AR(p) models, and the residuals obtained are squared and used as the volatility of each variable to capture the wave dynamics of each variable. As for the lag order p of the AR(p) model, the AIC and the SC are comprehensively considered.
In addition, the statistics of the JB test of all sequences are statistically significant at the 10% significance level, indicating that all variables do not conform to the normal distribution. Therefore, when constructing the GARCH-MIDAS model, a standard GARCH-MIDAS model based on the normal distribution and the GARCH-MIDAS-skewed-t model with multiple explanatory factors (hereinafter referred to as GARCH-MIDAS-t) should be constructed. On the basis of the estimated results, a comparative analysis should be made.
In order to further determine the accurate form of the error distribution, this section first uses the GARCH (1,1)-n model to generate the standard residuals, and analyzes and judges the residuals and the normal distribution Q-Q diagram. Through a comprehensive analysis, the ideal distribution of the error distribution of the GARCH model is the T distribution. Therefore, the GARCH (1,1)-t model has improved accuracy and simplicity for fitting the volatility of Brent futures return series. Therefore, it is reasonable to choose the GARCH (1,1)-t model as the benchmark model for this empirical research.

Level Effect of Single Factor
(1) Level effect of low-frequency variables Tables 3 and 4 give the estimation results of the mixed model for the level value of a single factor. l  reflects the level effect of each influencing factor on the long-term component of the highfrequency volatility, and l  reflects the optimal estimation weight in the single-factor mixed level value model. The coefficients l  examine the impact of the level of individual factors or the increase in the rate of return on the volatility of the crude oil market; that is, when l  is positive and statistically significant, the increase of the factor will promote the Brent crude oil futures market fluctuation. When l  is negative and statistically significant, the increase in this factor will reduce the volatility of the Brent crude oil futures market.  crude oil production is less than zero. ② The crude oil inventory coefficient is significantly greater than zero, and the US crude oil inventory level is positively correlated with the volatility of crude oil futures prices. Crude oil inventory is a part of the crude oil demand and the potential supply of crude oil next year. The higher its value, the higher the volatility of crude oil futures prices. Similarly, the OECD is the main demand representative of the crude oil market. The increase in OECD crude oil consumption will cause a significant increase in crude oil market volatility, which is consistent with economic theory. ③ As a kind of clean energy, natural gas has a wider scope of promotion. As one of the important substitutes for oil, the increase in natural gas prices is also one of the factors driving the fluctuation of crude oil prices. Similar to natural gas, coal is also an important substitute for oil. It has a positive impact on the fluctuation of crude oil prices. Due to coal's substitution role for crude oil requires conditions, its advantage of replacing crude oil will not exist when the price of coal gradually increases. Therefore, it can be seen from the estimation results that the impact is not sufficiently significant, the coal price level has which on the fluctuation of crude oil prices in the long run. ④ Crude oil noncommercial arbitrage holdings represent speculative factors in the crude oil futures market. From the estimation results, the coefficient l  of Wtioi is greater than zero, but the statistics are not significant, indicating that speculative factors are not important factors for international oil price fluctuations for the global crude oil market. ⑤ Unlike others, the coefficient l  of OPEC crude oil production is less than zero and statistically significant, indicating that the increase in OPEC crude oil production will weaken the volatility of Brent crude oil futures. OPEC crude oil production, as the main supplier of the oil market, has a significant impact on price fluctuations in the international crude oil market. When the OPEC crude oil production value is high, the global crude oil market is adequately supplied, and international oil prices which has the effect of stabilizing oil prices will be more stable. Based on the above analysis, in terms of horizontal effects, crude oil production, consumption, inventory, and natural gas prices are the main factors that have a major impact on the fluctuation of Brent crude oil futures prices after the financial crisis in addition to coal spot prices and speculative factors.
(2) Level effect analysis of high-frequency variables In the process of selecting explanatory variables, in addition to considering low-frequency variables, the influence of high-frequency variables is also considered in order to increase the sufficiency of the explanation.
This article combines indicator correlation and data availability and selects five high-frequency variables including the closing price of the US dollar index (USD), Brent crude oil futures trading volume (10, From the estimated results in Table 4, we can see the following:   Table 4, the following can be concluded: is the basis of the futures price. Based on the estimation results, the Brent crude oil spot price income has a significant negative effect on the futures price fluctuation; that is, the spot price is negatively correlated to the volatility of the futures market, especially when the spot price is high, the futures price will also be high, and the relative volatility will be weaker. Therefore, from the analysis results of high-frequency factors, the trading volume in the highfrequency factors on the market and the spot price in the high-frequency factors outside the market have a significant explanatory effect on the price volatility of the Brent crude oil futures market. In addition, based on the estimation results in Tables 3 and 4, the Brent crude oil futures market has a strong GARCH effect. Various GARCH-MIDAS models reflect the adaptability of the mixed level model in describing the fluctuations of the Brent crude oil futures market. Table 5 and 6 are the estimation results of the volatility effect of low-frequency and highfrequency variables, respectively. The coefficients v  in each equation examine the volatility of a single factor or the impact of increased volatility on the volatility of the crude oil futures market; that is, when v  is positive and statistically significant, the increase in the volatility of this factor will promote the volatility of the crude oil futures market. When v  is negative and statistically significant, the increase in the volatility of this factor will reduce the volatility of the crude oil futures market.  (1) Analysis of volatility effects of low-frequency variables The volatility effects of low-frequency variables are estimated as shown in Table 5.

Single Factor Volatility Effect
From the estimated results in Table 5, we can see the following: ① The α + β values of each model are very close to 1, indicating that each model has a relatively strong ARCH effect and volatility persistence. ⑤ The increase in the volatility of crude oil inventories, crude oil consumption, crude oil production, substitute prices, and speculative factors will increase the volatility of the Brent crude oil futures market. Relatively speaking, the impact of substitutes is not sufficiently obvious.
(2) Analysis of volatility effects of high-frequency variables The volatility effects of high-frequency variables are estimated as shown in Table 6. From the estimated results in Table 6, we can see the following:

Analysis of Combination Effects of Low-Frequency Variables
The combined effect analysis of low-frequency variables is shown in Table 7 and 8.  According to the estimated results in Table 8, the following conclusions can be drawn: ① The average BIC value of the low-frequency factor combination effect model is equal to Overall, the horizontal effect of natural gas prices is more significant when studied separately, and the horizontal effect of coal prices is more significant when studied in combination. As two important substitutes for oil, there is a strong correlation between these substitutes' horizontal price and the volatility of crude oil prices. ④ From the perspective of speculation: Generally speaking, speculative factors are one of the influencing factors of financial market price fluctuations. According to the stratification and combination research, the stronger the market speculation, the stronger the volatility of crude oil futures. When the number of speculative factors increases, the speculative power in the market is increasing, and the increase in speculative power will cause frequent price changes to a certain extent, which intensifies market volatility in the crude oil market. However, from the perspective of crude oil, as a global bulk product, speculative factors are not the main factor in market fluctuations.
⑤ From the supply and demand level: From the perspective of the level effect coefficients l  and volatility effect coefficients v  of Inventory, OPEC, and OECD, it is mainly supply and demand that can affect the fluctuation of crude oil prices. Current consumption and current inventory both represent the current demand for crude oil, and OPEC production represents the current crude oil output, that is, the current supply. Regarding the value and significance of the estimated coefficient, whether it is a single analysis or a combined effect study, supply and demand are both significant factors, which is consistent with economic theory. Generally speaking, when the supply increases, the price of crude oil tends to fall. When the supply is high, the crude oil market has sufficient supply and the market is stable, so the volatility of crude oil prices is low. The increase in demand for crude oil will increase the price of crude oil. When the demand for crude oil is high, its changes will increase the volatility of crude oil prices. Based on the results of the hierarchical and combination analysis of low-frequency factors, the most important factor affecting the volatility of international crude oil futures returns is the supply and demand of crude oil. In the long run, other factors such as alternative energy sources, speculative factors, inventory changes and other long-term factors will eventually be reflected in the supply or demand level of the crude oil market; in the short term, on-market and off-market factors will affect short-term price fluctuations of the crude oil futures, but in fact they still affect the short-term supply and demand of crude oil. Therefore, supply and demand are the most important factors in the fluctuation of international crude oil prices.

Analysis of the Combined Effect of High-Frequency Variables
Analysis of high-frequency variables is shown in Table 9 and 10. The estimated results according to Table 10 are discussed below:  consistent with the results of the hierarchical analysis, but the significance of the parameters is different. The DI volatility effect changed from insignificant to a 5% significance level, the GP level effect changed from insignificant to a 1% significance level, and the PS level effect changed from significant to insignificant. In contrast, the DI level effect and the volatility effect of GP and PS are stabler. From the estimated results of the combined effect, the level value of GP is positively correlated with the volatility of crude oil futures. As one of the substitutes for crude oil, the higher the price of natural gas, the greater the market demand for natural gas or the decrease in supply, which will cause a change in the price of crude oil. The volatility of natural gas prices (up or down) is positively correlated with the volatility of crude oil futures prices. After considering the level effect of DI, the volatility effect of DI is significant in the portfolio analysis, indicating that the correlation between the fluctuation of the US dollar index and the fluctuation of crude oil is enhanced after combining the level of the US dollar index. Changes in the exchange rate of the US dollar will cause fluctuations in the nominal price of crude oil, thereby increasing the volatility of crude oil futures. Whether it is a layered analysis or a combination effect analysis, the volatility effect of spot prices is always significantly positive, reflecting the close connection and linkage between the Brent crude oil futures market and the spot market. Based on the results of the hierarchical analysis and combination analysis of high-frequency factors, both high-frequency on-market and off-market factors are important factors in the volatility of international crude oil futures returns, especially the on-market trading volume of the highfrequency factor, the off-market natural gas price, and the crude oil spot price of the high-frequency factor.

Analysis of the Multi-factors Associative Effect
According to the results of the aforementioned single-layer and combination analysis, there is a big difference between the results of single-factor single-effect analysis and combination effect analysis. This result is related to the type and number of factors included in the model. The modeling in the previous part is based on similar factors (a low-frequency or high-frequency factor), without considering the influence of other factors. In order to comprehensively consider the influence of the combination of different types of factors on the price volatility of Brent crude oil futures, this article attempts to conduct a multifactor combination analysis. However, if multiple combinations of various factors are considered, the number of combinations will be very large. Therefore, a combination of low-frequency and high-frequency factors is considered in a model.
According to the previous analysis, low-frequency factors such as the US crude oil inventory (Inventory), OPEC crude oil production (OPEC), and OECD crude oil consumption (OECD) and trading volume of high-frequency factors (V) have significant impacts on the volatility of Brent crude oil futures prices, and their conclusions are stable. Therefore, constructing a combined effect model from certain low-frequency factors (Inventory, OPEC, OECD) and high-frequency factors (V). The estimation results of the combined effect model of high-frequency and low-frequency factors are shown in Tables 11 and 12.  Firstly, in order to intuitively judge the estimation ability of the above-mentioned comprehensive low-frequency and high-frequency mixed effect model and the corresponding cofrequency effect model, the graphs of the realized volatility RVF estimated by the corresponding model and the realized volatility RV calculated using the original data were compared (graphics omitted). It was found that the RV and RVF of different models are basically consistent, indicating that the prediction effects of the models are acceptable.
Secondly, the MAE and RMSE values of each model are basically the same, indicating that the estimated RV and RVF errors of different models are basically the same, and the fluctuation trends are basically the same.
The out-of-sample prediction method was used to test the prediction accuracy of the GARCH-MIDAS model. The sample interval of the high-frequency variables in this article is from 1 January 2008 to 31 December 2017, and the total sample size is 2459 (excluding nontrading days). The sample includes 2376 data for modeling between 1 January 2008 and 31 August 2017, and 83 data for forecasting between 1 September 2017, and 31 December 2017. The developed GARCH-MIDAS model was used to calculate the MAE and RMSE indicators inside and outside the sample according to the results of parameter estimation to examine the prediction effect of the model. MAE and RMSE were used to compare the error between the estimated realized volatility and the actual realized volatility; the smaller the value, the more accurate the model's estimate of the volatility. Table 13 shows the estimated results of the predicted parameters of each model. According to Table 13, the MAEout value of various models is lower than MSEin, and the RMSEout value is lower than RMSEin, indicating that the parameters estimated by the fitting sample have a good estimation effect on the data outside the fitting sample, and the model does not have the problem of overfitting. Parameter estimation has a strong ability to predict the volatility of financial time series. In addition, like the previous analysis, the MAEin and RMSEin parameter estimation results of the various models in the sample are not much different, and the mixed-factor model slightly outperforms the same frequency effect model. From the out-of-sample MAEout and RMSEout parameters, the mixed factor model also outperforms the same frequency effect model.
Finally, according to the quantitative indicators of each model in Table 13, comparing the high and low frequency model with the model before combining, it was found that the combination of LLC, BIC and H criteria in the hybrid low-frequency and high-frequency combination model considers more different types of factors that affect crude oil price fluctuations, and contains more effective information that affects the long-term components of high-frequency volatility and therefore reflects the advantages over the same-frequency factor combination model in terms of estimation accuracy. For example, the InventoryL+Inventory V+VL+VV model is composed of the Inventory level and volatility combined effect model and the volume combined effect model. Comparing this with the InventoryL+InventoryV model and the VL+VV model, it was found that its BIC value is −7.0465, lower than the other two types models of co-frequency combination effect, and its H value is 0.017, which is also lower than the other two models of cofrequency combination effect. It can be seen that the combined effect model, which combines two different factors of low frequency and high frequency, outperforms the model that only considers the combined effect of the same frequency factors. The comparison results of other models are basically the same, so they are not repeated here.
Regarding the impact of various variables on the volatility of Brent crude oil futures, comparing Table 13 with the estimated results of the corresponding model, although the significance of the individual models has changed, the conclusion of the study on the impact of crude oil futures volatility is not changed, and the parameter symbols are consistent with the research results of the comparison model, which reflects the advantages of the high and low frequency mixed effect model in the stability of the estimation results.

Summary and Contrast of Fluctuation Effects Estimation Results of Mixed Frequency and the Same Frequency Data Models
This section will summarize and compare the volatility effects of crude oil futures prices analyzed by GARCH and GARCH-MIDAS models.

Summary and Comparison of Estimation Results of Crude Oil Futures Price Volatility Effects
As the GARCH model is the same frequency data modeling, and the GARCH-MIDAS model is the mixing data modeling, in order to compare these two models, the estimate results of the highfrequency data level and volatility combined effect model in the GARCH-MIDAS model (Table 9) and the independent effect model in the GARCH model (Table 14) were compared and analyzed (Table 15).  Regarding the direction of action, there is only a difference between the GARCH and GARCH-MIDAS models in the direction of the horizontal effect of the position volume. The GARCH model estimation result is negative, while the GARCH-MIDAS estimation result is positive but not significant, but the estimation result of the position fluctuation effect model is consistent with that of the GARCH model, that is, negative (not significant). The level of other variables and the direction of action of the volatility effect are consistent.
From the results of the significant difference, the significance of the volume fluctuation effect, the level and volatility effect of open interest, the dollar index level effect, and the spot price level effect have undergone significant changes. There is no significant difference between the level effect and the volatility effect of other variables.

Comparison of Estimation Accuracy between GARCH and GARCH-MIDAS Models
Furthermore, in order to compare the difference in estimation accuracy between GARCH and GARCH-MIDAS-t models, the GARCH-MIDAS level and volatility effect combined model was compared with the GARCH model.
Firstly, according to the GARCH model estimation results of the combined effect of volume level and volatility, the MAE and RMSE values were calculated; then according to the estimated value of the corresponding GARCH-MIDAS-t model, divide the MAE (RMSE) based on the GARCH-MIDASt model by the MAE (RMSE) of the GARCH model of the same sample to calculate RMAE (relative ratio value of MAE) and RRMSE (relative ratio value of RMSE),which can be used as the inspection index. When the RMAE (RRMSE) value is less than 1, it means that the GARCH-MIDAS model has the effect of improving the accuracy of model prediction and estimation compared with the GARCH (1,1) model.
According to the calculation results (Table 16), the RMAE and RRMSE values of trading volume, open interest, dollar index, natural gas price, and crude oil spot price are all less than 0.05. It shows that the GARCH-MIDAS-t model, compared with the GARCH model, has a very obvious role in improving the accuracy of prediction and estimation, which reflects the advantages of the mixed frequency model compared to the same frequency model.

Conclusions
Crude oil is the most important energy product in the world and one of the major bulk commodities in the international market. It has a very close relationship with the economic and social development of all countries in the world and has attracted the attention and participation of many capitals. Given the impact of two oil shocks in the 1970s, led by OPEC who controls the pricing power on the economies of the developed countries in the West, in order to reduce the international crude oil spot price fluctuations, the United States and Britain in 1983 and in 1988 launched the WTI crude oil futures trading and the north sea Brent crude oil futures trading, hoped to ensure their goals of crude oil price security, and to grasp the international crude oil pricing power again through two basic functions of the futures market price discovery and hedging. In this way, the price of crude oil futures is not only affected by traditional factors, but also closely related to financial factors, resulting in frequent price fluctuations and very complex patterns, which has become a widespread concern for the government, academia and industry. In this paper, a GARCH-MIDAS-Skewed-t model based on mixed data was constructed, and the effects and differences of each single factor, multiple factors, and their combinations on the volatility of Brent crude oil futures were examined.
(1) The crude oil futures series and the high-frequency explanatory variable series do not follow a normal distribution and have "spikes and thick tails" and the phenomenon of bias. Therefore, this paper combines the skewed-t distribution and the GARCH-MIDAS model to construct a new GARCH-MIDAS-Skewed-t model. Empirical results show that, compared with the GARCH-MIDAS model of the normal distribution, the GARCH-MIDAS-Skewed-t model can better describe the volatility of the Brent crude oil futures return series.
(2) Regarding single-factor level effects, among the low-frequency factors, in addition to coal spot prices and speculative factors, crude oil production, consumption, inventory, and natural gas prices are the main factors that have a major impact on the fluctuation of Brent crude oil futures prices after the financial crisis. Although coal is also one of the important substitutes for petroleum, the substitution effect of coal on crude oil requires certain conditions. When the price of coal gradually increases, its advantage of replacing crude oil will not exist. In addition, from a long-term perspective, speculation factors are not the main factor in the price fluctuation of crude oil futures. Regarding the level effect of high-frequency factors, the trading volume in the high-frequency factors on the market and the spot prices in the off-market high-frequency factors have a significant explanatory effect on the price volatility of the Brent crude oil futures market. As trading volume reflects market liquidity and trading activity, and spot prices are the basis of futures prices. therefore, from a short-term perspective, trading volume and spot prices have more important impacts on price fluctuation in the crude oil futures market. This is consistent with the analysis result of the existing theory.
(3) Regarding single-factor volatility effects, among the low-frequency factors, the volatility of all low-frequency factors is positively correlated with the volatility of Brent crude oil futures returns, but the effects of coal and natural gas spot price models are not statistically significant. The volatility of crude oil inventories, crude oil consumption, crude oil production, substitute prices, and speculative factors will increase the volatility of the Brent crude oil futures market. From a long-term perspective, relatively speaking, the impact of substitutes is not sufficiently obvious, indicating that the main factor affecting the volatility of Brent crude oil futures is its own demand and supply. Regarding the high-frequency data, whether it is on-market high-frequency factors or off-market high-frequency factors, the volatility of all high-frequency factors will increase the volatility of Brent crude oil futures. In addition to open interest and the US dollar index, the other three factors, namely crude oil spot price, substitute price, and trading volume, are all significantly positively correlated with the volatility of Brent crude oil futures, indicating that the volatility of Brent crude oil futures are susceptible to both on-market and off-market factors.
(4) Regarding the combined effect of single factor level and volatility, considering the lowfrequency data, the most important factor influencing international crude oil futures return volatility is the supply and demand of crude oil, while other factors such as the change in alternative energy and speculative factors, will ultimately be reflected in the supply or demand level of the crude oil market. Therefore, supply and demand are the most important factors in the fluctuation of international crude oil prices. This shows that the financial crisis has not changed the long-term core position of supply and demand. Supply and demand are the basic cause of crude oil fluctuations. However, supply and demand are quietly undergoing structural changes: Firstly, the financial crisis has accelerated the trend of crude oil pricing power from being concentrated to decentralized. Before the financial crisis, the international pricing power of crude oil was basically concentrated in the hands of developed countries such as Europe and the United States. After the outbreak of the economic crisis in 2008, the economic strength and financial control of the developed countries was weakened. On the issue of international oil pricing, OPEC, Europe, America, Japan, China, and Russia all have a certain degree of discourse in international market pricing. The United States and the Organization of Petroleum Exporting Countries have both similarities and conflicts of interest. Secondly, the shale oil and gas revolution in the United States, Russia's huge oil reserves, and China's huge oil imports are all affecting international crude oil prices. It can be seen from this that, after the financial crisis, the tendency of international market pricing power to shift from concentration to decentralization is relatively obvious, and it is concluded that crude oil production and consumption have a major influence on the price fluctuation of Brent crude oil futures.
Regarding high-frequency data, high-frequency on-market and off-market factors are important factors in the volatility of international crude oil futures returns, especially the trading volume of the on-market high-frequency factors, and the natural gas price and crude oil spot price of the off-market high-frequency factors. There are two main theories on the relationship between price fluctuations and trading volume in financial market: one is the assumption of continuous information arrival; the other is the hypothesis of mixed distribution. The assumption of continuous information arrival holds that market information is spread out step by step, and traders get information step by step. Only when all traders get relevant information, the market can reach the final equilibrium. There is a positive correlation between price fluctuations and trading volume, and volume information is helpful for forecasting price fluctuations. The mixed distribution hypothesis holds that trading volume and price fluctuations both depend on the same variable which can be described by the speed of market information transmission. When market information changes, price and trading volume respond to the information at the same time, that is, the mixed distribution hypothesis holds all traders receive new information and respond at the same time. The new balance point does not need to pass through the intermediate balance point and is completed in an instant, and the price changes and the adjustment of trading volume are carried out simultaneously. According to this theory, the sequence-related information arrival process causes the ARCH behavior in the futures rate of return, namely the continuity of volatility, and the futures contract transaction volume can be used as a proxy indicator of information arrival, explaining the time-varying nature of the variance of the return rate. Investors can take advantage of the fact that there is a correlation between futures price fluctuations and trading volume, and using the joint information between crude oil futures price fluctuations and trading volume can improve the accuracy and effectiveness of analysis and inference; it also helps to understand the role of trading volume and open interest in futures price fluctuations, and strengthen risk management in the crude oil futures market.
Based on the above conclusions, the fundamental reason for crude oil price fluctuations is the increasing diversification of dominant forces, and the direct cause is that the game of pricing power has triggered the structural disorder of the crude oil supply and demand system. A historical investigation of the structural transformation process of the crude oil market found that: spatial isolation between scarcity and supply and demand, small demand elasticity and supply which is vulnerable to be monopolized, the contradiction between supply and demand caused by strong demand growth, the gap control of storage, supply, demand, and inventory in the transaction process, changes in the price of petrodollars, excessive speculations by traders, and the impact of emergencies all have triggered violent and frequent fluctuations in crude oil prices. Crude oil futures prices are the combination of natural laws (laws of sustainable development), micro-market laws, financial market laws, and emergencies and policy regulations in different time and space. According to the scope of action, degree of influence, and frequency of occurrence, the main controlling factors that affect crude oil futures price fluctuations are mainly divided into three categories: First, the basic factors that affect the medium and long-term trend, mainly including crude oil supply, crude oil demand and crude oil transaction settlement currency, while basic factors have a direct impact on the international crude oil futures prices; second, structural factors that affect the short-term trend, mainly include crude oil inventories in the spot market and changes in the prices of substitutes (natural gas, coal), changes in positions (arbitrage and speculation, etc.) on the financial market (future market); third, incidental factors that trigger short-term fluctuations, mainly including international political turbulence, geopolitical events, wars, and sudden climate changes. In the market, the various factors and categories of crude oil prices are independent and interactive with each other, showing integrity, organic correlation, dynamic development, orderliness, and complexity, forming a "transaction-market-environment" system. In this system, supply, demand, and crude oil transaction settlement currency are the underlying factors of crude oil as a commodity transaction, interacting in the transaction subsystem of the system; crude oil spot inventory in the commodity market, alternatives to coal and natural gas, on-site factors such as hedging and speculation in the crude oil futures market complete the structural activities in the market subsystem; international politics and economic turmoil, conflicts and wars in major oil producing areas, and extreme abnormal climates constitute the external environmental subsystem in which information is fully exchanged, therefore triggered sharp fluctuations in crude oil futures prices.
(5) Regarding high-frequency and low-frequency combined effects, although the results of single-factor and multifactor models are different, there is no significant difference in general. The research results are basically the same, reflecting the advantages of the mixed effect model of high and low-frequency data in the stability of the estimation results. This shows that with the advancement of technology, the trend of information integration will become more and more obvious. Therefore, in the management of crude oil futures volatility risk, attention should be paid to the rich information contained in high-frequency data, especially in risk hedging, which can play more important role.