Portfolio Value at Risk Estimate for Crude Oil Markets: A Multivariate Wavelet Denoising Approach

In the increasingly globalized economy these days, the major crude oil markets worldwide are seeing higher level of integration, which results in higher level of dependency and transmission of risks among different markets. Thus the risk of the typical multi-asset crude oil portfolio is influenced by dynamic correlation among different assets, which has both normal and transient behaviors. This paper proposes a novel multivariate wavelet denoising based approach for estimating Portfolio Value at Risk (PVaR). The multivariate wavelet analysis is introduced to analyze the multi-scale behaviors of the correlation among different markets and the portfolio volatility behavior in the higher dimensional time scale domain. The heterogeneous data and noise behavior are addressed in the proposed multi-scale denoising based PVaR estimation algorithm, which also incorporatesthe mainstream time series to address other well known data features such as autocorrelation and volatility clustering. Empirical studies suggest that the proposed algorithm outperforms the benchmark ExponentialWeighted Moving Average (EWMA) and DCC-GARCH model, in terms of conventional performance evaluation criteria for the model reliability.


Introduction
With the increasing level of global economic integration and the development of financial market, in the crude oil markets we have seen a more volatile as well as more integrated environment [1,2].On the one hand, the market is influenced by increasingly complicated factors, aside from the fundamental causes such as the basic supply and demand relationship.This is exemplified by the recent intensifying financial crisis, which brought higher level of fluctuations as observed in the crude oil markets.The traditional risk measurement tools such as standard deviations and equilibrium analysis can no longer provide sufficient level of risk coverage when potential risk sources are increasingly differentiated and transient data behaviors are more prevalent.On the other hand, markets across the world are increasingly integrated with the development in both its technological infrastructure and deregulation wave.This is also evidenced from the high level of correlations between both WTI and Brent markets crude oil portfolio forms indispensable part of investment decision making process in the crude oil markets.When the multi-asset portfolio is constructed to reduce the risk level through diversification during portfolio allocation, the reliable and accurate risk measurement for the portfolio is essential for the portfolio management and risk management and affects critically the portfolio performance.
Although the risk measurement research in the crude oil markets has drawn significant research attempts over years, majority of innovative research methodologies are developed for the risk measurement for single asset portfolio.Till now there is only limited attempts to explore various theoretical and empirical issues for portfolio VaR estimation in the literature.One reason is that compared to the numerous attempts to model univariate time series, researches directed toward the understanding and modeling of multivariate time series are much more restricted due to the level of technical difficulties involved.Another reason is that portfolio VaR needs to deal with further complication of modeling second moments including correlation and covariance matrix among multiple assets than the single asset VaR, which involves exponentially increasing level of computational complexity.Typical models used in this field include the much restricted Exponential Weighted Moving Average (EWMA) and the multivariate VARMA-GARCH model for portfolio risk measurement.
Meanwhile, the wavelet based approach is also gaining momentum in the economics and finance literature [3][4][5].Firstly, wavelet analysis has been used to identify time varying behaviours of market behaviours, key economic variables and their correlation evolutions.A typical example would be the studies on multi-scale price movements and lead lag relationships in various financial markets [3,[6][7][8][9][10].Empirical studies in these markets reveal the multi-scale structure of prices and evolution of their correlation.Secondly, wavelet analysis is used to smooth data at finer scales.As the magnitude of noise is time varying and the signal to noise ratio (i.e., the threshold level set for the distinction) may be subject to investors' preferences, data smoothness would improve if smoothing is conducted at a finer time scale domain, and thus improved model fitness and performance would result.Empirical studies based on wavelet shrinkage techniques in both electricity and foreign exchange markets have confirmed the potential performance improvement.For instance, Meng et al. [11], Amjady and Keynia [12] and Aggarwal et al. [13] based their neural network forecasting model on wavelet pre-processed data series and obtained positive performance improvement [11][12][13].Thirdly, wavelet analysis has been used to decompose data series for a finer level of modelling accuracy.
Decomposed data series at lower scales are considered dominated by trends and are modelled using econometric approaches.Decomposed data series at higher scales are considered mostly dominated by price volatilities and are modelled using time series approaches.Results of experiments in several energy markets have confirmed the performance improvement [14,15].For instance, Xu et al. [16] attempted to combine wavelet analysis with support vector machine and obtained positive performance improvement in empirical studies on the Australian electricity market [16].Conejo et al. [17] combined wavelet analysis and ARMA model to analyze the Spanish electricity market and obtained positive results [17].
In the crude oil forecasting and risk measurement literature, we have only witnessed limited efforts.Most of them are accumulating quite recently.These attempts mostly use the wavelet analysis to pre-process the data before time series techniques and machine learning techniques are used to analyze the data and make forecasts.For instance, Yousefi et al. [15] used wavelet analysis to decompose crude oil price and extended them directly to make forecasts [15].de Souza e Silva et al. [18] uses wavelet analysis to remove high frequency noises, then uses HIdden Markov Model (HMM) to predict future price movement [18].Naccache [19] uses wavelet analysis analyze the relationship between macroeconomic conditions proxied by Morgan Stanley Capital International (MSCI) and the crude oil price at finer scales [19].We have conducted some exploratory studies on multi-scale based risk measure estimates.He et al. [20] used the Morphological Component Analysis with wavelets as bases to investigate the heterogeneous market microstructure [20].But the literature on the forecasting aspects of crude oil price is limited to this extent so far.
Meanwhile, risk measurement in the crude oil markets remains another important research topic, but has received much less attention compared to the progress made in the forecasting field.Most of current approaches use the multi-resolution analysis capability of wavelet analysis to look into the past price or correlation behaviours in the capital markets [21,22], e.g., it has been used to analyze the risk distribution across differences the scales [23][24][25][26][27][28][29][30].Some researches such as that by Karandikar et al. [31] has gone a step further to estimate VaR with this information taken into account.VaR estimated following the traditional approaches is adjusted by the proportion of risks that corresponds to the investment time horizon [31].These approaches are unique in that VaRs are tailored to investor's investment profile [24].However, backtesting these VaR estimates may involve additional complexities.Unless risks at all levels are taken into account, using these adjusted VaRs to calculate VaR exceedances may result in the underestimated exceedance levels.The reliability may also be statistically overestimated as a result.Aside from the backtesting issue, the performance of these approaches is also constrained by the accuracy of the traditional methodology used.Thus, when it comes to models' reliabilities and accuracies, these approaches are more useful as risk analysis tool than methodologies that bring further performance improvement.Recently He et al. [32] used wavelet analysis to analyze the risk structure and construct the VaR estimation algorithm based on it [32].It is univariate in nature and not directly applicable to the much more complicated multivariate environment.The risk measurement literature is limited to this extent so far concerning the portfolio risk measurement in the multi-scale domain.
Therefore, this paper proposes a novel multivariate wavelet denoising approach for estimating Value at Risk for portfolio in the crude oil markets.Following Heterogeneous Market Hypothesis (HMH), it is assumed that the underlying DGPs vary in their scales, corresponding to different scales of investment.Thus taking different investment scales (i.e., portfolio sizes) as the discriminatory factor, wavelet based denoising algorithm is used to gain insights into heterogeneous market structures categorized roughly on scales of investment.In this paper, we introduce the multivariate wavelet analysis in higher dimensional space to analyze the second moments evolution of crude oil portfolio data.The introduced multivariate wavelet denoising methodology is then used to extract data of distinct characteristics, which are modeled further incorporating the Multivariate GARCH (MGARCH) framework.
The theoretical motivation for the present research is that recognizing these complicated and heterogeneous nature of the crude oil markets, there exists diverse patterns of behavior of different participants, which is the target for the separation and extraction process.These participants need to be categorized and separated from the original data based on their unique behavioral patterns, resulting from different investment horizons and strategies.For example, the behavior of noise traders are temporal and nonlinear, usually resulting from highly speculative behavioral patterns, while the behavior of fundamental investors are characterized with features subject to the extensive studies over years, including auto correlation, heteroscedasticity, mean reverting and long memory, etc.
The practical and technical motivation for the present research is that the challenge to improve estimation reliability and accuracy has been difficult to improve beyond the traditional EWMA and MGARCH models.Recent empirical researches suggest that this is due to its complicated and nonlinear data nature, compared to the linear assumptions adopted before.Besides, crude oil markets typically represent a more volatile and risky environment than the traditional financial markets and demonstrate unique characteristics due to several reasons.Firstly, crude oil is traded less frequently than equities.There are considerable transaction and opportunity costs involved, resulting in higher levels of fluctuations and relatively lower levels of efficiency.Secondly, price movements in crude oil markets are frequently influenced by numerous factors, such as weather conditions, political stability, economic prospects, consumer expectations and business indicators, etc.The complex dynamics involved exhibit multi-scale non-linear characteristics, which are beyond explanatory capabilities of current models and methodologies for forecasting and risk measurement purposes.Despite its significance, forecasting of crude price and the measurement of risk in crude oil markets have not received sufficient attentions, as compared to equities markets.Research in this paper attempts to address the unique characteristics of crude oil markets when constructing useful models for forecasting and risk measurement purposes and to fill the gap in the relevant literature.Meanwhile, wavelet denoising in higher dimension involves directional projection of time series, i.e., multivariate time series can be projected into different directions, normally vertical, horizontal and diagonal.Projections into different directions are associated with different correlations among time series at different time lags.
The key findings and contributions include the following aspects.There have been limited attempts in the literature to model the conditional correlations among portfolio using multivariate GARCH model [33,34].The most related one is Vacha and Barunik [35].However, it ignores the dependence structure among portfolio assets, which can only be investigated in the higher dimension.Besides, it does not deal with the utilization of these time varying correlation information into the modeling of risk evolutions, which this paper addresses using the DCC model in the multi-scale framework.Meanwhile, most research focuses on the single variate VaR estimation, among which the most related one is Cifter [36] that is mainly based on univariate wavelet analysis.This paper instead introduces multivariate wavelet analysis in the estimation of portfolio Value at Risk.Few studies in the literature have investigated the the portfolio risk measurement with time varying correlation.Work in this paper represents one of few attempts to investigate the estimation of portfolio VaR using multivariate wavelet based techniques.
The rest of the paper proceeds as follows: Section 2 provides a brief account and also literature review on the relevant theories used to construct the proposed algorithm.This includes the portfolio value at risk, multivariate GARCH models and multivariate wavelet analysis theories.Section 3 proposes the multivariate wavelet denoising approach for estimating portfolio value at risk, with detailed economic rationale explained and numerical procedures.The performance of the proposed algorithm against benchmark alternatives are evaluated in Section 4. The empirical studies are conducted in the benchmark US WTI and Europe Brent markets and experiment results are reported to justify the superiority of the proposed algorithm.Section 5 provides some concluding remarks.

Portfolio Value at Risk Estimates
The Value at Risk (VaR) is the percentile based risk sensitive criteria [37].
It is the single statistics summarizing the maximal downside risk given the confidence level and holding horizon.
In the more general case of a portfolio of n assets with returns r t , (t = 1, 2, ..., n) and portfolio weight ω t , (t = 1, 2, ..., n), the VaR is defined for a portfolio worth X at the confidence level cl over the holding period h as in (1).
where α = 1 − cl and P (.) is the probability density function, V aR cl is the VaR threshold at the confidence level cl.Similar to the univariate case, the main approaches to estimate portfolio VaR can also be roughly categorized into three categories, i.e., parametric, non-parametric and semi-parametric, but with much less attention than in the univariate case [38,39].
The parametric approach usually refers to a series of models under the well-known variance-covariance framework.Assuming the multivariate normal distribution, the VaR is defined as in (2) [39].
where z a refers to the standard normal distribution, h is the holding period, is the variance-covariance matrix, ω is the weight matrix.
The µ and can be estimated in the conditional framework such as the Riskmetrics's multivariate Exponential Weighted Moving Average (EWMA) and Multivariate Generalized Autoregressive Conditional Heteroscedasticity (MGARCH).As is well known, EWMA can be viewed as a special case of MGARCH model and has served as the benchmark model in the literature in the past.However, unlike in the univariate case, multivariate GARCH has not been widely accepted yet as the benchmark model.
The multivariate time series models such as MGARCH are extension of univariate time series models in the higher dimensional space.However, it is much more complex.It suffers from a problem called "curve of dimensions", which refers to the fact that number of parameters to estimate and the associated computational complexity increase exponentially with the increasing number of variables.
There are mainly three approaches to model the multivariate conditional mean and conditional standard deviation in the literature.To model multivariate conditional mean, Vector Autoregressive (VAR) model and Vector Autoregressive Moving Average (VARMA) are often utilized.In practice the VARMA is too complicated to estimate.The VAR model is usually resorted to in practice to model the multivariate conditional means.There are many multivariate arch type models developed over years to model the multivariate conditional variance-covariance matrix [40].Typical examples include BEKK, Dynamic Conditional Correlation (DCC) and Constant Conditional Correlation (CCC) [41][42][43][44].Among them, DCC is the most popular because it relaxed the fixed correlation assumptions in the other two approaches.Both Engle [42] and Tse and Tsui [45] propose slightly different DCC MV-GARCH specifications respectively [42,45].Between them, The DCC MV-GARCH proposed by Engle [42] is used as it allows more flexible GARCH like specification of dynamic correlations than the simpler weighted sum specification of correlations proposed by Tse and Tsui [45].
where r t refers to the return at time t, r t ∼ N (0, D t+1 RD t+1 ), µ t is the N-dimensional vector of conditional mean, D 2 t can be defined as any univariate GARCH model as The dynamic conditional correlation matrix R t of ε t at time t is defined as in (4).
where the Q t is N × N symmetric positive definite unconditional correlation matrix of ε t as in (5).
The non-parametric approach includes Historical Simulation (HS) and Monte Carlo (MC) simulation, both directly extendable from univariate to multivariate case and serve as the benchmark in the literature [39].The Monte Carlo simulation relies on the repeated draws from the pre-specified distributional assumptions to simulate the real-life scenarios and estimate the portfolio VaR accordingly.However, MC is unconditional in nature and is biased toward normal distribution.It is also limited to the pre-specified distributional assumptions and has difficulty in replicating the extreme event scenario [38].Some recent attempts have been made to tackle these issues.For instance, Glasserman et al. [49] has relaxed the distributional assumption behind Monte Carlo simulation to multivariate t-distribution to improve the performance [49].The historical simulation approach uses the past information over the pre-specified window to infer directly the future portfolio VaR.This approach is sensitive to the choice of the window's timing and length and overestimate VaR because of the uniform treatment of risk level when transient events are rare in nature.When transient events are excluded from the window, it would lead to underestimated PVaR.
The semi-parametric approach includes some newly emerging techniques such as copula theory, fuzzy logic, genetic algorithm, etc. [48,[50][51][52].However, to the best of our knowledge, there has been no attempts in analyzing the multi-scale portfolio risk behaviors and constructing the PVaR estimation algorithm using multivariate wavelet analysis in the higher dimension.

Multivariate Wavelet Denoising Algorithm
In recent years, significant progress has been made in multivariate research in engineering fields, which has resulted in significant performance improvement that provides inspiration to the current research.Of more particular relevance to this research is multivariate denoising techniques in the image processing field, which consist of a series of algorithms including spatial filtering and transform domain filtering [53][54][55].There are several detailed review and surveys in the literature including Buades et al. [56], Mallat and Peyre [57].
In the literature, the traditional denoising algorithms are known to over-suppress high-frequency details [58].Thus there are several nonlinear alternatives proposed in the literature.For example, Pang and Yang [59] proposes the projection gradient method to solve the transformed image denoising problem using the duality strategy [59].Meanwhile, multivariate wavelet analysis has received considerable attentions in image denoising fields [53,55].Multivariate wavelet based image denoising has been explored extensively and positive performance has been reported in different disciplines.For example, in medical image processing, Chen et al. [60] proposes a homogeneity similarity based denoising algorithm, which can retain edges and contrast information while avoiding artifacts [60].Gorgel et al. [61] propose wavelet denoising algorithms with adaptive thresholding functions for enhancement of mammographic images in breast cancer detection [61].Kim et al. [58] proposes the use of filtered directional bases and frequency-adaptive shrinkage to reduce noise in edge regions, which is claimed to retain sharp details [58].Kumar and Agnihotri [62] have attempted discrete wavelet transform and stationary wavelet transform with different threshold techniques in denoising the ECG and MRI images and have found positive performance [62].Rabbani and Gazor [63] derive estimators using maximum a posteriori and minimum mean squared error methods in different sparse domains projected with discrete wavelet transform and its variant contourlet and curvelet transforms.It was found that the wavelet based method provides superior performance [63].Saeedi et al. [64] used fuzzy logic to incorporate the inter-relation between different channels during the wavelet denoising process and found improved performance compared with the standard single channel wavelet denoising methods [64].Tian et al. [65] proposed a nonparametric model to formulate the marginal distribution of wavelet coefficients in an adaptive manner and integrated it into a Bayesian inference framework to drive a maximum a posterior estimation based image denoising method with improved performance [65].Zifan et al. [66] used the mutliwavelet denoising methods to denoise microarray images and found that the results significantly outperformed wavelet and Wiener filter based methods [66].
Developed in the 1980s, wavelet based transform and its various extension functions have been proven useful for many different disciplines such as signal analysis, numerical analysis, image compression and other applied mathematics and engineering branches [67].The unique feature of wavelet analysis is its localization capability in both time and scale domain during the analysis.This is in contrast to the traditional spectrum analysis tool that signifies the patterns at different frequency at the cost of ignoring details at time scales.As financial data usually exhibit nonstationary time varying characteristics, wavelet analysis is appealing when it comes to financial data analysis.
To define wavelet analysis and multi-resolution analysis, in the one dimensional case, firstly a refinable (scale) function φ : R → C is defined as the solution to the two scale recursion difference equation in (6).
where h k,k∈Z ∈ L 2 (Z) are recursion coefficients.Then given some additional constraints, (6) gives rise to the wavelet functions ϕ which forms the orthonormal basis for L 2 (R) as in (7).
Utilizing the wavelets and their appealing time scale localization characteristics, the original data are projected into the multi-scale domain using wavelet transform as in (8).
where ψ( t−u s ) refers to the wavelet families translated by location parameter u, which corresponds to time scale localization, and dilated by scale parameter s, which corresponds to the frequency scale localization.In practice, wavelet analysis including wavelet transform and synthesis is performed at discrete time interval due to the data availability issue and to avoid the exponential increasing level of information redundancy.
Wavelet analysis in higher dimension are extensions to the wavelet analysis in one dimension, with additional complications.There are mainly two approaches to extend and define wavelet analysis and multi-resolution analysis in the higher dimensional case, i.e., tensor product multi-resolution analysis or the nonseparable scheme [67].Between them, tensor product based approach has both computational and intuitive appeals.In the tensor product scheme, the 2-dimensional wavelet is constructed by taking the tensor product functions generated by two one-dimensional bases as in (9).
where both the resulting basis Ψ j 1 ,k 1 ;j 2 ,k 2 (x 1 , x 2 ) and the original bases ψ j 1 ,k 1 (x 1 ) and ψ j 2 ,k 2 (x 2 ) are orthonormal wavelet bases for L 2 (R 2 ).For all f, g ∈ L 2 (R 2 ), the decomposition and reconstruction in the two dimensional case is defined as in (10) and ( 11) respectively. ) where With wavelet analysis, Multi-Resolution Analysis (MRA) can be conducted to analyze the time and frequency characteristics of the data.In the more general case in higher dimension, the MRA on L 2 (R d ) is defined as a doubly infinite nested sequence of subspaces V j of L 2 (R d ) as in (12) [69].
in which where is the tensor product operator.Then for each V j−1 in V j , the orthogonal complement space W j is defined as in (13).
where the orthogonal complement space W j consists of several pieces, i.e., one d-variate scaling function as in ( 14) and 2 d − 1 d-variate wavelet functions as in (15).
Here we take the two dimensional wavelet as a special case for illustration purpose.In the 2-dimensional case, three subspaces are generated with three orthonormal wavelet bases, i.e., horizontal wavelet Ψ horizontal (x, y), vertical wavelet Ψ vertical (x, y) and diagonal wavelet Ψ diagonal (x, y) [67].Over years, significant research interests have focused on using the wavelet denoising techniques in the univariate case, but much less attentions on the denoising of multivariate time series data.The basic procedure of multivariate wavelet denoising technique is as follows: 1.It firstly projects the original data series into different scales using wavelet transform as in (10).
The resulting coefficients would include approximation coefficients as well as horizontal, vertical and diagonal directions.
2. For approximation and direction coefficients at each direction, the threshold is chosen specifically at different scales for different directions and the wavelet coefficients are processed by either suppression or shrinkage.
3. Using the denoised wavelet coefficients and the scale chosen, the processed wavelet coefficients are reconstructed into the unified denoised data series using wavelet synthesis as in (11).
In the multivariate setting, given the multivariate time series X, the denoising algorithm assumes that it consists of both deterministic data D and undesirable stochastic noises in a linearly fashion.Applying the multivariate wavelet analysis, this relationship is defined as in (16).
where υ is an N ×N orthonormal matrix, O is the N dimensional vector of wavelet transform coefficients O l : l = 0, ..., N − 1.
The mainstream threshold selection rules include Universal and Steins Unbiased Risk Estimate (SURE) [70].Different threshold selection rules have different target and considerations when setting the noise reduction target.For instance, the universal threshold selection rule aims at reducing the noises at maximum level possible, so the threshold is selected as η U = σ √ 2logN , where N is the number of wavelet coefficients and σ is the estimate of the volatility level.Statistically this gives the upper bound to noises signals given significant sample size N and normally distributed noise series.However, this method achieves the maximal level of smoothness at the cost of lower goodness-of-fit.The denoised data risk losing some important data features.
The mainstream thresholding functions are hard and soft thresholding [70,71].The hard thresholding is the high pass filter, which suppresses the wavelet coefficients below the chosen threshold values and leaves the remaining coefficients intact as in (17).
where O t refers to the wavelet coefficients, η is the set threshold value.
The soft thresholding focus on the signal smoothing.It suppresses the wavelet coefficients below the set threshold value and subtracts the threshold value from the remaining wavelet coefficients.Compared with the hard thresholding, the data processing following soft threshold selection rules are smoother but lost the abrupt changes in the original data.It filters the signal as in (18): where

Multivariate Wavelet Denoising Model for Portfolio Value at Risk Estimates
In the mainstream literature the Efficient Market Hypothesis (EMH) underpins most econometric models in the crude oil markets, which depicts the market structure and price behavior based on the assumption of homogeneous agents with rational expectations and single time horizon.This approximates the market well when there is relatively stronger control and much regulations in place.However, with the deregulation waves, more empirical researches suggest that the market exhibits diverse individual characteristics not just in the frequency domain, but over multi-scale time horizons.Thus the Heterogeneous Market Hypothesis (HMH), initially proposed in the exchange rate and equity market, is proposed in this paper to complement the traditional EMH and recognize these new empirical stylized facts [72][73][74][75][76][77][78][79][80].
HMH can be viewed as a more restricted form of FMH, focusing on heterogeneous features of market microstructure, which comes natural when the market has fractal characteristics.It also complements the traditional EMH [74][75][76][77][78][79][80].It assumed that different actors in the heterogeneous market have different time horizons and dealing frequencies.The market is heterogeneous with a fractal structure of time horizons as it consists of short-term, medium-term and long-term components.Different actors are likely to settle for different prices and decide to execute their transactions in different market situations.In other words, they create volatility.The market is also heterogeneous in terms of geographic locations of the participants.Based on these assumptions, the HMH postulates that the market consists of heterogeneous agents with heterogeneous investment strategies and investment time horizons.Compared with the homogeneous reaction to the news shocks in the EMH, HMH states that these agents or investors react to news shocks differently based on their own characteristics.On one hand, their investment time horizon and dealing frequency are diversely different.Their time scale behaviors could range from short and temporal to long and stable characteristics.For instance, pensions funds and central banks tend to have low dealing frequency focusing on long time horizon while the market traders tend to have high dealing frequency with short time horizon.On the other hand, these agents or investors employ diversely different investment strategies or measures based on their own characteristics and focus.Their frequency behaviors could range from stationary to transient characteristics.
In the literature, these micro market structures were studied incorporating information at the operational and transaction level.However, results from these studies are constrained by the restricted access to the concerned data, some insider in nature.The data obtained are also significantly lagged in time due to the regulations on protection of sensitive information.Thus it would be difficult to test the theory and implement it for real time trading environment.
Meanwhile, a different alternative is to use the time series itself without dependence on unreliable external information to extract and separate the underlying components from the original data directly.Certainly it is a difficult problem as the underlying components are unknown latent variables, whose structures are vulnerable to inappropriate separation processes.The multivariate case is made even more complicated by the time varying correlations.The wavelet based approach is promising as it is supported by positive results for decomposing and denoising of univariate time series.However, using wavelet to denoise and separate data features in multivariate time series involves further complications that deserve further investigations.
Therefore, in this paper we propose a different approach to extract and separate the underlying components from the original data directly.
Since the accuracy of the noise and denoised data separation and extraction process affects critically the generalizability of the models fitted, appropriate tools need to be selected.Traditional data denoising techniques include spectrum analysis methods such as simple averages, Kalman filtering and Fourier transforms, etc.However, these methods did not receive significant attentions in the mainstream empirical research literature.This is due to the insignificant performance improvement during the application of these approaches, as they are mostly based on spectrum and frequency scale domain while ignoring the multi-scale heterogeneous structure in the data.More specifically these approaches would result in two critical issues: Firstly, the data may be distorted during the denoising process due to the single time horizon approach, and there are patterns left in the noises removed, which would reduce the forecasting accuracy if ignored.The denoising algorithm with different parameters further complicate this situation.Secondly, following HMH, the noises may vary in level in the multi-scale domain for multi-scale data structure, corresponding to different investment strategies and time horizon.To tackle these issues, the analysis and modeling of the individual components behaviors, including both the separated denoised data and noises, in the multi-scale domain need to be conducted during the denoising process, as it would reduce the risk of using biased estimates with lower goodness of fit and contributes to the overall performance improvement.For example, the behavior of transient data component is significantly different from the behavior of data component driven by normal market activities.Each of them need to be modeled separately to avoid the biased estimate due to violations of assumptions of the unsuitable models.
Thus the wavelet based denoising algorithm is proposed to serve as a promising alternative modeling tool to extract heterogeneous data features of distinct characteristics, in a real time manner without the need of insider information.This is a more refined approach compared with the normal spectrum or frequency domain approach taken by traditional data denoising techniques such as spectrum analysis methods, e.g., simple averages, Kalman filtering and Fourier transforms, etc.
Applying the wavelet based time frequency analysis in the higher dimensions, a two stage Multivariate Wavelet Denoising Portfolio Value at Risk Model (MWDNPVaR) is proposed to track portfolio risk evolution in the higher dimensions space and estimate portfolio VaR.
Accordingly the numerical procedure for the two stage hybrid estimation process is listed as follows.
In the first stage, the multivariate wavelet analysis based denoising algorithm is used to separate the noises matrix from the original data matrix, leaving the rest data matrix with more distinct characteristics.
Then the multivariate time series models are used to extract the denoised and noise matrix respectively, resulting in different model parameters tuned for distinct data characteristics and dependency structure.
In the second stage, since we assume that the denoised and noise data are independently distributed, the conditional means are aggregated from the individual conditional mean forecast matrix forecasted for both denoised and noise data components as in (19).
where R t is the aggregated conditional mean forecast at time t, R D,t and R N,t are conditional mean forecasts for both denoised and noise data respectively, at time t.
As the wavelet decorrelates the data series, suppose data series across different scales are independent.Thus the variance-covariance matrix can be aggregated from the individual variance-covariance matrix forecast at different scales as in (20).
where t is the aggregated conditional variance-covariance forecast matrix at time t, D,t and N,t are conditional variance-covariance forecast matrix for both denoised and noise data respectively, at time t.t = Cov(X t , Y t ) is the covariance matrix at time t, Cov(X D,t , Y N,t ) = 0 and Cov(X N,t , Y D,t ) = 0.Then, suppose a unit worth portfolio investment with ω weights and one day holding period with the forecasted conditional mean and covariance matrixes, we follow the traditional variance-covariance approach for estimating portfolio V aR as in (21).

Data and Descriptive Statistics
Empirical studies are conducted in the US West Texas Intermediate (WTI) and Europe Brent markets, which are the marker markets where most trading and transactions are conducted.Together they form essential part of the crude oil portfolio.The data source is the Energy Information Administration, Department of Energy, US.The data set covers the same date range from 2 January, 2002 to 1 October, 2009 for both markets.The starting date for the data set is chosen since the equilibrium price level has shifted to a new regime since 2002.The potential cause is that the balance between the demand and supply of crude oil is increasingly constrained by its limited and scarce nature in recent years, due to the increasing demand and steady level of supply, as well as the wave of liberalization and globalization.Meanwhile, this time period would reduce the impact of previous direct systematic market disruptions such as the Gulf war, the minimum.The end date for the data set is chosen based on data availability.However, this results in different number of daily observations for each market respectively, i.e., 1939 for WTI market and 1996 for Brent market, as normal trading are interrupted at different time points due to different trading regulations and holidays.For the purpose of constructing a valid portfolio in both market, only observations for the valid trading days in both markets are retained in the data set, i.e., if trading in either one market is disrupted with holiday or regulations, observations for this day are excluded in both markets as no valid portfolio can be constructed.This results in 1939 observations.We assume a one dollar investment portfolio with equal holding positions, i.e., 50% in the WTI market and another 50% in the Brent market.The data set is divided into three subsets, i.e., the training set for determination of model parameters (40%), the model tuning set to select and determine the empirically optimal model specifications (20%) and the test set for the out-of-sample test to evaluate the performance of different models (40%).This results in 776 daily observations for out-of-sample test, yielding statistically significant results.The one day ahead forecast is performed using rolling-window method.The size of the window is 775 to cover the relevant information available.As autocorrelation and partial autocorrelation function indicates the trend factors, the daily prices are log differenced at the first order to remove them as in y t = ln pt p t−1 .We assume one day holding period to conduct one step ahead PVaR estimations.
The descriptive statistics for both markets are listed in Table 1.  1 show some interesting stylized facts.On the aggregated level, the distribution of the original data approximates the normal distribution, as indicated by four moments.However, the rejection of the null hypothesis for Jarque-Bera (JB) test of normality suggests the deviation of return distribution from the normal distribution [81].The rejection of the null hypothesis for BDS test suggests the existence of unknown correlations, potentially nonlinear dynamics, in the data [82].However, compared to descriptive statistics of individually denoised data series in Table 1, the multivariate denoising algorithms extract correlated denoising data and noises data.The paired data in both markets exhibit similar descriptive statistics, suggesting that they are governed by correlated or the same underlying DGP.
Figure 1 shows the surf plot of return series in both markets.
Results in Figure 1 show that multivariate wavelet analysis effectively separates data of distinct characteristics and shapes.The middle figure for denoised data show distinctively different shapes and behaviors than the right figure for noise data.Behaviors of both of them are different than the left figure for the original data.Interestingly, these figures show that some of the data features in the left figure are signified in the middle figure while others are signified in the right figure.For instance, the abrupt changes in the left figure are mostly captured in the middle figure.

Multi-scale Analysis of Correlation of Portfolio
As we argue that the correlation between different markets consists of varying strength of correlations at different scales, the cross correlation wavelet analysis is used to analyze and illustrate the multi-scale structure of correlation movements.The rolling windows over which the correlation is calculated is 775.For the portfolio data series decomposed at scale 8 with Daubechies 8 wavelet family, correlation for data series at the first three levels along with the correlation for the original portfolio are shown in the scatter plot in Figure 2.  Experiment results in Figure 2 show that the correlation between WTI and Brent market is dynamically changing over the entire investment time horizon, between 0.44 and 0.6.This justifies the use of DCC-GARCH model, in which the dynamic correlation assumptions better conform to the real data.
Using wavelet analysis, the underlying structure of correlation is recovered when relating the strength of correlation to the investment time horizon.For instance, over the short term investment time horizon, the correlation is relatively low for these particular portion of markets.We also observe strong correlation between particular portion of both markets over the medium to long term time horizon.
In addition to the multi-scale correlation structure, there are also heterogeneous structure for the correlation evolution, i.e., the correlation between different markets may be influenced by underlying correlations of different strength and characteristics.Therefore, we further use the cross correlation wavelet analysis to analyze and illustrate the heterogeneous correlation movements.The original, denoised and separated noise correlation between WTI and Brent market are shown in the scatter plot in Figure 3.  Experimental results in Figure 3 show that correlation for the extracted denoised data series exhibits time varying behavior at higher range between 0.75 and 0.95, while correlation for noise data series also exhibits time varying behavior at lower range between −0.15 and 0.1.This shows that under the HMH framework, the multi-scale heterogeneous structure of correlations can be recovered using wavelet analysis to extract dependency structures of distinct nature, which drives the dynamic correlation behaviors.In this case, they are roughly categorized into denoised and noise parts to signify their distinct characteristics.Therefore they need to be modeled using different model specifications and parameters.
These experiments signify the multi-scale and heterogeneous correlation structure as important stylized facts that need to be incorporated during the modeling process.Wavelet analysis based denoising algorithm provides an interesting alternative to uncover the multi-scale heterogeneous correlation structure, by projecting the original data into different time scales and extracting different denoised and noise components at these scales.

Performance Evaluation
The performance of the proposed MWDNPVaR model is evaluated in the empirical studies in the marker WTI and Brent market, against benchmark models.In this paper, we use two models as benchmark models, i.e., the well accepted EWMA as the de facto industry standard and DCC-GARCH due to its theoretical superiority.The generalizability of the proposed model models is evaluated against benchmark models in out-of-sample forecast tests, using the Kupiec backtesting procedure for unconditional risk coverage for reliability of estimates as well as MSE and Clark-West test of predictive accuracy for the accuracy of estimates [83,84].
The frequency based approach, or the so-called unconditional coverage approach, as suggested by Kupiec, tests the statistical significance of the patterns of VaR exceedance, i.e., whether the size of the loss is greater than the estimated VaR values.The null hypothesis for the Kupiec test is that the proposed model is the correct one and corresponds to the theoretical risk coverage.If the model is acceptable, the number of tail losses x should follow binomial distribution.Thus, at the given total number of observations n and confidence level p, the likelihood ratio statistics is calculated as in (22): The Kupiec likelihood ratio test statistics conforms to the χ 2 (1) distribution.
The Kupiec backtesting procedure is intuitively and computationally appealing.However, it relies entirely on frequency of losses information and ignores temporal pattern of exceedances and the size of losses.Thus, it suffers from lower level of discriminatory power for small sample sizes.Therefore backtesting procedures addressing other loss exceedances patterns are developed such as the one proposed by Christoffersen to test for the temporal behavior of losses [85].However, when the sample size is sufficiently large, the Kupiec backtesting procedures would offer an acceptable level of discriminatory power between good and bad models.Therefore, it is adopted in this paper to backtest and evaluate different models.
The MSE is the simple statistics measuring the deviation of forecasts from actual observations, as T −1 T i=1 ε 2 i , where ε i = y i − y i .Diebold Mariano test of equal predictive accuracy is conventionally used to test the statistical significance of the differences of empirical losses between competing models and benchmark models, based on Mean Squared Prediction Error (MSPE).However, it is found that the originally proposed test statistics is upward biased heavily if models tested are nested.Thus the Clark-West test of In Table 3 N cl , (cl = 95%, 97.5%, 99%) refers to the number of exceedances at different confidence levels, P cl , (cl = 95%, 97.5%, 99%) refers to p value of Kupiec backtesting procedure at different confidence levels.Experiment results from Table 2 show the different in-sample performance for the proposed MWDNPVaR, using different model parameters.The three numbers in the parentheses correspond to exceedances at three different confidence levels, i.e., 99%, 97.5% and 95%.The attempted model parameters are as follows: 4 different wavelet families including Daubechies 2 (DB2), Symlet 2 (SYM2), Coiflet 2 (COIF2) and DMEY 2. Hard and soft thresholding algorithm.Decomposition levels from 1 to 9. The p value of backtesting procedure is (0.0805, 0.0366, 0.0166) for the EWMA model and (0.6436, 0.4653, 0.7501) for the MGARCH model.Using exceedances and their associated Kupiec backtesting statistics as performance measure as in Tables 2 and 3, we determine the optimal model specification demonstrating the highest level of reliability, as indicated by the higher p value for the out-of-sample test, i.e., Symlet 2 using hard thresholding algorithm at decomposition level 3.The shrinkage algorithm is minimax threshold.The thresholding is conducted in a level dependent manner.The p value of backtesting procedure for the chosen model is (0.9473, 0.9162, 0.8801).
The reliability performance measure of the proposed algorithm against alternative benchmark models are listed in Table 4.  4 show that the proposed MWDNPVaR algorithm improves the reliability significantly compared with the traditional EWMA and DCC-GARCH model.The exceedances of the MWDNPVaR algorithm decreases substantially across all confidence levels.This result is statistically significant as confirmed by the significant p value.We also observed significantly improved p value at both 97.5% and 95% confidence levels from the Kupiec test statistics.
The performance improvement is attributed to the introduction of multivariate wavelet denoising algorithm as the data separation tool.More specifically, this is mainly attributed to the use of multivariate wavelet denoising algorithm to separate data and noises, taking into account their distinct behavior patterns.It is also attributed to the individual modeling of the separated components in the multi-scale time domain, taking into account the heterogeneous data structure.
We further conduct experiments to investigate the predictive accuracy of the proposed algorithm against benchmark alternatives.Results from experiments are listed in Table 5. Experiment results in Table 5 show that the proposed MWDNPVaR achieves inferior predictive accuracy than the benchmark EWMA and DCC-GARCH based models, as measured by larger MSE.However, the performance gap is not statistically significant, as shown by the failure to reject the null hypothesis for Clark-West test of predictive accuracy.This implies that the predictive accuracy of the proposed MWDNPVaR model is at the comparable level as the traditional MGARCH based PVaR models in asset allocations and utilization.The main purpose of the downside risk estimate as those presented in this manuscript is to improve the reliability to allow sufficient capital allocation to cover the potential significant capital loss due to adverse market price movement.The proposed algorithm in this manuscript can help to estimate risks more reliably under the turbulent period with the prevalence of transient or extreme events, at the cost of predictive accuracy, if investors are risk averse.
There are some important implications drawn from the empirical investigations in the paper.Firstly, although previous researches show that there are increasingly diversified data features, such as extreme events, that can not be explained by MGARCH models based on normal distribution, our results show that it could be attributed to the heterogeneous structure of the underlying data components, potentially much simpler than previously assumed extreme distribution.Thus based on the parsimony principle, it is inappropriate to disregard MGARCH based models in favor of others, such as the models based on extreme value theories.By deliberately separating and modeling the data and noise, the performance of DCC-GARCH model can be improved further to the competent level.Secondly, the optimal performance is achieved when the individual components are extracted using Symlet 2 wavelet family and modeled with different parameters.These observations support the argument that the risk in the crude oil prices are complicated processes with a mixture of underlying DGPs.Its modeling and analysis is a tricky issue which requires the detailed look into the underlying structure in the time scale domain.Thus appropriate recovery of the underlying structure is critical to the further performance improvement and more thorough understanding of the DGPs during the modeling process.
Results and conclusions are generalizable and are supposed to be valid across different data sets and time windows, i.e., the proposed algorithm is expected to lead to improved reliability since the disruptive noise parts are separated in the multiscale domain.

Conclusions
In this paper, we introduce the multivariate wavelet analysis to analyze the multi-scale behaviors of the correlation among different markets.Then we propose a novel multivariate wavelet denoising based approach for estimating PVaR.The superior performance of the proposed algorithm is supported by performance evaluation in the empirical studies in the crude oil markets.This is attributed to the separation of data and noises taking into account their multi-scale dependency structure and the construction of the hybrid methodology incorporating these information.
The contributions of the proposed methodology is threefold.Firstly, the multivariate wavelet analysis is introduced to analyze data and noises components in the portfolio, especially their distinct behavioral and correlation patterns, recognizing the heterogeneous investment strategies over different time horizons and across different markets.Secondly, the risk for the separated data and noises components of the portfolio are modeled using DCC-GARCH model with different parameters to emphasize their different behavioral and correlation characteristics.The incorporation of heterogeneous investment strategies for portfolio across different markets in the time scale domain leads to improved PVaR estimates with higher reliability.Thirdly, the proposed multivariate wavelet denoising GARCH methodology is not sensitive to the underlying distributional assumptions and thus is more robust and effective.Results from this research may be limited in that the level of accuracy for the boundary or the edge set by the denoising algorithm could critically affect the performance.These can be possible directions for future research.

Figure 1 .
Figure 1.The surf plot of the original, denoised and noise data.

Figure 2 .
Figure 2. Dynamic correlation between WTI and Brent markets at different scales.

Figure 3 .
Figure 3. Dynamic correlation between WTI and Brent markets.

Table 1 .
Descriptive statistics and statistical tests.

Table 4 .
Reliability comparison of different models.

Table 5 .
Predictive accuracy comparison of different models.MSE CW p−value , EW M A CW p−value , DCC−GARCH