Effects of the Swiss Franc / Euro Exchange Rate Floor on the Calibration of Probability Forecasts

Probability forecasts of the Swiss franc/euro (CHF/EUR) exchange rate are generated before, surrounding and after the placement of a floor on the CHF/EUR by the Swiss National Bank (SNB). The goal is to determine whether the exchange rate floor has a positive, negative or insignificant effect on the calibration of the probability forecasts from three time-series models: a vector autoregression (VAR) model, a VAR model augmented with the LiNGAM causal learning algorithm, and a univariate autoregressive model built on the independent components (ICs) of an independent component analysis (ICA). Score metric rankings of forecasts and plots of calibration functions are used in an attempt to identify the preferred time-series model based on forecast performance. The study not only finds evidence that the floor on the CHF/EUR has a negative impact on the forecasting performance of all three time-series models but also that the policy change by the SNB altered the causal structure underlying the six major currencies.


Introduction
On 6 September 2011, the Swiss National Bank (SNB) began intervening in the Swiss franc/euro (CHF/EUR) exchange rate market to prohibit the franc from appreciating beyond 1.20 francs per euro, and it continued this intervention throughout 2012 [1,2].The objective of this study is to assess the impact of this currency intervention on the probability forecasts of the CHF/EUR from three time-series models: a vector autoregression (VAR) model, a VAR model augmented with the LiNGAM causal learning algorithm, and a univariate autoregressive model built on the independent components (ICs) of an independent component analysis (ICA).One-step-ahead forecasts of the CHF/EUR probability distribution are generated from each time-series model and are based on a series of intraday data for six exchange rates (all versus the Swiss franc).The forecasted probability distributions are tested for calibration and ranked with two different scoring techniques in periods of time before, surrounding, after and long after the beginning of the CHF/EUR exchange rate intervention.
In contrast to other literature on exchange rate forecasting that examines point forecasts of exchange rates, this study follows the example set by [3] and evaluates forecasted probability distributions.A brief summary of the most relevant literature concerning the exchange rate forecasting performance of multivariate time-series models is as follows.Reference [4] determines that the forecasting accuracy of restricted VAR models is better than that of unrestricted VAR models for forecasting the US dollar/yen, US dollar/Canadian dollar and US dollar/Deutsche Mark monthly exchange rates.Reference [5] uses VAR, Bayesian VAR and vector error correction (VEC) models to forecast the Australian Dollar/United States Dollar monthly exchange rate and concludes that the VEC exhibits superior forecasting performance.Reference [6] uses a VAR, restricted VAR, Bayesian VAR, VEC and Bayesian VEC to forecast five Central and Eastern European monthly exchange rates and Forecasting 2019, 1, 3-25; doi:10.3390/forecast1010002www.mdpi.com/journal/forecastingconcludes that none of the models outperform the others for three-month forecasts and that the Bayesian models tend to perform better than the others for five-month forecasts.Reference [7] forecasts the monthly exchange rates of 33 exchange rates against the US dollar using a large Bayesian VAR model; the results indicate that the Bayesian VAR model forecasts better than a random walk model for most of the currencies.
There are many other techniques used to forecast exchange rates in addition to VAR and VEC models.For instance, Reference [8] surveys the literature on exchange rate forecasting and reports that factor-based models and time-varying parameter models outperform a variety of other models, but the results are sensitive to the chosen sample periods and time horizons.Machine learning algorithms are also popular for forecasting foreign exchange.Reference [9] uses artificial neural network, k-nearest neighbor, decision tree, and naïve Bayesian classifier learning algorithms to predict the USD/GBP daily exchange rate.All algorithms had a similar performance and there was a high degree of correlation between their predictions.Reference [10] compared the performance of several machine learning algorithms including multi-layer perceptron, support vector regression, and gamma classifier to the performance of more traditional time-series models including autoregressive, autoregressive moving-average, and autoregressive integrated moving-average models.Results were mixed and depended upon which exchange rate (MXN/USD, JPY/USD, or USD/GBP) was being forecasted.Other studies such as [11] and [12] have focused on forecasting exchange rates using various artificial neural network models.

Probabilistic Forecasting
Let x t = (x 1t , . . . ,x mt ) be the observed values of an m × 1 vector time series X t at time period t.Suppose that at any time n, the forecaster knows values x t , t = 1, . . ., n and must issue a set of probability distributions P n+1 for the next observation X n+1 .A prequential forecasting system (PFS) is a rule which associates a choice of P n+1 with each value of n and with any possible set of outcomes x t , t = 1, . . ., n [13].A PFS is so named because it is the combination of probability forecasting and sequential prediction; this concept is also known as "probabilistic forecasting" or "density forecasting".
Reference [13] suggests that the adequacy of a PFS as a probabilistic explanation of the data should depend only on the sequence of forecasts that the PFS in fact made; this is called the prequential principle.In practice, the prequential principle is implemented by using the calibration criterion to judge whether or not a PFS issues adequate probabilities.For a PFS to be well calibrated according to the calibration criterion, the PFS must assign a probability to each event that matches that event's ex post relative frequency.
Formal testing of calibration relies on the probability integral transform as shown in [13] and summarized as follows.For a continuous random variable X i,t+1 (i.e., the one period forecast for time series i), let U i,t+1 = F i,t+1 (X i,t+1 ) be the continuous distribution function of P i,t+1 .Under P i,t+1 the U i,t+1 are independent uniform U[0, 1] random variables so that P i,t+1 is considered to be well calibrated if the observed sequence of fractiles u i,t+1 = F i,t+1 (x i,t+1 ) "looks like" a random sample from U[0, 1].In other words, the PFS is well calibrated if the observed sequence The cumulative distribution function G(U i,t+1 ) for U i,t+1 is estimated by arranging the observed sequence u i,t+1 = F i,t+1 (x i,t+1 ), t = 1, . . ., N in order of ascending value u i,t+1 (1), . . . ,u i,t+1 (N) and calculating Ĝ[u i,t+1 (j)] = j/N, j = 1, . . ., N.
Calibration performance can be shown graphically as a plot of the PFS's observed fractiles (u i,t+1 's) on the x-axis against the estimated cumulative distribution function Ĝ(U i,t+1 ) on the y-axis.This calibration plot will be approximately a 45-degree line for a well-calibrated PFS.
In practice, a chi-squared goodness-of-fit test can be performed to test a PFS for calibration.This test uses the sequence of observed fractiles (u i,t+1 's) from the sequence of probability forecasts P i,t+1 .Under the null hypothesis that the forecasts are well calibrated, the distribution of a sequence of N observed fractiles is a uniform distribution on the interval [0, 1], whereas the alternative hypothesis is that the distribution of observed fractiles is not uniform.If the interval [0, 1] is divided into J nonoverlapping subintervals of length L (where 0 ≤ L ≤ 1), the goodness-of-fit statistic is calculated as where a j is the actual number of observed fractiles in interval j and L j is the length of interval j [3].
The goodness-of-fit statistic is compared to the chi-squared distribution with J − 1 degrees of freedom.This test and all other chi-squared goodness-of-fit tests share a common form which is a sum of terms containing the square of a difference between an observed count and an expected count divided by the expected count For more information on the goodness-of-fit test see [14].

Scoring Forecasts
In addition to calibration plots and calibration tests, prequential forecasting systems can be evaluated by metrics such as the mean-squared error (MSE) criterion or the probability score (Brier 1950) [15].The MSE criterion is most often used to evaluate point forecasts, but it can also be used to evaluate predictive distributions [3].The MSE is calculated for probability forecasts by using the expected value of the forecast distribution.Let P i,n+1 , n = 1, . . ., K be a sequence of probability forecasts for the ith element X i,n+1 of the random time-series vector X n+1 and M i,n+1 be the expected value of the distribution P i,n+1 .The MSE of the forecasts for X i,n+1 is calculated as follows where x i,n+1 is the observed value of X i,n+1 .The sequence of forecasts with the smallest MSE is preferred; a PFS P is chosen over an alternative PFS Q if the PFS P has the smallest MSE.
In contrast to the MSE, the probability score evaluates the entire forecasted probability distribution (Brier 1950) [15].On any occasion n + 1, suppose that there are R possible outcomes for X i,n+1 with probabilities The probability score is defined as where E j n+1 takes the value 1 if outcome j occurred and 0 otherwise.The usage of the probability score is similar to that of the MSE; the sequence of forecasts with the smallest probability score is preferred.A PFS P is chosen over an alternative PFS Q if the PFS P has the smallest probability score.

Independent Component Analysis
In basic independent component analysis, there are n observed variables x 1 , . . ., x n that are linear combinations of underlying statistically mutually independent source variables s 1 , . . ., s n x i = a i1 s 1 + a i2 s 2 + . . .+ a in s n f or all i = 1, . . ., n, which in vector-matrix form is written as where A is the unknown mixing coefficient matrix and s is a vector of unobserved independent components.The observed variables x are used to estimate both A and s.Both x and s can be assumed to have zero mean; if this is not true, then the preprocessing step will center the original observed variables x o if they are not already centered.The independent components will then also have zero mean since Basic ICA model estimation relies on the following assumptions [16] 1.
The independent components are assumed to be statistically independent, but this does not need to be exactly true in application.

2.
The mixing matrix A is assumed to be square and invertible for the sake of convenience and simplicity.

3.
The independent components must have non-Gaussian distributions.
Many ICA models differ from the basic ICA model and have their own assumptions.For additional details see [16].
The independent components s are not only uncorrelated, but they are also as statistically independent as possible.Because achieving this requires more information than a correlation matrix can provide, the estimation of independent components uses higher-order moments or other information such as the autocovariance structure for time-series variables in addition to correlation information.
The observed random variables x can be linearly transformed into uncorrelated variables that have unit variances via a process called whitening.The whitened vector z is computed as z = Vx (11) where the decorrelating matrix V is In the above equation, E = (e 1 . . .e n ) is the matrix whose columns are the unit-norm eigenvectors of the covariance matrix C x = E xx T and D = diag(d 1 . . .d n ) is the diagonal matrix of the eigenvalues of C x .Basic ICA estimation requires the higher-order moments of non-Gaussian distributions because there are an infinite number of matrices V that can create decorrelated components.

ICA Time Series
If the independent components are time series, as opposed to independent random variables in the basic ICA model, then the ICA model takes the following form [16] x(t) = As(t), t = 1, . . ., T where t is the time index.Since time-series variables have more structure than independent random variables, the time-series autocovariances may be used for estimation instead of the higher-order information that is required in the basic ICA model.
The AMUSE algorithm provides one method to estimate the time-series ICA model [16].This algorithm requires the time-lagged covariance matrix in place of the higher-order moments used in the basic ICA model.The time-lagged covariance matrix is computed as where τ is a lag constant, τ = 1, 2, 3, . ... This matrix contains the autocovariances of each signal and the covariances between signals.
The algorithm is based on the fact that the instantaneous and lagged covariances of s(t) are zero due to independence.Hence, the time-lagged covariance matrix is used to find a matrix B so that all of the instantaneous and lagged covariances of are equal to zero.
The AMUSE algorithm assumes that all of the ICs have autocovariances different from zero and different from each other.This assumption replaces the assumption of the basic ICA model that the independent components must have non-Gaussian distributions.
The AMUSE algorithm uses whitened, zero mean data z(t) as input and generates the separating matrix W as output so that The time-lagged covariance matrix is modified to be symmetric by the following computation so that an eigenvalue decomposition on this new symmetric matrix is well defined.The steps of the AMUSE algorithm are as follows [16]: 1.
Center and whiten the observed data x(t) to obtain z(t).

2.
Compute the eigenvalue decomposition of the symmetric, time-lagged covariance matrix (Equation ( 18)) for some time lag τ.

3.
The rows of the estimated separating matrix Ŵ are given by the eigenvectors.4.
The estimated separating matrix for the unwhitened data x is B = ŴV in which V is defined in Equation (12).
Time-series models are typically built using observed returns, which are represented in vector form by the notation where R i (t) is the return on a particular asset i ∈ 1, . . ., N at time t ∈ 1, . . ., T. In the following discussion, the vector of observed time-series variables is the vector of observed returns, i.e., x(t) = R(t).A prequential forecasting system can be created with the independent components by building on the forecasting method described in [17].The following procedure is used to create a prequential forecasting system for a set of observed returns 1.
Compute the independent components using the estimated separating matrix Model each independent component with an autoregressive (AR) model where c is a constant, k is the number of time-delays (lags) of the autoregression, ϕ τ are coefficients, and ε i (t) is the innovation process.

3.
Compute the estimates of the innovation process as follows and estimate the probability distributions of the innovations with a method such as kernel density estimation.For an overview of kernel density estimation see [18].4.
Obtain samples from the estimated probability distributions of the innovations with a sampling technique such as Latin hypercube sampling.A stratified sampling technique such as Latin hypercube sampling is generally more accurate when there are low-probability outcomes, which is likely to be the case in this application [19].

5.
Use the samples of the innovations in conjunction with historical data and parameter estimates to compute the estimated probability distribution for the one-step-ahead independent components using Equation ( 21). 6.
Finally, transform the samples of the estimated probability distributions of the independent components into estimated probability distributions of the original variables

LiNGAM Algorithm
The LiNGAM algorithm assumes that the observed variables can be arranged in a causal order so that the data generating process can be represented by a directed acyclic graph (DAG), that the value assigned to each variable is a linear function of values assigned to variables positioned earlier in the causal order, that there are no latent common causes, and that the disturbance terms are mutually independent with non-Gaussian distributions and non-zero variances [20].The non-Gaussian assumption is important because this allows LiNGAM to estimate the full causal model with no undetermined parameters.
LiNGAM assumes that the observed variables are linear functions of the disturbance variables.When the mean is subtracted from each variable, this is expressed as Solving for x, this becomes x = Ae (25) where A = (I − B) −1 .Equation ( 24) in addition to the assumption that the disturbance terms are independent and have non-Gaussian distributions is the independent component analysis model.
The ICA model has two indeterminacies that must be resolved before a graphical model can be constructed: neither the order nor the scaling of the independent components is defined.LiNGAM resolves both of these issues by permuting and normalizing the ICA output (i.e., the mixing matrix) to obtain a matrix B containing the DAG connection strengths.The graphical representation of this matrix is the causal DAG model.Because LiNGAM uses the non-Gaussian information contained in the disturbance terms, its output is just one DAG instead of the class of equivalent DAGs found by most causal learning algorithms.As noted earlier, this output includes parameter estimates for the linear model.The LiNGAM procedure is implemented both in MATLAB (version 7.7) provided by [20] and in the TETRAD IV software package (version 4.3.10)provided by [21].In the application below, the MATLAB code is used to produce coefficient estimates, and TETRAD IV is used to produce DAG illustrations.

VAR Models
A vector autoregression (VAR) built using a time series of return observations (Equation ( 19)) is written as where k is the number of time-delays (lags) of the autoregression, M τ are n × n matrices of coefficients, and n(t) is the innovation process.
To find an estimate n(t) of the innovation process, estimate the vector autoregressive model using any least squares method and compute the estimate of the innovation process as In the application below, the VAR model is used as a one-step-ahead prequential forecasting system by using a multivariate normal distribution as the distribution of the innovations n(t).Estimates of the expected value vector and covariance matrix of n(t) are used as parameters of the multivariate normal distribution.The multivariate normal distribution of the innovations is used in Equation (26) with historical data and parameter estimates to create a probability distribution for the one-step-ahead return vector R(t).

Dynamic Directed Graph Discovery (VAR-LiNGAM)
LiNGAM can be combined with the VAR model in a specific way so that the VAR model becomes fully identified as described in [22]; in the following text, this combined model is called VAR-LiNGAM.The VAR-LiNGAM model is a combination of an autoregressive model with time-delays and a structural equation model, which does not consider the time-series structure in data.The autoregressive portion of VAR-LiNGAM is where k is the number of time-delays (lags) of the autoregression, B τ are n × n matrices of coefficients, and e(t) is the innovation process.The structural equation portion of VAR-LiNGAM is where e is a vector of disturbances and the diagonal of B is defined to be zero.The complete VAR-LiNGAM model is the combination of Equations ( 28) and ( 29) where k is the number of time-delays (lags) of the autoregression, B τ are the n × n matrices containing the causal effects between returns R(t − τ) with time lag τ = 0, . . ., k, and e(t) are random disturbances.The B τ matrices for τ > 0 correspond to effects from the past to the present, while B 0 corresponds to instantaneous effects.The VAR-LiNGAM model is based on three assumptions: 1. e(t) are mutually independent and temporally uncorrelated, both with each other and over time.

3.
The matrix B 0 corresponds to an acyclic graph.
The model is estimated in two stages.First, estimate a traditional vector autoregressive model and compute the residuals of the model as described above.Then perform a LiNGAM analysis on the estimate of the innovation process to obtain an estimate of the matrix B 0 , which is the solution to the instantaneous causal model n(t) = B 0 n(t) + e(t).
Finally, use where Mτ are estimated coefficient matrices of the VAR model in Equation ( 26).
The VAR-LiNGAM model becomes a prequential forecasting system for the one-step-ahead return vector R(t) with the following procedure.Compute an estimate of the independent components ê(t) from the estimates of the innovations n(t) Because there is essentially no stochastic dependence between the independent components, the probability distributions of the individual independent components can be estimated with a univariate estimation method such as kernel density estimation.
Next, obtain samples from the estimated probability distributions of the individual independent components with a sampling technique such as Latin hypercube sampling.Transform the samples of the independent components into samples of the innovations Finally, samples of the innovations in conjunction with historical data and parameter estimates are used to compute the estimated probability distribution for the one-step-ahead return vector R(t) using Equation (26).

Application
In the remainder of the paper, probability forecasts of the CHF/EUR exchange rate are generated from the three time-series models.Forecast calibration is evaluated with calibration plots and goodness-of-fit calibration tests.The mean-squared error and the probability score metrics are then used to compare the forecasting accuracy of the models.The code used for forecast generation, calibration, and scoring metrics was programmed and executed with MATLAB [23].

Description of the Data
Data is obtained from the Sierra Chart historical data service using Sierra Chart software (version 842) [24].Both spot and futures data are available from the data service, and virtually identical model estimation and forecast evaluation results are obtained regardless of which is used.The results presented later in the paper are all reported using futures data.The rationale for presenting these results is that the futures data originates from a globally accessible exchange whereas the Sierra Chart spot data which consists of transactions between a small forex dealer and its clients.
The data consists of futures contracts that are traded on the CME Group exchange for the Australian dollar (AUD), Canadian dollar (CAD), euro (EUR), Great Britain pound sterling (GBP), Japanese yen (JPY) and the Swiss franc (CHF).These currencies are chosen because they had the largest market turnover rates in 2010 according to the Triennial Central Bank Survey [25].
Sierra Chart software is used to join each currency's future contracts into a single continuous time series for the corresponding currency; for instance, all futures contracts for the AUD (June 2010, ..., July 2012) were joined in sequence to form a single continuous time series for the AUD.The original data has one-minute periodicity and is aggregated across time into fifteen-minute intervals so that the resulting data used in this analysis has fifteen-minute periodicity.A fifteen-minute periodicity is used because it is large enough to give the currencies plenty of time to respond to each other and small enough to provide the LiNGAM algorithm with a sufficient number of observations.
The exchange rates for the six currencies are converted to direct quotations where the domestic currency is the CHF so that the data used for the analysis consists of observations of the AUD, CAD, EUR, GBP, JPY and USD quoted as CHF/X where X is one of the stated currencies.
Missing data is replaced by the most recent observation in each currency series.Log returns are then computed by taking the natural logarithm and first-differencing the exchange rates (in that order).All log returns in all time periods are stationary based on Dickey-Fuller tests.

Brief History of the Swiss Franc
During the second and third quarter of 2011, the SNB became worried that the appreciation of the franc against the euro was hurting the Swiss economy and increasing the risk of deflation.In August, the SNB drove interest rates to nearly zero and flooded the market with liquidity in an attempt to mitigate the franc's appreciation, but neither of these actions were completely effective.Finally, the franc's appreciation was halted in September when the SNB placed a floor on the CHF/EUR exchange rate.The sequence of SNB actions were as follows [1]: 3 August 2011: the SNB lowered the upper limit of its target range for the three-month Libor to 0-0.25 percent (from 0 to 0.75 percent).• 10 August 2011: the SNB announced additional measures to increase liquidity and reduce the appreciation of the franc.These included pumping more liquidity into the Swiss money market and conducting foreign exchange swap transactions (a policy last used in late 2008).

•
11 August 2011: an SNB official said that a temporary peg to the euro was possible.• 6 September 2011: the SNB announced that it was establishing a floor on the CHF/EUR exchange rate (ceiling on the EUR/CHF exchange rate).The franc would not be allowed to appreciate beyond 1.20 francs per euro.

Model Estimation
To analyze forecasts surrounding the establishment of the floor on the CHF/EUR exchange rate, the futures contract time-series data is segmented into four two-month data sets.These four forecast data sets have corresponding estimation data sets on which estimates of the econometric models are made.Note that it is the forecast data sets (not the estimation data sets) that are arranged around the 11 August 2011 intervention announcement, while the matching estimation data sets simply contain data in the prior six months.The names and descriptions of these four forecast datasets are as follows.In the before data set, the CHF/EUR exchange rate is unencumbered.The surrounding data set begins on 11 August 2011 when an SNB official announced that a temporary peg was possible; the SNB formally established a floor on the CHF/EUR exchange rate near the middle of this data set on 6 September 2011.The after data set begins after the floor has been in effect for just more than a month.The long after data set begins six months after the exchange rate floor has been in place.The exact dates of the forecast data sets and the dates of their accompanying estimation data sets are shown in Table 1.Expected values of the currency log returns in each forecast data set are shown in Table A1, and correlation matrices of the currency log returns in the estimation and forecast data sets are shown in Tables A2 and A3.The estimation results for each of the models on all the estimation data sets are reported in Tables A4-A7.Model estimation is performed using SAS software, Version 9.2 [26].The lag lengths for the estimated VAR models are chosen by using the Hannan-Quinn information criterion and the Schwarz's Bayesian criterion [27].For VAR models in all estimation data sets, both the Hannan-Quinn information criterion and the Schwarz's Bayesian criterion are best (most negative) for lag 1.The VAR model estimates of the autoregressive matrices M 1 for the estimation data sets are shown in Table A4.
Each VAR-LiNGAM model is built on an estimated VAR model by applying the LiNGAM structural learning algorithm to the VAR model's estimated innovation processes.As evidence that the VAR-LiNGAM non-Gaussian assumption holds on every estimation data set, a Kolmogorov-Smirnov test performed on each currency's corresponding independent factor confirms that the null hypothesis of normality is rejected with p-value less than 0.01 for each factor.The VAR-LiNGAM model estimates of the autoregressive matrices M 1 correspond to those of the VAR model and are shown in Table A4.The VAR-LiNGAM model estimates of the causal effect matrices B 0 are shown in Table A5.
Independent component analysis is performed on the currency time series and the independent components are modeled with univariate autoregressive processes.The separating matrices B found by the AMUSE algorithm are shown in Table A6.Independent components are computed using the separating matrices as described in Equation (20).A Kolmogorov-Smirnov test is performed on each independent component to verify the ICA model's non-Gaussian assumption; the test's null hypothesis of normality is rejected with p-value less than 0.01 for each independent component.
The lag lengths for the estimated AR models are chosen by using Schwarz's Bayesian criterion.For the AR models in all estimation data sets, Schwarz's Bayesian criterion is best (most negative) for lag 1.Thus, the independent components are modeled with AR(1) processes whose parameter estimates are shown in Table A7.

Forecast Generation
A multivariate normal distribution is used to model the one-step-ahead probability distribution of the VAR model innovation process.Latin hypercube samples from the multivariate normal distribution in conjunction with the VAR model parameter estimates and historical data are used to compute one-step-ahead probability distributions for the exchange rate returns.
An estimate of the independent factor process of the VAR-LiNGAM model is obtained from its estimated innovation process.Kernel density estimation with a normal probability window is used to estimate the probability distributions of the VAR-LiNGAM independent factor processes.Latin hypercube samples from the independent factor process distributions are transformed into one-step-ahead distributions of the VAR-LiNGAM innovation processes.The innovation process distribution samples plus the VAR-LiNGAM model parameter estimates and historical data are used to compute one-step-ahead probability distributions for the exchange rate returns.
Kernel density estimation with a normal probability window is used to estimate the probability distribution of each AR innovation process.Latin hypercube samples from the innovation process distributions plus the AR model estimates and historical data are used to compute one-step-ahead probability distributions for the independent components.The forecasted probability distributions of the independent components are transformed into forecasted probability distributions of the exchange rate returns as described in Equation ( 23).
Sample one-step-ahead cumulative predictive distributions in each of the forecast data sets for the VAR-LiNGAM model are shown in Figure 1.These sample predictive cdfs are similar to those generated by the VAR and AR models.

Forecast Evaluation
The only forecasts considered here are those for the CHF/EUR exchange rate; the forecasts of other currencies are not evaluated.For the computation of calibration functions, the fractile of each outcome is determined by comparing the outcome to the estimated cumulative predictive distribution.These fractiles are used in conjunction with the estimated cumulative predictive distributions to compute the calibration functions.The calibration functions are both plotted and used to compute goodness-of-fit test statistics.
Calibration plots of the CHF/EUR for the before, surrounding, after and long after forecast data sets are in Figures 2-5.The calibration plots for the AR, VAR and VAR-LiNGAM models in a particular forecast data set in addition to a 45-degree line for reference are shown in each figure .Underconfidence in probability assessments is indicated where the calibration function maps above the 45-degree line, while overconfidence in assessments is indicated where the calibration function maps below the 45-degree line.

Forecast Evaluation
The only forecasts considered here are those for the CHF/EUR exchange rate; the forecasts of other currencies are not evaluated.For the computation of calibration functions, the fractile of each outcome is determined by comparing the outcome to the estimated cumulative predictive distribution.These fractiles are used in conjunction with the estimated cumulative predictive distributions to compute the calibration functions.The calibration functions are both plotted and used to compute goodness-of-fit test statistics.
Calibration plots of the CHF/EUR for the before, surrounding, after and long after forecast data sets are in Figures 2 and 4       For the before forecast data set, each model exhibits underconfidence on the lower end of the calibration function and overconfidence on the upper end.For the surrounding forecast data set, the AR and VAR models exhibit overconfidence on the lower end of the calibration function and underconfidence on the upper end; the extreme ends of both of these calibration functions show the opposite behavior.The calibration function for the VAR-LiNGAM model on the surrounding data set displays the opposite behavior of the AR and VAR models with underconfidence on the lower end and overconfidence on the upper end.For the after and long after data sets, the calibration functions for all models exhibit a large degree of overconfidence on the lower end and a large degree of underconfidence on the upper end.
Overall, the calibration plots show that all models are better calibrated (i.e., map closer to the 45degree line) in the before and surrounding data sets than in the after and long after data sets.Forecasts are less calibrated after the placement of the floor on the CHF/EUR exchange rate; it appears that the Swiss National Bank's market intervention had a negative effect on the calibration of the time-series models in the longer run.
Chi-squared goodness-of-fit tests are performed to test each time-series model for calibration during each forecast data set.The null hypothesis that the forecasts are well calibrated is rejected with a p-value near zero in every data set for every time-series model; no time-series model forecasts are well calibrated in any of the time periods under consideration.Some of the calibration functions appear to map closely to the 45-degree reference line, such as in Figure 2a,b.Nevertheless, none of the calibration functions shown in any of Figures 2-5 reflect forecasts that are well calibrated according to the goodness-of-fit test.
In some of Figures 2-5, the calibration problems appear to be in the tails of the distributions, such as in Figure 2a,b.Generating forecasts with distributions estimated via kernel density estimation For the before forecast data set, each model exhibits underconfidence on the lower end of the calibration function and overconfidence on the upper end.For the surrounding forecast data set, the AR and VAR models exhibit overconfidence on the lower end of the calibration function and underconfidence on the upper end; the extreme ends of both of these calibration functions show the opposite behavior.The calibration function for the VAR-LiNGAM model on the surrounding data set displays the opposite behavior of the AR and VAR models with underconfidence on the lower end and overconfidence on the upper end.For the after and long after data sets, the calibration functions for all models exhibit a large degree of overconfidence on the lower end and a large degree of underconfidence on the upper end.
Overall, the calibration plots show that all models are better calibrated (i.e., map closer to the 45-degree line) in the before and surrounding data sets than in the after and long after data sets.Forecasts are less calibrated after the placement of the floor on the CHF/EUR exchange rate; it appears that the Swiss National Bank's market intervention had a negative effect on the calibration of the time-series models in the longer run.
Chi-squared goodness-of-fit tests are performed to test each time-series model for calibration during each forecast data set.The null hypothesis that the forecasts are well calibrated is rejected with a p-value near zero in every data set for every time-series model; no time-series model forecasts are well calibrated in any of the time periods under consideration.Some of the calibration functions appear to map closely to the 45-degree reference line, such as in Figure 2a,b.Nevertheless, none of the calibration functions shown in any of Figures 2 and 4-6 reflect forecasts that are well calibrated according to the goodness-of-fit test.
In some of Figures 2 and 4-6, the calibration problems appear to be in the tails of the distributions, such as in Figure 2a,b.Generating forecasts with distributions estimated via kernel density estimation with a normal probability window might be the source of this bad tail behavior.In the calibration plots that show bad tail behavior, the miscalibration of each tail is in the opposite direction; for example, in Figure 2b, the calibration function shows underconfidence on the low end and overconfidence on the upper end.If the normal probability widow was to blame for this poor tail performance, it would likely produce tails that were too heavy or too light at both ends of the distribution.For instance, if kernel density estimation with a normal probability window produced a distribution with tails that were too light to reflect the distribution of returns, then the corresponding calibration function would show underconfidence at both ends of the plot.Additionally, since other figures show that the problem with calibration is more in the central part of the distribution than in the tails, such as Figure 4a,b, it is unlikely that the normal probability window is the culprit for bad calibration.
In addition to the calibration tests, the mean-squared error (MSE) and the probability score metrics are used to rank the probability forecasting systems.The mean-squared errors of each model's forecasts are reported in Table 2, and the probability scores of each model's forecasts are reported in Table 3.The VAR and VAR-LiNGAM models both have the same MSE on each data set because they are both driven by the innovations of the VAR model (see Equation ( 26)).The MSE results indicate that no model consistently outperforms the others.The VAR and VAR-LiNGAM models perform the best in the before and long after data sets, while the AR model performs the best in the surrounding and after data sets.This may indicate that all models have roughly the same forecasting performance or that the VAR and VAR-LiNGAM models perform better in periods isolated from structural change.
In contrast, the probability score rankings show that the VAR model outperforms the other models in all but the long after data set in which the VAR-LiNGAM's performance is slightly better.Because the simple VAR model outperforms the other models that are built using independent components, the probability score results indicate that there is no gain in forecasting performance when using independent components.Additionally, the probability score ranks the AR forecasts higher than the VAR-LiNGAM forecasts in all periods but the last; this may indicate that in some cases the multivariate VAR-LiNGAM model provides no advantage over the univariate AR model.
The VAR and VAR-LiNGAM models generate better forecasts in the long after period according to the MSE and the probability score.This is some indication that the VAR-LiNGAM model performs better than the AR model after market intervention has been in effect for some period of time.

Change in the Causal Structure
The results from the LiNGAM algorithm show that there is evidence that the causal relationships among the exchange rates changed after the intervention by the Swiss National Bank.Table A5 reports the causal effect matrices for the different estimation data sets.These matrices show the causal effects from currencies listed in the columns to the currencies listed in the rows.For example, the first row of Table A5a shows that the AUD exchange rate is positively affected by the CAD, EUR, GBP and USD and negatively affected by the JPY.The causal effects contained in these matrices can be represented graphically by directed acyclic graphs.The causal structure of the currencies before the SNB intervention is shown in Figure 7a and the structure after the intervention is shown in Figure 7b.

Change in the Causal Structure
The results from the LiNGAM algorithm show that there is evidence that the causal relationships among the exchange rates changed after the intervention by the Swiss National Bank.Table A5 reports the causal effect matrices for the different estimation data sets.These matrices show the causal effects from currencies listed in the columns to the currencies listed in the rows.For example, the first row of Table A5a shows that the AUD exchange rate is positively affected by the CAD, EUR, GBP and USD and negatively affected by the JPY.The causal effects contained in these matrices can be represented graphically by directed acyclic graphs.The causal structure of the currencies before the SNB intervention is shown in Figure 6a and the structure after the intervention is shown in Figure 6b.A5.
Figure 6 shows that several causal relationships reversed direction following the SNB intervention: and the EUR → USD relationship changed sign from negative to positive.These graphs show that the policy change by the SNB altered the causal structure underlying the six major currencies.This result adds supporting evidence to the Lucas critique [28], wherein Lucas hypothesized that a policy change could change the structure of an econometric model.It is difficult to say why this causal change occurred.One possible explanation is that the Swiss franc is a safe haven currency and a funding currency for currency carry trades [29], so the changes could be due to a change in the risk of the Swiss franc that affects its usefulness in either of these roles.A typical carry trade in 2011-2012 would have invested in the Australian dollar (a high-yielding currency) and been funded by the Swiss franc (a low-yielding currency) [29].Thus, the changing causal relationships with the Australian dollar could be due to a change in the risk characteristics of the carry trade when the SNB imposed its floor on the CHF/EUR.A5.
Figure 7 shows that several causal relationships reversed direction following the SNB intervention: and the EUR → USD relationship changed sign from negative to positive.These graphs show that the policy change by the SNB altered the causal structure underlying the six major currencies.This result adds supporting evidence to the Lucas critique [28], wherein Lucas hypothesized that a policy change could change the structure of an econometric model.
It is difficult to say why this causal change occurred.One possible explanation is that the Swiss franc is a safe haven currency and a funding currency for currency carry trades [29], so the changes could be due to a change in the risk of the Swiss franc that affects its usefulness in either of these roles.A typical carry trade in 2011-2012 would have invested in the Australian dollar (a high-yielding currency) and been funded by the Swiss franc (a low-yielding currency) [29].Thus, the changing causal relationships with the Australian dollar could be due to a change in the risk characteristics of the carry trade when the SNB imposed its floor on the CHF/EUR.

Discussion
This study assesses the impact of the Swiss National Bank's manipulation of the CHF/EUR exchange rate on the probability forecasts from a VAR model, a VAR model augmented with the LiNGAM causal learning algorithm, and a univariate AR model built on the independent components of an independent component analysis.Forecasts are divided among data sets that represent periods of time before, surrounding, after and long after the beginning of the CHF/EUR exchange rate manipulation.
Calibration plots are shown for the forecasted probability distributions of CHF/EUR returns on all data sets.None of the forecasted probability distributions appear to be calibrated based on the calibration plots, and calibration tests confirm this.The calibration plots show that all models are better calibrated in the periods before and surrounding the beginning of the exchange rate manipulation than in the two periods after the floor on the CHF/EUR was established.This implies that the SNB's intervention in the CHF/EUR market had a negative impact on the forecasting performance of the time-series models.
The mean-squared error (MSE) and the probability score metrics are used to rank the probability forecasting systems.When comparing models within each data set, the MSE finds that the VAR and VAR-LiNGAM models generate better forecasts in the before and long after data sets, while the AR model generates better forecasts in the surrounding and after data sets.These results may indicate that all models have roughly the same forecasting performance or that the VAR and VAR-LiNGAM models perform better in periods isolated from structural change.
The probability score finds that the VAR model outperforms the other models in all data sets except the long after dataset in which the VAR-LiNGAM's performance is slightly better.The relatively good performance of the VAR model, which does not take independent components into account, may indicate that there is no improvement in forecasting performance when independent components are used to generate forecasts.Additionally, the probability score ranks the AR forecasts higher than the VAR-LiNGAM forecasts in all periods but the last; this may indicate that in many cases the univariate independent component AR model provides as good or better forecasts than the multivariate VAR-LiNGAM model.
In addition to the forecasting results, this study finds evidence that the policy change by the SNB altered the causal structure underlying the six major currencies.Six causal pathways reversed direction after the policy change and one causal relationship changed from negative to positive.
The findings of this study raise some interesting questions.In particular, why was the causal structure of the foreign exchange market affected by the SNB policy change?Does central bank intervention in a currency market always have a negative impact on the forecasting performance of time-series models?Does the VAR model often generate forecasts that are as good as those from models that use independent components?Under what circumstances does the univariate independent component AR model generate better forecasts than the multivariate VAR-LiNGAM model?These are questions for future studies.

Forecasting 2018, 1 ,Figure 1 .
Figure 1.Sample Cumulative Predictive Distributions.The plots show the sample one-step-ahead cumulative predictive distributions generated by the VAR-LiNGAM model in the before (a), surrounding (b), after (c) and long after (d) forecast data sets.

Figure 1 .
Figure 1.Sample Cumulative Predictive Distributions.The plots show the sample one-step-ahead cumulative predictive distributions generated by the VAR-LiNGAM model in the before (a), surrounding (b), after (c) and long after (d) forecast data sets.

- 6 .Figure 2 .Figure 2 .
Figure 2. CHF/EUR Calibration Functions in the Before Forecast Data Set.The plots show calibration functions for the CHF/EUR exchange rate that are generated by forecasts from the AR (a), VAR (b) and VAR-LiNGAM (c) models in the before forecast data set (11 June 2011-10 August 2011).A model is well calibrated if it maps onto the 45-degree reference line.

Forecasting 2018, 1 ,Figure 2 .Figure 3 .
Figure 2. CHF/EUR Calibration Functions in the Before Forecast Data Set.The plots show calibration functions for the CHF/EUR exchange rate that are generated by forecasts from the AR (a), VAR (b) and VAR-LiNGAM (c) models in the before forecast data set (11 June 2011-10 August 2011).A model is well calibrated if it maps onto the 45-degree reference line.

Figure 4 .
Figure 4. CHF/EUR Calibration Functions in the After Forecast Data Set.The plots show calibration functions for the CHF/EUR exchange rate that are generated by forecasts from the AR (a), VAR (b) and VAR-LiNGAM (c) models in the after data set (11 October 2011-10 December 2011).A model is well calibrated if it maps onto the 45-degree reference line.

Figure 4 .Figure 3 .
Figure 4. CHF/EUR Calibration Functions in the Surrounding Forecast Data Set.The plots show calibration functions for the CHF/EUR exchange rate that are generated by forecasts from the AR (a), VAR (b) and VAR-LiNGAM (c) models in the surrounding data set (11 August 2011-10 October 2011).A model is well calibrated if it maps onto the 45-degree reference line.

Figure 4 .
Figure 4. CHF/EUR Calibration Functions in the After Forecast Data Set.The plots show calibration functions for the CHF/EUR exchange rate that are generated by forecasts from the AR (a), VAR (b) and VAR-LiNGAM (c) models in the after data set (11 October 2011-10 December 2011).A model is well calibrated if it maps onto the 45-degree reference line.

Figure 5 .
Figure 5. CHF/EUR Calibration Functions in the After Forecast Data Set.The plots show calibration functions for the CHF/EUR exchange rate that are generated by forecasts from the AR (a), VAR (b) and VAR-LiNGAM (c) models in the after data set (11 October 2011-10 December 2011).A model is well calibrated if it maps onto the 45-degree reference line.

Figure 5 .
Figure 5. CHF/EUR Calibration Functions in the Long after Forecast Data Set.The plots show calibration functions for the CHF/EUR exchange rate that are generated by forecasts from the AR (a), VAR (b) and VAR-LiNGAM (c) models in the long after data set (7 March 2012-6 May 2012).A model is well calibrated if it maps onto the 45-degree reference line.

Figure 6 .
Figure 6.CHF/EUR Calibration Functions in the Long after Forecast Data Set.The plots show calibration functions for the CHF/EUR exchange rate that are generated by forecasts from the AR (a), VAR (b) and VAR-LiNGAM (c) models in the long after data set (7 March 2012-6 May 2012).A model is well calibrated if it maps onto the 45-degree reference line.

Figure 6 .
Figure 6.Causal effects represented as directed acyclic graphs in the before (a) and the long after (b) estimation data sets.These correspond to the before and long after  0 matrices in TableA5.

Figure 7 .
Figure 7. Causal effects represented as directed acyclic graphs in the before (a) and the long after (b) estimation data sets.These correspond to the before and long after B 0 matrices in TableA5.

Table 1 .
The table shows the data set starting and ending dates.

Table 2 .
The table shows the mean-squared errors of the CHF/EUR forecasts from the AR, VAR and VAR-LiNGAM models on each forecast data set.

Table 3 .
The table shows the probability scores of the CHF/EUR forecasts from the AR, VAR and VAR-LiNGAM models on each forecast data set.

Table A1 .
These are the expected values of the currency log returns R(t) in the estimation data sets (a) and the forecast data sets (b) for the Australian dollar (AUD), Canadian dollar (CAD), euro (EUR), Great Britain pound sterling (GBP), Japanese yen (JPY) and United States dollar (USD).

Table A2 .
These are the correlation matrices of the currency log returns R(t) in the before (a), surrounding (b), after (c) and long after (d) estimation data sets for the Australian dollar (AUD), Canadian dollar (CAD), euro (EUR), Great Britain pound sterling (GBP), Japanese yen (JPY) and the United States dollar (USD).All correlation coefficients are significant at the 1% level.

Table A3 .
These are the correlation matrices of the currency log returns R(t) in the before (a), surrounding (b), after (c) and long after (d) forecast data sets for the Australian dollar (AUD), Canadian dollar (CAD), euro (EUR), Great Britain pound sterling (GBP), Japanese yen (JPY) and United States Dollar (USD).All correlation coefficients are significant at the 1% level.

Table A4 .
These are the VAR and VAR-LiNGAM model estimates of the autoregressive matrices M 1 in the before (a), surrounding (b), after (c) and long after (d) estimation data sets for the Australian dollar (AUD), Canadian dollar (CAD), euro (EUR), Great Britain pound sterling (GBP), Japanese yen (JPY), and the United States dollar (USD).An M 1 matrix contains the estimates from a standard vector autoregressive model and reflects the autoregressive effects from the lag 1 period on the lag 0 period.

Table A6 .
These are the independent component analysis estimates of the separating matrices B in the before (a), surrounding (b), after (c) and long after (d) estimation data sets.A separating matrix facilitates the computation of the independent components from the original series of returns.

Table A7 .
These are the AR model parameter estimates.