3.1. The Rationale behind MS and MS-GARCH Models, the Input Data Processing Method, and the Trading Strategy Simulation’s Parameters
MS models estimate the location (
), scale (
), and smoothed regime probabilities (
) by assuming that they come from a
multimodal Gaussian or t-student pdf as follows:
In the previous expressions, and are the fixed probabilities or mixtures of each regime pdf. That is, the fixed probability or proportion that a given realization comes from the regime number .
The main rationale of MS models is the fact that
, the regime
at
, and the location and scale parameters are unobservable for the analyst, investor, or trader. Therefore, these must be estimated as if they come from a
states hidden Markovian chain. As a hidden Markovian chain, the stochastic process can transit from a given state (
) to another one (
). This transition can be modeled with a
transition probability matrix
:
In order to give an example, we used the one-month, weekly historical cocoa future price. In
Figure 1 we present the price time series (blue line), along with the distress periods or regime (grey areas). As we will review next, we estimated these with a time-fixed variance, Gaussian pdf MS model. As we mentioned previously, the distress time periods shown with the grey area are unobservable to the analyst. This means that their behavior must be inferred directly from the time series’ realizations.
In order to model the potential regime switch, there are two possible solutions. The first comes with the threshold autoregressive (TAR) model [
28]. In this model and given a user-specified threshold value (
), the stochastic process can be inferred with the next indicator function:
The main drawback of this first proposal is that the analyst must have a clear value and this value cannot be known in certain securities. A good example of this situation is the potential downturn that a commodity future price could have to negative prices. If the market of a given future (such as cocoa, sugar, corn, or oil) finds that there is not enough storage for the given commodity, the future prices could fall to negative values. This could happen by the fact that future purchasers could have higher storage costs than “to pay” (with a negative price) to someone else. This could happen in order to motivate the buy of their commodity’s long position. Given this, in a single day or week, the future price could have a downturn of more than 100%. By the fact that this scenario could be seen only in extreme economic or social circumstances, a discretional threshold value would not be useful, because that does not fit to the current reality or price shortfall.
Another drawback of the TAR model in Equation (5) is the parameter estimation method. In practical terms, the TAR model is one with dummy (binary) and slope dummy regressors. As more regimes or states of nature are incorporated, the estimation process could lose degrees of freedom.
Departing from these issue, and as we previously mentioned, Hamilton proposed the use of MS models. In these, the behavior of the
number of regimes is one of a
regime Markovian process with a transition probability matrix such as the one in Equation (3). This Markovian process is hidden or unobserved to the analyst and must be inferred from the time series. In order to infer the parameter set
of MS models, Bayesian estimation techniques are used. One of these is the quasi-maximum likelihood (QML) estimation described by Hamilton [
3]. The rationale behind it is similar to the Kalman’s filter. As a first stage of the estimation process, the analyst has a prior estimate of the transition probability matrix
and the parameter set (
). In the second stage, the analyst filters the probability of being in each state or regime by using the prior parameter set in the regime-specific Gaussian or t-student probability density functions. This follows as, respectively:
In the previous expressions,
are the prior and starting location (mean), scale (standard deviation), and t-student pdf degrees of freedom. This is in each iteration of the estimation process. Once the regime-specific probabilities were filtered from the data with Equation (6) or Equation (7), the analyst estimates the corresponding log-likelihood function (LLF), departing from Equation (2) or Equation (3):
Given the filtered regime-specific probabilities, the location and scale parameters are updated with these calculations, respectively:
With the new parameter set () in this number of iteration or step (), the analyst returns to the first stage in the estimation algorithm, as this set is the new prior. Then, she updates the filtered probabilities, Equation (6) or Equation (7), with the new parameter set, (), calculates the with Equation (8), and updates, with Equations (9) to (11), the parameter set with the new filtered probabilities. This numerical method continues until the values between the current iteration () and the previous one () have a difference lower than a threshold value ().
Once the estimation process stops, the filtered probabilities are smoothed with the method suggested by Kim [
58].
As noted, this QML method suggested by Hamilton [
3] is an special case of the expectation–maximization (E-M) algorithm of Dempster et al. [
59]. As we will mention next, we did not use the QML method for our estimations. We used a Markov chain Monte Carlo (MCMC) method known as the Metropolis–Hastings [
60] sampler. This is a Monte Carlo estimation method similar to the current one that allows a higher degree of feasibility than the QML one.
With the QML or Metropolis–Hastings estimation method described, the analyst could arrive to a result similar to the one of
Figure 2. In this example, we present an estimation for the cocoa future price. In the upper left chart, we present the historical value of the continuously time percentage variation. The red lines correspond to the returns generated during the distress (
) regime. The blue ones are the returns generated in the normal (
) one. These were determined with the next indicator function, which is the criteria suggested in most of the related literature about MS models [
1,
2,
3,
17,
18,
19,
61]:
In the upper right chart of
Figure 2, we show the Gaussian kernels of the returns generated in each regime. As noted, the blue area is the narrowest one and corresponds to the Gaussian kernel of the returns generated in a normal (
) regime (the blue lines in the upper left chart). The red area is the kernel of the returns of the distress time periods. Given this, it is noted that the parameters of each regime, especially the scale or standard deviation, have notable differences (especially in the scale parameter).
In the lower left chart, we present the second regime’s () smoothed probability (). This was estimated with the MS model. As noted, the dates in which correspond to the dates in which the returns are marked with red in the upper chart. Finally, we also show a table with the estimated transition probability matrix. As noted in this case, the normal regime tends to be absorbing. That is, the probability of being in such a regime is notably higher and the time spent in it is also high.
Given the estimation method described previously, the parameter set of a Gaussian or t-student MS model () is useful for trading decisions. More specifically, the smoothed probability is of being in the distress regime (red line in the lower left chart), as we will mention next.
Two important assumptions in the original MS models are the fact that the transition probabilities and the scale parameters (
) are time-fixed. A natural extension of the MS models explained in Equations (2) to (11) is to estimate the variance through a generalized autoregressive conditionally heteroskedastic (GARCH) standard deviation [
62,
63]. A scale parameter estimation model with the next functional form:
This stochastic process means that the actual standard deviation is dynamic or time-varying in each regime. This leads to the time-varying standard deviation model known as the MS-GARCH [
29,
30,
31]. Given this, and the dynamic standard deviation of the GARCH model inside the Markov-switching structure, the parameter set includes a time-varying scale parameter (
) for each regime:
Given the specific features of the MS-GARCH model in Equations (13) and (14), the analyst could estimate the variance by using only the lags of the residuals (
). This means that she is using only the autoregressive conditionally heteroskedastic (ARCH) term. If she estimates the entire model in Equation (13), she is adding the GARCH term that is a generalization of the
extra lag terms. Given the parameter set of a MS or MS-GARCH model (
), a futures’ position trader or a portfolio manager could be interested in the smoothed probabilities of being in each regime. These probabilities could be used to forecast the probability of being in each regime at
as follows:
With these forecasted probabilities, the trader or portfolio manager could follow a trading algorithm similar to the one of Brooks and Persand [
17] or the one tested by De la Torre-Torres, Galeana-Figueroa, and Álvarez-García [
20,
22] or De la Torre-Torres et al. [
23]:
- (1)
To estimate the parameter set of a MS or MS-GARCH (with Gaussian or t-student pdf).
- (2)
To forecast the smoothed probability () of being in the distress regime at with Equation (15).
- (3)
To follow the next trading algorithm:
As it is expected, the estimated smoothed probability values are different if the trader or portfolio manager uses either a Gaussian or t-student pdf. The values also are different if she prefers to estimate time-fixed (MS), ARCH (MS-ARCH), or GARCH (MS-GARCH) standard deviations.
Departing from this feature, we were interested to test the performance that a futures’ position trader or a portfolio manager would have had, had she performed the previous trading algorithm each week. This was in the three soft commodities of interest.
In the next section, we will mention how we processed the input data and how we ran our back tests of the previous trading algorithm.
3.2. Data Processing
The idea behind our review is the fact that the use of a Gaussian or t-student pdf and a time-fixed, ARCH, or GARCH variance could lead to different smoothed probability () estimations. Given this difference, it is expected that the previous trading algorithm could lead to a different timing in trading decisions.
Our review was done by assuming that she traded actively in the weekly periods from 7 January 2000 to 3 April 2020 (the last available date until the writing of the present paper). In order to test the benefits of this trading rule in soft agricultural commodities, we performed our simulations in the coffee, cocoa, and sugar future prices. These three futures are the one-month to expiration ones and were used as the risky assets in the portfolio.
Also, we used as risk-free asset, a theoretical zero tracking-error fund that invests in the thee-month U.S. Treasury bill. The Refinitiv Identifier Codes (RIC), security name, and the futures’ exchange in which they are traded are summarized in
Table 1.
The historical weekly data of the commodity futures along with the historical three-month U.S. Treasury bills’ rate were retrieved from Refinitiv Eikon-Xenith [
64].
With this historical data, we simulated, week by week, the previous trading algorithm in six scenarios. Three of these had a Gaussian pdf with time-fixed (MS), ARCH (MS-ARCH), or GARCH (MS-GARCH) standard deviations. The other three used a t-student pdf.
In order to estimate the MS, MS-ARCH, or MS-GARCH models, we did not used a QML estimation method. Instead, we preferred to use Markov chain Monte Carlo (MCMC) estimation process. This was through the Metropolis–Hastings [
60] sampling method, a method that is also proper of the MS-GARCH package [
65] in the R programming code. R and this package were the core software in our back tests. We used the MCMC method in order to ensure the convergence in the inference of the parameters, a risk that was expected in the QML method previously described.
As an extra methodological note, we estimated the MS-ARCH or MS-GARCH with one lag in the ARCH and GARCH terms of the time-varying variance. Also, for estimation purposes of our models, we used the entire
time series from 20 September 1991 to the simulated date at
(from 7 January 2020 to 3 April 2020). This led to a total of 1057 back test dates. The returns were estimated with the time continuous method, given the future price at
(
):
By the fact that the trading commissions could be small for future position traders or institutional portfolio managers, we set aside the impact of trading costs. We did this by assuming the similar performance results observed in active trading with MS models in some stock markets [
20]. This was with 0.35% and 0% of trading fee.
In order to perform our back tests, we compared the simulated performance of a USD 100,000 starting balance account against the performance of a buy-and-hold (BH) strategy. We did this by executing the next pseudocode of our trading Algorithm 1 previously described:
Algorithm 1. The MS-GARCH based trading algorithm’s pseudocode. |
For date 1 to 1057:To determine the current balance in the portfolio (cash balance + market value of holdings). To execute the Markov-switching model analysis in Equation (6) with either GARCH, ARCH, or constant standard deviation (either with Gaussian or t-student probability density function). This was by using the future’s return historical data from 20 September 1991 to the simulated date (). To calculate, by using Equation (14), the forecasted smoothed probability related to being in a distress or bad-performing regime at . If , then:
Else: - b.
To invest in the risky asset (the commodity future contract simulated)
- 5.
To price the value of the portfolio with a market-to-market (with closing market prices at ) procedure.
End |
Before we start to review our simulation results, we tested the appropriateness of MS and MS-GARCH models in all the simulated dates. For this purpose, we estimated the deviance information criterion, or DIC, for each model. We did this in a recursive manner with data from 20 September 1991 to the back test date range (7 January 2020 to 3 April 2020). The mean DIC values of these estimations in each date are summarized in
Table 2.
As noted, we compared the six MS or MS-GARCH models of interest with the single-regime DIC estimation, that is, against the case in which no MS model was used. Given these results, the MS-GARCH models with t-student pdf was the most suitable for the cocoa and the sugar future returns and the time-fixed variance, t-student MS one in the coffee. Also, and in general terms, the use of two-regime models is preferable to characterize the time series of interest during the back test periods.
Now that we described our input data processing method and our back test’s pseudocode, we will continue with the discussion of our main results.