4.1. In-Sample Estimation of Forecasting Models
The sample period covered for the in-sample estimation of forecasting models is from 1964 to 2020. Using data for this period, each of the four empirical models specified in
Section 3.2 is estimated with forecasting horizons
h of 3, 6, and 12 months.
Table 3 reports the probit regression results for forecasting recessions over the next three months, using the four empirical models specified in
Section 3.2. If we compare the results for Models 1 and 2, we can notice that all of the additional financial variables,
,
, and
, have significant forecasting power for recessions at 1% significance level. Additionally, the Akaike information criterion (AIC) of
Akaike (
1974), which is a measure of prediction error, indicates that Model 2 has a lower in-sample prediction error than Model 1. The comparison of the AICs from Models 1 and 3 also reveals that the inclusion of the temporal cubic terms improves the in-sample fit for recession forecasting. Among the four models considered, Model 4 has the best in-sample fit in terms of AIC, which shows that both the additional financial variables and temporal cubic terms help to improve recession predictability. Moreover, the likelihood ratio test statistic for comparing Models 1 and 4 is 269.70 with a
p-value of less than 0.01%; thus, the null hypothesis of no difference between the two models is strongly rejected in favor of Model 4, which implies that adding the extra financial variables together with the temporal cubic terms leads to a statistically significant improvement in model fit.
In addition,
Table 4 reports the Wald test results on the joint significance of the coefficients estimated from Model 4. The test statistics and their
p-values indicate that each set of the coefficients has strong joint statistical significance even after controlling for the other explanatory variables. This implies that the additional financial variables and temporal cubic terms independently help to improve in-sample recession predictability.
Figure 2 shows the recession probabilities predicted from each model with a forecasting horizon
h of 3 months. This plot illustrates that Model 4 makes better predictions of recessions than the other three models. Model 1 performs well at predicting recessions in early 1980s, but not well at forecasting other recessions. Model 2 enhances the predictability of recessions, especially the recessions in the mid 1970s, late 2000s, and early 2020s, while it still does not predict well the recessions in the early 1990s and early 2000s. Moreover, Model 2 generates high probabilities of recessions for the late 1980s, which are not actual recession periods. This false prediction of a recession is due to the high value of
in the late 1980s as noted in
Figure 1. Model 3 shows decent forecasting performance by capturing most of the recession periods, although it underestimates the duration of the recession in the 2008–2009 period. Model 4 predicts the 2008–2009 recession better than Model 3 by augmenting Model 3 with additional financial variables, which shows that those financial variables are especially useful in improving the predictability of the recession associated with the financial crisis. Additionally, the recession probabilities predicted from Model 4 for the late 1980s are significantly lower than those from Model 2. We can see that accounting for the temporal dependence in the recession indicator by adding temporal cubic terms into Model 2 decreases the predicted recession probabilities for the late 1980s, which in turn increases the predictive performance.
The effects of the additional financial variables and temporal cubic terms for the recession predictability, which are illustrated in
Figure 2, can be summarized as follows. (i) The additional financial variables (
,
, and
) help to better predict the recessions in the mid 1970s, late 2000s, and early 2020s, although they falsely predict a recession in the late 1980s. They are especially useful in predicting the duration of the 2008–2009 recession. (ii) The temporal cubic terms improve the predictability of most recessions, even though they downplay the duration of the recession associated with the 2007 global financial crisis. In particular, they predict the recessions in the early 1990s and early 2000s better than the financial variables. (iii) The additional financial variables and temporal cubic terms complement each other so that they jointly can forecast most of the recessions fairly precisely. The temporal cubic terms help to correct a false detection of a recession from the additional financial variables, while the additional financial variables help to improve the predictability of the recession in the late 2000s, which is not properly captured by the temporal cubic terms.
To further investigate the performance of recession prediction for the longer forecasting horizon,
Table 5 presents the estimation results of the four empirical models specified in
Section 3.2 with forecasting horizons
h of 6 and 12 months. The results show the same patterns as in the estimation results with a forecasting horizon
h of 3 months. The in-sample fit of Model 4 is better than those of the other three models, confirming the usefulness of the additional financial variables and temporal cubic terms for recession prediction. One thing to note with regards to the estimated coefficients for Model 4 is the significance of the coefficients on
. The coefficient on
is statistically significant with
h = 12, but not with
h = 3 nor 6. This suggests that the default yield spread helps to predict recessions in the longer term rather than imminent recessions.
Given that Model 4 has the best in-sample fit among the four empirical models considered, it is interesting to see how each explanatory variable in Model 4 affects the predicted probability of recessions. To this end,
Figure 3 plots the predicted probabilities of recessions over a range of values for each explanatory variable, holding all the other explanatory variables at their mean values. All these predicted probabilities are based on Model 4 and horizon
h = 3. The following can be noticed from the figure. First, the marginal effects of
and
show similar patterns: once the value of each variable goes below a certain threshold, the probability of a recession starts to increase as the value of the variable decreases. In particular, the larger size of a negative term spread leads to a higher probability of recessions. Second,
has its own threshold above which the recession probability increases with the value of the variable. The higher the stock market volatility becomes, the higher the recession probability gets. Third, the probability of a recession gradually increases with the value of
, while it decreases with the value of
. When the default yield spread is larger, the probability for recessions is also higher. Finally, the plot for the marginal effect of
shows that the recession probability steeply decreases with the value of
as long as
is less than about 2 years. However, the probability of a recession starts to slowly increase once
exceeds around 6 years, which implies that a period of expansions for more than about 6 years starts to bring in a higher probability of recessions.
4.2. Out-of-Sample Forecasting Performance
In this section, I examine the usefulness of the empirical models specified in
Section 3.2 for the out-of-sample forecasting of recessions. The in-sample estimation in
Section 4.1 assumes that all information for the whole sample period is available at the time when the forecasting is made. However, for those who want to use estimated models for forecasting future recessions, this assumption is not realistic. Out-of-sample forecasting is, therefore, estimated using only the information available at the time of forecasting. Starting with the recession prediction made in December 1979, the predicted recession probability in each month is calculated using only the data available up to the time at which the prediction is made. The root mean squared error (RMSE) for each forecasting horizon
h is then computed as follows:
where
is the total number of out-of-sample predictions made, and
is the predicted probability of a recession over the period between
t+1 and
t+
h, which is estimated using only the data available up to time
t. In addition to the RMSE, the average prediction error (APE) for each forecasting horizon
h is also calculated with a probability threshold of 0.5 as follows:
where
denotes an indicator function for the condition in the parenthesis being true. The APE defined above corresponds to the average rate of the false prediction of future recessions when we regard a recession probability greater than a threshold of 0.5 as an indication of a recession.
Table 6 reports the out-of-sample RMSE and APE of each model for forecasting horizons
h of 3, 6, and 12 months. For all three horizons considered, Model 4 shows the best out-of-sample forecasting performance in terms of both RMSE and APE. With a forecasting horizon
h of 3 months, Model 1 has an average prediction error of 12.20%, while Model 4 decreases the prediction error by 4.43%p to 7.77%. If we focus on a forecasting horizon
h of 12 months, Model 4 reduces the average prediction error by 7.86%p as compared to that from Model 1. In addition, we can see that the RMSE and APE from Model 4 are smaller than those from Models 2 and 3, which indicates that both the additional financial variables and temporal cubic terms help to improve the out-of-sample recession prediction.
Another way of evaluating the out-of-sample forecasting performance is to use a receiver operating characteristic (ROC) curve, which shows the relation between false positive rate (FPR) and true positive rate (TPR). FPR is the ratio between the number of false positive predictions and the number of real negative cases, while TPR is the ratio between the number of true positive predictions and the number of real positive cases. Unlike the APE in Equation (
9) which assumes a fixed recession probability threshold of 0.5, the ROC curve is generated by assuming many different values of the recession probability threshold.
Figure 4 shows the ROC curves of the out-of-sample forecasting from the four empirical models specified in
Section 3.2 with a forecasting horizon
h of 3 months. Each dot in the plot corresponds to a pair of FPR and TPR for a certain value of the recession probability threshold. The out-of-sample forecasting performance of each model can be measured by the area under the curve (AUC) for the ROC curve from each model. The higher value of the AUC implies better forecasting performance. The AUC for each of the Models 1–4 is 0.6247, 0.8624, 0.9179, and 0.9248, respectively, which also confirms that Model 4 has the best out-of-sample forecasting performance among the four models considered.