Next Article in Journal
Mathematical Modeling and Optimal Control of the Hand Foot Mouth Disease Affected by Regional Residency in Thailand
Next Article in Special Issue
On Reliability Estimation of Lomax Distribution under Adaptive Type-I Progressive Hybrid Censoring Scheme
Previous Article in Journal
Set Stability and Set Stabilization of Boolean Control Networks Avoiding Undesirable Set
Previous Article in Special Issue
A New Quantile Regression Model and Its Diagnostic Analytics for a Weibull Distributed Response with Applications
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Determining Number of Factors in Dynamic Factor Models Contributing to GDP Nowcasting

Department of Statistics, Iowa State University, Ames, IA 50011, USA
*
Author to whom correspondence should be addressed.
Mathematics 2021, 9(22), 2865; https://doi.org/10.3390/math9222865
Submission received: 14 October 2021 / Revised: 5 November 2021 / Accepted: 6 November 2021 / Published: 11 November 2021
(This article belongs to the Special Issue Statistical Simulation and Computation II)

Abstract

:
Real-time nowcasting is a process to assess current-quarter GDP from timely released economic and financial series before the figure is disseminated in order to catch the overall macroeconomic conditions in real time. In economic data nowcasting, dynamic factor models (DFMs) are widely used due to their abilities to bridge information with different frequencies and to achieve dimension reduction. However, most of the research using DFMs assumes a fixed known number of factors contributing to GDP nowcasting. In this paper, we propose a Bayesian approach with the horseshoe shrinkage prior to determine the number of factors that have nowcasting power in GDP and to accurately estimate model parameters and latent factors simultaneously. The horseshoe prior is a powerful shrinkage prior in that it can shrink unimportant signals to 0 while keeping important ones remaining large and practically unshrunk. The validity of the method is demonstrated through simulation studies and an empirical study of nowcasting U.S. quarterly GDP growth rates using monthly data series in the U.S. market.

1. Introduction

Real-time nowcasting is a process to assess current-quarter GDP from timely released economic and financial series before the figure is disseminated in order to catch the overall macroeconomic conditions in real time. This is of interest because most data are released with a lag and are released subsequently. In theory, any release, no matter at what frequency, may affect current-quarter estimates and their precision potentially. Both forecasting and nowcasting are important tasks for central banks for policy decision-making; for example, monetary policies need to be made in real time and are based on assessments of current and future economic conditions. Additionally, estimated current-quarter GDP figures are often used as relevant inputs for model-based longer-term forecasting exercises in banks.
Real-time nowcasting faces some difficulties. The first one is how to bridge monthly data series with the quarterly GDP. Bafigi et al. [1], Rünstler and Sédillot [2], and Kitchen and Monaco [3] studied the idea of bridge equations which use small models to “bridge” the information contained in one or a few key monthly data with the quarterly growth rate of GDP. However, they involve judgmental nowcasts and only deal with a few monthly data series. The second difficulty is how to deal with a large number of monthly data series. For macroeconomic forecasting, factor models (FMs) are widely used at central banks and other institutions to achieve dimension reduction. Many authors, such as Boivin and Ng [4], Forni et al. [5], and D’Agostino and Giannone [6], have shown that these models are successful in this regard. However, they did not use FMs specifically for the problem of real-time nowcasting. There are other existing approaches to tackle the high dimensional issue. One example is Eraslan and Schröder [7] who dealt with this over-parameterized issue in GDP nowcasting by implementing the dynamic model averaging method (Raftery et al. [8]). However, they did not talk about how to handle the unbalanced structure of the data caused by different release dates with different lags in each month. Moreover, they assumed a fixed number of factors, while in our paper the focus is to determine the number of factors contributing to GDP nowcasting. The third challenge is that a large number of monthly data series are released with different lags, causing unbalanced data at the end of the sample. Some authors, including Croushore and Stark [9], Koenig et al. [10] and Orphanides [11], discussed about this issue, but they are not focusing on the statistical estimation.
Giannone et al. [12] provided a frequentist inference framework for the parametric dynamic factor models (DFMs). In their framework, they took advantage of different data releases throughout the month and updated the nowcast based on each new data release. The authors combined the idea of connecting monthly series with the nowcast of quarterly GDP and the idea of using data with different releases within a single statistical framework. Their model combines principal component analysis (PCA) with modified Kalman Filter (KF) to deal with the unbalanced feature of the data. Hereafter, we call the method proposed in Giannone et al. the GRS approach.
In this paper, we borrow the idea of DFMs from Giannone et al. [12] and propose a Bayesian Monte Carlo Markov Chain (MCMC) approach to deal with the real-time nowcasting problem. For DFMs, one important aspect is to determine the number of factors. In the GRS approach, the number of factors is assumed to be fixed, which is determined by looking at the cumulative proportions of variances explained by the first few principle components from PCA, and the same set of factors is assumed to have prediction power on GDP. Bai and Ng [13] showed that the number of factors could be estimated consistently in a large panel of data setting. In this paper, we impose a cap on the number of factors in the DFM structure but allow an unknown number of factors to contribute to GDP prediction. We propose to apply the horseshoe shrinkage (Carvalho et al. [14,15]) on the coefficients of factors in the prediction equation. One big advantage of the horseshoe shrinkage over other traditional shrinkages is that it can shrink unimportant signals to 0 while keeping important ones large and practically unshrunk (more details will be discussed in Section 2). After estimation, any coefficient that is shrunk to 0 indicates its corresponding factor has no prediction power. As a result, the number of coefficients that remain large after strong shrinkage is a good estimate of an unknown number of factors with prediction power on GDP. Our Bayesian MCMC approach can also provide a more natural way to deal with the unbalanced data structure due to real-time data releasing and estimate all parameters, including the number of contributing factors and latent dynamic factors in a single framework. We refer to this Bayesian approach as the BAY approach. Through simulation studies, we evaluate the abilities of our BAY approach in estimating an unknown number of contributing factors and in producing reliable nowcasts in real time. The validity of the BAY approach is also examined by applying it to nowcast U.S. quarterly GDP growth rates.
The rest of this paper is organized as follows. Section 2 sets up the model structure, introduces the horseshoe shrinkage into our model, and stylizes the data structure. In Section 3, we introduce the Bayesian MCMC estimation method with nowcasting equations. In Section 4, we conduct simulation studies. In Section 5, an empirical study of nowcasting U.S. GDP growth rates is presented. Section 6 concludes the paper. A list of abbreviations used in this paper is provided in Abbreviations.

2. Model Set-Ups, Horseshoe Shrinkage, and Data Structure

In Section 2.1, the model set-ups used in the BAY approach are illustrated. In Section 2.2, we introduce the horseshoe shrinkage idea and how to implement it in our BAY approach to estimate number of contributing factors. In Section 2.3, we formalize the unbalanced data structure.

2.1. Dynamic Factor Models

In this section, we introduce the Dynamic Factor Model structure and how it can reduce dimension and bridge our monthly released series with quarterly released GDP.
Since the number of monthly series is vast, modeling GDP on all available series can involve too many parameters; hence, the model would perform poorly in forecasting because of large uncertainty in the parameters’ estimation. The fundamental idea of Giannone et al. [12] is to use DFMs to exploit the collinearity of the series by summarizing all the available information into a few common latent factors. Due to collinearity, a linear combination of the common factors is able to capture the dynamic interaction among the series and to provide a model that only requires a limited number of parameters and thus works well in forecasting. In this paper, our DFM is specified in the following ways.
First, assume that the monthly series are linear functions of a few unobserved common factors F t ,
x t = μ + Θ F t + ε t ,
where x t = ( x 1 , t , , x n , t ) is the n × 1 monthly series vector at month t, for t = 1 , 2 , , T , F t = ( f 1 t , , f r t ) is the r × 1 common factor vector at month t, r is the number of latent factors ( r < < n ) which is usually assumed to be known and fixed, Θ is the n × r factor loading, μ is the n × 1 mean vector, and the n × 1 error vector ε t N ( 0 , Ω n × n ) . Note that the difference between Equation (1) and regular multiple regression is that the factors F t are unobserved latent variables, while the predictors in multiple regression are observed. The latent factors F t serve an important role of bridging information from monthly series to quarterly GDP.
Then, we further specify the dynamic of the common factors as a vector auto-regression:
F t = A F t 1 + u t ,
where A = d i a g ( a 1 , a 2 , , a r ) and u t N ( 0 , Σ r × r ) with Σ = d i a g ( σ 1 2 , σ 2 2 , , σ r 2 ) . It is known that factor dynamic models can suffer from non-identifiable issues. Following Stock and Watson [16], we construct two sets of restrictions: | a j | < 1 and σ i 2 ( 1 a i 2 ) < σ j 2 ( 1 a j 2 ) for 1 i < j r . These restrictions together with the prior distribution of Θ (specified later) satisfy the identification assumptions in Stock and Watson [16], and can identify factors up to a change of sign.
Finally, we assume that the nowcast of the GDP at quarter k is a linear function of the common factors at each month in the current quarter and the GDP from the previous quarter:
y k = β 0 + β 1 F 3 k + β 2 F 3 k 1 + β 3 F 3 k 2 + β 4 y k 1 + ν k ,
where β 0 and β 4 are scalars, β i = ( β i 1 , , β i r ) , i = 1 , 2 , 3 , are r × 1 vectors and ν k N ( 0 , η 2 ) for k = 1 , , K . Here, the number T (the end of monthly series) and the number K (the end of quarterly GDP series) satisfy 3 K + 1 T 3 K + 3 . The DFMs specified this way can successfully bridge quarterly released GDP with monthly financial or economic series and achieve dimension reduction.

2.2. Horseshoe Shrinkage

In this section, the horseshoe shrinkage idea is introduced, and we discuss how it can be implemented in our model framework to estimate the number of contributing factors.
In Carvalho et al. [14], the horseshoe prior was first introduced as a shrinkage prior. Follett and Yu [17] shown that the horseshoe prior competes favorably with shrinkage schemes commonly used in Bayesian multivariate regression models. To illustrate the idea, let us first consider a simple mean model z j i n d N ( μ j , σ 2 ) for j = 1 , , R . We assume μ = ( μ 1 , , μ R ) is sparse and some μ j might be equal to 0. We can assign the horseshoe prior to μ j for j = 1 , , R by letting μ j | λ j , τ i n d N ( 0 , λ j 2 τ 2 ) with λ j i i d H a l f C a u c h y ( 0 , 1 ) . Here, τ is referred to as the global shrinkage prior and λ j is referred to as the local shrinkage prior. Figure 1 plots the densities for the horseshoe (setting τ = 1 for simplicity), Laplacian, and Student-t priors, respectively. As shown in Figure 1, compared to Laplacian or Student-t priors, the horseshoe has flat tails which allow strong signals that remain un-shrunk. It also has an “infinitely” tall spike at the center which can provide severe shrinkage for elements near zero. This feature makes the horseshoe prior a very useful shrinkage prior.
Furthermore, it was showed in Carvalho et al. [14] that, for τ = σ 2 = 1 ,
E ( μ j | z 1 , , z R , λ j ) = ( 1 κ j ) z j + κ j 0 = ( 1 κ j ) z j ,
where κ j = 1 1 + λ j 2 . When κ j = 0 , E ( μ j | z 1 , , z R , λ j ) = z j , it indicates that the signal from data dominates; when κ j = 1 , E ( μ j | z 1 , , z R , λ j ) = 0 , it means that μ j is shrunk to 0. Thus, κ j is referred to as the Shrinkage Profile which measures the shrinkage level:
κ j = 0 little shrinkage ( Important ) 1 extreme shrinkage ( Not important ) .
The Half Cauchy prior on λ j implies a B e t a ( 0.5 , 0.5 ) prior on the Shrinkage Profile κ j . Figure 2 shows the implied prior on κ j for the horseshoe prior, the student’s t prior, and the Laplacian prior. As shown in the figure, unlike the Laplacian prior and t prior, the density of κ j implied by the horseshoe prior is unbounded at both 0 and 1 with a small mass in between (a horseshoe shape). Being unbounded at 0 allows effects to grow large (little shrinkage) while being unbounded at 1 can shrink effects until they are fully removed from the equation.
We apply this horseshoe shrinkage idea to the prediction Equation (3) as follows:
y k = β 0 + β 1 S F 3 k + β 2 S F 3 k 1 + β 3 S F 3 k 2 + β 4 y k 1 + ν k ,
where S = d i a g ( τ λ 1 , , τ λ R ) , and R is the largest possible number of latent factors. The cap R is predetermined, satisfying r < R < < n . Now, β i = ( β i 1 , , β i R ) , i = 1 , 2 , 3 , are R × 1 vectors, F t = ( f 1 t , , f R t ) are R × 1 common factors at month t, and dimensions for A , Σ , and Θ are changed accordingly.
In this specification, τ is the global shrinkage prior and λ j ( j = 1 , , R ) are local shrinkage priors. We set λ j H a l f C a u c h y ( 0 , ν λ j ) , where 0 < ν λ < 1 . In this way, we assume that the importance of factors decreases when j goes from 1 to R. If we put priors β i j i i d N ( 0 , 1 ) , i = 1 , 2 , 3 , j = 1 , , R and define β ˜ i j : = τ λ j β i j , then β ˜ i j | τ , λ j i n d N ( 0 , τ 2 λ j 2 ) , and Equation (5) changes to be
y k = β 0 + j = 1 R β ˜ 1 j f 3 k , j + j = 1 R β ˜ 2 j f 3 k 1 , j + j = 1 R β ˜ 3 j f 3 k 2 , j + β 4 y k 1 + ν k .
It can be seen that β ˜ . j = ( β ˜ 1 j , β ˜ 2 j , β ˜ 3 j ) is the coefficient connecting factor j to GDP in the prediction equation. By such a specification, we successfully impose the horseshoe shrinkage prior on the coefficients β ˜ . j ’s. The magnitudes of the estimated profiles, κ j = 1 1 + λ j (for j = 1 , , R ), give us some information about which β ˜ . j should be shrunk to 0 and which should not. If κ j is close to 1, indicating extremely strong shrinkage on β ˜ . j , factor j then has no prediction power on GDP. As a consequence, the number of remaining un-shrunk β ˜ . j determines the number of contributing factors.

2.3. The Unbalanced Structure of the Data

In this section, we provide descriptions of the unbalanced data structure and notations used to represent the unbalanced structure.
In real time, macroeconomic series are released with diverse lags. At a particular release date, some series have observations up through the current month, whereas for others, the most recent observations maybe come from previous months. Dealing with these kinds of unbalanced data is vital for nowcasting.
Let x T = ( x 1 , T , , x n , T ) be the n × 1 vector denoting n monthly data series at month T (the end of the sample), and y K be quarterly GDP at quarter K. Assume there are Q different release dates at each month. Each release date is denoted as ( q , T ) , representing the qth release date in month T, where q = 1 , , Q . New series are released on each release date. Since some series for the current month may be released in the future, let T q denote the latest month in which the data are balanced. For t = T q + 1 , , T , the releasing set v q , t collects indexes of all x i , t s that have been released at or before the release date ( q , T ) , and this set of available series is denoted as x i v q , t . Without loss of generality, for each month, we assume the release dates for all series are fixed.
Table 1 gives a simple example of the data set available for nowcasting. In this example, there are n = 6 monthly series, x t = ( x 1 , t , , x 6 , t ) , released at three (i.e., Q = 3 ) releasing dates. For month T, cells with gray color represent series that are available before T. Suppose ( x 5 , T 2 , x 6 , T 2 ) are released at the first releasing date ( 1 , T ) , ( x 3 , T 1 , x 4 , T 1 ) are released at the second releasing date ( 2 , T ) , and ( x 1 , T , x 2 , T ) are released at the third releasing date ( 3 , T ) . Thus, for the third releasing date ( 3 , T ) , T q = T 2 , v 3 , T 1 = { 1 , 2 , 3 , 4 } , x i v 3 , T 1 = ( x 1 , T 1 , x 2 , T 1 , x 3 , T 1 , x 4 , T 1 ) , and v 3 , T = { 1 , 2 } , x i v 3 , T = ( x 1 , T , x 2 , T ) . If we want to nowcast GDP in the current quarter y K + 1 at the first release date in the second month of the quarter, so here T = 3 K + 2 . The series available to use at the release date ( 1 , T ) are highlighted in gray and orange color, i.e., { x 1 , , x T 2 , x i v 1 , T 1 } . The series available to use at the release date ( 2 , T ) are highlighted in gray, orange, and green, i.e., { x 1 , , x T 2 , x i v 2 , T 1 } .
The goal is to nowcast y K + 1 with all available information including x t series and y k series at each releasing date ( q , T ) in month T. Here, T = 3 K + 1 , 3 K + 2 , 3 K + 3 , indicating the first month, second month, and third month nowcast. At every new release date ( q , T ) , model parameters are updated with new information added from the new released series, and nowcast of y K + 1 is re-produced. How to deal with this unbalanced data in our BAY approach will be discussed in details in Section 3.

3. Estimation Method and Nowcasting

In Section 3.1, we introduce the Bayesian MCMC algorithm to estimate model parameters and latent factors and to determine the number of contributing factors. In Section 3.2, nowcasting formulas are provided.

3.1. Estimating Dynamic Factor Models Using Bayesian MCMC

In this section, we first introduce our method of implementing the unbalanced data into our model framework naturally. Then, we finish our model specification by assigning priors in Bayesian Framework. Finally, the MCMC procedure is discussed in detail.
As discussed in Section 2, macroeconomic series are released with diverse lags in real time. Thus, a difficulty in real-time nowcasting is to deal with unbalanced data. In this section, we develop a computational Bayesian MCMC approach that can tackle this issue naturally.
To deal with the missing data in x i v q , t at the end of the sample, we introduce the n q × n indicator matrix 1 v q , t by deleting the ith row from the identity matrix 1 n × n if i v q , t . For the example discussed in Section 2, at the third releasing date ( 3 , T ) in month T, v 3 , T 1 = { 1 , 2 , 3 , 4 } . Therefore, removing the fifth and sixth row of 1 6 × 6 gives us
1 v 3 , T 1 = 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 .
Similarly, for the index set v 3 , T = { 1 , 2 } , deleting the last four rows of 1 6 × 6 leads to
1 v 3 , T = 1 0 0 0 0 0 0 1 0 0 0 0 .
Then, we can simply rewrite x i v q , t as x i v q , t = 1 v q , t x t .
To better derive the posterior distributions, we express the dynamic of x t in Equation (1) as:
x t = μ + [ I n × n F t ] θ + ε t ,
where θ = v e c ( Θ ) = ( θ 1 , , θ n ) , θ i is a 1 × R vector representing the ith row of Θ , i = 1 , 2 , , n , and the symbol ⊗ denotes the Kronecker product. Thus, for the qth releasing date in month T, the conditional density for x t is
x t | F t , Θ , Ω N ( μ + [ I n × n F t ] θ , Ω ) for t = 1 , , T q 1 v q , t x t | F t , Θ , Ω N ( 1 v q , t ( μ + [ I n × n F t ] θ ) , 1 v q , t Ω 1 v q , t ) for t = T q + 1 , , T ,
the conditional density of F t is
F t | F t 1 , A , Σ N ( A F t 1 , Σ ) , f o r t = 2 , , T ,
and the conditional density of y k is
y k | F 3 k , F 3 k 1 , F 3 k 2 , y k 1 , S , η 2 N ( β 0 + β 1 S F 3 k + β 2 S F 3 k 1 + β 3 S F 3 k 2 + β 4 y k 1 , η 2 ) ,
for k = 2 , , K . In this way, the unbalanced structure of the data is built into our model framework through this indicator matrix 1 v q , t .
Let Φ = ( μ , Θ , Ω , A , Σ , β , τ , S , η 2 ) denote all parameters to be estimated. Suppose we are at releasing date q in month T of quarter K + 1 , our task is to use observations Y = { y 1 , , y K } and X q , T = { x 1 , , x T q , x i v q , T q + 1 , , x i v q , T } to estimate parameters Φ and latent factors { F 1 , , F T } , then conduct the nowcast for y K + 1 .
The joint posterior distribution p ( Φ , F | Y , X q , T ) can be written as a product of individual conditionals,
p ( Φ , F | Y , X q , T ) p ( Y , X q , T , Φ , F ) t = 1 T q p ( x t | F t , θ , Ω ) t = T q + 1 T p ( 1 v q , t x t | F t , θ , Ω ) × t = 2 T p ( F t | F t 1 , A , Σ ) × k = 2 K p ( y k | F 3 k , F 3 k 1 , F 3 k 2 , y k 1 , S , η 2 ) × π ( Φ ) ,
where p ( x t | F t , θ , Ω ) , p ( 1 v q , t x t | F t , θ , Ω ) , p ( F t | F t 1 , A , Σ ) , and p ( y k | F 3 k , F 3 k 1 , F 3 k 2 , y k 1 , η 2 ) can be derived according to Equations (7)–(9), respectively. π ( Φ ) is the prior distribution for the parameter set Φ .
We finish the model specification by assigning prior distributions in Bayesian framework. We set prior for μ as μ n × 1 N ( 0 , I ) . The prior for Θ is defined as Θ n × R M a t r i x N o r m a l ( 0 n × R , I n × n , I R × R ) . This prior on Θ , along with two restrictions we set in Section 2 ( | a j | < 1 and σ i 2 ( 1 a i 2 ) < σ j 2 ( 1 a j 2 ) for 1 i < j R ), satisfy the identification assumptions in Stock and Watson [16]. The prior for Ω is defined as Ω I n v e r s e W i s h a r t ( 1 n I n × n , ν θ ) , where ν θ is a scalar and pre-specified to be n + 2 so that the expectation of Ω is n 1 I n × n . The prior for A is the standard normal truncated at [ 1 , 1 ] , that is: for j = 1 , , R
π ( a j ) = ϕ ( a j ) Φ ( 1 ) Φ ( 1 ) for 1 < a j 1 0 otherwise ,
where ϕ ( · ) and Φ ( · ) are PDF and CDF for standard normal distribution. Then, π ( A ) = j = 1 R π ( a j ) . The priors for the diagonal elements of Σ are defined as σ j 2 iid I n v e r s e G a m m a ( α s , β s ) for j = 1 , , R , where α s and β s are scalars and pre-specified to be 2 and R + 2 , accordingly. Then, π ( Σ ) = j = 1 R π ( σ j ) . The prior for β = ( β 0 , β 1 , β 2 , β 3 , β 4 ) is β N ( 0 ( 3 R + 2 ) × 1 , I ( 3 R + 2 ) × ( 3 R + 2 ) ) . The prior for λ j is set to be λ j H a l f C a u c h y ( 0 , ν λ j ) for j = 1 , , R ( τ is set to be 1). As discussed in Section 2.2, these prior specifications of β and λ j imply a horseshoe shrinkage on the coefficients β ˜ . j ’s. The prior for η 2 is η 2 I n v e r s e G a m m a ( α h , β h ) , where α h and β h are scalars and pre-specified to be 4 and 0.01 , accordingly, to provide a reasonable mean and variance of η 2 .
All priors are assumed to be independent. Based on the derived complete conditional posterior distributions for each parameter and latent variable, we obtain posterior samples using Metropolis–Hastings within Gibbs sampling since some conditional posterior distributions do not have closed forms. In estimation, we use the means of posterior samples as estimates for parameters and latent factors. Complete conditional posterior distributions for all model parameters and latent factors are provided in Appendix A.

3.2. Nowcasting Formulas

In this section, nowcasting formulas are provided. Suppose we are at ( q , T ) , the qth ( q = 1 , , Q ) releasing date in month T. As discussed in Section 3.1, the available information are { y 1 , , y K } and { x 1 , , x T q , x i v q , T q + 1 , , x i v q , T } , here T can be the first ( T = 3 K + 1 ), second ( T = 3 K + 2 ), or third ( T = 3 K + 3 ) month of the quarter K + 1 . Our goal is to nowcast GDP y K + 1 . Let A ( g ) , β 0 ( g ) , β i ( g ) ( i = 1 , 2 , 3 ), β 4 ( g ) , A ( g ) , S ( g ) = d i a g ( τ λ 1 ( g ) , , τ λ R ( g ) ) , and F t ( g ) ( t = 1 , , T ) be the gth posterior draws for parameters and latent factors after the burn-in period, where g = 1 , , G . We nowcast y K + 1 using the following formulas.
  • When T = 3 K + 1 , the nowcast of y K + 1 using BAY is given by:
    y ^ K + 1 = 1 G g = 1 G [ β 0 ( g ) + ( β 1 ( g ) ) S ( g ) ( A ( g ) ) 2 F T ( g ) + ( β 2 ( g ) ) S ( g ) A ( g ) F T ( g ) + ( β 3 ( g ) ) S ( g ) F T ( g ) + ( β 4 ( g ) ) y K ] ,
  • When T = 3 K + 2 , the nowcast of y K + 1 using BAY is given by:
    y ^ K + 1 = 1 G g = 1 G [ β 0 ( g ) + ( β 1 ( g ) ) S ( g ) A ( g ) F T ( g ) + ( β 2 ( g ) ) S ( g ) F T ( g ) + ( β 3 ( g ) ) S ( g ) F T 1 ( g ) + ( β 4 ( g ) ) y K ] ,
  • When T = 3 K + 3 , the nowcast of y K + 1 using BAY is given by:
    y ^ K + 1 = 1 G g = 1 G [ β 0 ( g ) + ( β 1 ( g ) ) S ( g ) F T ( g ) + ( β 2 ( g ) ) S ( g ) F T 1 ( g ) + ( β 3 ( g ) ) S ( g ) F T 2 ( g ) + ( β 4 ( g ) ) y K ] .
Note that for some releasing dates, if v q , T = , meaning that no monthly series are available at releasing date ( q , T ) , then posterior samples F T ( g ) cannot be generated. As a solution, we use F ˜ T ( g ) = A ( g ) F T 1 ( g ) to replace F T ( g ) in nowcasting equations. All of the parameter and factor estimations are updated in every single release within a month. Then, y ^ K + 1 is re-produced for each release date.

4. Simulation Study

In this section, we will investigate three aspects of the Bayesian approach through numerical simulations. In Section 4.1, we evaluate whether it can successfully determine the true number of latent factors that can contribute to GDP nowcasting, i.e., the number of contributing factors. In Section 4.2, we study the accuracy of estimated latent factors F t . In Section 4.3, we examine out-of-sample nowcasting performances of the BAY approach.
In the simulation study, we simulate data following the model in Equations (1), (2) and (4) with T = 180 (months), K = 60 (quarters), r = 2 (true number of contributing latent factors), n = 60 (monthly series), and Q = 3 release dates in each month. The releasing pattern follows Table 2 with 20 new monthly series released in each release date, that is: at ( 1 , T ) , release ( x 41 , T 2 , , x 60 , T 2 ) , at ( 2 , T ) , release ( x 21 , T 1 , , x 40 , T 1 ) and at ( 3 , T ) , release ( x 1 , T , , x 20 , T ) .
Our method requires a predetermined cap R as the largest possible number of factors. Theoretically, R can be as large as the number of monthly series n. However, in practice, we use a smaller number to avoid extreme computational burden. In this simulation study, we choose R = 6 because a preliminary PCA analysis shows that the first six principle components can explain at least 95 % of total variation for all six simulations. For all simulations, some of the parameter settings used in generating data are common: we set A = d i a g ( 0.9 , 0.8 , 0.75 , 0.7 , 0.65 , 0.6 ) , Σ = d i a g ( 5.5 , 3 , 1 , 0.5 , 0.25 , 0.1 ) ; β 0 = 0.5 , β i = ( 1 , 1 , 1 , 1 , 1 , 1 ) ( i = 1 , 2 , 3 ), β 4 = 0.15 , and η 2 = 1 . Ω is simulated from I n v e r s e W i s h a r t ( 60 1 I 60 × 60 , 60 ) , each element of μ equals to 10, and Θ is simulated from M a t r i x N o r m a l ( 0 , I 60 × 60 , I 6 × 6 ) .
Specifications of λ j ( j = 1 , , R ) for each of the simulation are shown in Table 3. For all six simulations, we assume only the first two factors majorly contribute to our nowcasting equations. Simulation 1 and Simulation 2 represent the group with high signals for the first two factors. Simulation 3 and Simulation 4 represent the group of moderate signals, while Simulation 5 and Simulation 6 are in the group of weak signals. Within each group, one of the simulations is configured with true sparsity, that is λ j = 0 for j = 3 , , 6 , while for another simulation, we assign non-sparsity with small noise ( λ j N ( 0 , 0 . 1 2 ) for j = 3 , , 6 ) as a comparison. In this way, we can investigate how our method performs when changing magnitudes of the true signals from strong to weak, and when the non-true signals are contaminated with small noise or not.
For each simulation, we conduct one-step ahead nowcasts for the last 20 quarters using a moving window with a length of 10 years (40 quarters). For each quarter, nowcasting is made in each release date within each month. Thus, there are 20 ( q u a r t e r s ) × 3 ( m o n t h s ) × 3 ( r e l e a s e d a t e s ) = 180 nowcasts in each simulation. In our MCMC procedure, we discard the first 10,000 iterations as burn-in and run 1000 more for posterior summaries.

4.1. Estimating the Number of Contributing Factors

In this section, we validate our Bayesian Approach’s ability in determining true number of contributing factors through six sets of simulation studies. The ability of the algorithm to determine the true number of contributing factors is investigated as follows. First, we check whether our approach can perform as expected when true signals of the first two factors are high, moderate, and low, and the non-true signals are exactly equal to 0. Secondly, we check if its performance will be undermined if we add some noise to the non-true signals.
For each simulation, every estimate of the shrinkage profiles κ ^ j ( j = 1 , , R ) is calculated using the average of 1000 posterior draws after the burn-in period, that is κ ^ j = g = 1 G κ j ( g ) = g = 1 G 1 1 + λ j ( g ) for G = 1000 . Figure 3 shows box-plots of estimated shrinkage profiles κ ^ j ( j = 1 , , R ) based on 180 nowcast estimates in each simulation. In Simulation 1 and Simulation 2, κ ^ 1 and κ ^ 2 are near 0 while κ ^ 3 to κ ^ 6 are generally close to 1, indicating that the algorithm can successfully detect high signals for the first two contributing factors and shrink the other four to zero. In Simulation 3 and Simulation 4, when we decrease signals of the first two factors from high to moderate, our algorithm can still detect signals of the first two and shrinkage signals of the last four to 0. However, if we only apply low signals for the first two factors, as shown in Simulation 5 and Simulation 6, the algorithm can only detect one contributing factor while shrinking all others to 0. When comparing results between right column (Simulation 2, 4, and 6 for true sparsity) and left column (Simulation 1, 3, and 5 for small noise), our algorithm can extremely shrink all four non-true factors (i.e., having κ ^ j 1 ) in all three scenarios with different strengths of true signals, disregarding whether the non-true factors are contaminated with noise or not. The findings in Figure 3 validate our algorithm’s ability to detect the true number of contributing factors with moderate to high signals.
Figure 4 shows a scatterplot of posterior means κ ^ i j ’s ( κ ^ i j = κ ^ j ) versus β ˜ ^ i j ’s from 180 nowcast estimates. There are two general patterns observed across all six simulations. The first is that the estimated profile κ ^ i j ’s get closer to zero (little shrinkage) when the values of β ˜ ^ i j ’s increase horizontally to very large numbers, while κ ^ i j ’s approach to one (strong shrinkage) when β ˜ ^ i j ’s become very small. The second pattern is that the dots are separated into two clear segmentations in each picture. The vertical distance between two groups is the largest for the strong signal cases (Simulation 1 and 2), then becomes smaller for the moderate signals (Simulation 3 and 4). However, for the last row (the weak signal cases), the distance almost diminishes. This is consistent with the findings in Figure 3.

4.2. Estimation of Latent Factors

We then investigate whether the BAY method can accurately estimate the latent factors F t . In our approach, latent variables F t are also estimated with posterior means, i.e., F ^ t = 1 G g = 1 G F t ( g ) , t = 1 , , T , and G = 1000 is the number of MCMC iterations after the burn-in period. Figure 5 plots the estimated first two latent factors from BAY approach, together with the true latent factors, in the first 100 months (in-sample period) of the data for six simulations. The absolute values are compared since the factors are identified up to a change of sign (Section 2.1). Figure 5 shows that, generally, the estimation from the BAY approach is close to the true factors, especially for the first four simulations in which the true number of contributing latent factors is successfully detected.

4.3. Out-of-Sample Nowcasting Performances

In this section, we prove that our Bayesian Apporach can provide exceptional out-of-sample nowcasting performances compared to the Random Walk.
Out-of-sample nowcasting performances are assessed based on 20 one-step-ahead nowcasting. For each simulation, whenever there are new series released in a month, the model parameters and latent factors will be updated. Therefore, there are 180 nowcasts in total.
Figure 6 presents the nowcasting performances for all six simulations. In each panel (representing each simulation), the first, second, and third row represent nowcasting trends over 20 quarters in the first, second, and third month, respectively. In each subplot of each panel, the black curve represents the true GDP, while colored curves with different symbols represent nowcasts from different releases. Figure 6 shows that BAY approach can capture trends and changes in simulated GDP really well. For all six simulations, within the same month, there is no obvious difference in nowcasting performance between release 1 and release 2. However, nowcasting curves for release 3 are slightly closer to true curves than that of the other two releases. Moreover, we can see obvious improvements from nowcasts in the first month to nowcasts in the third month.
In order to better understand nowcasting results, we use mean absolute nowcasting error (MANE) to measure nowcasting accuracy. Let y ^ K + 1 q , T be the nowcast at qth release date of month T, where q = 1 , 2 , 3 and T = 3 K + 1 , 3 K + 2 , 3 K + 3 . Then, MANE ( q , T ) = 1 20 k + 1 = 41 60 | y ^ k + 1 q , T y k + 1 | . We compare the nowcasting performances of BAY with that of the random walk (RW) approach, which uses the previous quarter GDP to predict the current quarter GDP, by calculating MANE reduction relative to RW (i.e., MANE ( q , T ) MANE rw MANE rw ). Table 4 provides MANE reductions (in percentage) for BAY approach compared with that of the RW. For instance, 30 % indicates that BAY can reduce 30 % of MANE of the RW. Table 4 shows that, moving from the first month to the third month, there are significant reductions in terms of MANE ratios. Within each month, there is no obvious difference in MANE ratios between release 1 and release 2, while release 3 can provide larger MANE reduction than the other two. The possible reason is that, in the releasing pattern of this simulation study, series of the current month are only released in the third release of each month.
In summary, this simulation study suggests that our BAY approach can successfully detect true number of contributing factors with moderate to high signals. It also has the ability to estimate latent dynamic factors accurately and produce reliable nowcasting results.

5. Empirical Study

In this section, we examine empirical performance of the BAY method using U.S. quarterly GDP growth rates. The Federal Reserve Bank of New York built a platform that has been nowcasting U.S. GDP growth rates since April 2016. The methodology behind the platform is based on the GRS method, and details can be found in Bok et al. [18]. We borrow their data from Github (https://github.com/FRBNY-TimeSeriesAnalysis/Nowcasting) in 20 April 2021. This data set contains 26 monthly series released by both government agencies and private institutions. Based on economic insights, those monthly series are assigned to nine different categories, such as labor, international trade, manufacturing, surveys, and others. All series are updated in real time; thus, the release dates for each one vary from month to month. Based on the approximate release dates for each individual series, we roughly group these series into three release dates: before 10th, from 10th to 20th, and after the 20th of a month. Table 5 provides the release pattern of the real data. The same transformations as in Bok et al. [18] are applied to monthly series to achieve stationarity. Detailed information of transformations and release patterns are available in Table 6 and Table 7.
We choose the data span from 1993Q1 to 2016Q4, which gives us data series with 288 months (96 quarters). In-sample data is chosen to be in the period from 1993Q1 to 2002Q4, while the nowcasting horizon covers 2003Q1 to 2016Q4. The GDP growth rate used in this empirical study is the annualized quarter over quarter percentage change, which is defined as:
Y k = { ( 1 + GDP k GDP k 1 GDP k 1 ) 4 1 } × 100 ,
where GDP k is the real GDP of quarter k. Figure 7 plots the GDP growth rate with nowcasting horizon on the right side of the dashed blue line. In Figure 7, we see a severe drop at around 2009Q1 which is due to the financial crisis around 2007–2008.
We apply our BAY approach to this real U.S. GDP data. In this empirical study, we assign the same prior settings as in the simulation study; the largest possible number of latent factors R is also assumed to be six as the first six principle components from PCA explain 99.9 % of the variation observed in monthly series, and we still use G = 1000 iterations after 10,000 burn-in period in the MCMC sampling. Estimations of shrinkage profiles κ ^ j are used to determine the number of contributing factors. Out of all 56 ( q u a r t e r s ) × 3 ( m o n t h s ) × 3 ( r e l e a s e s ) = 504 estimates of shrinkage profiles, we find there are two main scenarios occurring. Figure 8 plots two examples for each of them, respectively. The left panel is the boxplot for posterior draws of shrinkage profile κ ^ j ’s when nowcasting 2000Q4 in the first release of the first month. This plot shows that the first factor is clearly detected to be different from the other five factors, although its κ ^ value is not small. The right panel is the boxplot for κ ^ j ’s when nowcasting 2000Q2 in the first release of the first month. This plot, however, shows that no factor contributes to the GDP nowcasting.
Table 8 shows proportions of all nowcasts in which one factor is detected. Generally speaking, more than 50 % of cases have one factor detected to have contribution for GDP nowcasting. This is consistent with Bok et al. [18], where they assume one single common factor in their DFMs setting.
Figure 9 plots out-of-sample GDP nowcasts over last 56 quarters for each release in each month. Three rows represents three nowcasting months. In this plot, we compare our BAY approach with the autoregressive model of order 1 (AR(1)) y k = β 0 + β 4 y k 1 + ν k . The AR(1) is equivalent to the case where no factor is detected. Figure 9 shows that our BAY approach can successfully capture the economic downfall due to the financial crisis at 2009Q1 with one lag delay. However, the AR(1) failed to capture it.
For the empirical study, we calculate MANE to measure nowcasting errors. Here, apart from AR(1), we also compare our BAY approach with another Bayesian model with no shrinkage priors, and we refer this model as NS. The NS model is proposed as the following: keep all other settings the same and remove S from Equation (5). More specifically, in the NS model, we impose Normal priors on the β ˜ . j ’s instead of horseshoe priors. Table 9 provides MANE reductions of BAY approach, relative to RW, AR(1), and NS. The first sub-table reports the percentage of reduction in MANEs relative to RW accross three nowcasting months, while the middle sub-table reports the percentage of reduction in MANEs relative to AR(1), and the last sub-table reports the percentage of reduction in MANEs relative to NS. Table 9 shows that our BAY approach can produce smaller nowcasting errors than the RW approach. On average, the percentages of reduction relative to RW do not have an obvious difference from first month to third month. This indicates that, for real data, having more monthly series does not necessarily lead to better nowcasting performances. One potential reason is that the quality of the data might not be perfect. Adding more series means adding more noise and thus may not guarantee more accurate nowcasts of GDP. The MANE reduction relative to AR(1) are lower than those relative to RW. However, our BAY approach can still have approximately 11 % reduction in nowcasting errors when being compared with the AR(1). This indicates that even if no factor is detected in nearly 50 % of the cases, the ones with one factor detected indeed contribute and enhance our nowcasting performance. The MANE reduction relative to NS for the first two nowcasting months are comparable and are lower than those relative to AR(1), the MANE reduction for the third month is higher than those relative to AR(1). This result indicates that using the horseshoe shrinkage idea to shrink unimportant factors to 0 can further increase our model’s nowcasting performance.
In summary, this empirical analysis demonstrates the empirical relevance of the BAY approach in nowcasting U.S. GDP. It suggests that at most one factor is sufficient for our BAY approach to provide a good performance.

6. Conclusions

Real-time nowcasting has become important in making policy decisions and long-term forecasting. In this paper, we adopt the DFM model framework and introduce a Bayesian MCMC approach for real-time nowcasting. Unlike other nowcasting methods based on DFMs, our Bayesian approach allows an unknown number of contributing factors and utilizes the horseshoe shrinkage to determine the number of contributing factors. Through the simulation study, we have shown that our Bayesian approach can identify the number of contributing factors correctly with high to moderate signals and estimate latent factors accurately. Both simulation study and empirical study validate our Bayesian approach’s ability to provide reliable real-time nowcasting results.
In this paper, we are not able to add the horseshoe shrinkage to Equation (1) due to the non-identifiable issue in DFMs. If we add S as Θ S F t in Equation (1), there will be two layers of non-identifiable issues, which is difficult to solve. Furthermore, in this paper, the horseshoe shrinkage is only used in detecting the number of contributing factors. In our DFMs framework, we assume the GDP of quarter k depends on factors with up to two lags, i.e., factors of 3 k , 3 k 1 , and 3 k 2 (see Equation (3)). We might argue that this lag should not be fixed and can also be determined using the horseshoe shrinkage. These are two possible research directions that we want to explore in future investigation.

Author Contributions

Formal analysis, J.L.; Methodology, J.L. and C.L.Y.; Supervision, C.L.Y.; Writing —original draft, J.L. and C.L.Y.All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ARAutoregressive Model
DFMsDynamic Factor Models
GRSModel proposed in Giannone et al. (2008)
KFKalman Filter
MANEMean Absolute Nowcasting Error
MCMCMarkov Chain Monte Carlo
PCAPrinciple Component Analysis
RWRandom Walk

Appendix A. Posterior Distributions

In this Appendix, complete conditional distributions for each parameter and latent factor are provided. An MCMC algorithm is applied to draw posterior samples. For most parameters, conditional posterior distributions have closed forms, which allows for Gibbs sampling method. However, Ω and λ j ( j = 1 , , R ) do not have closed-form posterior distributions, for which we use an independent Metropolis–Hastings within Gibbs sampler to generate posterior samples.

Appendix A.1. Posterior Samples for Mean of Monthly Series μ

The conditional posterior for the mean of monthly series, μ , is a multivariate normal distribution:
μ | X , F , Θ , Ω N ( U μ , Σ μ ) ,
where
Σ μ 1 = T q Ω 1 + t = T q + 1 T ( 1 v q , t ( 1 v q , t Ω 1 v q , t ) 1 1 v q , t ) + I n × n , U μ = Σ μ t = 1 T q Ω 1 ( x t Θ F t ) + t = T q + 1 T ( 1 v q , t ( 1 v q , t Ω 1 v q , t ) 1 1 v q , t ) ( x t Θ F t ) .

Appendix A.2. Posterior Samples for Factor Loading Matrix Θ

Let θ = v e c ( Θ ) = ( θ 1 , , θ n ) . The conditional posterior for the θ is a multivariate normal distribution:
θ | X , F , μ , Ω N ( U θ , Σ θ ) ,
let C t = I n × n F t , then
Σ θ 1 = t = 1 T q C t Ω 1 C t + t = T q + 1 T C t 1 v q , t ( 1 v q , t Ω 1 v q , t ) 1 1 v q , t C t + I n R × n R , U θ = Σ θ t = 1 T q C t Ω 1 ( x t μ ) + t = T q + 1 T C t 1 v q , t ( 1 v q , t Ω 1 v q , t ) 1 1 v q , t ( x t μ ) .

Appendix A.3. Posterior Samples for Covariance in Monthly Series Ω

The conditional posterior for Ω is
p ( Ω | · ) t = 1 T q p ( x t | μ , F t , θ , Ω ) × t = T q + 1 T p ( 1 v q , t x t | μ , F t , θ , Ω ) × P ( Ω ) | Ω | ( T q 1 ) / 2 e 1 2 t r t = 1 T q ( x t μ C t θ ) ( x t μ C t θ ) Ω 1 × | Ω | ( ν θ + n + 1 ) / 2 e 1 2 t r 1 n Ω 1 I n × n × t = T q + 1 T | 1 v q , t Ω 1 v q , t | 1 2 e 1 2 t r 1 v q , t ( x t μ C t θ ) 1 v q , t ( x t μ C t θ ) ( 1 v q , t Ω 1 v q , t ) 1 ,
which is not in a closed form. We need to use Metropolis–Hastings within Gibbs sampling method to draw Ω . The first two parts combined together yield an inverse Wishart distribution W 1 ( Φ Ω , ν Ω ) , with Φ Ω = 1 n I n × n + t = 1 T q ( x t μ C t θ ) ( x t μ C t θ ) and ν Ω = T q + ν θ . Therefore, we purpose Ω from W 1 ( Φ Ω , ν Ω ) and use the last piece in the posterior distribution, denoted as q ( Ω ) , to construct the acceptance–rejection rate. That is, the proposal Ω is accepted with probability
P ( Ω | Ω 0 ) = m i n 1 , q ( Ω ) q ( Ω 0 ) ,
where Ω 0 denotes the current state of Ω .

Appendix A.4. Posterior Samples for AR(1) Coefficients aj

For j = 1 , , R , the conditional posterior of each A R ( 1 ) coefficient a j is:
a j | F , Σ N ( μ a j , σ a j 2 ) I ( | a j | < 1 ) ,
where μ a j = t = 2 T f j , t f j , t 1 σ j 2 + t = 2 T f j , t 1 2 and σ a j 2 = σ j 2 σ j 2 + t = 2 T f j , t 1 2 .

Appendix A.5. Posterior Samples for Covariance Matrix in the Factor Equation Σ

For j = 1 , , R , the conditional posterior of each diagonal element of Σ , σ j 2 , is an inverse gamma distribution:
σ j 2 | F , A i G ( α j , β j ) ,
where α j = α s + ( T 1 ) / 2 and β j = β s + t = 2 T ( f j , t a j f j , t 1 ) 2 / 2 .

Appendix A.6. Posterior Samples for Coefficients in GDP Equation β

The conditional posterior of coefficients connecting factors with GDP, β , is a multivariate normal distribution:
β | F , y 2 , , y K , η 2 N ( U β , Σ β ) ,
where F ˜ 3 k = [ 1 , ( S F 3 k ) , ( S F 3 k 1 ) , ( S F 3 k 2 ) , y k 1 ] , and
Σ β 1 = k = 2 K F ˜ 3 k F ˜ 3 k / η 2 + I ( 3 R + 2 ) × ( 3 R + 2 ) , U β = Σ β k = 2 K F ˜ 3 k y k / η 2 ,
where S = d i a g ( λ 1 , , λ R ) .

Appendix A.7. Posterior Samples for Variance in GDP Equation η2

The conditional posterior of the variance in GDP equation, η 2 , is an inverse gamma distribution:
η 2 | F , y 2 , , y K , β , S i G ( α η , β η ) ,
where α η = α h + ( K + 1 ) / 2 , and β η = β h + k = 2 K ( y k β F ˜ 3 k ) 2 / 2 .
Here, F ˜ 3 k = [ 1 , ( S F 3 k ) , ( S F 3 k 1 ) , ( S F 3 k 2 ) , y k 1 ] , and S = d i a g ( λ 1 , , λ R ) .

Appendix A.8. Posterior Samples for Each Element in S Matrix, λj

For simplicity, we assume τ = 1 . The posterior for λ j is
p ( λ j | · ) l = 1 3 p ( β l | · ) × P ( λ j ) 1 λ j 3 e x p { 1 2 l = 1 3 β l j 2 λ j 2 } × 1 1 + λ j 2 ν λ 2 j ,
where β l j is the jth element in β l for l = 1 , 2 , 3 . This posterior does not have a closed form, and the Metropolis–Hastings within Gibbs sampling method is required to draw λ j . If we define λ ˜ j : = λ j 2 , the first part of the posterior yields a gamma distribution for λ ˜ j , i.e., λ ˜ j G a m m a ( 2.5 , 2 l = 1 3 β l j 2 ) . Therefore, we purpose λ ˜ j from G a m m a ( 2.5 , 2 l = 1 3 β l j 2 ) , obtain λ j = λ ˜ j 1 2 , and use the last piece in the conditional posterior distribution q ( λ j ) = 1 1 + λ j 2 ν λ 2 j to construct the acceptance–rejection rate. That is, the proposal λ j is accepted with probability
P ( λ j | λ j 0 ) = m i n 1 , q ( λ j ) q ( λ j 0 ) ,
where λ j 0 denotes the current state of λ j .

Appendix A.9. Sampling the Latent Factors Ft

The posterior distribution for F t has different forms depending on t. Suppose K is the largest integer such that 3 K T q , T q , as defined in Section 2. For 3 < t < 3 K , we have the most general form defined as follows:
First, we write t as t = 3 ( k 1 ) + i for k = 2 , , K , where i = 1 , 2 , 3 represents that we are in the first, second, and third month of quarter k. Then, at t = 3 ( k 1 ) + i , F t enters the joint likelihood through x t , F t + 1 , F t 1 and y k by
x t F t + 1 F t + 1 f y k ( i ) = Θ A A 1 β i S F t + ϵ t u t + 1 A 1 u t ν k Y ˜ = X ˜ + ϵ ˜ ,
where f y k ( i ) is a function of i defined as
f y k ( i ) = y k β 0 β 2 S F t 1 β 3 S F t 2 β 4 y k 1 if i = 1 y k β 0 β 1 S F t + 1 β 3 S F t 1 β 4 y k 1 if i = 2 y k β 0 β 1 S F t + 2 β 2 S F t + 1 β 4 y k 1 if i = 3 .
Therefore,
Y ˜ = x t F t + 1 F t + 1 f y k ( i ) , X ˜ = Θ A A 1 β i S , ϵ ˜ = ϵ t u t + 1 A 1 u t ν k ,
and
v a r ( ϵ ˜ ) = Σ ˜ = Ω 0 0 0 0 Σ 0 0 0 0 ( A Σ 1 A ) 1 0 0 0 0 η 2 .
By weighted regression, for 0 < t < 3 K , k = 2 , , K and t = 3 ( k 1 ) + i , draw
F t | Y ˜ , X ˜ , Σ ˜ M V N ( ( X ˜ Σ ˜ 1 X ˜ ) 1 X ˜ Σ ˜ 1 Y ˜ , ( X ˜ Σ ˜ 1 X ˜ ) 1 ) .
For other t, the posterior distribution for F t is of the same form with some modifications on Y ˜ , X ˜ , and Σ ˜ due to different availability. For example, if t = 1 , since F 0 and y 0 are not available, corresponding entries to F t 1 and f y k ( i ) are deleted. For T q < t T , monthly series are unbalanced, change entries corresponding to 1 v q , t x t in Y ˜ , X ˜ , and Σ ˜ .

References

  1. Baffgi, A.; Golinelli, R.; Parigi, G. Bridge models to forecast the euro area GDP. Int. J. Forecast. 2004, 20, 447–460. [Google Scholar] [CrossRef]
  2. Rünstler, G.; Sédillot, F. Short-Term Estimates of Euro Area Real GDP by Means of Monthly Data; Working Paper Series 276; European Central Bank: Frankfurt, Germany, 2003. [Google Scholar]
  3. Kitchen, J.; Monaco, R. Real-Time Forecasting in Practice: The U.S. Treasury Staff’s Real-Time GDP Forecast System. Bus. Econ. 2003, 38, 10–19. [Google Scholar]
  4. Boivin, K.; Ng, S. Understanding and Comparing Factor-Based Forecasts. Int. J. Cent. Bank. 2005, 1, 117–151. [Google Scholar]
  5. Forni, M.; Hallin, M.; Lippi, M.; Reichlin, L. The Generalized Dynamic Factor Model: One-Sided Estimation and Forecasting. J. Am. Stat. Assoc. 2005, 100, 830–840. [Google Scholar] [CrossRef] [Green Version]
  6. D’Agostino, A.; Giannone, D. Comparing Alternative Predictors Based on Large-Panel Factor Models; Working Paper Series 680; European Central Bank: Frankfurt, Germany, 2006. [Google Scholar]
  7. Eraslan, S.; Schröder, M. Nowcasting GDP with a Large Factor Model Space. Deutsche Bundesbank Discussion Paper No. 41/2019. 2019, Volume 41. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3507664 (accessed on 20 December 2020).
  8. Raftery, A.E.; Kárný, M.; Ettler, P. Online Prediction Under Model Uncertainty via Dynamic Model Averaging: Application to a Cold Rolling Mill. Technometrics 2010, 52, 52–66. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  9. Croushore, D.; Stark, T. A real-time data set for macroeconomists. J. Econom. 2001, 105, 111–130. [Google Scholar] [CrossRef] [Green Version]
  10. Koenig, E.F.; Dolmas, S.; Piger, J. The Use and Abuse of Real-Time Data in Economic Forecasting. Rev. Econ. Stat. 2003, 85, 618–628. [Google Scholar] [CrossRef]
  11. Orphanides, A. Monetary policy rules and the Great Inflation. J. Am. Econ. Assoc. 2002, 92, 115–120. [Google Scholar]
  12. Giannone, D.; Reichlin, L.; Small, D. Nowcasting: The real-time informational content of macroeconomic data. J. Monet. Econ. 2008, 55, 665–676. [Google Scholar] [CrossRef]
  13. Bai, J.; Ng, S. Determining the Number of Factors in Approximate Factor Models. Econometrica 2002, 70, 191–221. [Google Scholar] [CrossRef] [Green Version]
  14. Carvalho, C.M.; Polson, N.G.; Scott, J.G. Handling sparsity via the horseshoe. In Proceedings of the 12th International Conference on Artificial Intelligence and Statistics, Clearwater Beach, FL, USA, 16–18 April 2009; Volume 5. [Google Scholar]
  15. Carvalho, C.M.; Polson, N.G.; Scott, J.G. The horseshoe estimator for sparse signals. Biometrika 2010, 97, 465–480. [Google Scholar] [CrossRef] [Green Version]
  16. Stock, J.H.; Watson, M.W. Forecasting Using Principal Components From a Large Number of Predictors. J. Am. Stat. Assoc. 2002, 97, 1167–1179. [Google Scholar] [CrossRef] [Green Version]
  17. Follett, L.; Yu, C. Achieving parsimony in Bayesian vector autoregressions with the horseshoe prior. Econom. Stat. 2019, 11, 130–144. [Google Scholar] [CrossRef]
  18. Bok, B.; Caratelli, D.; Giannone, D.; Sbordone, A.M.; Tambalotti, A. Macroeconomic Nowcasting and Forecasting with Big Data. Annu. Rev. Econ. 2018, 10, 615–643. [Google Scholar] [CrossRef] [Green Version]
Figure 1. Densities of the implied prior on μ j based on the horseshoe prior, the t prior, and the Laplacian prior.
Figure 1. Densities of the implied prior on μ j based on the horseshoe prior, the t prior, and the Laplacian prior.
Mathematics 09 02865 g001
Figure 2. Densities of the shrinkage profile κ j based on the horseshoe prior, the t prior, and the Laplacian prior. κ j = 0 means no shrinkage and κ j = 1 means total shrinkage.
Figure 2. Densities of the shrinkage profile κ j based on the horseshoe prior, the t prior, and the Laplacian prior. κ j = 0 means no shrinkage and κ j = 1 means total shrinkage.
Mathematics 09 02865 g002
Figure 3. Box-plots of shrinkage profile estimations κ ^ j for j = 1 , , R from 180 nowcast estimates. Here, R = 6 . Each subplot represents the result for each simulation.
Figure 3. Box-plots of shrinkage profile estimations κ ^ j for j = 1 , , R from 180 nowcast estimates. Here, R = 6 . Each subplot represents the result for each simulation.
Mathematics 09 02865 g003
Figure 4. Scatter splots of shrinkage profile estimations κ ^ i j (y-axis) versus β ˜ ^ i j (x-axis) from 180 nowcast estimates. Each subplot represents the result for each simulation.
Figure 4. Scatter splots of shrinkage profile estimations κ ^ i j (y-axis) versus β ˜ ^ i j (x-axis) from 180 nowcast estimates. Each subplot represents the result for each simulation.
Mathematics 09 02865 g004
Figure 5. In-sample fit of the latent factors for 6 simulations. Absolute value is used for both true factors and in-sample fits. Yellow lines represent in-sample fitted value and gray lines represent true value. In each subplot, the upper panel represents the comparison for the first factor, and the lower panel shows the comparison for the second factor.
Figure 5. In-sample fit of the latent factors for 6 simulations. Absolute value is used for both true factors and in-sample fits. Yellow lines represent in-sample fitted value and gray lines represent true value. In each subplot, the upper panel represents the comparison for the first factor, and the lower panel shows the comparison for the second factor.
Mathematics 09 02865 g005
Figure 6. Nowcasting performance for six simulations. Three rows in each subplot represent the first, second, and third month’s nowcasts for the last 20 quarters, respectively. Black curve represents true simulated GDP, and colored curves with different shapes represent nowcasts from different releases.
Figure 6. Nowcasting performance for six simulations. Three rows in each subplot represent the first, second, and third month’s nowcasts for the last 20 quarters, respectively. Black curve represents true simulated GDP, and colored curves with different shapes represent nowcasts from different releases.
Mathematics 09 02865 g006
Figure 7. Real U.S. GDP growth rate from 1993Q1 to 2016Q4. Period after 2003Q1 (after blue dashed line) is the nowcasting horizon.
Figure 7. Real U.S. GDP growth rate from 1993Q1 to 2016Q4. Period after 2003Q1 (after blue dashed line) is the nowcasting horizon.
Mathematics 09 02865 g007
Figure 8. Two examples of boxplots for posterior draws of κ j . Left panel represents the one for nowcasting GDP of 2000Q4 in the first release of the first month. Right panel represents the one for nowcasting GDP of 2000Q2 in the first release of the first month.
Figure 8. Two examples of boxplots for posterior draws of κ j . Left panel represents the one for nowcasting GDP of 2000Q4 in the first release of the first month. Right panel represents the one for nowcasting GDP of 2000Q2 in the first release of the first month.
Mathematics 09 02865 g008
Figure 9. Nowcasting over 2003Q1 to 2016Q4. Three rows represent three nowcasting months, respectively. Black curve represents the real GDP, while curves of different colors and shapes represent nowcasting results from 3 different release dates and AR(1).
Figure 9. Nowcasting over 2003Q1 to 2016Q4. Three rows represent three nowcasting months, respectively. Black curve represents the real GDP, while curves of different colors and shapes represent nowcasting results from 3 different release dates and AR(1).
Mathematics 09 02865 g009
Table 1. An example of the overall releasing pattern. Gray colored cells represent available series before T. Orange, green, and blue represent the 1st, 2nd, and 3rd release within T, respectively.
Table 1. An example of the overall releasing pattern. Gray colored cells represent available series before T. Orange, green, and blue represent the 1st, 2nd, and 3rd release within T, respectively.
k1KK + 1
t123T − 4T − 3T − 2T − 1TT + 1
x i , t x 1 , 1 x 1 , 2 x 1 , 3 x 1 , T 4 x 1 , T 3 x 1 , T 2 x 1 , T 1 x 1 , T NA
x 2 , 1 x 2 , 2 x 2 , 3 x 2 , T 4 x 2 , T 3 x 2 , T 2 x 2 , T 1 x 2 , T NA
x 3 , 1 x 3 , 2 x 3 , 3 x 3 , T 4 x 3 , T 3 x 3 , T 2 x 3 , T 1 NANA
x 4 , 1 x 4 , 2 x 4 , 3 x 4 , T 4 x 4 , T 3 x 4 , T 2 x 4 , T 1 NANA
x 5 , 1 x 5 , 2 x 5 , 3 x 5 , T 4 x 5 , T 3 x 5 , T 2 NANANA
x 6 , 1 x 6 , 2 x 6 , 3 x 6 , T 4 x 6 , T 3 x 6 , T 2 NANANA
y k y 1 y K y K + 1
Table 2. Data releasing structure for simulation study when nowcasting quarter K + 1 ’s GDP in month T. “RL” represents release. Orange color represents release 1, green represents release 2, and blue represents release 3.
Table 2. Data releasing structure for simulation study when nowcasting quarter K + 1 ’s GDP in month T. “RL” represents release. Orange color represents release 1, green represents release 2, and blue represents release 3.
MonthT-3T-2T-1T
Series 1–20KnownKnownKnownRL3
Series 21–40KnownKnownRL2
Series 41–60KnownRL1
Table 3. Settings for λ j . For each simulation, the first 2 factors are contributing factors, and other 4 factors contribute little or 0 to GDP prediction.
Table 3. Settings for λ j . For each simulation, the first 2 factors are contributing factors, and other 4 factors contribute little or 0 to GDP prediction.
Simulation λ 1 λ 2 λ 3 , λ 4 , λ 5 , λ 6
1 N ( 5 , 0.1 2 ) N ( 5 , 0.1 2 ) N ( 0 , 0.1 2 )
2 N ( 5 , 0.1 2 ) N ( 5 , 0.1 2 ) 0
3 N ( 1 , 0.1 2 ) N ( 1 , 0.1 2 ) N ( 0 , 0.1 2 )
4 N ( 1 , 0.1 2 ) N ( 1 , 0.1 2 ) 0
5 N ( 0.1 , 0.1 2 ) N ( 0.1 , 0.1 2 ) N ( 0 , 0.1 2 )
6 N ( 0.1 , 0.1 2 ) N ( 0.1 , 0.1 2 ) 0
Table 4. This table reports percentages of reduction in MANE relative to RW, which are calculated as ( MANE Bay MANE RW ) / MANE RW × 100 % , for six simulation studies.
Table 4. This table reports percentages of reduction in MANE relative to RW, which are calculated as ( MANE Bay MANE RW ) / MANE RW × 100 % , for six simulation studies.
Simulation12
Release1st Month2nd Month3rd Month1st Month2nd Month3rd Month
1st 15.4 % 35.9 % 55.5 % 35.2 % 50.6 % 81.4 %
2nd 16.4 % 34.6 % 56.5 % 34.6 % 51.1 % 80.4 %
3rd 32.8 % 54.6 % 96.6 % 53.3 % 82.3 % 97.3 %
Average 21.5 % 41.7 % 69.5 % 41.0 % 61.3 % 86.4 %
Simulation34
Release1st Month2nd Month3rd Month1st Month2nd Month3rd Month
1st 24.6 % 52.5 % 68.7 % 43.5 % 53.5 % 73.1 %
2nd 24.5 % 54.7 % 69.3 % 44.5 % 55.1 % 73.3 %
3rd 51.5 % 68.2 % 89.0 % 57.9 % 75.1 % 90.7 %
Average 33.5 % 58.5 % 75.7 % 48.6 % 61.2 % 79.0 %
Simulation56
Release1st Month2nd Month3rd Month1st Month2nd Month3rd Month
1st 33.5 % 29.4 % 50.5 % 25.5 % 43.9 % 66.0 %
2nd 34.9 % 28.3 % 49.4 % 25.3 % 43.1 % 66.2 %
3rd 27.7 % 52.3 % 62.7 % 44.5 % 66.0 % 69.9 %
Average 32.0 % 36.7 % 54.2 % 31.8 % 51.0 % 67.4 %
Table 5. Data releasing structure in the empirical study when nowcasting quarter K + 1 ’s GDP in month T. RL stands for release, with release 1 colored in orange, released 2 colored in green, and release 3 colored in blue. The number in parentheses represents number of series for that particular release.
Table 5. Data releasing structure in the empirical study when nowcasting quarter K + 1 ’s GDP in month T. RL stands for release, with release 1 colored in orange, released 2 colored in green, and release 3 colored in blue. The number in parentheses represents number of series for that particular release.
MonthT-3T-2T-1T
Set 1 (2)KnownKnownKnownRL2 (2)
Set 2 (2)KnownKnownRL1 (2)
Set 3 (10)KnownKnownRL2 (10)
Set 4 (7)KnownKnownRL3 (7)
Set 5 (5)KnownRL1 (5)
Set 6 (3)KnownRL2 (3)
Table 6. Data transformation types: x i t represents raw data, and x i t represents the transformed data.
Table 6. Data transformation types: x i t represents raw data, and x i t represents the transformed data.
TypeTransformationDescription
1 x i t = x i t No transformation
2 x i t = x i t x i , t 1 Level change
3 x i t = x i t x i , t 1 x i , t 1 Month-to-month change
Table 7. Release groups, transformation types, and lag information for monthly series used in the empirical study.
Table 7. Release groups, transformation types, and lag information for monthly series used in the empirical study.
ReleaseBlockNameTransformationLag
1stHousing and constructionTTLCONS32
International tradeBOPTEXP32
BOPTIMP32
ManufacturingBUSINV32
LaborPAYEMS21
JTSJOL22
UNRATE21
2ndInternational tradeIR31
IQ31
Retail and consumptionRSAFS31
SurveyGACDISA066MSFRBNY10
GACDFSA066MSFRBNY10
ManufacturingINDPRO31
TCU21
OtherCPIAUCSL31
CPILFESL31
PPIFIS31
Housing and constructionHOUST31
PERMIT21
3rdManufacturingDGORDER31
WHLSLRIMSA31
Housing and constructionHSNIF31
IncomeDSPIC9631
Retail and consumptionPCEC9631
OtherPCEPI31
PCEPILIFE31
Table 8. This table reports percentages of nowcasts in which one factor is detected.
Table 8. This table reports percentages of nowcasts in which one factor is detected.
MethodBAY
Release1st Month2nd Month3rd Month
1st 50 % 46.4 % 62.5 %
2nd 58.9 % 50 % 58.9 %
3rd 46.4 % 62.5 % 55.3 %
Average 51.7 % 52.9 % 58.9 %
Table 9. This table reports percentages of reduction in MANE’s relative to RW, AR(1), and NS, i.e., ( MANE Bay MANE ) / MANE × 100 % , here * can be RW, AR(1), or NS.
Table 9. This table reports percentages of reduction in MANE’s relative to RW, AR(1), and NS, i.e., ( MANE Bay MANE ) / MANE × 100 % , here * can be RW, AR(1), or NS.
Compare withRW
Release1st Month2nd Month3rd Month
1st 20.6 % 27.5 % 32.0 %
2nd 28.4 % 21.9 % 19.2 %
3rd 25.0 % 23.9 % 24.7 %
Average 24.7 % 24.4 % 25.3 %
Compare withAR(1)
Release1st Month2nd Month3rd Month
1st 14.0 % 14.2 % 19.5 %
2nd 15.3 % 7.65 % 4.38 %
3rd 11.2 % 8.97 % 10.9 %
Average 13.5 % 10.3 % 11.6 %
Compare withNS
Release1st Month2nd Month3rd Month
1st 1.90 % 0.70 % 19.8 %
2nd 8.51 % 7.32 % 10.1 %
3rd 9.21 % 12.4 % 9.79 %
Average 6.54 % 6.34 % 13.23 %
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Luo, J.; Yu, C.L. Determining Number of Factors in Dynamic Factor Models Contributing to GDP Nowcasting. Mathematics 2021, 9, 2865. https://doi.org/10.3390/math9222865

AMA Style

Luo J, Yu CL. Determining Number of Factors in Dynamic Factor Models Contributing to GDP Nowcasting. Mathematics. 2021; 9(22):2865. https://doi.org/10.3390/math9222865

Chicago/Turabian Style

Luo, Jiayi, and Cindy Long Yu. 2021. "Determining Number of Factors in Dynamic Factor Models Contributing to GDP Nowcasting" Mathematics 9, no. 22: 2865. https://doi.org/10.3390/math9222865

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop