Determining Number of Factors in Dynamic Factor Models Contributing to GDP Nowcasting

Luo, Jiayi; Yu, Cindy Long

doi:10.3390/math9222865

Open AccessArticle

Determining Number of Factors in Dynamic Factor Models Contributing to GDP Nowcasting

by

Jiayi Luo

and

Cindy Long Yu

^*

Department of Statistics, Iowa State University, Ames, IA 50011, USA

^*

Author to whom correspondence should be addressed.

Mathematics 2021, 9(22), 2865; https://doi.org/10.3390/math9222865

Submission received: 14 October 2021 / Revised: 5 November 2021 / Accepted: 6 November 2021 / Published: 11 November 2021

(This article belongs to the Special Issue Statistical Simulation and Computation II)

Download

Browse Figures

Versions Notes

Abstract

:

Real-time nowcasting is a process to assess current-quarter GDP from timely released economic and financial series before the figure is disseminated in order to catch the overall macroeconomic conditions in real time. In economic data nowcasting, dynamic factor models (DFMs) are widely used due to their abilities to bridge information with different frequencies and to achieve dimension reduction. However, most of the research using DFMs assumes a fixed known number of factors contributing to GDP nowcasting. In this paper, we propose a Bayesian approach with the horseshoe shrinkage prior to determine the number of factors that have nowcasting power in GDP and to accurately estimate model parameters and latent factors simultaneously. The horseshoe prior is a powerful shrinkage prior in that it can shrink unimportant signals to 0 while keeping important ones remaining large and practically unshrunk. The validity of the method is demonstrated through simulation studies and an empirical study of nowcasting U.S. quarterly GDP growth rates using monthly data series in the U.S. market.

Keywords:

Bayesian analysis; dynamic factor models; horseshoe shrinkage; nowcasting

1. Introduction

Real-time nowcasting is a process to assess current-quarter GDP from timely released economic and financial series before the figure is disseminated in order to catch the overall macroeconomic conditions in real time. This is of interest because most data are released with a lag and are released subsequently. In theory, any release, no matter at what frequency, may affect current-quarter estimates and their precision potentially. Both forecasting and nowcasting are important tasks for central banks for policy decision-making; for example, monetary policies need to be made in real time and are based on assessments of current and future economic conditions. Additionally, estimated current-quarter GDP figures are often used as relevant inputs for model-based longer-term forecasting exercises in banks.

Real-time nowcasting faces some difficulties. The first one is how to bridge monthly data series with the quarterly GDP. Bafigi et al. [1], Rünstler and Sédillot [2], and Kitchen and Monaco [3] studied the idea of bridge equations which use small models to “bridge” the information contained in one or a few key monthly data with the quarterly growth rate of GDP. However, they involve judgmental nowcasts and only deal with a few monthly data series. The second difficulty is how to deal with a large number of monthly data series. For macroeconomic forecasting, factor models (FMs) are widely used at central banks and other institutions to achieve dimension reduction. Many authors, such as Boivin and Ng [4], Forni et al. [5], and D’Agostino and Giannone [6], have shown that these models are successful in this regard. However, they did not use FMs specifically for the problem of real-time nowcasting. There are other existing approaches to tackle the high dimensional issue. One example is Eraslan and Schröder [7] who dealt with this over-parameterized issue in GDP nowcasting by implementing the dynamic model averaging method (Raftery et al. [8]). However, they did not talk about how to handle the unbalanced structure of the data caused by different release dates with different lags in each month. Moreover, they assumed a fixed number of factors, while in our paper the focus is to determine the number of factors contributing to GDP nowcasting. The third challenge is that a large number of monthly data series are released with different lags, causing unbalanced data at the end of the sample. Some authors, including Croushore and Stark [9], Koenig et al. [10] and Orphanides [11], discussed about this issue, but they are not focusing on the statistical estimation.

Giannone et al. [12] provided a frequentist inference framework for the parametric dynamic factor models (DFMs). In their framework, they took advantage of different data releases throughout the month and updated the nowcast based on each new data release. The authors combined the idea of connecting monthly series with the nowcast of quarterly GDP and the idea of using data with different releases within a single statistical framework. Their model combines principal component analysis (PCA) with modified Kalman Filter (KF) to deal with the unbalanced feature of the data. Hereafter, we call the method proposed in Giannone et al. the GRS approach.

In this paper, we borrow the idea of DFMs from Giannone et al. [12] and propose a Bayesian Monte Carlo Markov Chain (MCMC) approach to deal with the real-time nowcasting problem. For DFMs, one important aspect is to determine the number of factors. In the GRS approach, the number of factors is assumed to be fixed, which is determined by looking at the cumulative proportions of variances explained by the first few principle components from PCA, and the same set of factors is assumed to have prediction power on GDP. Bai and Ng [13] showed that the number of factors could be estimated consistently in a large panel of data setting. In this paper, we impose a cap on the number of factors in the DFM structure but allow an unknown number of factors to contribute to GDP prediction. We propose to apply the horseshoe shrinkage (Carvalho et al. [14,15]) on the coefficients of factors in the prediction equation. One big advantage of the horseshoe shrinkage over other traditional shrinkages is that it can shrink unimportant signals to 0 while keeping important ones large and practically unshrunk (more details will be discussed in Section 2). After estimation, any coefficient that is shrunk to 0 indicates its corresponding factor has no prediction power. As a result, the number of coefficients that remain large after strong shrinkage is a good estimate of an unknown number of factors with prediction power on GDP. Our Bayesian MCMC approach can also provide a more natural way to deal with the unbalanced data structure due to real-time data releasing and estimate all parameters, including the number of contributing factors and latent dynamic factors in a single framework. We refer to this Bayesian approach as the BAY approach. Through simulation studies, we evaluate the abilities of our BAY approach in estimating an unknown number of contributing factors and in producing reliable nowcasts in real time. The validity of the BAY approach is also examined by applying it to nowcast U.S. quarterly GDP growth rates.

The rest of this paper is organized as follows. Section 2 sets up the model structure, introduces the horseshoe shrinkage into our model, and stylizes the data structure. In Section 3, we introduce the Bayesian MCMC estimation method with nowcasting equations. In Section 4, we conduct simulation studies. In Section 5, an empirical study of nowcasting U.S. GDP growth rates is presented. Section 6 concludes the paper. A list of abbreviations used in this paper is provided in Abbreviations.

2. Model Set-Ups, Horseshoe Shrinkage, and Data Structure

In Section 2.1, the model set-ups used in the BAY approach are illustrated. In Section 2.2, we introduce the horseshoe shrinkage idea and how to implement it in our BAY approach to estimate number of contributing factors. In Section 2.3, we formalize the unbalanced data structure.

2.1. Dynamic Factor Models

In this section, we introduce the Dynamic Factor Model structure and how it can reduce dimension and bridge our monthly released series with quarterly released GDP.

Since the number of monthly series is vast, modeling GDP on all available series can involve too many parameters; hence, the model would perform poorly in forecasting because of large uncertainty in the parameters’ estimation. The fundamental idea of Giannone et al. [12] is to use DFMs to exploit the collinearity of the series by summarizing all the available information into a few common latent factors. Due to collinearity, a linear combination of the common factors is able to capture the dynamic interaction among the series and to provide a model that only requires a limited number of parameters and thus works well in forecasting. In this paper, our DFM is specified in the following ways.

First, assume that the monthly series are linear functions of a few unobserved common factors

F_{t}

,

x_{t} = μ + Θ F_{t} + ε_{t},

(1)

where

x_{t} = {(x_{1, t}, \dots, x_{n, t})}^{'}

is the

n \times 1

monthly series vector at month t, for

t = 1, 2, \dots, T

,

F_{t} = {(f_{1 t}, \dots, f_{r t})}^{'}

is the

r \times 1

common factor vector at month t, r is the number of latent factors (

r < < n

) which is usually assumed to be known and fixed,

Θ

is the

n \times r

factor loading,

μ

is the

n \times 1

mean vector, and the

n \times 1

error vector

ε_{t} \sim N (0, Ω_{n \times n})

. Note that the difference between Equation (1) and regular multiple regression is that the factors

F_{t}

are unobserved latent variables, while the predictors in multiple regression are observed. The latent factors

F_{t}

serve an important role of bridging information from monthly series to quarterly GDP.

Then, we further specify the dynamic of the common factors as a vector auto-regression:

F_{t} = A F_{t - 1} + u_{t},

(2)

where

A = d i a g (a_{1}, a_{2}, \dots, a_{r})

and

u_{t} \sim N (0, Σ_{r \times r})

with

Σ = d i a g (σ_{1}^{2}, σ_{2}^{2}, \dots, σ_{r}^{2})

. It is known that factor dynamic models can suffer from non-identifiable issues. Following Stock and Watson [16], we construct two sets of restrictions:

| a_{j} | < 1

and

\frac{σ_{i}^{2}}{(1 - a_{i}^{2})} < \frac{σ_{j}^{2}}{(1 - a_{j}^{2})}

for

1 \leq i < j \leq r

. These restrictions together with the prior distribution of

Θ

(specified later) satisfy the identification assumptions in Stock and Watson [16], and can identify factors up to a change of sign.

Finally, we assume that the nowcast of the GDP at quarter k is a linear function of the common factors at each month in the current quarter and the GDP from the previous quarter:

y_{k} = β_{0} + β_{1}^{'} F_{3 k} + β_{2}^{'} F_{3 k - 1} + β_{3}^{'} F_{3 k - 2} + β_{4} y_{k - 1} + ν_{k},

(3)

where

β_{0}

and

β_{4}

are scalars,

β_{i} = {(β_{i 1}, \dots, β_{i r})}^{'}

,

i = 1, 2, 3

, are

r \times 1

vectors and

ν_{k} \sim N (0, η^{2})

for

k = 1, \dots, K

. Here, the number T (the end of monthly series) and the number K (the end of quarterly GDP series) satisfy

3 K + 1 \leq T \leq 3 K + 3

. The DFMs specified this way can successfully bridge quarterly released GDP with monthly financial or economic series and achieve dimension reduction.

2.2. Horseshoe Shrinkage

In this section, the horseshoe shrinkage idea is introduced, and we discuss how it can be implemented in our model framework to estimate the number of contributing factors.

In Carvalho et al. [14], the horseshoe prior was first introduced as a shrinkage prior. Follett and Yu [17] shown that the horseshoe prior competes favorably with shrinkage schemes commonly used in Bayesian multivariate regression models. To illustrate the idea, let us first consider a simple mean model

z_{j} \overset{i n d}{\sim} N (μ_{j}, σ^{2})

for

j = 1, \dots, R

. We assume

μ = (μ_{1}, \dots, μ_{R})

is sparse and some

μ_{j}

might be equal to 0. We can assign the horseshoe prior to

μ_{j}

for

j = 1, \dots, R

by letting

μ_{j} | λ_{j}, τ \overset{i n d}{\sim} N (0, λ_{j}^{2} τ^{2})

with

λ_{j} \overset{i i d}{\sim} H a l f C a u c h y (0, 1)

. Here,

τ

is referred to as the global shrinkage prior and

λ_{j}

is referred to as the local shrinkage prior. Figure 1 plots the densities for the horseshoe (setting

τ = 1

for simplicity), Laplacian, and Student-t priors, respectively. As shown in Figure 1, compared to Laplacian or Student-t priors, the horseshoe has flat tails which allow strong signals that remain un-shrunk. It also has an “infinitely” tall spike at the center which can provide severe shrinkage for elements near zero. This feature makes the horseshoe prior a very useful shrinkage prior.

Furthermore, it was showed in Carvalho et al. [14] that, for

τ = σ^{2} = 1

,

E (μ_{j} | z_{1}, \dots, z_{R}, λ_{j}) = (1 - κ_{j}) z_{j} + κ_{j} 0 = (1 - κ_{j}) z_{j},

where

κ_{j} = \frac{1}{1 + λ_{j}^{2}}

. When

κ_{j} = 0

,

E (μ_{j} | z_{1}, \dots, z_{R}, λ_{j}) = z_{j}

, it indicates that the signal from data dominates; when

κ_{j} = 1

,

E (μ_{j} | z_{1}, \dots, z_{R}, λ_{j}) = 0

, it means that

μ_{j}

is shrunk to 0. Thus,

κ_{j}

is referred to as the Shrinkage Profile which measures the shrinkage level:

κ_{j} = \{\begin{matrix} 0 & little shrinkage (Important) \\ 1 & extreme shrinkage (Not important) . \end{matrix}

(4)

The Half Cauchy prior on

λ_{j}

implies a

B e t a (0.5, 0.5)

prior on the Shrinkage Profile

κ_{j}

. Figure 2 shows the implied prior on

κ_{j}

for the horseshoe prior, the student’s t prior, and the Laplacian prior. As shown in the figure, unlike the Laplacian prior and t prior, the density of

κ_{j}

implied by the horseshoe prior is unbounded at both 0 and 1 with a small mass in between (a horseshoe shape). Being unbounded at 0 allows effects to grow large (little shrinkage) while being unbounded at 1 can shrink effects until they are fully removed from the equation.

We apply this horseshoe shrinkage idea to the prediction Equation (3) as follows:

y_{k} = β_{0} + β_{1}^{'} S F_{3 k} + β_{2}^{'} S F_{3 k - 1} + β_{3}^{'} S F_{3 k - 2} + β_{4} y_{k - 1} + ν_{k},

(5)

where

S = d i a g (τ λ_{1}, \dots, τ λ_{R})

, and R is the largest possible number of latent factors. The cap R is predetermined, satisfying

r < R < < n

. Now,

β_{i} = {(β_{i 1}, \dots, β_{i R})}^{'}

,

i = 1, 2, 3

, are

R \times 1

vectors,

F_{t} = {(f_{1 t}, \dots, f_{R t})}^{'}

are

R \times 1

common factors at month t, and dimensions for

A

,

Σ

, and

Θ

are changed accordingly.

In this specification,

τ

is the global shrinkage prior and

λ_{j}

(

j = 1, \dots, R

) are local shrinkage priors. We set

λ_{j} \sim H a l f C a u c h y (0, ν_{λ}^{j})

, where

0 < ν_{λ} < 1

. In this way, we assume that the importance of factors decreases when j goes from 1 to R. If we put priors

β_{i j} \overset{i i d}{\sim} N (0, 1)

,

i = 1, 2, 3

,

j = 1, \dots, R

and define

{\tilde{β}}_{i j} : = τ λ_{j} β_{i j}

, then

{\tilde{β}}_{i j} | τ, λ_{j} \overset{i n d}{\sim} N (0, τ^{2} λ_{j}^{2})

, and Equation (5) changes to be

y_{k} = β_{0} + \sum_{j = 1}^{R} {\tilde{β}}_{1 j} f_{3 k, j} + \sum_{j = 1}^{R} {\tilde{β}}_{2 j} f_{3 k - 1, j} + \sum_{j = 1}^{R} {\tilde{β}}_{3 j} f_{3 k - 2, j} + β_{4} y_{k - 1} + ν_{k} .

(6)

It can be seen that

{\tilde{β}}_{. j} = {({\tilde{β}}_{1 j}, {\tilde{β}}_{2 j}, {\tilde{β}}_{3 j})}^{'}

is the coefficient connecting factor j to GDP in the prediction equation. By such a specification, we successfully impose the horseshoe shrinkage prior on the coefficients

{\tilde{β}}_{. j}

’s. The magnitudes of the estimated profiles,

κ_{j} = \frac{1}{1 + λ_{j}}

(for

j = 1, \dots, R

), give us some information about which

{\tilde{β}}_{. j}

should be shrunk to 0 and which should not. If

κ_{j}

is close to 1, indicating extremely strong shrinkage on

{\tilde{β}}_{. j}

, factor j then has no prediction power on GDP. As a consequence, the number of remaining un-shrunk

{\tilde{β}}_{. j}

determines the number of contributing factors.

2.3. The Unbalanced Structure of the Data

In this section, we provide descriptions of the unbalanced data structure and notations used to represent the unbalanced structure.

In real time, macroeconomic series are released with diverse lags. At a particular release date, some series have observations up through the current month, whereas for others, the most recent observations maybe come from previous months. Dealing with these kinds of unbalanced data is vital for nowcasting.

Let

x_{T} = (x_{1, T}, \dots, x_{n, T})

be the

n \times 1

vector denoting n monthly data series at month T (the end of the sample), and

y_{K}

be quarterly GDP at quarter K. Assume there are Q different release dates at each month. Each release date is denoted as

(q, T)

, representing the qth release date in month T, where

q = 1, \dots, Q

. New series are released on each release date. Since some series for the current month may be released in the future, let

T_{q}

denote the latest month in which the data are balanced. For

t = T_{q} + 1, \dots, T

, the releasing set

v_{q, t}

collects indexes of all

x_{i, t}^{'}

s that have been released at or before the release date

(q, T)

, and this set of available series is denoted as

x_{i \in v_{q, t}}

. Without loss of generality, for each month, we assume the release dates for all series are fixed.

Table 1 gives a simple example of the data set available for nowcasting. In this example, there are

n = 6

monthly series,

x_{t} = (x_{1, t}, \dots, x_{6, t})

, released at three (i.e.,

Q = 3

) releasing dates. For month T, cells with gray color represent series that are available before T. Suppose

(x_{5, T - 2}, x_{6, T - 2})

are released at the first releasing date

(1, T)

,

(x_{3, T - 1}, x_{4, T - 1})

are released at the second releasing date

(2, T)

, and

(x_{1, T}, x_{2, T})

are released at the third releasing date

(3, T)

. Thus, for the third releasing date

(3, T)

,

T_{q} = T - 2

,

v_{3, T - 1} = {1, 2, 3, 4}

,

x_{i \in v_{3, T - 1}} = (x_{1, T - 1}, x_{2, T - 1}, x_{3, T - 1}, x_{4, T - 1})

, and

v_{3, T} = {1, 2}

,

x_{i \in v_{3, T}} = (x_{1, T}, x_{2, T})

. If we want to nowcast GDP in the current quarter

y_{K + 1}

at the first release date in the second month of the quarter, so here

T = 3 K + 2

. The series available to use at the release date

(1, T)

are highlighted in gray and orange color, i.e.,

{x_{1}, \dots, x_{T - 2}, x_{i \in v_{1, T - 1}}}

. The series available to use at the release date

(2, T)

are highlighted in gray, orange, and green, i.e.,

{x_{1}, \dots, x_{T - 2}, x_{i \in v_{2, T - 1}}}

.

The goal is to nowcast

y_{K + 1}

with all available information including

x_{t}

series and

y_{k}

series at each releasing date

(q, T)

in month T. Here,

T = 3 K + 1, 3 K + 2, 3 K + 3

, indicating the first month, second month, and third month nowcast. At every new release date

(q, T)

, model parameters are updated with new information added from the new released series, and nowcast of

y_{K + 1}

is re-produced. How to deal with this unbalanced data in our BAY approach will be discussed in details in Section 3.

3. Estimation Method and Nowcasting

In Section 3.1, we introduce the Bayesian MCMC algorithm to estimate model parameters and latent factors and to determine the number of contributing factors. In Section 3.2, nowcasting formulas are provided.

3.1. Estimating Dynamic Factor Models Using Bayesian MCMC

In this section, we first introduce our method of implementing the unbalanced data into our model framework naturally. Then, we finish our model specification by assigning priors in Bayesian Framework. Finally, the MCMC procedure is discussed in detail.

As discussed in Section 2, macroeconomic series are released with diverse lags in real time. Thus, a difficulty in real-time nowcasting is to deal with unbalanced data. In this section, we develop a computational Bayesian MCMC approach that can tackle this issue naturally.

To deal with the missing data in

x_{i \in v_{q, t}}

at the end of the sample, we introduce the

n_{q} \times n

indicator matrix

1_{v_{q, t}}

by deleting the ith row from the identity matrix

1_{n \times n}

if

i \notin v_{q, t}

. For the example discussed in Section 2, at the third releasing date

(3, T)

in month T,

v_{3, T - 1} = {1, 2, 3, 4}

. Therefore, removing the fifth and sixth row of

1_{6 \times 6}

gives us

1_{v_{3, T - 1}} = (\begin{matrix} 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 \end{matrix}) .

Similarly, for the index set

v_{3, T} = {1, 2}

, deleting the last four rows of

1_{6 \times 6}

leads to

1_{v_{3, T}} = (\begin{matrix} 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 & 0 \end{matrix}) .

Then, we can simply rewrite

x_{i \in v_{q, t}}

as

x_{i \in v_{q, t}} = 1_{v_{q, t}} x_{t} .

To better derive the posterior distributions, we express the dynamic of

x_{t}

in Equation (1) as:

x_{t} = μ + [I_{n \times n} \otimes F_{t}^{'}] * θ + ε_{t},

where

θ = v e c (Θ) = {(θ_{1}, \dots, θ_{n})}^{'}

,

θ_{i}

is a

1 \times R

vector representing the ith row of

Θ

,

i = 1, 2, \dots, n

, and the symbol ⊗ denotes the Kronecker product. Thus, for the qth releasing date in month T, the conditional density for

x_{t}

is

\{\begin{matrix} x_{t} | F_{t}, Θ, Ω \sim N (μ + [I_{n \times n} \otimes F_{t}^{'}] * θ, Ω) & for t = 1, \dots, T_{q} \\ 1_{v_{q, t}} x_{t} | F_{t}, Θ, Ω \sim N (1_{v_{q, t}} (μ + [I_{n \times n} \otimes F_{t}^{'}] * θ), 1_{v_{q, t}} Ω 1_{v_{q, t}}^{'}) & for t = T_{q} + 1, \dots, T, \end{matrix}

(7)

the conditional density of

F_{t}

is

F_{t} | F_{t - 1}, A, Σ \sim N (A F_{t - 1}, Σ), f o r t = 2, \dots, T,

(8)

and the conditional density of

y_{k}

is

y_{k} | F_{3 k}, F_{3 k - 1}, F_{3 k - 2}, y_{k - 1}, S, η^{2} \sim N (β_{0} + β_{1}^{'} S F_{3 k} + β_{2}^{'} S F_{3 k - 1} + β_{3}^{'} S F_{3 k - 2} + β_{4} y_{k - 1}, η^{2}),

(9)

for

k = 2, \dots, K

. In this way, the unbalanced structure of the data is built into our model framework through this indicator matrix

1_{v_{q, t}}

.

Let

Φ = (μ, Θ, Ω, A, Σ, β, τ, S, η^{2})

denote all parameters to be estimated. Suppose we are at releasing date q in month T of quarter

K + 1

, our task is to use observations

Y = {y_{1}, \dots, y_{K}}

and

X_{q, T} = {x_{1}, \dots, x_{T_{q}}, x_{i \in v_{q, T_{q} + 1}}, \dots, x_{i \in v_{q, T}}}

to estimate parameters

Φ

and latent factors

{F_{1}, \dots, F_{T}}

, then conduct the nowcast for

y_{K + 1}

.

The joint posterior distribution

p (Φ, F | Y, X_{q, T})

can be written as a product of individual conditionals,

\begin{matrix} p (Φ, F | Y, X_{q, T}) \propto p (Y, X_{q, T}, Φ, F) \\ \propto \prod_{t = 1}^{T_{q}} p (x_{t} | F_{t}, θ, Ω) \prod_{t = T_{q} + 1}^{T} p (1_{v_{q, t}} x_{t} | F_{t}, θ, Ω) \\ \times \prod_{t = 2}^{T} p (F_{t} | F_{t - 1}, A, Σ) \\ \times \prod_{k = 2}^{K} p (y_{k} | F_{3 k}, F_{3 k - 1}, F_{3 k - 2}, y_{k - 1}, S, η^{2}) \times π (Φ), \end{matrix}

(10)

where

p (x_{t} | F_{t}, θ, Ω)

,

p (1_{v_{q, t}} x_{t} | F_{t}, θ, Ω)

,

p (F_{t} | F_{t - 1}, A, Σ)

, and

p (y_{k} | F_{3 k}, F_{3 k - 1}, F_{3 k - 2}, y_{k - 1}, η^{2})

can be derived according to Equations (7)–(9), respectively.

π (Φ)

is the prior distribution for the parameter set

Φ

.

We finish the model specification by assigning prior distributions in Bayesian framework. We set prior for

μ

as

μ_{n \times 1} \sim N (0, I)

. The prior for

Θ

is defined as

Θ_{n \times R} \sim M a t r i x N o r m a l (0_{n \times R}, I_{n \times n}, I_{R \times R})

. This prior on

Θ

, along with two restrictions we set in Section 2 (

| a_{j} | < 1

and

\frac{σ_{i}^{2}}{(1 - a_{i}^{2})} < \frac{σ_{j}^{2}}{(1 - a_{j}^{2})}

for

1 \leq i < j \leq R

), satisfy the identification assumptions in Stock and Watson [16]. The prior for

Ω

is defined as

Ω \sim I n v e r s e W i s h a r t (\frac{1}{n} I_{n \times n}, ν_{θ})

, where

ν_{θ}

is a scalar and pre-specified to be

n + 2

so that the expectation of

Ω

is

n^{- 1} I_{n \times n}

. The prior for

A

is the standard normal truncated at

[- 1, 1]

, that is: for

j = 1, \dots, R

π (a_{j}) = \{\begin{matrix} \frac{ϕ (a_{j})}{Φ (1) - Φ (- 1)} & for - 1 < a_{j} \leq 1 \\ 0 & otherwise, \end{matrix}

(11)

where

ϕ (\cdot)

and

Φ (\cdot)

are PDF and CDF for standard normal distribution. Then,

π (A) = \prod_{j = 1}^{R} π (a_{j})

. The priors for the diagonal elements of

Σ

are defined as

σ_{j}^{2} \overset{iid}{\sim} I n v e r s e G a m m a (α_{s}, β_{s})

for

j = 1, \dots, R

, where

α_{s}

and

β_{s}

are scalars and pre-specified to be 2 and

R + 2

, accordingly. Then,

π (Σ) = \prod_{j = 1}^{R} π (σ_{j})

. The prior for

β = {(β_{0}, β_{1}^{'}, β_{2}^{'}, β_{3}^{'}, β_{4})}^{'}

is

β \sim N (0_{(3 R + 2) \times 1},

I_{(3 R + 2) \times (3 R + 2)})

. The prior for

λ_{j}

is set to be

λ_{j} \sim H a l f C a u c h y (0, ν_{λ}^{j})

for

j = 1, \dots, R

(

τ

is set to be 1). As discussed in Section 2.2, these prior specifications of

β

and

λ_{j}

imply a horseshoe shrinkage on the coefficients

{\tilde{β}}_{. j}

’s. The prior for

η^{2}

is

η^{2} \sim I n v e r s e G a m m a (α_{h}, β_{h})

, where

α_{h}

and

β_{h}

are scalars and pre-specified to be 4 and

0.01

, accordingly, to provide a reasonable mean and variance of

η^{2}

.

All priors are assumed to be independent. Based on the derived complete conditional posterior distributions for each parameter and latent variable, we obtain posterior samples using Metropolis–Hastings within Gibbs sampling since some conditional posterior distributions do not have closed forms. In estimation, we use the means of posterior samples as estimates for parameters and latent factors. Complete conditional posterior distributions for all model parameters and latent factors are provided in Appendix A.

3.2. Nowcasting Formulas

In this section, nowcasting formulas are provided. Suppose we are at

(q, T)

, the qth (

q = 1, \dots, Q

) releasing date in month T. As discussed in Section 3.1, the available information are

{y_{1}, \dots, y_{K}}

and

{x_{1}, \dots, x_{T_{q}}, x_{i \in v_{q, T_{q} + 1}}, \dots, x_{i \in v_{q, T}}}

, here T can be the first (

T = 3 K + 1

), second (

T = 3 K + 2

), or third (

T = 3 K + 3

) month of the quarter

K + 1

. Our goal is to nowcast GDP

y_{K + 1}

. Let

A^{(g)}

,

β_{0}^{(g)}

,

β_{i}^{(g)}

(

i = 1, 2, 3

),

β_{4}^{(g)}

,

A^{(g)}

,

S^{(g)} = d i a g (τ λ_{1}^{(g)}, \dots, τ λ_{R}^{(g)})

, and

F_{t}^{(g)}

(

t = 1, \dots, T

) be the gth posterior draws for parameters and latent factors after the burn-in period, where

g = 1, \dots, G

. We nowcast

y_{K + 1}

using the following formulas.

When $T = 3 K + 1$ , the nowcast of $y_{K + 1}$ using BAY is given by:

$\begin{matrix} {\hat{y}}_{K + 1} = \frac{1}{G} \sum_{g = 1}^{G} [β_{0}^{(g)} + {(β_{1}^{(g)})}^{'} S^{(g)} {(A^{(g)})}^{2} F_{T}^{(g)} + {(β_{2}^{(g)})}^{'} S^{(g)} A^{(g)} F_{T}^{(g)} + \\ {(β_{3}^{(g)})}^{'} S^{(g)} F_{T}^{(g)} + (β_{4}^{(g)}) y_{K}], \end{matrix}$

(12)
When $T = 3 K + 2$ , the nowcast of $y_{K + 1}$ using BAY is given by:

$\begin{matrix} {\hat{y}}_{K + 1} = \frac{1}{G} \sum_{g = 1}^{G} [β_{0}^{(g)} + {(β_{1}^{(g)})}^{'} S^{(g)} A^{(g)} F_{T}^{(g)} + {(β_{2}^{(g)})}^{'} S^{(g)} F_{T}^{(g)} + \\ {(β_{3}^{(g)})}^{'} S^{(g)} F_{T - 1}^{(g)} + (β_{4}^{(g)}) y_{K}], \end{matrix}$

(13)
When $T = 3 K + 3$ , the nowcast of $y_{K + 1}$ using BAY is given by:

$\begin{matrix} {\hat{y}}_{K + 1} = \frac{1}{G} \sum_{g = 1}^{G} [β_{0}^{(g)} + {(β_{1}^{(g)})}^{'} S^{(g)} F_{T}^{(g)} + {(β_{2}^{(g)})}^{'} S^{(g)} F_{T - 1}^{(g)} + \\ {(β_{3}^{(g)})}^{'} S^{(g)} F_{T - 2}^{(g)} + (β_{4}^{(g)}) y_{K}] . \end{matrix}$

(14)

Note that for some releasing dates, if

v_{q, T} = \emptyset

, meaning that no monthly series are available at releasing date

(q, T)

, then posterior samples

F_{T}^{(g)}

cannot be generated. As a solution, we use

{\tilde{F}}_{T}^{(g)} = A^{(g)} F_{T - 1}^{(g)}

to replace

F_{T}^{(g)}

in nowcasting equations. All of the parameter and factor estimations are updated in every single release within a month. Then,

{\hat{y}}_{K + 1}

is re-produced for each release date.

4. Simulation Study

In this section, we will investigate three aspects of the Bayesian approach through numerical simulations. In Section 4.1, we evaluate whether it can successfully determine the true number of latent factors that can contribute to GDP nowcasting, i.e., the number of contributing factors. In Section 4.2, we study the accuracy of estimated latent factors

F_{t}

. In Section 4.3, we examine out-of-sample nowcasting performances of the BAY approach.

In the simulation study, we simulate data following the model in Equations (1), (2) and (4) with

T = 180

(months),

K = 60

(quarters),

r = 2

(true number of contributing latent factors),

n = 60

(monthly series), and

Q = 3

release dates in each month. The releasing pattern follows Table 2 with 20 new monthly series released in each release date, that is: at

(1, T)

, release

(x_{41, T - 2}, \dots, x_{60, T - 2})

, at

(2, T)

, release

(x_{21, T - 1}, \dots, x_{40, T - 1})

and at

(3, T)

, release

(x_{1, T}, \dots, x_{20, T})

.

Our method requires a predetermined cap R as the largest possible number of factors. Theoretically, R can be as large as the number of monthly series n. However, in practice, we use a smaller number to avoid extreme computational burden. In this simulation study, we choose

R = 6

because a preliminary PCA analysis shows that the first six principle components can explain at least

95 %

of total variation for all six simulations. For all simulations, some of the parameter settings used in generating data are common: we set

A = d i a g (0.9, - 0.8, 0.75, 0.7, - 0.65, 0.6)

,

Σ = d i a g (5.5, 3, 1, 0.5, 0.25, 0.1)

;

β_{0} = 0.5

,

β_{i}^{'} = (1, 1, 1, 1, 1, 1)

(

i = 1, 2, 3

),

β_{4} = 0.15

, and

η^{2} = 1

.

Ω

is simulated from

I n v e r s e W i s h a r t (60^{- 1} I_{60 \times 60}, 60)

, each element of

μ

equals to 10, and

Θ

is simulated from

M a t r i x N o r m a l (0, I_{60 \times 60}, I_{6 \times 6})

.

Specifications of

λ_{j}

(

j = 1, \dots, R

) for each of the simulation are shown in Table 3. For all six simulations, we assume only the first two factors majorly contribute to our nowcasting equations. Simulation 1 and Simulation 2 represent the group with high signals for the first two factors. Simulation 3 and Simulation 4 represent the group of moderate signals, while Simulation 5 and Simulation 6 are in the group of weak signals. Within each group, one of the simulations is configured with true sparsity, that is

λ_{j} = 0

for

j = 3, \dots, 6

, while for another simulation, we assign non-sparsity with small noise (

λ_{j} \sim N (0, 0 . 1^{2})

for

j = 3, \dots, 6

) as a comparison. In this way, we can investigate how our method performs when changing magnitudes of the true signals from strong to weak, and when the non-true signals are contaminated with small noise or not.

For each simulation, we conduct one-step ahead nowcasts for the last 20 quarters using a moving window with a length of 10 years (40 quarters). For each quarter, nowcasting is made in each release date within each month. Thus, there are

20 (q u a r t e r s) \times 3 (m o n t h s) \times 3 (r e l e a s e d a t e s) = 180

nowcasts in each simulation. In our MCMC procedure, we discard the first 10,000 iterations as burn-in and run 1000 more for posterior summaries.

4.1. Estimating the Number of Contributing Factors

In this section, we validate our Bayesian Approach’s ability in determining true number of contributing factors through six sets of simulation studies. The ability of the algorithm to determine the true number of contributing factors is investigated as follows. First, we check whether our approach can perform as expected when true signals of the first two factors are high, moderate, and low, and the non-true signals are exactly equal to 0. Secondly, we check if its performance will be undermined if we add some noise to the non-true signals.

For each simulation, every estimate of the shrinkage profiles

{\hat{κ}}_{j}

(

j = 1, \dots, R

) is calculated using the average of 1000 posterior draws after the burn-in period, that is

{\hat{κ}}_{j} = \sum_{g = 1}^{G} κ_{j}^{(g)} = \sum_{g = 1}^{G} \frac{1}{1 + λ_{j}^{(g)}}

for

G = 1000

. Figure 3 shows box-plots of estimated shrinkage profiles

{\hat{κ}}_{j}

(

j = 1, \dots, R

) based on 180 nowcast estimates in each simulation. In Simulation 1 and Simulation 2,

{\hat{κ}}_{1}

and

{\hat{κ}}_{2}

are near 0 while

{\hat{κ}}_{3}

to

{\hat{κ}}_{6}

are generally close to 1, indicating that the algorithm can successfully detect high signals for the first two contributing factors and shrink the other four to zero. In Simulation 3 and Simulation 4, when we decrease signals of the first two factors from high to moderate, our algorithm can still detect signals of the first two and shrinkage signals of the last four to 0. However, if we only apply low signals for the first two factors, as shown in Simulation 5 and Simulation 6, the algorithm can only detect one contributing factor while shrinking all others to 0. When comparing results between right column (Simulation 2, 4, and 6 for true sparsity) and left column (Simulation 1, 3, and 5 for small noise), our algorithm can extremely shrink all four non-true factors (i.e., having

{\hat{κ}}_{j} \approx 1

) in all three scenarios with different strengths of true signals, disregarding whether the non-true factors are contaminated with noise or not. The findings in Figure 3 validate our algorithm’s ability to detect the true number of contributing factors with moderate to high signals.

Figure 4 shows a scatterplot of posterior means

{\hat{κ}}_{i j}

’s (

{\hat{κ}}_{i j} = {\hat{κ}}_{j})

versus

{\hat{\tilde{β}}}_{i j}

’s from 180 nowcast estimates. There are two general patterns observed across all six simulations. The first is that the estimated profile

{\hat{κ}}_{i j}

’s get closer to zero (little shrinkage) when the values of

{\hat{\tilde{β}}}_{i j}

’s increase horizontally to very large numbers, while

{\hat{κ}}_{i j}

’s approach to one (strong shrinkage) when

{\hat{\tilde{β}}}_{i j}

’s become very small. The second pattern is that the dots are separated into two clear segmentations in each picture. The vertical distance between two groups is the largest for the strong signal cases (Simulation 1 and 2), then becomes smaller for the moderate signals (Simulation 3 and 4). However, for the last row (the weak signal cases), the distance almost diminishes. This is consistent with the findings in Figure 3.

4.2. Estimation of Latent Factors

We then investigate whether the BAY method can accurately estimate the latent factors

F_{t}

. In our approach, latent variables

F_{t}

are also estimated with posterior means, i.e.,

{\hat{F}}_{t} = \frac{1}{G} \sum_{g = 1}^{G} F_{t}^{(g)}

,

t = 1, \dots, T

, and

G = 1000

is the number of MCMC iterations after the burn-in period. Figure 5 plots the estimated first two latent factors from BAY approach, together with the true latent factors, in the first 100 months (in-sample period) of the data for six simulations. The absolute values are compared since the factors are identified up to a change of sign (Section 2.1). Figure 5 shows that, generally, the estimation from the BAY approach is close to the true factors, especially for the first four simulations in which the true number of contributing latent factors is successfully detected.

4.3. Out-of-Sample Nowcasting Performances

In this section, we prove that our Bayesian Apporach can provide exceptional out-of-sample nowcasting performances compared to the Random Walk.

Out-of-sample nowcasting performances are assessed based on 20 one-step-ahead nowcasting. For each simulation, whenever there are new series released in a month, the model parameters and latent factors will be updated. Therefore, there are 180 nowcasts in total.

Figure 6 presents the nowcasting performances for all six simulations. In each panel (representing each simulation), the first, second, and third row represent nowcasting trends over 20 quarters in the first, second, and third month, respectively. In each subplot of each panel, the black curve represents the true GDP, while colored curves with different symbols represent nowcasts from different releases. Figure 6 shows that BAY approach can capture trends and changes in simulated GDP really well. For all six simulations, within the same month, there is no obvious difference in nowcasting performance between release 1 and release 2. However, nowcasting curves for release 3 are slightly closer to true curves than that of the other two releases. Moreover, we can see obvious improvements from nowcasts in the first month to nowcasts in the third month.

In order to better understand nowcasting results, we use mean absolute nowcasting error (MANE) to measure nowcasting accuracy. Let

{\hat{y}}_{K + 1}^{q, T}

be the nowcast at qth release date of month T, where

q = 1, 2, 3

and

T = 3 K + 1, 3 K + 2, 3 K + 3

. Then,

MANE (q, T) = \frac{1}{20} \sum_{k + 1 = 41}^{60} | {\hat{y}}_{k + 1}^{q, T} - y_{k + 1} |

. We compare the nowcasting performances of BAY with that of the random walk (RW) approach, which uses the previous quarter GDP to predict the current quarter GDP, by calculating MANE reduction relative to RW (i.e.,

\frac{MANE (q, T) - {MANE}^{rw}}{{MANE}^{rw}}

). Table 4 provides MANE reductions (in percentage) for BAY approach compared with that of the RW. For instance,

- 30 %

indicates that BAY can reduce

30 %

of MANE of the RW. Table 4 shows that, moving from the first month to the third month, there are significant reductions in terms of MANE ratios. Within each month, there is no obvious difference in MANE ratios between release 1 and release 2, while release 3 can provide larger MANE reduction than the other two. The possible reason is that, in the releasing pattern of this simulation study, series of the current month are only released in the third release of each month.

In summary, this simulation study suggests that our BAY approach can successfully detect true number of contributing factors with moderate to high signals. It also has the ability to estimate latent dynamic factors accurately and produce reliable nowcasting results.

5. Empirical Study

In this section, we examine empirical performance of the BAY method using U.S. quarterly GDP growth rates. The Federal Reserve Bank of New York built a platform that has been nowcasting U.S. GDP growth rates since April 2016. The methodology behind the platform is based on the GRS method, and details can be found in Bok et al. [18]. We borrow their data from Github (https://github.com/FRBNY-TimeSeriesAnalysis/Nowcasting) in 20 April 2021. This data set contains 26 monthly series released by both government agencies and private institutions. Based on economic insights, those monthly series are assigned to nine different categories, such as labor, international trade, manufacturing, surveys, and others. All series are updated in real time; thus, the release dates for each one vary from month to month. Based on the approximate release dates for each individual series, we roughly group these series into three release dates: before 10th, from 10th to 20th, and after the 20th of a month. Table 5 provides the release pattern of the real data. The same transformations as in Bok et al. [18] are applied to monthly series to achieve stationarity. Detailed information of transformations and release patterns are available in Table 6 and Table 7.

We choose the data span from 1993Q1 to 2016Q4, which gives us data series with 288 months (96 quarters). In-sample data is chosen to be in the period from 1993Q1 to 2002Q4, while the nowcasting horizon covers 2003Q1 to 2016Q4. The GDP growth rate used in this empirical study is the annualized quarter over quarter percentage change, which is defined as:

Y_{k} = {{(1 + \frac{{GDP}_{k} - {GDP}_{k - 1}}{{GDP}_{k - 1}})}^{4} - 1} \times 100,

where

{GDP}_{k}

is the real GDP of quarter k. Figure 7 plots the GDP growth rate with nowcasting horizon on the right side of the dashed blue line. In Figure 7, we see a severe drop at around 2009Q1 which is due to the financial crisis around 2007–2008.

We apply our BAY approach to this real U.S. GDP data. In this empirical study, we assign the same prior settings as in the simulation study; the largest possible number of latent factors R is also assumed to be six as the first six principle components from PCA explain

99.9 %

of the variation observed in monthly series, and we still use

G = 1000

iterations after 10,000 burn-in period in the MCMC sampling. Estimations of shrinkage profiles

{\hat{κ}}_{j}

are used to determine the number of contributing factors. Out of all

56 (q u a r t e r s) \times 3 (m o n t h s) \times 3 (r e l e a s e s) = 504

estimates of shrinkage profiles, we find there are two main scenarios occurring. Figure 8 plots two examples for each of them, respectively. The left panel is the boxplot for posterior draws of shrinkage profile

{\hat{κ}}_{j}

’s when nowcasting 2000Q4 in the first release of the first month. This plot shows that the first factor is clearly detected to be different from the other five factors, although its

\hat{κ}

value is not small. The right panel is the boxplot for

{\hat{κ}}_{j}

’s when nowcasting 2000Q2 in the first release of the first month. This plot, however, shows that no factor contributes to the GDP nowcasting.

Table 8 shows proportions of all nowcasts in which one factor is detected. Generally speaking, more than

50 %

of cases have one factor detected to have contribution for GDP nowcasting. This is consistent with Bok et al. [18], where they assume one single common factor in their DFMs setting.

Figure 9 plots out-of-sample GDP nowcasts over last 56 quarters for each release in each month. Three rows represents three nowcasting months. In this plot, we compare our BAY approach with the autoregressive model of order 1 (AR(1))

y_{k} = β_{0} + β_{4} y_{k - 1} + ν_{k}

. The AR(1) is equivalent to the case where no factor is detected. Figure 9 shows that our BAY approach can successfully capture the economic downfall due to the financial crisis at 2009Q1 with one lag delay. However, the AR(1) failed to capture it.

For the empirical study, we calculate MANE to measure nowcasting errors. Here, apart from AR(1), we also compare our BAY approach with another Bayesian model with no shrinkage priors, and we refer this model as NS. The NS model is proposed as the following: keep all other settings the same and remove

S

from Equation (5). More specifically, in the NS model, we impose Normal priors on the

{\tilde{β}}_{. j}

’s instead of horseshoe priors. Table 9 provides MANE reductions of BAY approach, relative to RW, AR(1), and NS. The first sub-table reports the percentage of reduction in MANEs relative to RW accross three nowcasting months, while the middle sub-table reports the percentage of reduction in MANEs relative to AR(1), and the last sub-table reports the percentage of reduction in MANEs relative to NS. Table 9 shows that our BAY approach can produce smaller nowcasting errors than the RW approach. On average, the percentages of reduction relative to RW do not have an obvious difference from first month to third month. This indicates that, for real data, having more monthly series does not necessarily lead to better nowcasting performances. One potential reason is that the quality of the data might not be perfect. Adding more series means adding more noise and thus may not guarantee more accurate nowcasts of GDP. The MANE reduction relative to AR(1) are lower than those relative to RW. However, our BAY approach can still have approximately

11 %

reduction in nowcasting errors when being compared with the AR(1). This indicates that even if no factor is detected in nearly

50 %

of the cases, the ones with one factor detected indeed contribute and enhance our nowcasting performance. The MANE reduction relative to NS for the first two nowcasting months are comparable and are lower than those relative to AR(1), the MANE reduction for the third month is higher than those relative to AR(1). This result indicates that using the horseshoe shrinkage idea to shrink unimportant factors to 0 can further increase our model’s nowcasting performance.

In summary, this empirical analysis demonstrates the empirical relevance of the BAY approach in nowcasting U.S. GDP. It suggests that at most one factor is sufficient for our BAY approach to provide a good performance.

6. Conclusions

Real-time nowcasting has become important in making policy decisions and long-term forecasting. In this paper, we adopt the DFM model framework and introduce a Bayesian MCMC approach for real-time nowcasting. Unlike other nowcasting methods based on DFMs, our Bayesian approach allows an unknown number of contributing factors and utilizes the horseshoe shrinkage to determine the number of contributing factors. Through the simulation study, we have shown that our Bayesian approach can identify the number of contributing factors correctly with high to moderate signals and estimate latent factors accurately. Both simulation study and empirical study validate our Bayesian approach’s ability to provide reliable real-time nowcasting results.

In this paper, we are not able to add the horseshoe shrinkage to Equation (1) due to the non-identifiable issue in DFMs. If we add

S

as

Θ S F_{t}

in Equation (1), there will be two layers of non-identifiable issues, which is difficult to solve. Furthermore, in this paper, the horseshoe shrinkage is only used in detecting the number of contributing factors. In our DFMs framework, we assume the GDP of quarter k depends on factors with up to two lags, i.e., factors of

3 k

,

3 k - 1

, and

3 k - 2

(see Equation (3)). We might argue that this lag should not be fixed and can also be determined using the horseshoe shrinkage. These are two possible research directions that we want to explore in future investigation.

Author Contributions

Formal analysis, J.L.; Methodology, J.L. and C.L.Y.; Supervision, C.L.Y.; Writing —original draft, J.L. and C.L.Y.All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AR	Autoregressive Model
DFMs	Dynamic Factor Models
GRS	Model proposed in Giannone et al. (2008)
KF	Kalman Filter
MANE	Mean Absolute Nowcasting Error
MCMC	Markov Chain Monte Carlo
PCA	Principle Component Analysis
RW	Random Walk

Appendix A. Posterior Distributions

In this Appendix, complete conditional distributions for each parameter and latent factor are provided. An MCMC algorithm is applied to draw posterior samples. For most parameters, conditional posterior distributions have closed forms, which allows for Gibbs sampling method. However,

Ω

and

λ_{j}

(

j = 1, \dots, R

) do not have closed-form posterior distributions, for which we use an independent Metropolis–Hastings within Gibbs sampler to generate posterior samples.

Appendix A.1. Posterior Samples for Mean of Monthly Series μ

The conditional posterior for the mean of monthly series,

μ

, is a multivariate normal distribution:

μ | X, F, Θ, Ω \sim N (U_{μ}, Σ_{μ}),

(A1)

where

\begin{matrix} Σ_{μ}^{- 1} = T_{q} Ω^{- 1} + \sum_{t = T_{q} + 1}^{T} (1_{v_{q, t}}^{'} {(1_{v_{q, t}} Ω 1_{v_{q, t}}^{'})}^{- 1} 1_{v_{q, t}}) + I_{n \times n}, \\ U_{μ} = Σ_{μ} [\sum_{t = 1}^{T_{q}} Ω^{- 1} (x_{t} - Θ F_{t}) + \sum_{t = T_{q} + 1}^{T} (1_{v_{q, t}}^{'} {(1_{v_{q, t}} Ω 1_{v_{q, t}}^{'})}^{- 1} 1_{v_{q, t}}) (x_{t} - Θ F_{t})] . \end{matrix}

Appendix A.2. Posterior Samples for Factor Loading Matrix Θ

Let

θ = v e c (Θ) = {(θ_{1}, \dots, θ_{n})}^{'}

. The conditional posterior for the

θ

is a multivariate normal distribution:

θ | X, F, μ, Ω \sim N (U_{θ}, Σ_{θ}),

(A2)

let

C_{t} = I_{n \times n} \otimes F_{t}

, then

\begin{matrix} Σ_{θ}^{- 1} = \sum_{t = 1}^{T_{q}} C_{t}^{'} Ω^{- 1} C_{t} + \sum_{t = T_{q} + 1}^{T} C_{t}^{'} 1_{v_{q, t}}^{'} {(1_{v_{q, t}} Ω 1_{v_{q, t}}^{'})}^{- 1} 1_{v_{q, t}} C_{t} + I_{n R \times n R}, \\ U_{θ} = Σ_{θ} [\sum_{t = 1}^{T_{q}} C_{t}^{'} Ω^{- 1} (x_{t} - μ) + \sum_{t = T_{q} + 1}^{T} C_{t}^{'} 1_{v_{q, t}}^{'} {(1_{v_{q, t}} Ω 1_{v_{q, t}}^{'})}^{- 1} 1_{v_{q, t}} (x_{t} - μ)] . \end{matrix}

Appendix A.3. Posterior Samples for Covariance in Monthly Series Ω

The conditional posterior for

Ω

is

\begin{matrix} p (Ω | \cdot) \propto \prod_{t = 1}^{T_{q}} p (x_{t} | μ, F_{t}, θ, Ω) \times \prod_{t = T_{q} + 1}^{T} p (1_{v_{q, t}} x_{t} | μ, F_{t}, θ, Ω) \times P (Ω) \\ \propto {| Ω |}^{- (T_{q} - 1) / 2} e^{- \frac{1}{2} t r [\sum_{t = 1}^{T_{q}} (x_{t} - μ - C_{t} θ) {(x_{t} - μ - C_{t} θ)}^{'} Ω^{- 1}]} \\ \times {| Ω |}^{- (ν_{θ} + n + 1) / 2} e^{- \frac{1}{2} t r [\frac{1}{n} Ω^{- 1} I_{n \times n}]} \\ \times \prod_{t = T_{q} + 1}^{T} {| 1_{v_{q, t}} Ω 1_{v_{q, t}}^{'} |}^{- \frac{1}{2}} e^{- \frac{1}{2} t r [1_{v_{q, t}} (x_{t} - μ - C_{t} θ) 1_{v_{q, t}}^{'} {(x_{t} - μ - C_{t} θ)}^{'} {(1_{v_{q, t}} Ω 1_{v_{q, t}}^{'})}^{- 1}]}, \end{matrix}

(A3)

which is not in a closed form. We need to use Metropolis–Hastings within Gibbs sampling method to draw

Ω

. The first two parts combined together yield an inverse Wishart distribution

W^{- 1} (Φ_{Ω}, ν_{Ω})

, with

Φ_{Ω} = \frac{1}{n} I_{n \times n} + \sum_{t = 1}^{T_{q}} (x_{t} - μ - C_{t} θ) {(x_{t} - μ - C_{t} θ)}^{'}

and

ν_{Ω} = T_{q} + ν_{θ}

. Therefore, we purpose

Ω^{*}

from

W^{- 1} (Φ_{Ω}, ν_{Ω})

and use the last piece in the posterior distribution, denoted as

q (Ω)

, to construct the acceptance–rejection rate. That is, the proposal

Ω^{*}

is accepted with probability

P (Ω^{*} | Ω^{0}) = m i n \{1, \frac{q (Ω^{*})}{q (Ω^{0})}\},

where

Ω^{0}

denotes the current state of

Ω

.

Appendix A.4. Posterior Samples for AR(1) Coefficients a_j

For

j = 1, \dots, R

, the conditional posterior of each

A R (1)

coefficient

a_{j}

is:

a_{j} | F, Σ \sim N (μ_{a_{j}}, σ_{a_{j}^{2}}) I (| a_{j} | < 1),

(A4)

where

μ_{a_{j}} = \frac{\sum_{t = 2}^{T} f_{j, t} f_{j, t - 1}}{σ_{j}^{2} + \sum_{t = 2}^{T} f_{j, t - 1}^{2}}

and

σ_{a_{j}^{2}} = \frac{σ_{j}^{2}}{σ_{j}^{2} + \sum_{t = 2}^{T} f_{j, t - 1}^{2}}

.

Appendix A.5. Posterior Samples for Covariance Matrix in the Factor Equation Σ

For

j = 1, \dots, R

, the conditional posterior of each diagonal element of

Σ

,

σ_{j}^{2}

, is an inverse gamma distribution:

σ_{j}^{2} | F, A \sim i G (α_{j}, β_{j}),

(A5)

where

α_{j} = α_{s} + (T - 1) / 2

and

β_{j} = β_{s} + \sum_{t = 2}^{T} {(f_{j, t} - a_{j} f_{j, t - 1})}^{2} / 2

.

Appendix A.6. Posterior Samples for Coefficients in GDP Equation β

The conditional posterior of coefficients connecting factors with GDP,

β

, is a multivariate normal distribution:

β | F, y_{2}, \dots, y_{K}, η^{2} \sim N (U_{β}, Σ_{β}),

(A6)

where

{\tilde{F}}_{3 k} = {[1, {(S F_{3 k})}^{'}, {(S F_{3 k - 1})}^{'}, {(S F_{3 k - 2})}^{'}, y_{k - 1}]}^{'}

, and

\begin{matrix} Σ_{β}^{- 1} = \sum_{k = 2}^{K} {\tilde{F}}_{3 k} {\tilde{F}}_{3 k}^{'} / η^{2} + I_{(3 R + 2) \times (3 R + 2)}, \\ U_{β} = Σ_{β} \sum_{k = 2}^{K} {\tilde{F}}_{3 k} y_{k} / η^{2}, \end{matrix}

where

S = d i a g (λ_{1}, \dots, λ_{R})

.

Appendix A.7. Posterior Samples for Variance in GDP Equation η²

The conditional posterior of the variance in GDP equation,

η^{2}

, is an inverse gamma distribution:

η^{2} | F, y_{2}, \dots, y_{K}, β, S \sim i G (α_{η}, β_{η}),

(A7)

where

α_{η} = α_{h} + (K + 1) / 2

, and

β_{η} = β_{h} + \sum_{k = 2}^{K} {(y_{k} - β^{'} {\tilde{F}}_{3 k})}^{2} / 2

.

Here,

{\tilde{F}}_{3 k} = {[1, {(S F_{3 k})}^{'}, {(S F_{3 k - 1})}^{'}, {(S F_{3 k - 2})}^{'}, y_{k - 1}]}^{'}

, and

S = d i a g (λ_{1}, \dots, λ_{R})

.

Appendix A.8. Posterior Samples for Each Element in S Matrix, λ_j

For simplicity, we assume

τ = 1

. The posterior for

λ_{j}

is

\begin{matrix} p (λ_{j} | \cdot) \propto \prod_{l = 1}^{3} p (β_{l} | \cdot) \times P (λ_{j}) \\ \propto \frac{1}{λ_{j}^{3}} e x p {- \frac{1}{2} \sum_{l = 1}^{3} β_{l j}^{2} λ_{j}^{- 2}} \times \frac{1}{1 + \frac{λ_{j}^{2}}{ν_{λ}^{2 j}}}, \end{matrix}

(A8)

where

β_{l j}

is the jth element in

β_{l}

for

l = 1, 2, 3

. This posterior does not have a closed form, and the Metropolis–Hastings within Gibbs sampling method is required to draw

λ_{j}

. If we define

{\tilde{λ}}_{j} : = λ_{j}^{- 2}

, the first part of the posterior yields a gamma distribution for

{\tilde{λ}}_{j}

, i.e.,

{\tilde{λ}}_{j} \sim G a m m a (2.5, \frac{2}{\sum_{l = 1}^{3} β_{l j}^{2}})

. Therefore, we purpose

{\tilde{λ}}_{j}^{*}

from

G a m m a (2.5, \frac{2}{\sum_{l = 1}^{3} β_{l j}^{2}})

, obtain

λ_{j}^{*} = {\tilde{λ}}_{j}^{* - \frac{1}{2}}

, and use the last piece in the conditional posterior distribution

q (λ_{j}) = \frac{1}{1 + \frac{λ_{j}^{2}}{ν_{λ}^{2 j}}}

to construct the acceptance–rejection rate. That is, the proposal

λ_{j}^{*}

is accepted with probability

P (λ_{j}^{*} | λ_{j}^{0}) = m i n \{1, \frac{q (λ_{j}^{*})}{q (λ_{j}^{0})}\},

where

λ_{j}^{0}

denotes the current state of

λ_{j}

.

Appendix A.9. Sampling the Latent Factors F_t

The posterior distribution for

F_{t}

has different forms depending on t. Suppose

K^{*}

is the largest integer such that

3 K^{*} \leq T_{q}

,

T_{q}

, as defined in Section 2. For

3 < t < 3 K^{*}

, we have the most general form defined as follows:

First, we write t as

t = 3 (k - 1) + i

for

k = 2, \dots, K^{*}

, where

i = 1, 2, 3

represents that we are in the first, second, and third month of quarter k. Then, at

t = 3 (k - 1) + i

,

F_{t}

enters the joint likelihood through

x_{t}

,

F_{t + 1}

,

F_{t - 1}

and

y_{k}

by

(\begin{matrix} x_{t} \\ F_{t + 1} \\ F_{t + 1} \\ f_{y_{k}} (i) \end{matrix}) = (\begin{matrix} Θ \\ A \\ A^{- 1} \\ β_{i}^{'} S \end{matrix}) F_{t} + (\begin{matrix} ϵ_{t} \\ u_{t + 1} \\ - A^{- 1} u_{t} \\ ν_{k} \end{matrix}) \equiv \tilde{Y} = \tilde{X} + \tilde{ϵ},

where

f_{y_{k}} (i)

is a function of i defined as

f_{y_{k}} (i) = \{\begin{matrix} y_{k} - β_{0} - β_{2}^{'} S F_{t - 1} - β_{3}^{'} S F_{t - 2} - β_{4} y_{k - 1} & if i = 1 \\ y_{k} - β_{0} - β_{1}^{'} S F_{t + 1} - β_{3}^{'} S F_{t - 1} - β_{4} y_{k - 1} & if i = 2 \\ y_{k} - β_{0} - β_{1}^{'} S F_{t + 2} - β_{2}^{'} S F_{t + 1} - β_{4} y_{k - 1} & if i = 3 . \end{matrix}

Therefore,

\tilde{Y} = (\begin{matrix} x_{t} \\ F_{t + 1} \\ F_{t + 1} \\ f_{y_{k}} (i) \end{matrix}), \tilde{X} = (\begin{matrix} Θ \\ A \\ A^{- 1} \\ β_{i}^{'} S \end{matrix}), \tilde{ϵ} = (\begin{matrix} ϵ_{t} \\ u_{t + 1} \\ - A^{- 1} u_{t} \\ ν_{k} \end{matrix}),

and

v a r (\tilde{ϵ}) = \tilde{Σ} = (\begin{matrix} Ω & 0 & 0 & 0 \\ 0 & Σ & 0 & 0 \\ 0 & 0 & {(A Σ^{- 1} A)}^{- 1} & 0 \\ 0 & 0 & 0 & η^{2} \end{matrix}) .

By weighted regression, for

0 < t < 3 K^{*}

,

k = 2, \dots, K^{*}

and

t = 3 (k - 1) + i

, draw

\begin{matrix} F_{t} | \tilde{Y}, \tilde{X}, \tilde{Σ} \sim M V N ({({\tilde{X}}^{'} {\tilde{Σ}}^{- 1} \tilde{X})}^{- 1} {\tilde{X}}^{'} {\tilde{Σ}}^{- 1} \tilde{Y}, {({\tilde{X}}^{'} {\tilde{Σ}}^{- 1} \tilde{X})}^{- 1}) . \end{matrix}

(A9)

For other t, the posterior distribution for

F_{t}

is of the same form with some modifications on

\tilde{Y}

,

\tilde{X}

, and

\tilde{Σ}

due to different availability. For example, if

t = 1

, since

F_{0}

and

y_{0}

are not available, corresponding entries to

F_{t - 1}

and

f_{y_{k}} (i)

are deleted. For

T_{q} < t \leq T

, monthly series are unbalanced, change entries corresponding to

1_{v_{q, t}} x_{t}

in

\tilde{Y}

,

\tilde{X}

, and

\tilde{Σ}

.

References

Baffgi, A.; Golinelli, R.; Parigi, G. Bridge models to forecast the euro area GDP. Int. J. Forecast. 2004, 20, 447–460. [Google Scholar] [CrossRef]
Rünstler, G.; Sédillot, F. Short-Term Estimates of Euro Area Real GDP by Means of Monthly Data; Working Paper Series 276; European Central Bank: Frankfurt, Germany, 2003. [Google Scholar]
Kitchen, J.; Monaco, R. Real-Time Forecasting in Practice: The U.S. Treasury Staff’s Real-Time GDP Forecast System. Bus. Econ. 2003, 38, 10–19. [Google Scholar]
Boivin, K.; Ng, S. Understanding and Comparing Factor-Based Forecasts. Int. J. Cent. Bank. 2005, 1, 117–151. [Google Scholar]
Forni, M.; Hallin, M.; Lippi, M.; Reichlin, L. The Generalized Dynamic Factor Model: One-Sided Estimation and Forecasting. J. Am. Stat. Assoc. 2005, 100, 830–840. [Google Scholar] [CrossRef] [Green Version]
D’Agostino, A.; Giannone, D. Comparing Alternative Predictors Based on Large-Panel Factor Models; Working Paper Series 680; European Central Bank: Frankfurt, Germany, 2006. [Google Scholar]
Eraslan, S.; Schröder, M. Nowcasting GDP with a Large Factor Model Space. Deutsche Bundesbank Discussion Paper No. 41/2019. 2019, Volume 41. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3507664 (accessed on 20 December 2020).
Raftery, A.E.; Kárný, M.; Ettler, P. Online Prediction Under Model Uncertainty via Dynamic Model Averaging: Application to a Cold Rolling Mill. Technometrics 2010, 52, 52–66. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Croushore, D.; Stark, T. A real-time data set for macroeconomists. J. Econom. 2001, 105, 111–130. [Google Scholar] [CrossRef] [Green Version]
Koenig, E.F.; Dolmas, S.; Piger, J. The Use and Abuse of Real-Time Data in Economic Forecasting. Rev. Econ. Stat. 2003, 85, 618–628. [Google Scholar] [CrossRef]
Orphanides, A. Monetary policy rules and the Great Inflation. J. Am. Econ. Assoc. 2002, 92, 115–120. [Google Scholar]
Giannone, D.; Reichlin, L.; Small, D. Nowcasting: The real-time informational content of macroeconomic data. J. Monet. Econ. 2008, 55, 665–676. [Google Scholar] [CrossRef]
Bai, J.; Ng, S. Determining the Number of Factors in Approximate Factor Models. Econometrica 2002, 70, 191–221. [Google Scholar] [CrossRef] [Green Version]
Carvalho, C.M.; Polson, N.G.; Scott, J.G. Handling sparsity via the horseshoe. In Proceedings of the 12th International Conference on Artificial Intelligence and Statistics, Clearwater Beach, FL, USA, 16–18 April 2009; Volume 5. [Google Scholar]
Carvalho, C.M.; Polson, N.G.; Scott, J.G. The horseshoe estimator for sparse signals. Biometrika 2010, 97, 465–480. [Google Scholar] [CrossRef] [Green Version]
Stock, J.H.; Watson, M.W. Forecasting Using Principal Components From a Large Number of Predictors. J. Am. Stat. Assoc. 2002, 97, 1167–1179. [Google Scholar] [CrossRef] [Green Version]
Follett, L.; Yu, C. Achieving parsimony in Bayesian vector autoregressions with the horseshoe prior. Econom. Stat. 2019, 11, 130–144. [Google Scholar] [CrossRef]
Bok, B.; Caratelli, D.; Giannone, D.; Sbordone, A.M.; Tambalotti, A. Macroeconomic Nowcasting and Forecasting with Big Data. Annu. Rev. Econ. 2018, 10, 615–643. [Google Scholar] [CrossRef] [Green Version]

Figure 1. Densities of the implied prior on

μ_{j}

based on the horseshoe prior, the t prior, and the Laplacian prior.

Figure 1. Densities of the implied prior on

μ_{j}

based on the horseshoe prior, the t prior, and the Laplacian prior.

Figure 2. Densities of the shrinkage profile

κ_{j}

based on the horseshoe prior, the t prior, and the Laplacian prior.

κ_{j} = 0

means no shrinkage and

κ_{j} = 1

means total shrinkage.

Figure 2. Densities of the shrinkage profile

κ_{j}

based on the horseshoe prior, the t prior, and the Laplacian prior.

κ_{j} = 0

means no shrinkage and

κ_{j} = 1

means total shrinkage.

Figure 3. Box-plots of shrinkage profile estimations

{\hat{κ}}_{j}

for

j = 1, \dots, R

from 180 nowcast estimates. Here,

R = 6

. Each subplot represents the result for each simulation.

Figure 3. Box-plots of shrinkage profile estimations

{\hat{κ}}_{j}

for

j = 1, \dots, R

from 180 nowcast estimates. Here,

R = 6

. Each subplot represents the result for each simulation.

Figure 4. Scatter splots of shrinkage profile estimations

{\hat{κ}}_{i j}

(y-axis) versus

{\hat{\tilde{β}}}_{i j}

(x-axis) from 180 nowcast estimates. Each subplot represents the result for each simulation.

Figure 4. Scatter splots of shrinkage profile estimations

{\hat{κ}}_{i j}

(y-axis) versus

{\hat{\tilde{β}}}_{i j}

(x-axis) from 180 nowcast estimates. Each subplot represents the result for each simulation.

Figure 5. In-sample fit of the latent factors for 6 simulations. Absolute value is used for both true factors and in-sample fits. Yellow lines represent in-sample fitted value and gray lines represent true value. In each subplot, the upper panel represents the comparison for the first factor, and the lower panel shows the comparison for the second factor.

Figure 6. Nowcasting performance for six simulations. Three rows in each subplot represent the first, second, and third month’s nowcasts for the last 20 quarters, respectively. Black curve represents true simulated GDP, and colored curves with different shapes represent nowcasts from different releases.

Figure 7. Real U.S. GDP growth rate from 1993Q1 to 2016Q4. Period after 2003Q1 (after blue dashed line) is the nowcasting horizon.

Figure 8. Two examples of boxplots for posterior draws of

κ_{j}

. Left panel represents the one for nowcasting GDP of 2000Q4 in the first release of the first month. Right panel represents the one for nowcasting GDP of 2000Q2 in the first release of the first month.

Figure 8. Two examples of boxplots for posterior draws of

κ_{j}

. Left panel represents the one for nowcasting GDP of 2000Q4 in the first release of the first month. Right panel represents the one for nowcasting GDP of 2000Q2 in the first release of the first month.

Figure 9. Nowcasting over 2003Q1 to 2016Q4. Three rows represent three nowcasting months, respectively. Black curve represents the real GDP, while curves of different colors and shapes represent nowcasting results from 3 different release dates and AR(1).

Table 1. An example of the overall releasing pattern. Gray colored cells represent available series before T. Orange, green, and blue represent the 1st, 2nd, and 3rd release within T, respectively.

k	1			…	K			K + 1
t	1	2	3	…	T − 4	T − 3	T − 2	T − 1	T	T + 1
$x_{i, t}$	$x_{1, 1}$	$x_{1, 2}$	$x_{1, 3}$	…	$x_{1, T - 4}$	$x_{1, T - 3}$	$x_{1, T - 2}$	$x_{1, T - 1}$	$x_{1, T}$	NA
	$x_{2, 1}$	$x_{2, 2}$	$x_{2, 3}$	…	$x_{2, T - 4}$	$x_{2, T - 3}$	$x_{2, T - 2}$	$x_{2, T - 1}$	$x_{2, T}$	NA
	$x_{3, 1}$	$x_{3, 2}$	$x_{3, 3}$	…	$x_{3, T - 4}$	$x_{3, T - 3}$	$x_{3, T - 2}$	$x_{3, T - 1}$	NA	NA
	$x_{4, 1}$	$x_{4, 2}$	$x_{4, 3}$	…	$x_{4, T - 4}$	$x_{4, T - 3}$	$x_{4, T - 2}$	$x_{4, T - 1}$	NA	NA
	$x_{5, 1}$	$x_{5, 2}$	$x_{5, 3}$	…	$x_{5, T - 4}$	$x_{5, T - 3}$	$x_{5, T - 2}$	NA	NA	NA
	$x_{6, 1}$	$x_{6, 2}$	$x_{6, 3}$	…	$x_{6, T - 4}$	$x_{6, T - 3}$	$x_{6, T - 2}$	NA	NA	NA
$y_{k}$	$y_{1}$			…	$y_{K}$			$y_{K + 1}$

Table 2. Data releasing structure for simulation study when nowcasting quarter

K + 1

’s GDP in month T. “RL” represents release. Orange color represents release 1, green represents release 2, and blue represents release 3.

Table 2. Data releasing structure for simulation study when nowcasting quarter

K + 1

’s GDP in month T. “RL” represents release. Orange color represents release 1, green represents release 2, and blue represents release 3.

Month	T-3	T-2	T-1	T
Series 1–20	Known	Known	Known	RL3
Series 21–40	Known	Known	RL2
Series 41–60	Known	RL1

Table 3. Settings for

λ_{j}

. For each simulation, the first 2 factors are contributing factors, and other 4 factors contribute little or 0 to GDP prediction.

Table 3. Settings for

λ_{j}

. For each simulation, the first 2 factors are contributing factors, and other 4 factors contribute little or 0 to GDP prediction.

Simulation	$λ_{1}$	$λ_{2}$	$λ_{3}, λ_{4}, λ_{5}, λ_{6}$
1	$N (5, {0.1}^{2})$	$N (- 5, {0.1}^{2})$	$N (0, {0.1}^{2})$
2	$N (5, {0.1}^{2})$	$N (- 5, {0.1}^{2})$	0
3	$N (1, {0.1}^{2})$	$N (- 1, {0.1}^{2})$	$N (0, {0.1}^{2})$
4	$N (1, {0.1}^{2})$	$N (- 1, {0.1}^{2})$	0
5	$N (0.1, {0.1}^{2})$	$N (- 0.1, {0.1}^{2})$	$N (0, {0.1}^{2})$
6	$N (0.1, {0.1}^{2})$	$N (- 0.1, {0.1}^{2})$	0

Table 4. This table reports percentages of reduction in MANE relative to RW, which are calculated as

({MANE}^{Bay} - {MANE}^{RW}) / {MANE}^{RW} \times 100 %

, for six simulation studies.

Table 4. This table reports percentages of reduction in MANE relative to RW, which are calculated as

({MANE}^{Bay} - {MANE}^{RW}) / {MANE}^{RW} \times 100 %

, for six simulation studies.

Simulation	1			2
Release	1st Month	2nd Month	3rd Month	1st Month	2nd Month	3rd Month
1st	$- 15.4 %$	$- 35.9 %$	$- 55.5 %$	$- 35.2 %$	$- 50.6 %$	$- 81.4 %$
2nd	$- 16.4 %$	$- 34.6 %$	$- 56.5 %$	$- 34.6 %$	$- 51.1 %$	$- 80.4 %$
3rd	$- 32.8 %$	$- 54.6 %$	$- 96.6 %$	$- 53.3 %$	$- 82.3 %$	$- 97.3 %$
Average	$- 21.5 %$	$- 41.7 %$	$- 69.5 %$	$- 41.0 %$	$- 61.3 %$	$- 86.4 %$
Simulation	3			4
Release	1st Month	2nd Month	3rd Month	1st Month	2nd Month	3rd Month
1st	$- 24.6 %$	$- 52.5 %$	$- 68.7 %$	$- 43.5 %$	$- 53.5 %$	$- 73.1 %$
2nd	$- 24.5 %$	$- 54.7 %$	$- 69.3 %$	$- 44.5 %$	$- 55.1 %$	$- 73.3 %$
3rd	$- 51.5 %$	$- 68.2 %$	$- 89.0 %$	$- 57.9 %$	$- 75.1 %$	$- 90.7 %$
Average	$- 33.5 %$	$- 58.5 %$	$- 75.7 %$	$- 48.6 %$	$- 61.2 %$	$- 79.0 %$
Simulation	5			6
Release	1st Month	2nd Month	3rd Month	1st Month	2nd Month	3rd Month
1st	$- 33.5 %$	$- 29.4 %$	$- 50.5 %$	$- 25.5 %$	$- 43.9 %$	$- 66.0 %$
2nd	$- 34.9 %$	$- 28.3 %$	$- 49.4 %$	$- 25.3 %$	$- 43.1 %$	$- 66.2 %$
3rd	$- 27.7 %$	$- 52.3 %$	$- 62.7 %$	$- 44.5 %$	$- 66.0 %$	$- 69.9 %$
Average	$- 32.0 %$	$- 36.7 %$	$- 54.2 %$	$- 31.8 %$	$- 51.0 %$	$- 67.4 %$

Table 5. Data releasing structure in the empirical study when nowcasting quarter

K + 1

’s GDP in month T. RL stands for release, with release 1 colored in orange, released 2 colored in green, and release 3 colored in blue. The number in parentheses represents number of series for that particular release.

Table 5. Data releasing structure in the empirical study when nowcasting quarter

K + 1

’s GDP in month T. RL stands for release, with release 1 colored in orange, released 2 colored in green, and release 3 colored in blue. The number in parentheses represents number of series for that particular release.

Month	T-3	T-2	T-1	T
Set 1 (2)	Known	Known	Known	RL2 (2)
Set 2 (2)	Known	Known	RL1 (2)
Set 3 (10)	Known	Known	RL2 (10)
Set 4 (7)	Known	Known	RL3 (7)
Set 5 (5)	Known	RL1 (5)
Set 6 (3)	Known	RL2 (3)

Table 6. Data transformation types:

x_{i t}^{*}

represents raw data, and

x_{i t}

represents the transformed data.

Table 6. Data transformation types:

x_{i t}^{*}

represents raw data, and

x_{i t}

represents the transformed data.

Type	Transformation	Description
1	$x_{i t} = x_{i t}^{*}$	No transformation
2	$x_{i t} = x_{i t}^{} - x_{i, t - 1}^{}$	Level change
3	$x_{i t} = \frac{x_{i t}^{} - x_{i, t - 1}^{}}{x_{i, t - 1}^{*}}$	Month-to-month change

Table 7. Release groups, transformation types, and lag information for monthly series used in the empirical study.

Release	Block	Name	Transformation	Lag
1st	Housing and construction	TTLCONS	3	2
	International trade	BOPTEXP	3	2
	International trade	BOPTIMP	3	2
	Manufacturing	BUSINV	3	2
	Labor	PAYEMS	2	1
		JTSJOL	2	2
		UNRATE	2	1
2nd	International trade	IR	3	1
	International trade	IQ	3	1
	Retail and consumption	RSAFS	3	1
	Survey	GACDISA066MSFRBNY	1	0
	Survey	GACDFSA066MSFRBNY	1	0
	Manufacturing	INDPRO	3	1
	Manufacturing	TCU	2	1
	Other	CPIAUCSL	3	1
		CPILFESL	3	1
		PPIFIS	3	1
	Housing and construction	HOUST	3	1
	Housing and construction	PERMIT	2	1
3rd	Manufacturing	DGORDER	3	1
	Manufacturing	WHLSLRIMSA	3	1
	Housing and construction	HSNIF	3	1
	Income	DSPIC96	3	1
	Retail and consumption	PCEC96	3	1
	Other	PCEPI	3	1
	Other	PCEPILIFE	3	1

Table 8. This table reports percentages of nowcasts in which one factor is detected.

Method	BAY
Release	1st Month	2nd Month	3rd Month
1st	$50 %$	$46.4 %$	$62.5 %$
2nd	$58.9 %$	$50 %$	$58.9 %$
3rd	$46.4 %$	$62.5 %$	$55.3 %$
Average	$51.7 %$	$52.9 %$	$58.9 %$

Table 9. This table reports percentages of reduction in MANE’s relative to RW, AR(1), and NS, i.e.,

({MANE}^{Bay} - {MANE}^{*}) / {MANE}^{*} \times 100 %

, here * can be RW, AR(1), or NS.

Table 9. This table reports percentages of reduction in MANE’s relative to RW, AR(1), and NS, i.e.,

({MANE}^{Bay} - {MANE}^{*}) / {MANE}^{*} \times 100 %

, here * can be RW, AR(1), or NS.

Compare with	RW
Release	1st Month	2nd Month	3rd Month
1st	$- 20.6 %$	$- 27.5 %$	$- 32.0 %$
2nd	$- 28.4 %$	$- 21.9 %$	$- 19.2 %$
3rd	$- 25.0 %$	$- 23.9 %$	$- 24.7 %$
Average	$- 24.7 %$	$- 24.4 %$	$- 25.3 %$
Compare with	AR(1)
Release	1st Month	2nd Month	3rd Month
1st	$- 14.0 %$	$- 14.2 %$	$- 19.5 %$
2nd	$- 15.3 %$	$- 7.65 %$	$- 4.38 %$
3rd	$- 11.2 %$	$- 8.97 %$	$- 10.9 %$
Average	$- 13.5 %$	$- 10.3 %$	$- 11.6 %$
Compare with	NS
Release	1st Month	2nd Month	3rd Month
1st	$- 1.90 %$	$0.70 %$	$- 19.8 %$
2nd	$- 8.51 %$	$- 7.32 %$	$- 10.1 %$
3rd	$- 9.21 %$	$- 12.4 %$	$- 9.79 %$
Average	$- 6.54 %$	$- 6.34 %$	$- 13.23 %$

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Luo, J.; Yu, C.L. Determining Number of Factors in Dynamic Factor Models Contributing to GDP Nowcasting. Mathematics 2021, 9, 2865. https://doi.org/10.3390/math9222865

AMA Style

Luo J, Yu CL. Determining Number of Factors in Dynamic Factor Models Contributing to GDP Nowcasting. Mathematics. 2021; 9(22):2865. https://doi.org/10.3390/math9222865

Chicago/Turabian Style

Luo, Jiayi, and Cindy Long Yu. 2021. "Determining Number of Factors in Dynamic Factor Models Contributing to GDP Nowcasting" Mathematics 9, no. 22: 2865. https://doi.org/10.3390/math9222865

APA Style

Luo, J., & Yu, C. L. (2021). Determining Number of Factors in Dynamic Factor Models Contributing to GDP Nowcasting. Mathematics, 9(22), 2865. https://doi.org/10.3390/math9222865

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Determining Number of Factors in Dynamic Factor Models Contributing to GDP Nowcasting

Abstract

1. Introduction

2. Model Set-Ups, Horseshoe Shrinkage, and Data Structure

2.1. Dynamic Factor Models

2.2. Horseshoe Shrinkage

2.3. The Unbalanced Structure of the Data

3. Estimation Method and Nowcasting

3.1. Estimating Dynamic Factor Models Using Bayesian MCMC

3.2. Nowcasting Formulas

4. Simulation Study

4.1. Estimating the Number of Contributing Factors

4.2. Estimation of Latent Factors

4.3. Out-of-Sample Nowcasting Performances

5. Empirical Study

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A. Posterior Distributions

Appendix A.1. Posterior Samples for Mean of Monthly Series μ

Appendix A.2. Posterior Samples for Factor Loading Matrix Θ

Appendix A.3. Posterior Samples for Covariance in Monthly Series Ω

Appendix A.4. Posterior Samples for AR(1) Coefficients aj

Appendix A.5. Posterior Samples for Covariance Matrix in the Factor Equation Σ

Appendix A.6. Posterior Samples for Coefficients in GDP Equation β

Appendix A.7. Posterior Samples for Variance in GDP Equation η2

Appendix A.8. Posterior Samples for Each Element in S Matrix, λj

Appendix A.9. Sampling the Latent Factors Ft

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Appendix A.4. Posterior Samples for AR(1) Coefficients a_j

Appendix A.7. Posterior Samples for Variance in GDP Equation η²

Appendix A.8. Posterior Samples for Each Element in S Matrix, λ_j

Appendix A.9. Sampling the Latent Factors F_t