Estimation of FAVAR Models for Incomplete Data with a Kalman Filter for Factors with Observable Components

: This article extends the Factor-Augmented Vector Autoregression Model (FAVAR) to mixed-frequency and incomplete panel data. Within the scope of a fully parametric two-step approach, the alternating application of two expectation-maximization algorithms jointly estimates model parameters and missing data. In contrast to the existing literature, we do not require observable factor components to be part of the panel data. For this purpose, we modify the Kalman Filter for factors consisting of latent and observed components, which signiﬁcantly improves the reconstruction of latent factors according to the performed simulation study. To identify model parameters uniquely, the loadings matrix is constrained. In our empirical application, the presented framework analyzes US data for measuring the effects of the monetary policy on the real economy and ﬁnancial markets. Here, the consequences for the quarterly Gross Domestic Product (GDP) growth rates are of particular importance.


Introduction
The role of money in the case of monetary policy and its impact on the real ecomony have been thoroughly discussed in the literature. For instance, see Levhari and Patinkin (1968), Grandmont and Younes (1972) as well as Carr and Darby (1981). In this regard, Mankiw (2010Mankiw ( , 2014 distinguishes three hypotheses: First, classical dichotomy believes in the neutrality of money, that means, it does not affect the real economy, see for example, Ball and Romer (1990). In this theory, only prices and wages matter. A second group of economists claims that monetary policy may affect the real economy through falling interest rates and raising investments, see for example, Serletis and Koustas (1998). Finally, the current economic theory assumes the neutrality of money in the long run, but it admits the possibility that monetary policy may absorb economic fluctuations in the short run, see for example, Minsky (1993). Hence, monetary policy implications are crucial for central banks and it explains why there is abundant literature about measuring the effects of monetary policy.
Vector Autoregression Models (VARs) have become the standard approach for identifying and measuring the effects of monetary policy innovations on macroeconomic variables since Bernanke and Blinder (1992) and Sims (1992). A main advantage of this method is that it clearly discloses the effects of shocks. Unfortunately, VARs are restricted to a limited number of times series, which may result in a trade-off for empirical applications. On the one hand, a comprehensive model must take into account the full information spectrum used by central banks and external sources. On the other hand, VARs with too many variables cannot uniquely be estimated based on small data samples. Then, a pre-analysis is required to extract the most relevant, sparse data from the full information spectrum. However, if the resulting sparse panel data does not sufficiently reflect the original data, policy shocks are measured with errors and misleading results are obtained. A second drawback is that their Impulse Response Functions (IRFs) merely consider the few included variables covering a small subset of the universe central banks care about. Here, IRFs map how a variable of interest reacts to exogenous shocks over time. The choice of specific time series representing an economic concept like "real activity" is arbitrary to some degree and thus, denotes a third disadvantage of the VAR approach. Bernanke et al. (2005) introduced the Factor-Augmented Vector Autoregression Models (FAVARs) which combine the VAR approach with factor analysis. The main idea behind FAVARs is to extract the information inherent in large panel data by a few factors and some observable variables. Because of this, a FAVAR consists of two equations: The transition equation displays the joint dynamics of the observed and latent factors as a VAR process, while the measurement equation shows the relation between both factors and some additional panel data.
For estimating FAVARs several procedures can be pursued. For instance, Bernanke et al. (2005) suggested a non-parametric two-step approach using Principal Component Analysis (PCA) and Ordinary Least Squares Regression (OLS). Additionally, they derived a single-step Markov Chain Monte Carlo method. Bork (2009) as well as Bańbura and Modugno (2014) applied Expectation-Maximization Algorithms (EMs) instead. Sometimes, the estimation of FAVARs relies on complete panel data, whose updating frequency is either monthly (Bernanke et al. 2005;Bork 2009;Wu and Xia 2014) or quarterly (Ellis et al. 2014). In case of macroeconomic data, the Unemployment Rate and Consumer Price Index are monthly published, but the Gross Domestic Product (GDP) is quarterly reported. All three indices rank among the relevant guides for monetary policy, although they are not ready at the same frequency. Therefore, the question of how to best profit from such data arises. A simple solution takes the least frequently updated time horizon, for example, the quarterly one. However, this approach ignores all monthly information.
By contrast, we incorporate well-known results regarding temporal aggregation and missing observations to obtain balanced panel data Watson 1999, 2002b;Murasawa 2003, 2010;Bańbura et al. 2011Bańbura et al. , 2013. Thereby, we introduce for each observed time series an artificial, complete analog and define a proper relation between both. Depending on the relation type, we distinguish between stock, flow and change in flow variables. In the past, among others, Schumacher and Breitung (2008); Stock and Watson (2002b) and Bańbura and Modugno (2014) tackled data irregularities in the area of factor models, while Bańbura and Modugno (2014); Boivin et al. (2010); Bork (2015) and Marcellino and Sivec (2016) did the same for FAVARs.
In the presence of data incompleteness, Kalman filtering methods and EMs enable Maximum-Likelihood Estimation (MLE). With regard to this, the seminal work of Dempster et al. (1977) showed how to integrate missing data out of the likelihood function. Shumway and Stoffer (1982) deployed EMs for time series with missing observations. At the same time, Rubin and Thayer (1982) and Watson and Engle (1983) estimated factor models using EMs. Theoretical aspects of EMs, in particular, some convergence properties were discussed in Wu (1983). Finally, Bańbura and Modugno (2014) developed an EM for estimating dynamic approximate factor models with arbitrary patterns of missing data. Bańbura and Modugno (2014) as well as Bork (2015) admit time-dependent selection matrices to exclude missing data from their MLE. Their state-space representations already take into account which data type 1 each variable belongs to and so, they have a single EM instead of two. However, they must adjust the whole state-space representation, as soon as for example, new time series are added or old ones are removed. By contrast, our two-step approach requires changes of single equations for balanced data instead of the overall model formulation which bears less risks and so, denotes another 1 Distinction between stock, flow or change in flow variables. advantage of our procedure. The non-parametric method in Boivin et al. (2010) coincides with ours, if our second EM is replaced by the two-step principal component approach of Bernanke et al. (2005). In general, this second EM coincides with Bork (2009), Bork et al. (2010) and Bańbura and Modugno (2014). The first EM was introduced in Watson (1999 2002b) and was reused in Schumacher and Breitung (2008).
In this paper, we extend the FAVAR of Bernanke et al. (2005) to ragged panel data and make the following three contributions to the existing literature: First, two EMs estimate the model parameters and reconstruct missing obersations in the form of an iterative scheme. The first EM controls the relation between the observed and artificial time series, when it constructs balanced data. Based on this, the second EM performs the actual MLE. Our second contribution is that the observable factors of the FAVAR are not needed to be a part of the panel data as in Bork (2009) and Marcellino and Sivec (2016). Therefore, the loadings matrix can be constrained without resorting the panel data. This is convenient for model selection since existing estimation methods require a special variable order in the panel data. Nevertheless, for comparison reasons of our empirical results we perform the same data pre-processing as Bork (2009) including the distinction between slow-and fast-moving variables as proposed in Bernanke et al. (2005). Finally, our last contribution is the adaption of the classical Kalman Filter (KF) for the observable factor components. In this regard, we derive KF equations for a refined state-space representation and show the superiority of our modified KF estimation in a simulation study.
In the empirical study, we investigate the effects of the United States (US) monetary policy on its real economy. Thereby, we use data similar to Bernanke et al. (2005). In addition, we have quarterly indices, for example, GDP, discontinued data, for example, Deutsche Mark-US Dollar Foreign Exchange (FX) and later starting variables, for example, Euro-US Dollar FX. The updating frequency is monthly. The time period ranges from January 1959 until October 2015 covering several crises. We evaluate the impact of the monetary policy decisions using Impulse Response Functions (IRFs) and Forecast Error Variance Decompositions (FEVDs). The confidence intervals of the IRFs arise from a non-parametric bootstrap method.
The remainder of this paper is structured as follows: In Section 2, we discuss the definition of FAVARs and derive an alternative estimation method for incomplete panel data. Thereby, we derive estimates for missing observations. In Section 3, we compare the estimation quality of the suggested estimation method with already existing ones. In Section 4, we measure the impact of the US monetary policy on the real economy based on mixed-frequency US panel data. In Section 5, we summarize our findings and outline directions for the future research. The appendices provide detailed algorithms, results of the Monte Carlo (MC) simulations, data descriptions and illustrations of the empirical study.

Mathematical Background
We start with the definition of FAVARs and show that parameter ambiguity may affect the covariance matrices of idiosyncratic shocks. At this stage, we include identification conditions from Bai et al. (2015). In a next step, we modify the KF from Bork (2009) to take into account that factors are partially observable. Incomplete time series are reconstructed using the EM of Watson (1999, 2002b).

Parameter Ambiguity and Identification Restrictions
Usually, VARs accomodate a limited number of time series. 2 In this regard, FAVARs are more indulgent and support the modeling of high-dimensional data. Similar to Dynamic Factor Models (DFMs), FAVARs comprise a transition equation and a measurement equation. But there is an important difference between both. The transition equation of DFMs describes the dynamics of 2 Of course, there are exceptions from this statement such as Bańbura et al. (2010). latent factors F t ∈ R K at time t, whereas the one of FAVARs maps the joint dynamics of latent factors F t ∈ R K and observable variables Y t ∈ R M . This is why the joint factors C t = [F t , Y t ] ∈ R K+M are partially observable.
In the scope of monetary policy analysis with FAVARs, Y t often covers measures controlled by central banks such as the US Effective Federal Funds Rate (FEDFUNDS). By contrast, VARs require Y t to collect all data due to C t = Y t . Thus, VARs must balance covering of relevant information and data dimension. In FAVARs, important information, which is not yet part of Y t , is condensed in the latent factors F t . With this in mind, the transition equation of a FAVAR is given by the following dynamics: where Φ(L) is a conformable lag polynomial of finite order p ≥ 1 with Φ(L) = Φ 1 + Φ 2 L 1 + · · · + Φ p L p−1 and Φ i denoting a (K + M) × (K + M)-dimensional matrix of autoregressive coefficients for i = 1, . . . , p. The error vector v t is supposed to be Gaussian identically and independently distributed (iid) with zero mean and covariance matrix Σ v . For simplicity reasons, let each univariate times series part of Y t be standardized with zero mean and standard deviation of one. Furthermore, we assume the VAR process in (1) as covariance-stationary (Hamilton 1994, Proposition 10.1, p. 259).
Equation (1) is a VAR(p) in the variables Y t , if all terms of Φ(L) covering the impact of F t on Y t are zero (Bernanke et al. 2005). Otherwise, Bernanke et al. (2005) call (1) the transition equation a FAVAR. Moreover, they note: First, the FAVAR in (1) nests a VAR supporting comparisons with general VAR results and the assessment of the marginal contribution of the factors F t . Second, if the true system is a FAVAR, ignoring the factors F t and sticking to the simple VAR in Y t will cause biased estimation results and so, the interpretation of IRFs and FEVDs may be faulty.
Next, the hidden factors F t are obtained from the FAVAR measurement equation. For this purpose, the vector X t ∈ R N gathers all panel data at time t, where N is "large" (in particular, N may be greater than the sample length T) and K + M N holds. As for Y t , let each times series in X t be standardized. Then, the measurement equation relates the panel data X t and the partially observed factors C t as follows: where Λ f and Λ y denote loadings matrices of dimension N × K and N × M, respectively. The idiosyncratic error e t is Gaussian iid with zero mean and covariance matrix Σ e . Note, we attach a greater importance to cross-sectional instead of serial error correlation in this article. In this manner, we enter a direction different to the work of Bańbura and Modugno (2014). 3 Because of (2), the vector C t drives the dynamics of X t . This is why Bernanke et al. (2005) regard all X t as "noisy measures of the underlying unobserved factors F t ". In total, FAVARs are defined by (1) and (2). The model (1) and (2) is econometrically unidentified, therefore, its parameters cannot be uniquely estimated. For any non-singular matrix R of dimension (M + K) × (M + K) the measurement equation obeys: In the scope of a MC simulation study in Section 3, we show scenarios, where our estimation approach is superior.
with R −1 as the inverse of matrix R. The observability of Y t imposes constraints on the shape of R and so, removes M (K + M) degrees of freedom (Bai et al. 2015). Consequently, the invertible matrix R consists of the following submatrices: under linear loadings constraints (the maximization step of EM). Here, ln (·) denotes the natural logarithm and tr (·) is the matrix trace. The conditional moments of the factorC t are computed using Kalman Filter and Kalman Smoother (KS). By iterating the expectation and maximization steps until convergence of the expected log-likelihood E Θ [L (Θ|X, C) |X, Y], the EM estimates the model parameters Θ. The estimation of the FAVAR (4) and (5) with loadings constraints requires knowledge of the factor dimension K and the lag order p. In empirical analyses, both must be specified. For this purpose, we choose the usual Akaike Information Criterion (AIC) and leave more advanced approaches for model selection for the future research. Let 1 ≤p and 1 ≤K be upper limits of the autoregressive order and factor dimension, respectively, to be tested. Moreover, letΘ (p,K) be the estimated model parameters for dimensions (p, K). Then, we take the pair (p * , K * ) satisfying: Thereby, the penalty term accomodates the special shape of Σv in (5) and the loadings restrictions. 4 4 Alternatively, the information criteria of Ng (2002, 2008) or Hallin and Liška (2007) enable model selection.

Kalman Filter and Smoother
Usually, DFMs with factor dynamics of order p ≥ 1 are converted into large-dimensional DFMs of order p = 1. For FAVAR (4) and (5) and state vectorC t = C t , . . . ,C t−p+1 ∈ R p(K+M) , we receive iid. (12) Bork (2009) as well as Marcellino and Sivec (2016) considered FAVARs as DFMs and made two adjustments. First, they added the observable variables Y t part ofC t to the panel data X t . Second, they chose the shape of the loadings matrixΛ in (10) such that Y t in X t was identically mapped to Y t inC t . In other words, they treated the overall factorsC t as hidden and forced their last M entries to coincide with Y t part of X t .
By contrast, we use an alternative state-space representation. Namely, we separate latent and observed factors from each other, before the stacking takes place. For vectorsF t = F t , . . . ,F t−p+1 ∈ R pK and Y t = Y t , . . . , Y t−p+1 ∈ R pM , we reformulate the original FAVAR as follows: where the shocks v t are iid Gaussian with zero mean and covariance matrix Σ v defined by: A comparison of the transition Equations (11) and (14) shows that (14) explicitly acknowledges that the factorsC t are partially observed. This enables a modification of the standard KF for observable factor components Y t which, to the best of our knowledge, was not addressed in recent research. Second, we are able to linearly constrain the transition coefficientsΦ instead of the loadings matrix Λ. 5 As usual for KF, we assume known model parameters in (13)-(15) and define the filtration: Ω 0 = ∅, Ω t = {X 1 , . . . , X t , Y 1 , . . . , Y t } for t > 0 collecting all observations up to time t ≥ 0. Then, Ω T covers the overall sample {X, Y}. For the hidden factor moments, we set: Analogously, we shorten means and covariance matrices of X t and Y t , respectively, conditioned on Ω t−1 . Algorithm A1 summarizes the adapted KF with factor estimates obtained by PCA as starting values. Note, the KS is not influenced by the observed factor components as shown in Ramsauer (2017).

EM-Algorithm for Incomplete Panel Data
Regarding incomplete data we pursue the method of Watson (1999, 2002b) which introduces for each observed time series an artificial, high-frequency analog and defines a proper relation between both. As in Section 2.2, let N and T denote the number of times series and the total sample length, respectively. The index 1 ≤ t ≤ T covers each point in time when new information arrives and thus, captures the highest frequency. For 1 ≤ i ≤ N the vector X i obs ∈ R T(i) with T(i) ≤ T collects all observations of signal i and the vectorX i ∈ R T serves as its artificial, high-frequency counterpart. Then, we receive: with Q i ∈ R T(i)×T . For any complete time series, it holds: T(i) = T and Q i = I T . If a time series is less often updated or there are missing elements, we have: T(i) < T. Furthermore, the shape of the matrix Q i specifies the nature of the relation in (16). In the literature, see for example, Bańbura et al. (2013, ECB working paper), there is a common distinction between stock, flow and change in flow variables 6 . Sometimes, this classification is discussed as temporal aggregation. The structure of the matrix Q i does not affect our subsequent considerations, this is why we proceed with the general version (16). Let the matricesF : . . . , e T ] ∈ R T×N collect all factors, standardized observations and errors in (4), respectively. The panel data in (4) is supposed to consist of standardized time series, thus, we set: for each time series i with mean µX i and variance σ 2 X i . In Section 4, we replace both by their empirical estimates.
Here, the vector 1 T ∈ R T×1 consists of ones only. Using (4) and (16), we derive for 1 ≤ i ≤ N: An EM for parameter estimation subject to linear restrictions of the transition coefficientsΦ is stated in Ramsauer (2017). 6 For signal 1 ≤ i ≤ N, let the integers n j 1≤j≤T(i) count the high-frequency periods between two successive observations.
Then, o j = ∑ j k=1 n k captures when the j-th observation X i obs,j is made. For stock variables, the observations match with their artificial counterparts, that is, we have: X i obs,j =X i o j . For flow variables, the observations either represent the sum or the average of the artificial elements of the respective low-frequency period. Hence, the sum version obeys: The average formulation satisfies: X i obs,j = 1 n j ∑ n j −1 k=0X i o j −k . For change in flow variables, the change in two consecutive observations is traced back to a linear combination of the changes in the artificial time series. As before, a sum and average version exist. For the latter it holds: By contrast, the sum version requires the equality n j = n for all 1 ≤ j ≤ T (i) to derive a similar result. To verify this requirement we assume n j = n j−1 + 1 and obtain: ∆X i obs,j = X i obs,j − X i obs,j−1 = ∑ n j −1 . Since the last term is the signal itself, the observed change does not consist of a pure combination of high-frequency changes. By similar reasoning the same holds for any n j = n j−1 . whereΛ f i ,Λ y i and E i denote the i-th row ofΛ f andΛ y or the i-th column of E. Following Watson (1999, 2002b),X i is reconstructed by its conditional expectation given by Algorithm A2 summarizes the estimation of FAVARs with incomplete data. Besides the initialization, it consists of an inner and outer EM. The initialization calls for three steps: First, we construct an initial guess for the high-frequency panel data using the given observations. If necessary, gaps are filled by random numbers, interpolation and so forth. At this stage, the time seriesX i (0) are not required to obey (16), since this will be automatically achieved by (17). The second step applies the two-step principal component approach of Bernanke et al. (2005) to the standardized panel data X (0) . Finally, the third step updates the high-frequency panel data based on the estimated model parameters and observed time series.
The algorithm from Algorithm A2 also tackles the model selection problem. The optimal lag length and factor dimension (p * , K * ) may change during the estimation procedure. To avoid that changes in (p * , K * ) affect its termination, changes in the expected log-likelihood E [L | X, Y] instead of the model parameters serve as termination criteria. In this context, we consider relative instead of absolute changes.

Monte Carlo Simulation
In the scope of a MC simulation study, we compare the estimation accuracy of our two-step estimation method using the modified KF from Section 2 and three alternative approaches. Besides a non-parametric ansatz based on PCA and OLS, we test two parametric estimation methods treating FAVARs as Approximate Dynamic Factor Models. For all procedures, an outer EM reconstructs complete panel data from observations and latest parameter estimates. Thus, we concentrate on the estimation quality of the modified KF but also address the issue of incomplete panel data.
The underlying data is simulated as follows: and V e ∈ R N×N represent arbitrary orthonormal matrices for fixed dimensions (T, N, K, M, p). Then, the subsequent FAVARs parameters arise: Hence, the parameters in (18) specify a general FAVAR instead of its rotated simplification. If all matrices Φ i , 1 ≤ i ≤ p, do not satisfy the covariance-stationarity of the factor process {[F t , Y t ] }, they are redrawn. To prevent us from matrices Φ i , whose eigenvalues are close to zero, their eigenvalues are taken from the range of [0.25/i, 0.75/i], where the division by i reduces the impact of lagged factors. The restriction to matrices Φ i with positive eigenvalues and the division by i are made for simplicity only. Based on (1), we construct the factor sample [F, Y] ∈ R T×(K+M) , standardize all univariate time series in [F, Y] and adjust the matrices Φ i , 1 ≤ i ≤ p, and Σ v accordingly. Next, we simulate the panel data X ∈ R T×N based on (2) and matrices W and Σ e of full column rank. Eventually, we standardize all univariate time series in X and adapt the matrices W and Σ e correspondingly.
At this stage, we have complete panel data X. For ρ m ∈ [0, 1] as target ratio of gaps, we randomly delete ρ m T elements from each times series serving as stock variable. For flow or change in flow variables, we aggegrate the given data accordingly (for more details see Ramsauer (2017)) such that we receive a regular pattern with observations at times None of the four methods estimates hidden factors for points in time without any observation. Therefore, we reapply this procedure, if the resulting incomplete panel data comprises an empty row.
In the sequel, we focus on the hidden factors F, since the variables Y are observed in full. This is why, we determine for each of the four estimation methods the trace R 2 defined as follows: The trace R 2 evaluates the quality of the estimated factors. Since its introduction by Stock and Watson (2002a), it became a common standard in the literature, see for example, Doz et al. (2012) and Bańbura and Modugno (2014). If the hidden factors are perfectly estimated, the trace R 2 takes value 1. Otherwise, it is smaller than 1.
For the four estimation methods, Tables A1-A4 report the average of the trace R 2 based on 500 MC samples. We focus on the hidden factors, since the variables Y t are observed in full and therefore, do not call for estimation. In Table A1, we estimate the simulated FAVARs with the non-parametric method of Boivin and Giannoni (2008) and Boivin et al. (2010). In Tables A2 and A3, the EM of Bork (2009) serves as inner EM for the estimation of the model parameters. In Table A2, data reconstruction part of the outer EM (17) relies on filtered instead of observed factors Y t . By contrast, Table A3 directly utilizes observed factors Y t . Finally, Table A4 illustrates the average trace R 2 for our new KF approach. Except for the approach in Table A2, the outer EM in all other estimation methods take the observed vectors Y t into account.
All updates in Algorithm A2 stop, as soon as the absolute value of the relative change in the expected log-likelihood function is below 10 −2 . In particular, the termination criterion ξ = 10 −2 controls the data reconstruction (outer EM). Based on the reconstructed data, the criterion η = 10 −2 terminates the parameter estimation (inner EM). For instance, Modugno (2014, working paper, 2010), employ 10 −4 as termination criterion. In our case, decreasing the termination criterion from 10 −2 to 10 −4 did not significantly improve the estimation quality of our method, but it rather boosted its run time. For all estimation methods we initialize the first guess of the complete panel datā X (0) in the same way. That is, for each univariate time series, we fill its gaps by the empirical mean of its observations. Finally, we do not address the selection of K and p here.
A comparison of Tables A1-A4 shows: First, irrespective of the estimation method, there are no obvious differences between the trace R 2 means of the three data types. Second, a higher percentage of data gaps, ceteris paribus, deteriorates the trace R 2 means. Third, longer samples, that is, larger T, improve the trace R 2 means. The same holds for panel data covering more variables, that is, larger N. Fourth, higher lag orders improve the trace R 2 means, which is rather surprising. So far, all findings are in place for all four estimation methods.
However, some differences exist: First, the estimation methods in Tables A1-A3 require a work-around to take the observability of Y t into account. For instance, the non-parametric approach repeatedly applies PCA and OLS for separating the impacts of Y t and F t on X t from each other. In this regard, the dimensions of the vectors Y t and F t matter. With a view to Tables A1-A3, the pairs (K = 1, M = 1) and (K = 3, M = 3) have smaller trace R 2 means than the pair (K = 3, M = 1). By contrast, the estimation method with our modified KF in Table A4 offers for (K = 1, M = 1, p = 1) larger trace R 2 means than for (K = 3, M = 1, p = 1).
The trace R 2 means in Table A4 are usually better than their counterparts in Tables A1-A3.  For clarity, Tables A5-A7 display the corresponding ratios of trace R 2 means from Tables A1-A3 and  Table A4, respectively. Thereby, ratios larger than one confirm that the estimation method based on our modified KF outperforms the respective alternative. Note that all ratios in Tables A5-A7 are larger than one but for the previously mentioned pairs (K = 1, M = 1) and (K = 3, M = 3) they exceed one by far. This clearly highlights, why it makes sense to take into account that the variables Y t represent observed factors.

Empirical Application
The US economy ranks among the biggest and most important in the world. Moreover, after many years of declining interest rates, in December 2015 the US Federal Reserve decided to raise the Effective Federal Funds Rate (FEDFUNDS) by 25 basis points (bps). So, it was the first large central bank to leave the path of an extremely relaxed monetary policy. Due to this and, of course, for comparisons with Bernanke et al. (2005), Bork (2009Bork ( , 2015 and Bork et al. (2010), we deal with the impact of the US monetary policy on its real economy in the sequel. At the beginning, we describe the underlying panel data and observable factors. Then, we briefly summarize some technicalities. Eventually, we discuss the estimated Impulse Response Functions and Forecast Error Variance Decomposition.
The underlying panel data is an update of the one in Bernanke et al. (2005), except for 24 variables, which were not available anymore. This is why we have 96 of the original 120 time series over the period from January 1959 until October 2015. Besides the 96 monthly time series, we have 15 partially incomplete time series. Among other things, we are interested in how monetary policy decisions may affect quarterly indices. For this purpose, the quarterly growth rates of GDP, Governmental Total Expenditures, Real Exports of Goods and Services as well as Real Imports of Goods and Services belong to these 15 new time series. 7 Monetary policy actions can significantly move Foreign Exchange (FX), especially, if unexpected by markets. As the European Union trades a lot with the US, our data comprises the USD-EUR FX starting in January 1999 and USD FX against the German Mark, French Franc and Italian Lire serving as an approximation for the USD-EUR FX before January 1999. By this means, our data is ragged. Finally, 4 of the 15 new time series offer information about the Federal Reserve Banks' balance sheets, which have dramatically increased since the financial crisis in 2007/2008. In total, we have 111 macroeconomic indicators for diverse areas of the US economy from January 1959 until October 2015. For a detailed overview including sources, data preprocessing and the distinction between slow-and fast-moving ones based on Bernanke et al. (2005) see Appendix C.
The "Quantitative Easing" programs QE1-QE3 were the response of the Federal Reserve to the problems arising from the financial crisis, after stimulating the economy by lowering the Effective Federal Funds Rate reached its limits in December 2008. For instance, the Federal Reserve massively bought Treasuries and mortgage-backed securities. To obtain a comprehensive picture of the monetary policy actions, the observable factor Y t consists of Currency in Circulation (CURRCIR), St. Louis Adjusted Monetary Base (AMBSL) and Effective Federal Funds Rate (FEDFUNDS). Our estimation method for FAVARs requires the time series {Y t } to be complete. Therefore, holdings of Treasuries and mortgage-backed securities, which were only available for the years from 2002 until 2015, belong to the panel data.
In Section 3, we aimed at demonstrating the advantages of our updated KF compared to the standard approach. For comparisons of our empirical results to Bork (2009), we now perform the same data pre-processing as originally proposed by Bernanke et al. (2005). In particular, we also distinguish 7 We regard the four quarterly growth rates as sum versions of flow variables, while all other time series serve as stock variables. For the 107 monthly time series there is no distinction between stock, flow and change in flow variables. Although some time series start at a later point in time, for example, the USD-EUR FX, or are discontinued, for example, the German Mark-USD FX, there are no intermediately missing observations. between slow-and fast-moving variables. As soon as the sorting of complete, slow-moving variables has been finished, we repeat this procedure for complete, fast-moving ones, before we add all ragged time series in arbitrary order. Our technical settings are: T = 682, M = 3,K = 10,p = 5, η = 0.01 and ξ = 0.01. Thus, the termination criteria are not too strict and the run time of the algorithm in Algorithm A2 remains reasonable. An AIC-based model selection (Ramsauer 2017) yields: (K * , p * ) = (9, 1). In this way, we have larger factor dimensions K and M but a smaller lag order than Bork (2009). Because of this, Table 1 compares the first nine variables of our sorted panel data with their counterparts in Bork (2009). Thereby, we keep the long expressions of Bork (2009) in the second column and apply our abbreviations from Appendix C in the third column. At first glance, both subsets cover the same areas. That is, Bork (2009) has four time series of the group "Real Output and Income", three time series belonging to "(Un)employment and Hours", one time series from "Consumption" and one from "Price Indices". Similarly, our subset consists of one, four, one and three, respectively, of those time series. The main deviation arises from the larger number of price indices, which we are working with, instead of production data. However, we should keep in mind that some differences possibly arise from that fact that some time has passed since the work of Bork (2009). Furthermore, the panel data does not completely match. Note, the different loadings constraints are irrelevant for this pre-analysis. Industrial production: total index (1992 = 100, SA) PPICRM Next, we focus on the shock impact on the FAVAR variables. A properly chosen MA(∞) representation of the [F t , Y t ] dynamics implies that each factor is driven by its own innovations and the ones of preceding factors. For details see Ramsauer (2017). Thus, we obtain the subsequent innovation weight:  (2015), we derive confidence intervals for the IRFs. In doing so, there are diverse methods to construct those. For example, Bernanke et al. (2005) and Boivin et al. (2010) used the bias-adjusted bootstrap approach of Kilian (1998). In this sense, Yamamoto (2012) also showed bootstrap routines with bias correction. Due to its unknown asymptotic properties, Benkwitz et al. (1999) rised doubts concerning the approach of Kilian (1998) and recommended the use of standard bootstrap techniques instead. For instance, Bork et al. (2010) applied the standard bootstrap method. Alternatively, Bai et al. (2015) derived closed-form expressions for the asymptotic distributions of IRFs. Since the idiosyncratic errors of their measurement equation are uncorrelated, we cannot use the findings of Bai et al. (2015) here. For simplicity reasons, we revert to a non-parametric bootstrap method without any bias correction.
Reestimation of latent factors and data incompleteness offer some flexibilty, this is why we briefly sketch our bootstrap method: We first estimate the FAVAR parameters with loadings constraints taken into account and so, receive error residuals. To gain reliable confidence intervals, we run 10,000 bootstrap simulations. For each path, we randomly draw with replacement from the recentered errors and keep the first p estimates and observations, respectively, of the vector . Thereby, no model selection takes place, that is, a VAR(1) is estimated. Then, we derive the IRFs of [F t , Y t ] i for 1 ≤ i ≤ K + M. For the IRFs of X t , we fix the initially estimated loadings matrix. In this manner, we ignore uncertainty inherent in the bootstrapped panel data.
Similar to Bernanke et al. (2005), Bork et al. (2010) and Bork (2015), Figure A1 illustrates the impact of the shock z on the standardized variables. Our confidence intervals cover confidence levels of 68% (light gray) and 90% (dark gray) for a time horizon of 48 months. To be more precise, Figure A1 displays for time series 1 ≤ i ≤ N or factors 1 ≤ j ≤ K + M: Based on Figures  Housing starts (HOUST, HOUSTNE, HOUSTMW, HOUSTS, HOUSTW, PERMITNSA) are supposed to increase over the next 48 months. Perhaps, this reflects that people are afraid of additional interest rate hikes and therefore, bring such projects forward. Since the Effective Federal Funds Rate applies to the whole US, regional aspects in the case of housing starts do not matter. In the short term, less new orders (NAPMNOI) increase manufacturing inventories (NAPMII), which also confirms a reduction in consumption. In the long run, higher interest rates require companies to offer higher dividends (FSDXP), but boost their costs, too. For example, the same amout of debt calls for higher interest rate payments. In total, the price-earnings ratio (FSPXE) naturally decreases.
Except for EXCAUS and EXITUS, the United States Dollar (USD) becomes stronger compared to foreign currencies (EXSZUS, EXJPUS, EXUSUK, EXGEUS, EXFRUS, EXUSEU). Note that EXITUS is the FX rate between Italian Lira, which the Euro succeeded and USD. Thus, it is not relevant anymore. Here, it is part of our panel data, as EXGEUS, EXFRUS and EXITUS serve as approximations for EXUSEU, before the Euro was introduced on 1 January 1999. A stronger USD may come from an increased demand for USD, when investors increase their exposure to US fixed income products. For instance, US Treasuries' yields (TB3MS, TB6MS, GS1, GS5, GS10, TB3SMFFM, TB6SMFFM, T1YFFM, T5YFFM, T10YFFM) and corporate bond spreads (AAA, BAA, AAAFFM, BAAFFM) follow an increase in FEDFUNDS.
The drops in M1SL, TOTRESNS, BUSLOANS and NONREVSL let the available liquidity shrink, what the US Federal Reserve is exactly aiming at. In addition, prices and inflation (NAPMPRI, PPIFGS, PPIITM, PPICRM, CPIAUCSL, CPIAPPSL, CPITRNSL, CUSR0000SAC, CUSR0000SAD, CUSR0000SA0L2, CUSR0000SA0L5) climb in the long term such that the US economy eventually leaves its crisis mode and comes back to normal. This assumption is supported by the raising composite leading indicator MEI and GDP. Although there are no long-term effects on the export and import of goods and serices (EXPGSC1, IMPGSC1), both decrease. The reduced export might arise from the strong USD, which makes US products more expensive abroad. By contrast, the strong USD reduces the USD prices of foreign products. Hence, the drop in USD prices is not balanced by a bigger amount of imported products.

Conclusions and Final Remarks
This article considers the estimation of FAVARs, when the underlying panel data is incomplete. Thereby, incompleteness arises from the inclusion of mixed-frequency information and the absence of single values. Besides the panel data, a FAVAR comprises observable variables which, together with hidden factors, drive the joint factor dynamics. So far, the presented estimation method calls for full time series of the observable factors. Therefore, an extension to incomplete observed factors is a direction of the future research.
Within a maximum likelihood framework, a fully parametric two-step routine simultaneously estimates unknown model parameters and missing data. In a nutshell, two expectation-maximization algorithms are alternately applied until a pre-specified convergence criterion is reached. The first derives complete data from the observations and latest parameter estimates, whereas the second re-estimates the parameters, whenever the complete data changes. In the scope of a MC simulation study, the superior estimation quality of the suggested approach compared to already existing methods is confirmed.
The main contributions of this paper to the existing literature are as follows: First, we extend the FAVAR of Bernanke et al. (2005) to incomplete panel data. Marcellino and Sivec (2016) did the same, but their estimation method requires the observable factor components to be part of the panel data. By contrast, we modify the Kalman filter such that it takes into account that the factors are partially observed and so, can relax their restriction.
Second, the presented estimation method adds flexibility to the loadings matrix. As mentioned before, in Bork (2009) the observable factors are included in the panel data. In doing so, they occupy certain positions which calls for a specific shape of the loadings matrix, but allows Bork (2009) to apply estimation methods for dynamic approximate factor model for the estimation of the FAVAR of Bernanke et al. (2005). A main advantage of our new Kalman filter is the fact that we have to choose a few loadings constraints to ensure parameter uniqueness, but there is no need for a special structure of the loadings matrix.
Third, we explicitly separate the observable factors from latent ones. Because of this, we determine all results for the general case of an arbitrary autoregressive oder p ≥ 1. That is, we do not use the argument that any VAR of order p ≥ 1 can be traced back to a VAR(1) and do not treat this simplest case. Therefore, our results can be directly applied without any adjustments.
Fourth, the inclusion of mixed-frequency data enables us to investigate the impact of the monetary policy on quarterly indicators like GDP. For instance, our empirical study considers the US economy. Based on a sample, which covers 108 macroeconomic variables and a three-dimensional vector of observable factors over a period from January 1959 until October 2015, we come to the conclusion that GDP gains from an increase in the Effectice Federal Funds Rate by 0.25% in the long term.
In the recent literature, FAVARs were primarily used in the context of monetary policy. However, the extraction of relevant information from big data is already an overarching topic. Therefore, the application of FAVARs to areas beyond monetary policy (e.g., customer behavior/churn, macroeconomic forecasting, diagnosis of diseases) based on the proposed estimation method could be part of the future research. In addition, our approach may be extended to serially correlated errors such that the overall framework admits cross-sectionally and serially correlated error terms.
In the case of monetary policy, a comprehensive comparison of the presented approach with Multivariate State-space Time-varying Parameter VARs (MVSS-TVP-VARs), Dynamic Stochastic General Equilibrium Models (DSGEs), Bayesian VARs and their extensions as in Paccagnini (2014, 2015) could be performed. In this regard, some of them must be extended to ragged panel data in a first step. Furthermore, the seemingly unrelated time series equations for MVSS-TVP-VARs in Bekiros and Paccagnini (2015) rely on the univariate version of the standard Kalman Filter and consider the observable variables Y t independently. Finally, the vector Y t must be part of the panel data X t . Therefore, the most important direction of the future research could be the combination of the models in Paccagnini (2014, 2015) with our proposed Kalman Filter for the joint vector F t , Y t based on the panel data X t .
Author Contributions: M.L. and F.R. analyzed data and drafted a first estimation method. A.M. and F.R. further developed the model and associated estimation procedure. M.L. and F.R. performed the complete computational implementation. All three authors wrote the paper.

Funding:
The PhD position of Franz Ramsauer at Technical University of Munich was third-party funded by Pioneer Investments, which is now part of Amundi Asset Management. Otherwise, this research received no external funding.

Acknowledgments:
The authors want to thank the editor and the two anonymous reviewers for their very helpful suggestions, which essentially contributed to the improvement of our manuscript. The authors gratefully acknowledge Alec Chrystal for his help on monetary policy references. Franz Ramsauer gratefully acknowledges the support of Pioneer Investments, which is now part of Amundi Asset Management, during his doctoral phase.

Conflicts of Interest:
The authors declare no conflict of interest. The sponsors had no role in the design of the study, in the collection, analyses, or interpretation of data, in the writing of the manuscript and in the decision to publish the results.

Abbreviations
The following abbreviations are used in this manuscript:

Appendix C. Underlying Data
Except for a few time series, which were not available anymore and some new, in particular, incomplete ones, this data is an updated version of the one in Bernanke et al. (2005). In this context, not available refers to times series, which we could not find anymore, instead of discontinued ones.
For clarity reasons, we distinguish between the following categories: real output and income; employment and hours; consumption; housing starts and sales; real inventories, orders and unfilled orders; stock prices; foreign exchange rates; interest rates; money and credit quantity aggregates; price indices; average hourly earnings; miscellaneous; mixed-frequency time series; observed variables Y t .
The total sample ranges from January 1959 to October 2015 and is monthly updated. However, it also comprises quarterly time series marked by "q" in the column Freq. as well as shorter time series as indicated in the column Time span. For example, see time series MBST with its first observation in December 2002.
With footnote 6 in mind, we have for the assumed data types in column Type: stock (1), sum formulation of flow variable (2), average version of flow variable (3), sum formulation of change in flow variable (4) and average version of change in flow variable (5). Note, for complete time series the data type does not matter, since all yield an identity matrix for the matrix Q i .
Regarding data transformations in the scope of the preprocessing phase the column Trans. distinguishes between: no transformation (1), first difference (2), second difference (3), logarithm (4) and first difference of logarithm (5). This classification is in accordance with Bernanke et al. (2005).
Besides the series number, the first K variables of the sorted data provide their position number in brackets (Bork 2009). An asterix * next to an abbreviation marks the respective variable as slow-moving (Bernanke et al. 2005). Thereby, slow-moving variables are not supposed "to respond contemporaneously to unanticipated changes in monetary policy", however, they allow fast-moving variables "to respond contemporaneously to policy shocks". As most of our data comes from the research database of the Federal Reserve Bank of St. Louis, the Uniform Resource Locator (URL) "http://research.stlouisfed.org/fred2/series" is abbreviated by "fred".
The column Series description provides information on how publication delays are taken into account and highlights seasonality adjustments: Seasonally Adjusted (SA) and Not Seasonally Adjusted (NSA).  Figure A2. IRFs (black lines) of standardized time series in Appendix C arising from an increase in FEDFUNDS by 0.25%. Light gray areas show the 68%-confidence intervals (i.e., 1-σ interval), dark gray areas display the 90%-confidence intervals. All intervals are based on 10,000 non-parametric bootstrap simulations of the transition equation, where the estimated loadings matrix is kept fixed. Figure A3. IRFs (black lines) of standardized time series in Appendix C arising from an increase in FEDFUNDS by 0.25%. Light gray areas show the 68%-confidence intervals (i.e., 1-σ interval), dark gray areas display the 90%-confidence intervals. All intervals are based on 10,000 non-parametric bootstrap simulations of the transition equation, where the estimated loadings matrix is kept fixed. Figure A4. IRFs (black lines) of standardized time series in Appendix C arising from an increase in FEDFUNDS by 0.25%. Light gray areas show the 68%-confidence intervals (i.e., 1-σ interval), dark gray areas display the 90%-confidence intervals. All intervals are based on 10,000 non-parametric bootstrap simulations of the transition equation, where the estimated loadings matrix is kept fixed.