Algorithmic Modelling of Financial Conditions for Macro Predictive Purposes: Pilot Application to USA Data

: Aggregate ﬁnancial conditions indices (FCIs) are constructed to fulﬁl two aims: (i) The FCIs should resemble non-model-based composite indices in that their composition is adequately invariant for concatenation during regular updates; (ii) the concatenated FCIs should outperform ﬁnancial variables conventionally used as leading indicators in macro models. Both aims are shown to be attainable once an algorithmic modelling route is adopted to combine leading indicator modelling with the principles of partial least-squares (PLS) modelling, supervised dimensionality reduction, and backward dynamic selection. Pilot results using US data conﬁrm the traditional wisdom that ﬁnancial imbalances are more likely to induce macro impacts than routine market volatilities. They also shed light on why the popular route of principal-component based factor analysis is ill-suited for the two aims. and Watson (2011) show that factor invariance is technically achievable when the indicator set is sufﬁciently large and individual indicators fall homogeneously into certain idealised distributions, these requirements are practically unachievable. The dynamic features of available ﬁnancial indicators are so heterogeneous and time-varying that the numbers of indicators found with signiﬁcant factor loadings (i.e., estimated weights) fall far below a ‘sufﬁciently large’ number in practice. This study explores a new route of FCI construction by assimilating knowledge and methods from three research ﬁelds outside econometrics: partial least-squares (PLS) regression in management and marketing research, 2 measurement theory in psychology (psychometrics), and supervised versus unsupervised data dimensionality reduction in machine learning. 3 Research insights from these ﬁelds reveal that constructing FCIs by principal-component-analysis (PCA) is a misconceived route, because FCI construction amounts to composite index making. Models used for composite index making are classiﬁed as causally formative models, as opposed to causally reﬂective models in measurement theory. 4 It should be noted that causality here refers to the disaggregates-to-aggregate relationship, a different context from which causality is commonly conceived in the econometrics literature—causal direction of variables at the same aggregative level. The PCA is based on the criterion of maximising a uniquely shared variance and therefore suits, at best, the task of measuring a certain latent but commonly shared cause reﬂected in observed indicators (hence ‘reﬂective’ models). This criterion is unsuitable for the situation of weight determination in composite index making by formative models because there is no common latent cause. In the formative case, observed indicators are selected to represent distinctly different facets of the composite index of interest. Formative modelling is hence more challenging than reﬂective modelling, as it requires more than a single criterion, see Borsboom (2013, al. is


Introduction
The 2008 global financial crisis (GFC) has drawn macroeconomists' attention to a major weakness of extant macro models: the lack of variables which adequately represent broad financial market conditions and are proven as aggregate predictors of financial shocks to key macro variables, e.g., Gadanecz and Jayaram (2009), Barnett (2012), Ng (2011), Borio (2011Borio ( , 2013, and Morley (2016). 1 In acknowledgment of this weakness, there is a visible growth of research into the construction of financial conditions indices (FCI) that could serve as predictors in macro models. Many of these aggregate FCIs are model-based, as there is no suitable way of weighing up financial market variables and indices across different financial markets. The most popular modelling approach is to construct FCIs by means of principal-component-based factor analysis following the seminal works by Watson (1989, 2002), e.g., see Hatzius et al. (2010), Brave and Butters (2011), Moccero et al. (2014), Chauvet et al. (2015), Levanon et al. (2015), and Giglio et al. (2016).
Evaluation of the existing FCIs has yielded mixed results, e.g., see Aramonte et al. (2017). A key problem is lack of concatenation, i.e., the preceding values of the FCIs fail to remain invariant when the models from which they have been derived are updated with incoming new data, see Stock and Watson (2009) and Kotchoni et al. (2019). The lack of concatenation precludes the practical usefulness of such FCIs, since concatenation constitutes a fundamental measurement property, see Markus and Borsboom (2013, chp. 2) for more discussion on this attribute in the classical theory of measurement. Although Stock and Watson (2011) show that factor invariance is technically achievable when the indicator set is sufficiently large and individual indicators fall homogeneously into certain idealised distributions, these requirements are practically unachievable. The dynamic features of available financial indicators are so heterogeneous and time-varying that the numbers of indicators found with significant factor loadings (i.e., estimated weights) fall far below a 'sufficiently large' number in practice.
This study explores a new route of FCI construction by assimilating knowledge and methods from three research fields outside econometrics: partial least-squares (PLS) regression in management and marketing research, 2 measurement theory in psychology (psychometrics), and supervised versus unsupervised data dimensionality reduction in machine learning. 3 Research insights from these fields reveal that constructing FCIs by principal-component-analysis (PCA) is a misconceived route, because FCI construction amounts to composite index making. Models used for composite index making are classified as causally formative models, as opposed to causally reflective models in measurement theory. 4 It should be noted that causality here refers to the disaggregates-to-aggregate relationship, a different context from which causality is commonly conceived in the econometrics literature-causal direction of variables at the same aggregative level. The PCA is based on the criterion of maximising a uniquely shared variance and therefore suits, at best, the task of measuring a certain latent but commonly shared cause reflected in observed indicators (hence 'reflective' models). This criterion is unsuitable for the situation of weight determination in composite index making by formative models because there is no common latent cause. In the formative case, observed indicators are selected to represent distinctly different facets of the composite index of interest. Formative modelling is hence more challenging than reflective modelling, as it requires more than a single criterion, see Markus and Borsboom (2013, Part II), Howell et al. (2013), and Howell (2014). One popular criterion is a predictive target, as led by research in PLS regression modelling. This method is referred to as supervised dimensionality reduction in machine learning, as opposed to unsupervised dimensionality reduction with which PCA is associated, see Cunningham and Ghahramani (2015). 5 The above discussion helps explain why progress has remained slow in the econometric research of leading indicator construction over the last century, see Marcellino (2006). Single-minded pursuit of the PCA route is unlikely to achieve the goal of producing, from large financial data sets, aggregate measures that are practically useful for macroeconomic prediction. 6 Scrutinising the relationships between disaggregate financial indicators and macro aggregates is a prerequisite for any effective attempt on leading indicator construction. Such scrutiny reveals that the causality involved in aggregating financial shocks is even more complex than what the dichotomy of reflective versus formative models depicts. Dynamic interactions among indicators imply reactive or reciprocal causality, see Hayduk et al. (2007). Expounding these causal links among indicators requires a well-designed algorithm for indicator selection and classification. This task leads us to embrace concepts and statistical techniques from machine learning, e.g., Hastie et al. (2009), andShalev-Shwartz andBen-David (2014). 7 The algorithm proposed in this study follows the spirit of supervised dimensionality reduction. Our aim is to construct, from a relatively large set of financial indicators, unidimensional, partial, and leading indices for macro variables of predictive interest. The algorithm allows for asynchronous dynamics among financial market indicators during the dimensionality reduction process and imposes time-wise concatenation during the data-updating process. The resulting FCIs replace financial variables conventionally used in macro models to evaluate the performance of the FCI-based models against their conventional counterparts. The LSE general-to-specific dynamic model reduction approach-see Hendry (1995)-is adopted for both the FCI construction step and the macro modelling step, to reduce the risk of overfitting. Since this approach methodologically accords with backward selection via certain regularisation rules in feature selection of machine learning, it is referred to simply as backward dynamic selection hereafter. To test the algorithm, Econometrics 2022, 10, 22 3 of 22 three USA macro variables are chosen as forecasting targets: inflation and the growth rates of industrial production (IP) and GDP. Long-term and short-term interest rates are used to represent the monetary/financial sector in the baseline model. Over thirty financial variables of monthly frequency are used to construct FCIs.
The experiments yield the following results: (a) The problem of unstable loadings in financial indicator aggregation can be largely resolved by a targeted (supervised) dimensionality reduction approach in combination with asynchronous dynamics among indicators. (b) Indicators capturing short-run volatility are deselected in most cases while indicators capturing financial imbalances survive in diverse lag forms through the reduction process. (c) The forecasting performance of FCI-based macro models is sensitive to selected targets with the case of IP growth being a clear success, whereas inflation is found to be relatively too distant from FCI shocks. These findings demonstrate how intimately related the issues of aggregation, dynamics, specification of macro targets, and input feature design and selection are.
The paper is organised as follows. A detailed description of the algorithmic model design is given in Section 2. Section 3 describes the data used in the pilot experiments. Section 4 reports the findings and Section 5 considers potential further research.

Algorithmic Model Design
The algorithmic model design is introduced in two steps. Section 2.1 introduces the FCI-based macro model and the baseline model against which it is compared. Section 2.2 describes the algorithm for FCI construction and updating. The algorithm is designed to satisfy a multifaceted aim: constructing aggregates that have adequate financial market coverage such that they improve the forecasting capacity of macro models where such a coverage is lacking. This aim can be dissected into three key requirements. First, the selection of financial indicators should be exhaustive to ensure adequate representation of different facets of the markets. Second, the resulting FCIs should possess the fundamental measurement attribute of time-wise concatenation. Third, FCIs should be a leading cause of macro variables of predictive interest.
The first two requirements classify FCIs as composite aggregates. As mentioned in the previous section, composite measurement making entails more criteria than what is needed in dimensionality reduction of the reflective or common effect case, which is widely tackled by the PCA route. An explicit predictive target comes in naturally as one criterion. This puts FCI construction into the genre of supervised data reduction. Supervised data reduction focuses attention on dynamic features of data. For instance, aggregately redundant information is likely to exist among financial indicators across different markets. Further, not all indicators impact on the forecasting targets in a dynamically synchronised way and the dynamic features or ways via which indicators impact on the targeted macro variables can be more complex than what a simple linear model of those indicators can capture. 8 The proposed algorithm is designed to tackle these issues using methods inspired by PLS regression modelling and backward dynamic selection.

Macro Model Setting and Model Training
The baseline macro model takes the form of an error-correction model (ECM), arguably the most popular type of macro-econometric model. Specifically, the following ECM is used for a particular macro variable of predictive interest, y t : where ∆ denotes a one-period difference, n is the lag length, e t−1 is the error-correction term, v t the model residual term, and X t represents real-sector variables which co-trend with y t , and R t are interest rates representing the financial or banking sector conditions. The postulate that R t is inadequate in representing the financial markets motivates the FCI construction and the proposition of the following FCI-based model as an alternative to (1): where f * t is a latent variable representing the aggregate financial market conditions. Two types of targets for FCI construction are identified from (2). One is ∆y t , an obvious target, referred to as the short-run target. Another is embedded in e * t−1 , which is effectively a leading indicator for ∆y t , and referred to as the long-run target.
After FCI construction, both models (1) and (2) are reduced into data-permissible parsimonious models via backward selection over a model training period. Models (1) and (2) are re-estimated at regular updating intervals and the possibility of model respecification is examined. The examination leads to an additional model selection criterion: invariant lag structure.
The relative forecasting performance of (2) against (1) is then assessed by comparison of the mean squared forecasting errors (MSFE) and two types of forecast encompassing tests. One tests for MSFE dominance following Harvey et al. (1997Harvey et al. ( , 1998 and the other tests for forecast-differential encompassing following Ericsson (1992Ericsson ( , 1993 and Clements and Hendry (1993). It should be noted that MSFE dominance is a necessary but not sufficient condition for forecast encompassing, see Ericsson (1992). The latter test is also robust in the presence of integrated variables and hence provides a better assessment of the forecasting models compared here. The forecast encompassing tests are based on 1-step to 6-step ahead forecasts generated over the forecasting horizon, using observed data of the independent variables and forecasts of the dependent variable.

Algorithm for FCI Construction
Utilising the long-run and short-run targets identified in the previous sub-section, this sub-section introduces a new way to construct FCIs as feasible measures of f * t in (2), based on the principle of PLS regression. The long-run disequilibrium correction term, (y t − κ 1 X t ), in (1) is used as a long-run target. 9 Denoting observed financial indicators as i j,t and the total number of indicators as m, the following partial regression model is run for each target variable in the finite distributed lag form to allow for asynchronous dynamics among indicators: Only the term with the single largest-in absolute terms-weightφ j,ik * , estimated by the PLS algorithm, is selected for each indicator from (3). From the selected terms and weights the long-run FCI is constructed: Although ∆ f L t−j can be used as a measure of ∆ f * t−j in (2), this measure is rather restrictive in that it rules out the possibility that some i j,t may only exert a short-run impact on ∆y t and/or the dynamics of their short-run impact may be more complicated than what ϕ j,k * can capture. Hence, the following short-run FCI using ∆y t as the target is constructed in a similar manner as (3) and (4).
Econometrics 2022, 10, 22 5 of 22 where L denotes the lag operator.ω j (L) are obtained via backward selection from (5) by OLS and m * ≤ m number of input indicators with at least one significantω i,j enter the short-run FCI. Two differences should be noted in the dynamic specifications between the longrun and the short-run FCI construction. First, (5) takes a n-lag leading indicator model form; (4) is simply a distributed-lag model, since the error-correction term in (2), e * t−1 , already acts as a leading indicator. Second, while the indicator loadings in both f L t and f S t are asynchronous, the lag structure of indicators is more complex in f S t than in f L t . Only one term is kept for each financial indicator in (3), whereasω j (L) in (6) retains all the significant lag loadings to allow for the possibility that the short-run input of some indicators is dynamically nonlinear. Findings of such nonlinearity can be further exploited to improve input feature design of the relevant financial indicators as discussed further in Section 4.2. 10 See Appendix A for a summary of the algorithms for long-run and short-run FCI construction and Supplementary Materials for code.
With regular data updates, timewise concatenation of FCIs is a key requirement. An illustration of how concatenated FCI series are constructed during updating is given in Figure 1. The overall stability of loadings is checked through comparison of the concatenated FCI series with the un-concatenated ones, i.e., FCIs derived simply from various rounds of estimation after data updating.
where L denotes the lag operator.
( ) are obtained via backward selection from (5) by OLS and * number of input indicators with at least one significant , enter the short-run FCI.
Two differences should be noted in the dynamic specifications between the long-run and the short-run FCI construction. First, (5) takes a n-lag leading indicator model form; (4) is simply a distributed-lag model, since the error-correction term in (2), * , already acts as a leading indicator. Second, while the indicator loadings in both and are asynchronous, the lag structure of indicators is more complex in than in . Only one term is kept for each financial indicator in (3), whereas ( ) in (6) retains all the significant lag loadings to allow for the possibility that the short-run input of some indicators is dynamically nonlinear. Findings of such nonlinearity can be further exploited to improve input feature design of the relevant financial indicators as discussed further in Section 4.2. 10 See Appendix A for a summary of the algorithms for long-run and short-run FCI construction and Supplementary Materials for code. With regular data updates, timewise concatenation of FCIs is a key requirement. An illustration of how concatenated FCI series are constructed during updating is given in Figure 1. The overall stability of loadings is checked through comparison of the concatenated FCI series with the un-concatenated ones, i.e., FCIs derived simply from various rounds of estimation after data updating.
Model (3)  To create 1-step to 6-step ahead forecasts of model (2), predicted FCIs over the forecasting horizon are generated, using observed data and estimated weights. As weights are not updated, concatenation is not required for the predicted part of the FCIs.  To create 1-step to 6-step ahead forecasts of model (2), predicted FCIs over the forecasting horizon are generated, using observed data and estimated weights. As weights are not updated, concatenation is not required for the predicted part of the FCIs.

Data
Model (1) is applied to three macro targets: annual inflation based on the consumer price index (CPI), annual growth rates of IP and annual growth rates of GDP. Two output variables are considered due to the availability of monthly IP and the absence of monthly GDP. Monthly GDP series are interpolated from quarterly time series using the monthly weights of IP. Experiments with total retail sale as a proxy of private consumption yield similar results. For simplicity, a single real-sector explanatory variable is chosen for each of the three modelled variables in their level form: GDP for IP, the producer price index (PPI) for CPI, and the global output index from the GVAR literature 11 for GDP.
The data sample is in monthly frequency over the period of 1980M1-2017M12, except for the global output index which is only available up to 2012M12. The period of 1980M1-2000M12 is used for model training to ensure a decent level of composite reliability, see Terry and Kelley (2012). The period of 2001M1-2006M12 is used for model validation.
The rest of the sample is used to examine how the model evolves through the turbulent period of the GFC. Data updating is set at a 12-month interval for the validation period. The maximum lag length is set at six months, n = 6 in (3) and (5). For convenience, the data is updated historical data, not vintage data. Hence, the evaluation of the predictive performance does not emulate real-life forecasting performance.
The short-run macro targets and the two interest rate variables used in model (1) are plotted in Figure 2.

Data
Model (1) is applied to three macro targets: annual inflation based on the consumer price index (CPI), annual growth rates of IP and annual growth rates of GDP. Two output variables are considered due to the availability of monthly IP and the absence of monthly GDP. Monthly GDP series are interpolated from quarterly time series using the monthly weights of IP. Experiments with total retail sale as a proxy of private consumption yield similar results. For simplicity, a single real-sector explanatory variable is chosen for each of the three modelled variables in their level form: GDP for IP, the producer price index (PPI) for CPI, and the global output index from the GVAR literature 11 for GDP.
The data sample is in monthly frequency over the period of 1980M1-2017M12, except for the global output index which is only available up to 2012M12. The period of 1980M1-2000M12 is used for model training to ensure a decent level of composite reliability, see Terry and Kelley (2012). The period of 2001M1-2006M12 is used for model validation. The rest of the sample is used to examine how the model evolves through the turbulent period of the GFC. Data updating is set at a 12-month interval for the validation period. The maximum lag length is set at six months, n = 6 in (3) and (5). For convenience, the data is updated historical data, not vintage data. Hence, the evaluation of the predictive performance does not emulate real-life forecasting performance.
The short-run macro targets and the two interest rate variables used in model (1) are plotted in Figure 2. The set of initial indicators used as input indicators for FCI construction is classified into two types: unprocessed financial variables and processed financial variables. Type one forms the dominant part of the set. Variables are collected from money, forex, equity, and fixed income markets, as well as from banking sector balance sheets. Highly correlated variables from the same market are excluded to ensure indicators are as The set of initial indicators used as input indicators for FCI construction is classified into two types: unprocessed financial variables and processed financial variables. Type one forms the dominant part of the set. Variables are collected from money, forex, equity, and fixed income markets, as well as from banking sector balance sheets. Highly correlated variables from the same market are excluded to ensure indicators are as representative of as possible. Type two variables are systemic risk measures derived from micro financial series. These are provided by Giglio et al. (2016). 12 While the latter type is directly used as input indicators for FCI construction, the former type is transformed to capture, in a concentrated manner, dynamic features of liquidity, leverage, and linkage frictions which are indicative of various market imbalances. This is achieved by choosing indicators of what Qin and He (2012) refer to as 'long-run type'. According to their classification, long-run indicators cover ratios or differences across financial variables, such as interest rate spreads. In contrast, growth rates or changes of individual variables are referred to as 'short-run type'.
From a time-series perspective, long-run indicators exhibit distinctly lower frequency dynamics than the short-run ones, and therefore are expected to match better with the dynamic features exhibited by macro variables, which are well-known to exhibit strong inertia. Such match in dynamic features is essential for successful modelling, as emphasised by Drehmann et al. (2012) and Borio (2014aBorio ( , 2014b. Further, since long-run indicators are formed to capture features of imbalance or disequilibrium across different financial variables, these are expected to reflect the main transmission channels between the financial and real sectors, see BCBS (2011), Borio and Lowe (2002), and Gramlich et al. (2010). Previous experiments, reported in Wang (2017), have indeed shown that long-run indicators occupy a dominant part in the construction of FCIs while short-run indicators are largely screened out during indicator selection. Table 1 provides a list of input indicators of the long run type with details about the variables underlying their construction and processed financial variables with their data source. A list of the unprocessed financial variables underlying the long run input indicators and their data sources can be found in the Appendix B, Table A1. All input indicators are standardised for comparability of loadings.

Empirical Results
The constructed FCIs are assessed on two criteria: (a) aggregation and (b) prediction. The first criterion is mainly examined via concatenation, as discussed in Section 2.2. The second criterion is assessed via the forecasting performance of model (2) as compared to model (1). This section reports and discusses the results with respect to these criteria in three parts.

Input Indicator Selection
Experiments with various sets of input indicators in the construction of long-run and short-run FCIs yield the following findings. First, loadings of the short-run input indicators, derived from the monthly differences of the unprocessed financial variables, are either very small or insignificant in the construction of both f L t and f S t . This result reinforces Wang's (2017) findings and confirms the conventional wisdom that routine volatilities from financial markets are mostly noise to the real sectors unless they accumulate into disequilibrium signals too large to be ignored at a macro level. 13 Forecasting comparisons between FCI-based models where the few surviving short-run indicators are kept and those without these indicators yield little difference. Short-run indicators are henceforth excluded from the FCI construction, following the call for simplicity in aggregation by Cox et al. (1992). Second, loadings of most of the processed financial variables are either insignificant or very small in f S t . In the construction of f L t , some loadings are significant, but unstable. Given the insignificance and instability of those loadings, the processed variables are excluded in subsequent experiments. The final set of input indicators used in the construction of the FCIs hence solely comprises input indicators of the long-run type constructed from unprocessed financial variables. Figures 3 and 4 show, respectively, the long-run targets with the resulting e * t series incorporating the long-run FCIs and the short-run FCIs, f S t and ∆ 12 f L t , following Equations (4) and (6). κ 1 = 1 in the long-run targets is imposed based on empirical evidence. 14 As can be seen from Figure 4, the FCI series vary distinctly from each other. While the FCIs targeted at the two output growth variables are similar-as expected given that IP forms a sizable part of GDP-the FCIs targeted at inflation are strikingly different. It is hence unsurprising that PCA-based FCIs cannot achieve the same adequacy in concatenation or possess the same predictive capacity as these targeted FCIs.

Aggregation
This sub-section evaluates aggregation of the input variables by assessing the loadings and lag structure of the different FCIs with respect to prominence of financial sectors, leading information, constancy over data updates and dynamic form. The loadings of the chosen indicator set for the long-run FCIs are reported in Table 2. The first column reports k * , the lag of the most significant indicator loadingφ j,k * in (4). The second column reports the average of the loadings over repeated 12 month updates over the training period between 2001M1 and 2006M12.
The lack of cross-market synchronisation is noticeable, with indicators entering at different lag lengths from 0-6. Meanwhile, a sizeable part of the indicators has relatively constant loadings over time-highlighted in bold-especially when targeting inflation and IP. For inflation and GDP growth, forex market indicators are mostly insignificant and loadings for the banking sector indicators are dominant with a relatively stable lag structure. For IP, indicators representing the forex and fixed income markets are dominant and imbalances in the banking sector, except for the housing market, play a minor role with mostly insignificant loadings. Judging from the lag structure, fixed income indicators contribute the most leading information. The long run targets y t − x t are constructed as following: inflation with y t = CPI and x t = PPI; GDP with y t = GDP US and x t = World GDP; IP with y t = IP and x t = GDP.
The links between IP and forex markets are unsurprising because exchange rates are a pivotal factor in determining international competitiveness. The leading role of indicators from the fixed income market explains why interest rates have taken hold as the aggregate variables representing the financial sector in conventional macro models. Table 3 reports the loadings for short-run FCIs f S t constructed from the same indicator set used to construct the long-run FCIs. The dynamic form of several indicators is nonlinear. This nonlinearity can be captured by transforming the input indicators into a differenced term plus a level term. Because GDP and IP annual growth rates have similar dynamics, indicators enter in similar dynamic forms for these two targets, with many indicators entering as a difference. Because inflation exhibits faster dynamics than the output growth variables, few indicators enter as a difference in the short-run FCI targeting at inflation.
For FCIs built for the two output targets, the size of indicator loadings changes and lag instability emerges when extending the algorithm beyond the testing sample, foreshadowing the financial crisis. For most input indicators, stability returns with the 2010M1 update. In contrast, lag stability for the FCIs targeted at inflation is largely maintained over the crisis period, an indication of inflation being relatively less susceptible to financial market conditions. Table 4 summarises these changes in loadings and lag structure for the training period 2001M1 to 2006M12 as compared to the crisis and post-crisis period 2007M1 to 2016M12. Results for annual output growth by GDP are excluded as fails to de-trend US GDP after 2000, see Figure 3. Note: Short-run macro targets in annual growth rates. Unstable lag structure indicated by NA in the lag column. D16|L1 means a difference between the 1st and the 6th lag with a remaining level on the 1st lag.
If we compare concatenated against non-concatenated short-run FCIs in Figure 5, a location shift is clearly discernible for the output growth targets at the 2009M1 update. However, the shift is negligible and transient for inflation, further supporting the hypothesis that inflation is less susceptible to financial market conditions. The stability for the output growth targets before the crisis period is promising with regards to aggregation. If we compare concatenated against non-concatenated short-run FCIs in Figure 5, location shift is clearly discernible for the output growth targets at the 2009M1 update However, the shift is negligible and transient for inflation, further supporting the hypoth esis that inflation is less susceptible to financial market conditions. The stability for th output growth targets before the crisis period is promising with regards to aggregation.

Prediction
This sub-section summarises the model form and predictive performance of the FCI based model (2) as compared to the baseline model (1). Backward dynamic selection usin the training sample yields the parsimonious models summarised in Table 5. The dynamic of the macro target variables are dominantly explained by their own lags and other rea

Prediction
This sub-section summarises the model form and predictive performance of the FCIbased model (2) as compared to the baseline model (1). Backward dynamic selection using the training sample yields the parsimonious models summarised in Table 5. The dynamics of the macro target variables are dominantly explained by their own lags and other real sector variables in both model (1) and (2), while the explanatory power of the financial variables is relatively minor, be it interest rates or FCIs. This is a common feature of aggregate macro-econometric models and explains why the financial side has been regarded as marginally important in those models. Consequently, the model framework delimits the degree of possible improvement of model (2) over model (1).

Model (1)-Baseline
Model ( Considering the magnitudes of the own-lag parameter estimates in contrast to those of the financial variables, the inflation target is the least susceptible to the financial sector impact. Indeed, both R t and f L t drop out of the error correction term. Moreover, IP and GDP share the same model specification as far as the real-sector variables are concerned, with ∆ 12 x t and ∆ 12 y t entering with one lag each in both cases. Results in Tables 2, 3 and 5 reveal that FCIs contain more leading information than interest rates. For instance, in the case of IP, no input variable, except for the EC-term, contains leading information beyond 2 lags ahead in the baseline model, while the short-run FCI in model (2) contains leading information up to 12 lags ahead.
After each 12-month update over the testing period 2001M1-2006M12, 1-step to 6-step ahead forecasts are produced. Figure 6 plots the ratios of the 1-step ahead MSFE of the FCI-based model (2) over the baseline model (1) for annual inflation (CPI), annual output growth by GDP and annual output growth by IP. In the IP and GDP case, the FCI-based model outperforms the baseline model up to the 2006M1 update. In the inflation case, the baseline model marginally outperforms the FCI-based model in most years. Results provided in Table 5 and Figure 6 suggest that the main forecasting power in (2) relative to (1) stems from e * t , which is absent from the FCI-based model for inflation, explaining the underperformance.  (2) over the testing period 2001M1-2006M12 for 1-step ahead fore casts over repeated 12-month updates. A value < 1 implies that (2) produces a smaller MSFE than (1). Table 6 reports the average 1-step ahead MSFE of the baseline and the FCI-based model over the testing period 2001M1-2006M12, and two extended testing periods in cluding the GFC, 2001M1-2010M12 and 2001M1-2017M12. The remaining downward trend in for GDP undermines the results for GDP and GDP is excluded for the exper iments over the extended testing periods. The difference in MSFE is marginal for all three macro targets and the forecast encompassing tests by Harvey et al. (1997) reveal no signif icant distinction between model (1) and (2). However, forecast encompassing tests by Er icsson (1992) suggest that the FCI-based model encompasses the baseline model for al three macro targets. Results are significant at the 1 percent level for the two output growth targets and at the 5 percent level for the inflation target over the initial testing period.  (2) encompasses the baseline model (1); H2: the baseline model (1) encompasses the FCI-based model (2). * indicates 5% significance level and ** indicates 1%.  (2) over the testing period 2001M1-2006M12 for 1-step ahead forecasts over repeated 12-month updates. A value < 1 implies that (2) produces a smaller MSFE than (1). Table 6 reports the average 1-step ahead MSFE of the baseline and the FCI-based model over the testing period 2001M1-2006M12, and two extended testing periods including the GFC, 2001M1-2010M12 and 2001M1-2017M12. The remaining downward trend in e t for GDP undermines the results for GDP and GDP is excluded for the experiments over the extended testing periods. The difference in MSFE is marginal for all three macro targets and the forecast encompassing tests by Harvey et al. (1997) reveal no significant distinction between model (1) and (2). However, forecast encompassing tests by Ericsson (1992) suggest that the FCI-based model encompasses the baseline model for all three macro targets. Results are significant at the 1 percent level for the two output growth targets and at the 5 percent level for the inflation target over the initial testing period.  Figure 7 compares the MSFE of the 1-step to 6-step ahead forecasts over the extended testing period 2001M1-2017M12 for IP and inflation. As expected, the difference in forecasting error between baseline and FCI-based model widens with increasing steps. On average, the outperformance of the baseline model by the FCI-model is maintained over the crisis and post crisis period for IP.
Econometrics 2022, 10, x FOR PEER REVIEW 16 of 24 Figure 7 compares the MSFE of the 1-step to 6-step ahead forecasts over the extended testing period 2001M1-2017M12 for IP and inflation. As expected, the difference in forecasting error between baseline and FCI-based model widens with increasing steps. On average, the outperformance of the baseline model by the FCI-model is maintained over the crisis and post crisis period for IP. To conclude, the financial market effect on inflation is too indirect to be meaningful as a measure to enhance forecasting performance. However, for IP, where effects are more direct, preliminary results are promising.

Concluding Discussion
This section concludes with a discussion on the methodological implications of our pilot experiments and a reflection on future research directions. A summary of the empirical findings is already given in Section 1 and is hence omitted here. Treating the construction of model-based composite aggregates as supervised dimensionality reduction has opened up new territory. The prerequisite of choosing a clearly defined set of criteria, as compatible as possible to the task at hand, has guided the algorithmic modelling design. Two vital criteria are identified: concatenation and macro prediction in weight measurement construction, a key step in composite or synthetic index making, see OECD (2008).
Experiments with regularly concatenated FCIs demonstrate the extent to and ways in which dimensionality reduction of financial market frictions is empirically feasible as far as specific macro forecast tasks are concerned. From the input side, indicators of financial market frictions constitute the main ingredients of FCIs, and the dynamic forms by which they enter the FCIs are heterogenous and asynchronous. This finding demonstrates, once more, why the PCA approach is inappropriate. From the target side, our results show that concatenation is target sensitive. The closer a specified macro target is to financial market frictions, the more data permissible the concatenation operation and the stronger the predictive content of the resulting FCIs.
Two implications follow. First, the modelling approach can be used to identify macro variables that are susceptible to aggregate financial shocks through the estimated impact of FCIs and the loadings of their components. Second, concatenation becomes data impermissible when the targets are broad and virtually relate to the whole macroeconomy. This To conclude, the financial market effect on inflation is too indirect to be meaningful as a measure to enhance forecasting performance. However, for IP, where effects are more direct, preliminary results are promising.

Concluding Discussion
This section concludes with a discussion on the methodological implications of our pilot experiments and a reflection on future research directions. A summary of the empirical findings is already given in Section 1 and is hence omitted here. Treating the construction of model-based composite aggregates as supervised dimensionality reduction has opened up new territory. The prerequisite of choosing a clearly defined set of criteria, as compatible as possible to the task at hand, has guided the algorithmic modelling design. Two vital criteria are identified: concatenation and macro prediction in weight measurement construction, a key step in composite or synthetic index making, see OECD (2008).
Experiments with regularly concatenated FCIs demonstrate the extent to and ways in which dimensionality reduction of financial market frictions is empirically feasible as far as specific macro forecast tasks are concerned. From the input side, indicators of financial market frictions constitute the main ingredients of FCIs, and the dynamic forms by which they enter the FCIs are heterogenous and asynchronous. This finding demonstrates, once more, why the PCA approach is inappropriate. From the target side, our results show that concatenation is target sensitive. The closer a specified macro target is to financial market frictions, the more data permissible the concatenation operation and the stronger the predictive content of the resulting FCIs.
Two implications follow. First, the modelling approach can be used to identify macro variables that are susceptible to aggregate financial shocks through the estimated impact of FCIs and the loadings of their components. Second, concatenation becomes data impermissible when the targets are broad and virtually relate to the whole macroeconomy. This verifies that the quest for a universal (i.e., unsupervised) and useful measure of financial market risks for the forecasting of specific macro variables is a fantasy.
More research is needed to improve the current algorithmic model design before its potential can be fully tapped beyond academia. Reflection on the presented experimental findings yields the following observations about important aspects for the future research agenda.

1.
On model conceptualisation: Essentially, the model-based FCIs are pooled and partial predictors. However, it can be challenging to reach a consensus on the interpretation of these predictors, in view of the intense debates and discussions on the nature of latent variables or composite variates in PLS regression modelling and formative measurement modelling. 15 From the stance of statistical modelling, the question of interpretability touches on the crux of under what conditions a theoretical identity has a one-to-one mapping from a statistical identity, see Markus (2016). Aside from epistemological concerns, the discussion highlights the primary importance of the algorithm design to ensure consistency between math/statistical aggregation rules and desired properties of the theoretical constructs, e.g., see Munda (2012). 2.
On model design: Improvements can be made from two sides. From the input side, more elaborate aggregation rules should be introduced with help from dimensionality reduction techniques in machine learning. For instance, multi-path or classification models should be experimented with to replace the man-made filtering step of redundant indicators. The possibility of interactive dynamics among indicators should also be considered to search for more effective and parsimonious ways to formulate dynamic input features. From the target side, more attention should be focused on how to exploit the target selection capacity of this modelling approach to better serve policy purposes. 16 3.
On model testing: Improvements of methods of model evaluation are desired at various stages. Here, active research is worth tracking in two areas. One is concerned with the quality of composite indicators and tackled by sensitivity analysis, see Saisana et al. (2005), and Dobbie and Dail (2013). The other is on evaluation of various aspects of formative PLS path models, such as content, construct, convergent and discriminant validity, see Andreev et al. (2009), and Bentler and Huang (2014). Notes: Matrixes are bold and in capital letters and vectors are bold and in lower case. y T is the vector of the long-run target (y t − κ 1 X t ) as specified in (3). I T is a T by m × n matrix of financial indicators and their respective lagged versions. The selectedφ j,k,h in step 8 isφ j,k * in (4). s x T indicates the standard deviation of vector x T and x T its mean. Notes: Matrixes are bold and in capital letters and vectors are bold and in lower case. y T is the vector of the short-run target ∆y t as specified in (5). I T is a T by m × n matrix of financial indicators and their respective lagged versions. s x T indicates the standard deviation of vector x T and x T its mean.  The inadequacy of capturing the financial sector impact in macro modelling has been highlighted recently from a broader angle in a special issue of Oxford Review of Economic Policy vol. 34, i.e., see Stiglitz (2018), and Vines and Wills (2018).

2
The PLS method was proposed by Wold (1966Wold ( , 1975Wold ( , 1980; for more background information, see Wegelin (2000), Sanchez (2013) and McIntosh et al. (2014). It has been extended into the causal interpretation of PLS path modelling in relation to measurement theory, see Vinzi et al. (2010), Howell et al. (2013) and Howell (2014). A few trial applications of PLS can be found in the econometric literature: e.g., Lin and Tsay (2005), Eickmeier andNg (2011), Lannsjö (2014), Kelly and Pruitt (2015), Fuentes et al. (2015), Groen and Kapetanios (2016), and Kapetanios et al. (2018), but none has adopted the causal model basis of the method. Historically, the reflective model is referred to as 'mode A' and the formative model 'mode B', see Wold (1980) and also Vinzi et al. (2010). In a reflective model, weights are identified by the assumption of conditional independence, i.e., all the manifest indicators are effects of one common cause. In contrast, this assumption does not apply to the formative model. Hence, the weights of the indicators in a formative model cannot be identified by a single criterion, such as the common variance criterion of PCA. An additional criterion is needed for the identification (Markus and Borsboom 2013, p. 113). 5 Although the concept of supervised versus unsupervised data reduction is unfamiliar to economists, the idea of targeting the index construction process at dependent variables has been around since early warning systems research, e.g., Gramlich et al. (2010). The method of selecting indicators based on their ability to signal turning points in Levanon et al. (2015) effectively follows the supervised learning approach. 6 This incorrect formalization of problems falls into what is referred to as 'Type III error' by Hand (1994, p. 317 When this long-run combination of the ECM fits the description of cointegration, our formulation can be interpreted as exploiting the Granger-Engle two-step procedure.