Abstract
In this paper, we consider the estimation of a dynamic panel data model with non-stationary multi-factor error structures. We adopted the common correlated effect (CCE) estimation and established the asymptotic properties of the CCE and common correlated effects mean group (CCEMG) estimators, as N and T tend to infinity. The results show that both the CCE and CCEMG estimators are consistent and the CCEMG estimator is asymptotically normally distributed. The theoretical findings were supported for small samples by an extensive simulation study, showing that the CCE estimators are robust to a wide variety of data generation processes. Empirical findings suggest that the CCE estimation is widely applicable to models with non-stationary factors. The proposed procedure is also illustrated by an empirical application to analyze the U.S. cigar dataset.
Keywords:
dynamic panel models; cross-sectional dependence; non-stationary; common factors; common correlated effects JEL Classification:
C01; C13; C23
1. Introduction
Recently, there has been increased interest in the analysis of panel data models with cross-sectionally dependent errors (also known as unobserved common factors or multi-factor error structures), which are motivated by empirical applications in economics, such as common shocks and the global financial crisis, see Omay and Kan (2010); Bussiere et al. (2013); Eberhardt et al. (2013) and Chudik et al. (2017), etc. The dependencies across the units violate the traditional assumption of independent and identically distributed errors; conventional panel estimation methods (such as fixed effects estimation) could have serious consequences and lead to inconsistent estimations and misleading inferences. Therefore, in econometrics literature, much effort has been devoted to the estimations for panels with cross-sectional dependence, for example, Pesaran (2006); Bai (2009); Zaffaroni (2009); Greenaway-McGrevy et al. (2012); Kao et al. (2012); Chudik and Pesaran (2013); Moon and Weidner (2015, 2017), among others. See also Chudik and Pesaran (2015b) for a survey of recent developments in large panel models with cross-sectional dependence.
Among these studies, a predominant approach of dealing with cross-sectionally dependent errors in panel models is the so-called common correlated effect (CCE) method proposed by Pesaran (2006)1. The basic idea of CCE estimation is to proxy the unobserved common factors using the cross-sectional averages of the observables in the regression. Comparatively, it has several advantages. For instance, it can be computed by least squares to auxiliary regression, and it does not require the knowledge of the number of unobserved factors. The CCE method has been further developed and applies to different types of panel models. To name a few, Chudik and Pesaran (2015a) suggested the CCE approach to analyze dynamic heterogeneous panels with stationary unobserved common factors. Kapetanios et al. (2011) extended the CCE method to static panel data models with non-stationary multi-factor error structures. Westerlund et al. (2019) considered the CCE for short panels, and Zhou and Zhang (2016) extended the CCE for unbalanced panels.
Among the aforementioned works, there is a gap in the CCE estimation for dynamic panels with non-stationary unobservable factors. To fill this gap, in this paper we consider a linear dynamic heterogeneous panel data model with non-stationary unobserved common factors when both the cross-sectional and time dimensions of the dataset grow to infinity. Under these settings, we find that the CCE estimator of the individual coefficient is consistent, and the CCE mean group (CCEMG) estimator is consistent and has a normal limit distribution. The practical implication of this finding is that for inferential purposes of the CCE estimation, one does not necessarily need to test the stationarity of the unobserved common factors in the model. The finite sample properties are examined through Monte Carlo simulations and the simulation results confirm our theoretical findings in the paper. Moreover, the proposed procedure is illustrated by an empirical application, which analyzes the U.S. cigar dataset.
The rest of the paper is organized as follows. Section 2 sets up the basic model and introduces the CCE estimation of the dynamic heterogeneous panel data model with common factors. The asymptotics of the CCE estimation with non-stationary unobserved common factors is provided in Section 3. Monte Carlo simulation results and an empirical application are reported in Section 4 and Section 5, respectively. The concluding remarks are made in Section 6. Proof of the main results is provided in Appendix A.
Notation: The letter K stands for a finite positive constant. All vectors are column vectors represented by bold lower case letters, and matrices are represented by bold capital letters. Let denote the Frobenius norm. , and denote the maximum absolute column and row sum matrix norms, respectively. denotes the Moore–Penrose inverse of , and denotes the rank and the spectral radius of , respectively.
2. Dynamic Panel Data Model with Non-Stationary Unobserved Common Factors
2.1. The Model
We assume the scalar dependent variable and regressors are generated as follows2
and
for and , where and are individual fixed effects for unit i, is a vector of the regressors specific to cross-sectional unit i at time t, are the individual-specific (idiosyncratic) errors and are the individual-specific components of , and are and factor loading matrices, and the vector represents unobserved common factors. In what follows, we maintain the restriction that model (1) is stationary, such that for
Models (1)–(2) have been widely studied in the literature; see, for instance, Pesaran (2006), Chudik and Pesaran (2015a), Westerlund et al. (2019), and the references therein. We follow these studies to consider the CCE estimation for and and reexamine the validity of the CCE estimation when is non-stationary.
2.2. CCE Estimation
Following Chudik and Pesaran (2015a), let , then (1) and (2) can be compactly written as
where , , , and , with , , and
If the support of lies strictly inside the unit circle, then (3) can be rewritten as the following distributed lag form
for
Taking the cross-sectional average of (5) yields
where is a dimensional vector of the cross-section average, , and , with , and L being the lag operator. Furthermore, if is invertible (see Assumption 4 below), then we have
When the matrix has the full column rank, i.e., the rank condition
holds, we have
where . This suggests that the contemporary and lagged value of can be used as observable proxies for the unobserved common factors .
Substituting the observed proxies of the unobserved common factors (7) into (1) yields the following augmented regression
or
for , where , , is the number of lags used to truncate the infinite polynomial distributed lag function ,3 and the composite error has the form of
For notational simplicity, let , , with and , and , then the augmented regression (8) can be expressed in vector form as
where are the parameters of interest, are nuisance parameters, and4
Based on the cross-sectionally augmented regression model (9) and by the formula for partitioned regression, the CCE estimator of the individual coefficients is given by
which is an ordinary least squares estimate, where is an orthogonal projection matrix, with a -dimensional identity matrix. In panel models with N large, the primary parameters of interest are the means of the individual–specific coefficients, , which can be estimated by the common correlated effects mean group (CCEMG) estimator
3. Asymptotics of CCE Estimators with Non-Stationary Factors
3.1. Assumptions
When the unobserved common factors are stationary processes, Chudik and Pesaran (2015a) showed that the CCE estimator (11) of the individual coefficient is consistent, and the CCEMG estimator (12) is consistent and asymptotically normal. However, in practice, the common factors may follow a non-stationary process (see Bai and Ng 2004, 2010; Pesaran 2007; Pesaran et al. 2013, among others). In this scenario, the validity of CCE estimators and their asymptotic properties need to be re-examined.
Following Kapetanios et al. (2011), we assume the unobserved common factors follow the multivariate unit root process
To derive the asymptotic properties of the CCE type estimators (11) and (12) when follows (13), we make the following assumptions.
Assumption 1.
(Individual-specific errors). (i) The individual–specific errorsfollow a linear stationary process with uniformly-bounded positive variance,, for some constant K, and uniformly-bounded fourth-order cumulants.follows a linear stationary process with absolute summable auto-covariances (uniformly in i), with covariance matrices,, which are non-singular and satisfy, and have uniformly-bounded fourth-order cumulants. (ii)are independently distributed offor all, and. For each i,is anvector of, stationary near epoch dependent processes of sizeon the-mixing process of size, and for,, which is a non-singular matrix and satisfies.
Assumption 2.
(Factor loadings). The factor loadings and are independently and identically distributed () across i, and of the common factors , for all i and t, with means and , respectively, and the bounded second moments. In particular,
and
where and are and symmetric nonnegative definite matrices, and , for some constant
Assumption 3.
(Heterogeneous coefficients). The slope coefficients follow the random coefficient model
where , , is the symmetric nonnegative definite matrix and the random deviations are distributed independently of and for , and t. Furthermore, the support of lies strictly inside the unit circle, and , for all i, where .
Assumption 4.
(Exogenous regressors). Regressors are either strictly exogenous and generated according to the canonical factor model (2) with , or weakly exogenous and generated according to (2) with , for , across i, and independently distributed of , and for all , and t. In the case where the regressors are weakly exogenous, we also assume:
(i) The support of lies strictly inside the unit circle, for , where with and are defined in (4).
(ii) The inverse of polynomial exists and has exponentially decaying coefficients, where .
Assumption 5.
(Rank condition). The matrix has a full column rank, such that
where
Assumption 6.
(i) As the matrices and exist for all i, and and have finite second-order moments for all where and are projection matrices, where denotes the Moore–Penrose generalized inverse of , and , defined as
where with and , is a vector of ones.
(ii) The matrix is non-singular, where with a matrix, , and .
Assumption 7.
in (13) is an vector of stationary near epoch-dependent processes of size , on an -mixing process of size , and is distributed independently of the idiosyncratic errors and for all i and t.
Several remarks can be made for these assumptions. Assumptions 1–3 are quite standard in the literature for (dynamic) panel models with cross-sectional dependence, for example, see Pesaran (2006) and Kapetanios et al. (2011) and the references therein. Assumption 4 is also made on Chudik and Pesaran (2015a) for exogenous regressors and stationarity conditions for dynamic panels. Assumption 5 is a common condition for the implementation of the CCE estimation (e.g., Pesaran (2006) and Chudik and Pesaran (2015a), etc.), which implies that there are more included regressors than the unobserved factors in the model. See Juodis et al. (2021) for a detailed discussion of the validity of the rank condition and the resulting asymptotics for the CCE estimation. Assumption 6 is a common assumption for the CCE estimation and it is imposed for the partition regression in augmented regression for the dynamic panels (e.g., Chudik and Pesaran 2015a). Assumption 7 requires that the error structures in the unit root process are stationary.
3.2. Asymptotics
Under these assumptions, we can establish the asymptotic properties of CCE estimators (11) and (12) when is non-stationary. To begin with, we note that for the original model (1), it can be rewritten as in the vector form
or more compactly as
for , where , , , , , and .
Using the CCE estimator (11) into (15), we have
which shows that the asymptotics of depends on the unobserved factors through .
Using the results in Lemma A2, A5, and A6 in Appendix A, we obtain
and, thus,
when the rank condition (6) is satisfied. The above results are summarized in the following theorem, establishing the consistency of the CCE estimator of individual coefficients of interest.
Theorem 1.
Consider the panel models (1) and (2), suppose Assumption 1–7 hold, then, as , such that , we have
See the Appendix A for the proof.
Remark 1.
The above theorem suggests that the CCE estimator of the individual slope coefficient is consistent even if the common factors are non-stationary. When the rank condition (6) is not satisfied, the CCE estimator of the individual slope coefficients would be inconsistent due to the correlation of and . See Juodis et al. (2021) for more discussions on the validity of the CCE estimator when the rank condition does not hold.
Next, we establish the asymptotic properties of the CCEMG estimator of the mean group coefficients, . We have
When the rank condition is satisfied, by (17), we have
hence, we can obtain
and, thus,
Theorem 2.
Consider the panel models (1) and (2), suppose Assumptions 1–7 hold, as , such that , then we have
If it is further assumed that , then
The asymptotic variance of can be consistently estimated nonparametrically by
For the results in both Theorems 1 and 2, we find that, for models with non-stationary common factors, although the intermediate results needed for deriving the asymptotic properties of the common correlated effects estimators significantly differ from the stationary case, as in Chudik and Pesaran (2015a), the final results are surprisingly similar. This is in direct contrast to the usual phenomenon where distributional results of processes are radically different from those of processes.
Remark 2.
For the consistency of and , no restrictions on the relative expansion rates of N and T to infinity are required. However, they require for the derivation of the asymptotic distribution of due to the time series bias, which arises from the presence of lagged values of the dependent variable; therefore, it is unsuitable for panels with T being small relative to N.
Including a lagged dependent variable as the regressor in the model could induce the estimators with time series bias of order . When T is not large, the bias is non-negligible; hence, a certain bias correction approach should be considered. In the simulations below, we consider the Jackknife bias-corrected method for bias reduction (e.g., see (21) below), which is used extensively in the relevant literature (e.g., Hahn and Newey 2004).
4. Monte Carlo Simulation
In this section, we investigate the finite sample properties of the CCEMG estimation for dynamic heterogeneous panels with non-stationary common factors. We consider the following data-generating processes5
and
for and Let , , and , , with . The main purpose of this paper is to illustrate the validity of the CCEMG estimator in the case of non-stationary unobserved common factors; hence, for the unobserved common factors , we consider the following three different non-stationary DGPs:
DGP 1. Two non-stationary unobserved common factors , where , for , and .
DGP 2. One non-stationary unobserved common factor and a stationary common factor , where , for , and .
DGP 3. Cointegrated unobserved common factors , where , for , and .
For the above DGPs, the starting values are , for ; the first 100 observations are discarded.
Correspondingly, the factor loadings are generated independently across replications as
and
for and , where , , and , for , where and for .
For the idiosyncratic errors, for all i and t, and the unit-specific components are generated as independent stationary AR(1) processes:
for and with the starting values . The first 100 observations are discarded.
We consider the combination of , and . The number of replications is set at 2000 times. In what follows, we focus on the lagged coefficient (the cross-section mean of ), as well as (the cross-section mean of ). To save space, we only report the results of since the results for are very similar to that of and they are available upon request.
Two estimators are considered in the simulation. The first is the main result of the CCEMG estimator given in (12), in which, the lag order is selected to satisfy as , for some ; that is, , which works well in our Monte Carlo design6. The second is the Jackknife bias-corrected CCEMG estimator, which is constructed as
where is the CCEMG estimator calculated using the first two-thirds of the available time period, namely over the period , and denotes the CCEMG estimator computed using the observations over the period , where denotes the integer part of . Note that a new strategy is applied to improve the performance of the Jackknife estimator, i.e., the whole time period is divided into three parts, the first two-thirds of the available period is applied to calculate the first estimator and another one is computed from the last two-thirds of the period. We find that, in our settings, this division strategy performs better than the half-panel Jackknife method discussed in Chudik and Pesaran (2015a).
We used the statistical software MATLAB to conduct the Monte Carlo experiments; the simulation results are summarized in Table 1, Table 2 and Table 3 for DGPs 1–3, respectively.
Table 1.
Estimation results for DGP 1.
Table 2.
Estimation results for DGP 2.
Table 3.
Estimation results for DGP 3.
From Table 1, we note that for the estimation of , the CCEMG performs well in terms of bias and RMSE, with the bias diminishing as T is increased, and the associated RMSEs fall steadily when T increases, which implies that the CCEMG estimator is consistent. However, it still suffers from the time series bias when T is small. While the Jackknife bias-corrected CCEMG estimator is quite effective at reducing the time series bias of the CCEMG estimator, the bias has been significantly reduced compared with the original CCEMG estimator when T was not large, and the RMSE also decreased with the increase of either N or T. Similar findings can be observed for
In order to evaluate the robustness of various estimators, we considered additional results in Table 2 and Table 3 for DGPs with both stationary and non-stationary factors or cointegrated factors. Similar to the case with non-stationary factors in DGP1, we find that the CCEMG estimator still performs well regardless of the number of common factors and the non-stationary type, and it can be improved by the Jackknife bias-corrected for the estimation of the autoregressive coefficient , the CCEMG estimator of the slope coefficient performs very well in almost all cases.
Overall, the findings of our Monte Carlo simulations show that, if the parameter of interest is the mean coefficient of the regressors, , the CCEMG estimator performs well even if N and T are not large. For the mean coefficient of the lagged dependent, , the CCEMG estimator is still consistent, but it suffers from the time series bias unless T is sufficiently large and, thus, the Jackknife bias-corrected CCEMG estimator is proposed, it helps to mitigate the time series bias.
5. Empirical Study
In this section, we illustrate our method by considering the U.S. Cigar dataset, which is frequently used in the literature on panel models (e.g., Baltagi and Li 2004; Bada and Liebl 2014). The panel contains the per capita cigarette consumption of American states from 1963 to 1992 () as well as data on the income per capita and cigarette prices; the dataset can be obtained from the R package phtt.
To test the cross-sectional dependence in the panel data, following Pesaran (2015) and Bailey et al. (2016), we compute the statistic and the statistic for the variables of interest in Table 4. As can be seen from the table, the statistics turn out to be , , and for consumption, income, and price, respectively; these are highly significant and reject the null hypothesis of weak cross-sectional dependence for all three variables. Additionally, the estimates of together with their confidence bands further confirm the above results. As a result, we can conclude that there is an obvious cross-sectional dependence for these three variables.
Table 4.
Exponent of the cross-sectional dependence of variables.
To investigate the relationship between the per capita cigarette consumption and the income per capita as well as cigarette prices, following Baltagi and Li (2004), we consider the panel model
where , and denote the per capita cigarette consumption, the income per capita, and cigarette price for the ith state at time t, respectively, and the idiosyncratic error has the multi-factor structure
The proposed dynamic CCE approach is applied to estimate the coefficients in model (22), and the augmented equation to be estimated can be written as
where the number of lags , and . We focus on the CCEMG estimators and the results are presented in Table 5.
Table 5.
Estimation results (Jackknife bias-corrected CCEMG).
The following conclusions can be drawn from Table 5. On the one hand, the income per capita has a positive effect on the per capita cigarette consumption, while the increase in cigarette price will restrain cigarette consumption to a certain extent, and both are significant. These results are consistent with the conclusions of Bada and Liebl (2014). On the other hand, the lagged explained variable is highly significant, indicating that it is appropriate to use dynamic models for the per capita cigarette consumption.
To illustrate the heterogeneous slopes across states, we display both the CCE and the CCEMG estimators in Figure 1, which clearly show that the estimates of coefficients vary from state to state, reflecting the heterogeneity among states. Moreover, to illustrate the potential non-stationarity of unobservable common factors in (23), we consider the method proposed by Bada and Kneip (2014) to select the number of unobservable common factors and estimate the selected common factors. The results are given in Figure 2, where the top panel shows the estimated common factors and the bottom panel shows the estimated time-varying individual effects of states. As can be seen from the figure, five common factors have been selected, among which the first and second common factors have obvious tendencies and violate the stationarity condition.
Figure 1.
CCE and CCEMG estimations for income (Left) and price (Right), respectively (CCE estimates of individual coefficients are indicated by a cross, CCEMG estimates by the red line, and the confidence interval by the upper and lower range and dashed red line).
Figure 2.
Estimated factors (Top) and the factor structure (Bottom).
6. Conclusions
In this paper, we re-examined the CCE type estimator for dynamic heterogeneous panel regression models with non-stationary common factors. Asymptotic properties of CCE estimators are established when both N and T are large. It is shown that, under certain conditions, the main results of Pesaran (2006) and Chudik and Pesaran (2015a) hold for a dynamic panel with non-stationary factors. Monte Carlo simulations were conducted to investigate the finite sample properties of the CCE estimation for the panel with non-stationary factors. An empirical application to the U.S. cigarette consumption dataset shows that the real data may have cross-sectional dependence as well as dynamic and non-stationary common factors (at the same time). Based on the findings of this paper, together with the results by Pesaran (2006); Kapetanios et al. (2011), and Chudik and Pesaran (2015a), we can conclude that the CCE method can be widely used to deal with panel models with error cross-sectional dependence, regardless of whether the model is static or dynamic, and whether the unobservable common factors are stationary.
Author Contributions
These authors contributed equally to this work. All authors have read and agreed to the published version of the manuscript.
Funding
Cao acknowledges the financial support from the National Natural Science Foundation of China (no. 11861014) and Guangxi Natural Science Foundation (no. 2020JJA110007 and no. 2020JJA110013.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Data used in the empirical application can be obtained from the R package phtt.
Acknowledgments
We are grateful for the constructive comments from the guest editor as well as the two anonymous referees.
Conflicts of Interest
The authors declare no conflict of interest.
Appendix A. Useful Lemmas and Theoretical Derivations of Theorems
The appendix includes proofs of the theorems and lemmas used in the derivations of the main results in the paper.
Recall that
where , and . Then, the following equalities hold,
where matrix is given by (10).
Appendix A.1. Useful Lemmas
Now, let us turn to the lemmas, which are needed for the derivation of the results in the main paper.
Lemma A1.
(a) If , , has a full-rank factorization where , , then
(b) If , i.e., is full row rank, then .
Using the properties of Moore–Penrose inverse, Lemma A.1 can be easily established by the MacDuffe Theorem of Ben-Israe and Greville (2003).
Lemma A2.
If the rank condition (6) in the main text is satisfied, then
where , and .
Lemma A3.
Lemma A4.
Under Assumption 1–7 as well as restriction , such that and , then the following holds,
Lemma A5.
Under Assumption 1–7 and , such that . Then,
where is a positive definite matrix. Additionally, if the rank condition (6) is satisfied, then
Lemma A6.
If the rank condition (6) is satisfied, and , such that , and , it follows that
Appendix A.2. Theoretical Derivation of the Asymptotics of the CCE Estimators
Proof of Theorem 1.
Since
and using the results of Lemmas A5 and A6, we have
Noting that under our assumptions, tends to a fixed positive definite matrix. Since , then we have
where is the OLS estimator of on . Since , the first part of (A10) in Lemma A4 implies that the first term is . Next, we establish . Note that with , i.e., contains the lags of , as well as the contemporary and lags of , by Assumption 1, is the series uncorrelated and independent of , then we have ; consequently,
as and . Then it is followed by the consistency of . □
Proof of Theorem 2.
Using the consistency of , and the definition of the mean group estimator , we obtain
By the assumption of the random coefficient model, , it follows that
Combining (A19) and (A20), we have
so we only need to show that . Since by Assumption 3, we have and , which implies
Next, we establish the asymptotic distribution of . We have
Using the result (A14) in Lemma A5, when the rank condition is satisfied, we have
which, together with the assumption of to be bounded, and the results of Lemma A2, A5 and Lemma A6, we obtain
uniformly over i, it follows that
For the third term, by Lemmas A2 and A6, we have
By the random coefficient assumption, it now follows that
and can be consistently estimated nonparametrically by
□
Appendix A.3. Proofs of Lemmas
Notation: All vectors are column vectors represented by bold lower case letters, and matrices are represented by bold capital letters. Let denote the Frobenius norm. and denote the maximum absolute column and row sum matrix norms, respectively. denotes the minimum eigenvalue of and denotes the maximum eigenvalue of denotes the Moore–Penrose inverse of and denotes the rank of We also let K denote a generic finite constant, which does not depend on N or T, and whose value may vary case by case.
Proof of Lemma A2.
Since , where
with , is invertible. If the rank condition (6) is satisfied, i.e., is a full column rank matrix, then has a full column rank; hence, has full row rank asymptotically, which implies is a full row rank matrix. Moreover, noting that when the rank condition holds, matrix is full rank, so we have
where the third equality follows from Lemma A1(a) since has the full row rank asymptotically, and the fourth equality is based on the result of Lemma A1(b). □
Proof of Lemma A3.
Denote , where . Note that , so we can write and , where , . Hence, we have
Consequently, we have
or more concisely as
□
Proof of Lemma A4.
So we only need to consider , which is a matrix. Since the elements of are weakly cross-sectionally dependent, together with the random coefficient assumptions, we have and . Consider the block element of , which can be written as , for . where the cross-product terms with finite means and variances. Hence,
then we have
which establishes (A7).
Now, we establish (A8), as before, we consider here, and note that the lth column block of is , for , which can be partitioned as
We consider the first term and note that
which implies that
Under the assumption of the individual–specific error, we have ; hence,
when , , since by Assumption 1, , and , then it easily follows that . When , we have the result since is a serial uncorrelated covariance stationary process under Assumption 2. Combining these results yields
Now, we consider the second term , noting that and are independently distributed stationary processes with zero means, it follows that
which follows that
Consequently, we have
hence, the first part of (A8) is established. Similarly, the result for of the second part of (A8) is established.
For the first part of (A9), since and , we consider
which is a matrix. Without loss of generality, we consider the first block element, , and note that the lth row of that can be written as . According to the assumption of and (independently distributed processes), it easily follows that
and
by the standard unit root asymptotic analysis result , which establishes that converges to its limit at the desired rate of . Consequently, we have
then the first part of (A9) is proven.
To establish the second part of (A9), recalling that , we have , since the norm of is assumed to be bounded.
To establish the third part of (A9), noting that , and using triangle inequality and the submultiplicative property of matrix norm , we have
by (A8), the first and second parts of (A9), as well as the norm of and are assumed to be bounded in probability uniformly over i.
To establish the first part of (A10), recalling , consider the vector , the element of can be written as , . Since by the assumption of and (independently distributed processes), it easily follows that
and
by the standard unit root asymptotic analysis result , which establishes that converges to its limit at the desired rate of . It follows that ; hence, the first part of (A10) is established. Moreover, the second part of (A10) can be proven similarly.
Recalling that , the third part of (A10) is established because
by (A8) and the first part of (A10).
For the first part of (A11), note that we only need to consider the , a matrix. We consider the block element of , which can be written as , for . Without loss of generality, we consider the first block, , and the element of can be written as . By the standard unit root asymptotic analysis, we have
which implies that , then we have , which establishes the first part of (A11). The second part of (A11) is established by
since the norm of is assumed bounded (and the above result).
To prove the third part of (A11), note that , by (A7), the second part of (A9), and the previous result in (A11), we have
To establish the second part of (A12), note that and , and recalling that with , we have
by (A10) and the first part of (A11), as well as the assumption that the norm of , , and is assumed bounded in probability uniformly over i. The third part of (A12) is proven straightforwardly since , using (A8) and the second part of (A12). □
Proof of Lemma A5.
To proof (A13), we note that
where is a matrix of factors, with . Denote the OLS estimator of the multiple regression (A32) as Since that , where is the OLS residuals, i.e., , and in the light of Assumption, , we only need to show that . In fact, we can write
because However, since by (A10) of Lemma A4, , it follows that Hence, , (A13) is established.
To prove (A14), we follow the same spirit of Lemma A.4 in Kapetanios et al. (2011), but need more attention because of the lags. Specifically, note that
since , where , and . However, and since . Then
or .
For the second column block of the above equation, we have or = as , since and , Hence
and
Since is invertible under the assumption, then (A34) can be rewritten as
When the rank condition is satisfied, we have
Note that can be written as , where and , then
Moreover, from (A33), , which directly follows
under the assumption of is invertible and the rank condition is satisfied. Then, using this result in (A37), we have
Since the norms of , and are assumed to be bounded, we need to establish the probability orders of and . For , since is a submatrix of , (A7) and (A9) imply and , which together with (A7), we obtain
Substituting the above two results into (A39) establishes the result. □
Proof of Lemma A6.
To prove (A15), we need to determine the order of probability of , by the triangle inequality of the matrix norm , which equals
Using the results of Lemma A.3 and the submultiplicative property of the matrix norm, and noting that , we focus on the individual elements on the right side of (A40).
Finally, we have
by (A9), (A11), and (A12). Substituting (A41)–(A43) into (A40), we have
as required.
To establish result (A16), similar to the proof of (A15), we have
then, examine each term of (A44), and note that .
Using some results in Lemma A4, we have
Result (A17) can also be established in a similar way, we have
then we examine each of the above terms. The first term equals
by the third part of (A9)–(A11). Next, we have
by (A7), and the results in (A9)–(A12). Finally,
by (A8), the second part of (A11) and (A12). Using (A49)–(A51) into (A48), (A17) is proven. □
Notes
| 1 | An alternative approach to deal with cross-sectional dependence is the principle component analysis proposed by Bai (2009). |
| 2 | As in Pesaran (2006) and Kapetanios et al. (2011), observed factors, such as time effects, can also be included in model (1). For notational simplicity and illustration purpose, we do not include such factors in the model (1). |
| 3 | As Chudik and Pesaran (2015a) point out, the number of lags needs to be restricted. Letting can ensures that, on the one hand, the number of lags is not too large, so that there are sufficient degrees of freedom for the consistent estimator, and on the other hand, the number of lags is not too small, so that the bias due to the truncation of infinite lag polynomials is sufficiently small |
| 4 | We note that can be denoted as , where is a vector of ones, matrices of observations on for |
| 5 | To illustrate the validity and robustness of the CCE estimator in the case of non-stationary common factors, the data-generating process and parameter settings are similar to the settings in Chudik and Pesaran (2015a), except for unobserved common factors. |
| 6 | We also conducted additional Monte Carlo simulations for other settings, such as and ; the corresponding results are slightly worse than that of , these results are not reported to save space. |
References
- Bada, Oualid, and Alois Kneip. 2014. Parameter cascading for panel models with unknown number of unobserved factors: An application to the credit spread puzzle. Computational Statistics and Data Analysis 76: 95–115. [Google Scholar] [CrossRef]
- Bada, Oualid, and Dominik Liebl. 2014. The R package phtt: Panel data analysis with heterogeneous time trends. Journal of Statistical Software 59: 1–34. [Google Scholar] [CrossRef]
- Bai, Jushan. 2009. Panel data models with interactive fixed effects. Econometrica 77: 1229–79. [Google Scholar]
- Bai, Jushan, and Serena Ng. 2004. A panic on unit root tests and cointegration. Econometrica 72: 1127–77. [Google Scholar] [CrossRef]
- Bai, Jushan, and Serena Ng. 2010. Panel unit root tests with cross section dependence: A further investigation. Econometric Theory 26: 1088–114. [Google Scholar] [CrossRef]
- Bailey, Natalia, George Kapetanios, and M. Hashem Pesaran. 2016. Exponent of cross-sectional dependence: Estimation and inference. Journal of Applied Econometrics 31: 929–1196. [Google Scholar] [CrossRef]
- Baltagi, Badi H., and Dong Li. 2004. Prediction in the panel data model with spatial correlation. In Advances in Spatial Econometrics: Methodology, Tools and Applications. Edited by Luc Anselin, Raymond J. G. M. Florax and Sergio J. Rey. Berlin/Heidelberg: Springer, pp. 283–295. [Google Scholar]
- Ben-Israel, Adi, and Thomas N. E. Greville. 2003. Generalized Inverses: Theory and Applications, 2nd ed. New York: Springer. [Google Scholar]
- Bussiere, Matthieu, Alexander Chudik, and Arnaud Mehl. 2013. How have global shocks impacted the real effective exchange rates of individual euro area countries since the euro’s creation? The B.E. Journal of Macroeconomics 13: 1–48. [Google Scholar] [CrossRef][Green Version]
- Chudik, Alexander, and M. Hashem Pesaran. 2013. Econometric analysis of high dimensional VARs featuring a dominant unit. Econometric Reviews 32: 592–649. [Google Scholar] [CrossRef]
- Chudik, Alexander, and M. Hashem Pesaran. 2015a. Common correlated effects estimation of heterogeneous dynamic panel data models with weakly exogenous regressors. Journal of Econometrics 188: 393–420. [Google Scholar] [CrossRef]
- Chudik, Alexander, and M. Hashem Pesaran. 2015b. Large panel data models with cross-sectional dependence: A survey. In The Oxford Handbook of Panel Data. Edited by Badi H. Baltagi. Oxford: Oxford University Press, pp. 2–45. [Google Scholar]
- Chudik, Alexander, Kamiar Mohaddes, M. Hashem Pesaran, and Mehdi Raissi. 2017. Is there a debt-threshold effect on output growth? Review of Economics and Statistics 99: 135–50. [Google Scholar] [CrossRef]
- Eberhardt, Markus, Christian Helmers, and Hubert Strauss. 2013. Do spillovers matter when estimating private returns to R&D? Review of Economics and Statistics 95: 436–48. [Google Scholar]
- Greenaway-McGrevy, Ryan, Chirok Han, and Donggyu Sul. 2012. Asymptotic distribution of factor augmented estimators for panel regression. Journal of Econometrics 169: 48–53. [Google Scholar] [CrossRef]
- Hahn, Jinyong, and Whitney Newey. 2004. Jackknife and analytical bias reduction for nonlinear panel models. Econometrica 72: 1295–319. [Google Scholar] [CrossRef]
- Juodis, Artūras, Hande Karabiyik, and Joakim Westerlund. 2021. On the robustness of the pooled CCE estimator. Journal of Econometrics 220: 325–48. [Google Scholar] [CrossRef]
- Kao, Chihwa, Lorenzo Trapani, and Giovanni Urga. 2012. Asymptotics for panel models with common shocks. Econometric Reviews 31: 390–439. [Google Scholar] [CrossRef]
- Kapetanios, George, M. Hashem Pesaran, and Takashi Yamagata. 2011. Panels with non-stationary multifactor error structures. Journal of Econometrics 160: 326–48. [Google Scholar] [CrossRef]
- Moon, Hyungsik Roger, and Martin Weidner. 2015. Linear regression for panel with unknown number of factors as interactive fixed effects. Econometrica 83: 1543–79. [Google Scholar] [CrossRef]
- Moon, Hyungsik Roger, and Martin Weidner. 2017. Dynamic linear panel regression models with interactive fixed effects. Econometric Theory 33: 158–95. [Google Scholar] [CrossRef]
- Omay, Tolga, and Elif Oznur Kan. 2010. Re-examing the threshold effects in the inflation-growth nexus with cross-sectionally dependent non-linear panel: Evidence from six industrialized economies. Economic Modelling 27: 996–1005. [Google Scholar] [CrossRef]
- Pesaran, M. Hashem. 2006. Estimation and inference in large heterogeneous panels with a multifactor error structure. Econometrica 74: 967–1012. [Google Scholar] [CrossRef]
- Pesaran, M. Hashem. 2007. A simple panel unit root test in the presence of cross section dependence. Journal of Applied Econometrics 22: 265–312. [Google Scholar] [CrossRef]
- Pesaran, M. Hashem. 2015. Testing weak cross-sectional dependence in large panels. Econometric Reviews 34: 1089–117. [Google Scholar] [CrossRef]
- Pesaran, M. Hashem, Ron Smith, and Takashi Yamagata. 2013. Panel unit root tests in the presence of multifactor error structure. Journal of Econometrics 175: 94–115. [Google Scholar] [CrossRef]
- Westerlund, Joakim, Petrova Yana, and Norkute Milda. 2019. CCE in fixed-T panels. Journal of Applied Econometrics 34: 746–761. [Google Scholar] [CrossRef]
- Zaffaroni, Paolo. 2009. Generalized least estimation of panel with common shocks. Unpublished Manuscript. [Google Scholar]
- Zhou, Qiankun, and Yonghui Zhang. 2016. Common correlated effects estimation of unbalanced panel data models with cross-sectional dependence. Journal of Economic Theory and Econometrics 27: 25–45. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).