Abstract
The Common Correlated Effect (CCE) estimator is widely used in panel data models to address cross-sectional dependence, particularly in nonstationary panels. However, existing estimators have limitations, especially in small-sample settings. This study refines the CCE estimator by introducing new proxy variables and testing them through a comprehensive set of simulations. The proposed method is simple yet effective, aiming to improve the handling of cross-sectional dependence. Simulation results show that the refined estimator eliminates cross-sectional dependence more effectively than the original CCE, with improved power properties under both weak- and strong-dependence scenarios. The refined estimator performs particularly well in small sample sizes. These findings offer a more robust framework for panel unit root testing, enhancing the reliability of CCE estimators and contributing to further developments in addressing cross-sectional dependence in panel data models.
    Keywords:
                                                                    panel unit root test;                    cross-sectional dependency;                    common correlated effect estimator;                    CD test        MSC:
                C12; C13; C23
            1. Introduction
Economists have increasingly focused on testing the stationarity properties of variables in panel data —e.g., [,,,,,,,,] (refs. [,,] provide a comprehensive review of the literature on unit roots and cointegration in panels). Panel data techniques for testing unit roots were developed to address the low power of conventional unit root tests by utilizing both time and cross-sectional dimensions []. The most widely used panel unit root tests, such as those proposed by [,] (henceforth, IPS), are extensions of []’s test into a panel setting. 
As pointed out by [], the assumption of cross-sectional independence might be too restrictive, especially in cross-country or cross-region regressions. Cross-sectional dependence may arise from spatial correlations, spill-over effects, omitted global variables, common unobserved shocks, or residual interdependence after accounting for common factors. This dependence invalidates classical unit root and cointegration tests in panel data models. Studies by [,,] demonstrate through simulations that panel unit root tests that ignore cross-sectional dependence perform poorly when applied to correlated panels.
Cross-sectional correlation is typically modeled using either common unobserved factors or general correlations among individual-specific innovations. Studies such as [,,,,,,] have used bootstrap methods to address cross-sectional correlation in panel unit root tests. Alternatively, ref. [] proposed a nonlinear instrumental variable approach to handle general forms of cross-sectional dependence (henceforth, CSD). Assuming CSD arises from common unobserved factors, refs. [,,,] applied the principal component method to estimate these factors and their loadings. Meanwhile, refs. [,] augmented individual ADF tests with cross-sectional averages to proxy for the common factor, rather than estimating it directly (when the size of time dimension () is large relative to the cross-section dimension () of the panel, cross-sectional dependency may be accounted for using standard time series techniques applied to systems of equations, for example, the seemingly unrelated regression equation (SURE) approach).
Each of the methods mentioned above poses practical challenges. In the context of panel data, the Common Correlated Effect (CCE) estimator proposed by [] offers an alternative for addressing cross-sectional dependence from unobserved common shocks or factors. By using cross-sectional averages of the dependent and independent variables, CCE captures these shared shocks, leading to more consistent and reliable estimates without advanced techniques. Refining CCE estimators is essential for improving accuracy, especially in small samples, and for handling complex data structures like nonlinearity or structural breaks, while also reducing bias from correlations between regressors and common factors.
While []’s methodology has been widely studied for its power and size properties, [] showed that the CCE estimator provides consistent and asymptotically normal estimates, even with a multifactor error structure. Despite this, recent studies have highlighted its limitations. For instance, [] criticized the CCE estimator’s assumption of independence between common factors and regressors, arguing that the  factor estimated by CCE may be insufficient and that asymptotic normality breaks down when . 
Our study addresses these shortcomings by proposing a refinement to the CCE estimator, introducing new proxy variables that are simpler and more effective at capturing cross-sectional dependence. Unlike previous methods, our approach reduces the correlation between regressors and proxy factors, which has been a critical drawback of the CCE estimator. Through an extensive simulation study, we demonstrate that our refined method improves performance, particularly in small sample sizes, and better handles both weak and strong dependence. This novel approach contributes to the ongoing refinement of CCE estimators and offers a more robust framework for panel unit root testing. To address these issues, we conducted the simulation study described below.
We use []’s cross-sectional dependency test (henceforth,  test) to assess whether our new estimator effectively remedies cross-sectional dependency. The use of the  test in this simulation study proved to be highly convenient. First, we verified that the  test accurately detected the CSD present in the data-generating process (henceforth, DGP). In the next stage, the  test confirmed the absence of CSD when no CSD was present. We then introduced generated factor data, and the  test showed that adding the factor eliminated CSD. However, when we applied the CCE estimator by [], it was clear that it did not fully remove CSD. In contrast, our refined CCE estimator successfully eliminated CSD and bias. As noted by [], the CCE estimator can mistakenly capture common structural breaks instead of factors when there is a common trend or break in the DGP, raising concerns about adding extra factors as proposed by []. Similarly, [] emphasized that structural breaks could be mistaken for common factors, leading to model misspecification. Instead of complicating the CCE estimator as mentioned in [], we refined it by using the simple average of the original data () in the auxiliary regression instead of using  (the panel unit root tests’ dependent variable) and the independent variable. This approach reduces the correlation between regressors and proxy factor variables, addressing the main drawback of the CCE estimator, as highlighted by []. While the new method performed well for small , it was less effective for larger . Thus, we proposed a two-stage estimator: first, using the CCE estimator to reduce correlation issues, and second, applying an auxiliary regressor generated in the first stage. In the empirical section, we compared the new unit root test with []’s test using data with structural breaks. While []’s test incorrectly indicated stationarity, the new test identified the unit root problem, eliminating size bias.
2. The Panel Unit Root Test and the Refinement of the CCE Estimator
2.1. Common Correlated Estimator in Panel Unit Root Testing
In this part, we introduce a practical way to refine the CCE estimator of []. In the author’s well-known paper, he introduces an auxiliary regression to remedy cross-sectional dependency. In line with [], we consider the following heterogeneous panel ADF regression: 
      
        
      
      
      
      
    
      
        
      
      
      
      
    
      
        
      
      
      
      
    
        where  is the unobserved common factor assumed to be a stationary process,  is factor loadings, and  is the individual specific error terms. For the moment, we assume that idiosyncratic error terms  are independently distributed across both  and . 
The common factor  can be proxied by the cross-sectional mean of  and  following [,]. To see this, first re-write Equation (1) as follows:
      
        
      
      
      
      
    
Taking the cross-section average of both sides, we obtain the following equation:
      
        
      
      
      
      
    
Assuming that  does not vanish, the common factor  can be written as
        
      
        
      
      
      
      
    
        which suggests that the common factor can be proxied by a linear combination of  and . Therefore, in order to filter out the effects of the common factor, we run the following modified test regression:
      
        
      
      
      
      
    
Then, the test of the null hypothesis of the unit root ( in Equation (1) above) is based on  ratio of the OLS estimate of  in Equation (6). 
In Equation (1) and hence in Equation (6), it was assumed that the error terms are not serially correlated. In order to allow for serial correlation in the error terms, we ran the following augmented regression:
      
        
      
      
      
      
    
        and we compute the test statistic from Equation (7) (see []). 
Ref. [] demonstrated that the CCE estimator provides consistent and asymptotically normal estimates of slope coefficients in panel data models, even with a multifactor error structure. In this section, we aim to explore additional properties by using the CSD test to determine whether the CCE estimator truly remedies the CSD issue. To carry out this, we employ the  test from []. First, we need to verify if this test performs well across different levels of CSD. 
2.2. The Behavior or Performance of the Test
In this section, we first conduct a Monte Carlo study to test whether the  test works properly with DGP given in Equation (3). The features of this DGP are described in [,], where CSD is introduced into the panel data using a factor structure, as shown in Equation (3). First, we compute the  test directly on the DGP to verify if the test detects CSD. In the second stage, we use the same DGP but apply the factor structure to remedy the CSD during the testing process. We expect to observe severe CSD in the first Monte Carlo setup and no CSD in the second. If these expectations hold, the  test can be used to assess the CCE estimator proposed by [,]. Previous studies [] have shown that the CCE estimator effectively remedies CSD and produces unbiased estimates. While our analysis complements theirs, our findings suggest a new direction for developing a more efficient estimator. In the final stage, we propose a new estimator based on these new insights.
The Monte Carlo design used in this study can be summarized as follows: First, we generate data using Equation (3) with factor loadings  ranging from −1.0 to 3.0, as suggested by the authors of [,], who classify this range as strong CSD. In our study, we compute the  test for each factor loading in this interval at 0.1 increments. These computations are performed for  and . For each point, we calculate the  test 1000 times to examine the distribution and the minimum and maximum values of the test. For example, with  and , we generate 40 intervals representing the probability of detecting CSD when imposed or, conversely, detecting no CSD. For  and  values as described, this results in 360 cells (), which can be displayed in 27 tables for each  and . To simplify interpretation, we present the results visually through figures. In these figures, we also include the 10% significance interval for the z-test, ranging from −2.0 to 2.0.
We consider the following panel regression model,
        
      
        
      
      
      
      
    
        for  cross-sectional units and  time periods. The sample estimate of the pair-wise correlation of the residuals is given by
        
      
        
      
      
      
      
    
        where  is the OLS estimates of  defined as . Ref. [] suggests that the  test statistic can be computed as
        
      
        
      
      
      
      
    
2.2.1. The Behavior of the Without Imposing the CCE Estimator
For the first Monte Carlo study, we use the DGP from Equation (3) without imposing any remedy. The results are shown in Figure 1, Figure 2 and Figure 3.
      
    
    Figure 1.
       test results without remedying the CSD data for . Note: In the figure, point 10 in the x-axis is 0.0 factor loading and the y-axis shows the  test results. The 10 percent significance interval is given as −2.0 and 2.0 as an approximation to −1.96 and 1.96. Moreover for clarification, blue line is the upper bound of the  test which is obtained for that specification and the black line is the lowest  value obtained for that parameter specification. As for every factor loading we have done simulation study the obtained lower and upper bounds are results from these simulation studies. Green line and red line explained above as −2.0 and 2.0, respectively.
  
      
    
    Figure 2.
       test results without remedying the CSD data for . Note: In the figure, point 10 in the x-axis is 0.0 factor loading and the y-axis shows the  test results. The 10 percent significance interval is given as −2.0 and 2.0 as an approximation to −1.96 and 1.96.
  

      
    
    Figure 3.
       test results without remedying the CSD data for . Note: In the figure, point 10 in the x-axis is 0.0 factor loading and the y-axis shows the  test results. The 10 percent significance interval is given as −2.0 and 2.0 as an approximation to −1.96 and 1.96.
  
As shown in Figure 1, the  test effectively captures CSD for . The test becomes more sensitive as the  dimension increases. This can be observed in the figures, where for , the maximum  test value approaches 20.0, while for , it increases to around 50. A similar pattern is seen for the minimum values. As  increases, the initially u-shaped  test results become skewed, with the tangency points of the curve shortening. This indicates that the  test detects CSD for smaller factor loadings as  grows. For , the range of test statistics is approximately (0.1, 0.4), indicating almost no CSD, while for , the range is wider at (−1.0, 2.4). These results suggest that it is not straightforward to categorize the range (0.0, 0.2) as weak CSD, as some effects are still present when  increases. However, our primary goal is to confirm whether the  test is working or not. Based on Figure 1, we conclude that it works well for small  values. 
We now move on to test the performance with a larger  value, selecting , and the corresponding results are given in Figure 2. As shown in Figure 2, the pattern observed is consistent with that in Figure 1. This confirms that increasing the  dimension does not cause any issues with the  test results. Therefore, these simulation results provide evidence of the test’s consistency in small samples. As the  dimension increases, both the maximum and minimum values of the  test also increase. For example, with  and , the maximum  test value approaches 20.0, but for  and , this maximum value rises to approximately 100.0. Similarly, for 0 and , the maximum value is around 50, while for  and , it increases to 450.0. A similar pattern is observed for the minimum values. Furthermore, as the  dimension increases, the tangency points of the u-shaped figure become shorter, indicating that the  test detects CSD for smaller factor loadings.
For  and , the test produces a range of (−0.1, 0.4), indicating almost no CSD, while for  and , the range narrows to (0.0, 0.1). Once again, this confirms that it is difficult to classify the range (0.0, 0.2) as weak CSD for all values of  and , as claimed by [,]. However, for large values of  and , factor loadings in the range  can indeed be categorized as weak CSD.
Up until here in Figure 2 and Figure 3, we have deal with  and  with . In order to see the large sample features, we performed simulations for  and . Figure 4 shows the behavior of the  test in large samples. Figure 4 confirms the expected pattern of Figure 1, Figure 2 and Figure 3. All the explanations for the comparison of Figure 1, Figure 2 and Figure 3 are valid. Hence, these simulation results provide evidence of the consistency of the test for catching CSD in small samples.
      
    
    Figure 4.
       test results without remedying the CSD data for  and .
  
2.2.2. The Behavior of the with Imposing the Factor Variable
We now investigate the consistency of the  test by imposing a factor variable into the testing process, which induces CSD in the DGP. If the  test is working correctly, the inclusion of the factor variable should eliminate the CSD, and the  test should indicate no remaining CSD. The resulting  values will fall between the z-test thresholds of −2.0 and 2.0, as shown in the graphics. The first results are given in Figure 5 for .
      
    
    Figure 5.
       test results with remedying the CSD by the original factor for . Note: In the figure, point 10 in the x-axis is 0.0 factor loading and the y-axis shows the  test results. The 10 percent significance interval is given as −2.0 and 2.0 as an approximation to −1.96 and 1.96.
  
From Figure 5, we observe that the calculated minimum and maximum values fall within the 10% significance level of the z-test, confirming that the  test successfully captures the CSD imposed in the DGP of Equation (3). This result meets our expectations and provides a benchmark for evaluating proxies used to remedy CSD. A valid proxy should follow this pattern for small  and  dimensions. If the test shows no CSD for any factor loading, the proxy can be considered an efficient factor variable. In Figure 5, we can see how increasing the  dimension affects the results. The test statistics range between approximately −2.0 and 7.0 for , and between −3.0 and 4.0 for . As  increases, the band of calculated test statistics narrows and shifts downward, with the maximum values decreasing more significantly than the minimum values. This suggests that while the original factor effectively remedies CSD, it still has some limitations in handling very small probabilities. To better understand the  test’s behavior, it would be useful to analyze the results with a larger  dimension.
From Figure 6, we can observe that for  and , the calculated  test statistics range between approximately −1.5 and 15.0, compared to a range of −2.0 to 7.0 for  and . As  increases, the band of the  test statistics shifts upward and widens, with the maximum values increasing more than the minimum values. When  and , the band narrows to a range of −2.0 to 4.0, whereas for  and , the range is approximately −3.0 to 4.0. This narrowing of the band as both  and  increase suggests that the original factor more effectively remedies CSD in the DGP. These simulations highlight the power of the  test, which proves to be highly effective in detecting CSD based on the test results. The second set of simulations further demonstrates the effectiveness of the method in remedying CSD, building on the first simulation’s results. Therefore, we can conclude that both simulations confirm the power of the  test and the method used to remedy CSD.

      
    
    Figure 6.
       test results with remedying the CSD by the original factor for . Note: In the figure, point 10 in the x-axis is 0.0 factor loading and the y-axis shows the  test results. The 10 percent significance interval is given as −2.0 and 2.0 as an approximation to −1.96 and 1.96.
  
To explore this further, we now examine the impact of increasing the  dimension in Figure 7. In Figure 7, a new phenomenon emerges. When the factor loadings are low, the original factor remedies CSD more effectively than at higher factor loadings. This behavior may also appear in approximation methods like the CCE estimator.
      
    
    Figure 7.
       test results with remedying the CSD by the original factor for . Note: In the figure, point 10 in the x-axis is 0.0 factor loading and the y-axis shows the  test results. The 10 percent significance interval is given as −2.0 and 2.0 as an approximation to −1.96 and 1.96.
  
Up until here in Figure 5, Figure 6 and Figure 7, we have deal with ,  and  with . To explore the behavior in larger samples, we conducted a simulation with  and . Figure 8 illustrates how the  test performs in remedying CSD using the original factor in these large-sample settings.
      
    
    Figure 8.
       test results with remedying the CSD by the original factor for  and .
  
We now have solid benchmark simulations to assess whether the proxy variables are effective. The results from the first simulation help us evaluate whether the proxy can remedy part of CSD. The second set of simulations further demonstrates the efficiency of the proposed method in addressing CSD.
2.2.3. The Behavior of the with Imposing the CCE Estimator
In light of these arguments, we can assess performances of the CCE estimator proposed by [,]. In Figure 9, we observe some unexpected results for what would typically be considered a good proxy. However, only for  and  do we see the expected results for a proxy that remedies CSD, as indicated by simulations 1 and 2. The computed  values range between approximately −2.4 and −1.5, demonstrating that the CCE estimator effectively remedies CSD. Additionally, for , the computed values quickly move out of the CSD rejection region, and the band between the minimum and maximum  values narrows as the  dimension increases. By , the minimum and maximum values of the  test are approximately 10.25 and −10.15, respectively. Now, we can further explore the effects of increasing the  dimension.

      
    
    Figure 9.
       test results with remedying the CSD by the CCE method for  Note: In the figure, point 10 in the x-axis is 0.0 factor loading and the y-axis shows the  test results. The 10 percent significance interval is given as −2.0 and 2.0 as an approximation to −1.96 and 1.96.
  
In Figure 10, we observe that only for  and  we obtain the expected results for a good proxy that remedies CSD, as indicated by simulations 1 and 2. The computed  values range between approximately −2.2 and −1.4, confirming that the CCE estimator effectively remedies CSD. However, as the  dimension increases, the band between the minimum and maximum  values narrows. For  and , the minimum and maximum  values for  are approximately  and , respectively. These results show that increasing  reduces the  test values and narrows the band, while increasing  further narrows the range. The key finding from these simulations, in line with simulations 1 and 2, is that the CCE estimator remedies CSD most effectively when  is small. However, as  increases, the proxies  and  correct the factor’s effect in the residual term but introduce additional CSD in the opposite direction. These simulations focus solely on CSD, so we cannot yet determine how they impact bias reduction in the  parameter in Equation (3). Nonetheless, there remains scope for developing a better estimator for addressing CSD in panel unit root testing, in line with the approaches of [,]. Before further investigating this, it is beneficial to examine the effects of increasing both  and  dimensions.
      
    
    Figure 10.
       test results with remedying the CSD by the CCE method for . Note: In the figure, point 10 in the x-axis is 0.0 factor loading and the y-axis shows the  test results. The 10 percent significance interval is given as −2.0 and 2.0 as an approximation to −1.96 and 1.96.
  
In Figure 11, we observe that the results are similar to those in Figure 9 and Figure 10. The only difference is that for  and , no band appears, and the minimum and maximum values are fixed at 10.0. This indicates that the bands in Figure 11 are narrower than those in Figure 9 and Figure 10. To further explore the behavior in large samples, we conducted a simulation for  and . 
      
    
    Figure 11.
       test results with remedying the CSD by the CCE method for . Note: In the figure, point 10 in the x-axis is 0.0 factor loading and the y-axis shows the  test results. The 10 percent significance interval is given as −2.0 and 2.0 as an approximation to −1.96 and 1.96.
  
Figure 12 illustrates how the  test performs when remedying CSD using the CCE estimator proposed by [] in large samples. In Figure 12, we observe that as the  dimension increases, the  test values reach 22.0. Additionally, increasing both the  and  dimensions together eliminate the band between the minimum and maximum values of the  test.
      
    
    Figure 12.
       test results with remedying the CSD by CCE for  and .
  
2.3. The Refinement of the Common Correlated Estimator by Using the Test Results
In light of these findings, we now can investigate for a more efficient estimator for remedying CSD in small samples.
        
      
        
      
      
      
      
    
Proposition 1. 
 and 
             proxies can be expressed by 
            .
Proof.  
      
        
      
      
      
      
    
 with some algebra can show that
        
      
        
      
      
      
      
    
      
        
      
      
      
      
    
        where  and  is taken:
      
        
      
      
      
      
    
Hence, we can write
        
      
        
      
      
      
      
    
      
        
      
      
      
      
    
Now, we can re-write Equation (12),
        
      
        
      
      
      
      
    
With some algebra,
        
      
        
      
      
      
      
    
Showing that
        
      
        
      
      
      
      
    
        we prove that
        
      
        
      
      
      
      
    
□
Using this variable, , may provide a more effective proxy factor to address cross-sectional dependency compared to []’s CCE estimator. In the CCE estimator, we use two variables to proxy the factor; now, we have only one parameter to estimate, which may increase the efficiency.
Figure 13 clearly demonstrates that, for small  values, the newly proposed estimator or proxy closely mimics the original factor variable. Specifically, for  and , the new proxy closely replicates the original factor variable based on the  test results shown in Figure 5. However, as the  dimension increases, both the minimum and maximum  test values rise, and the band widens. By , the  test results for the proxy estimator indicate that it no longer provides an effective remedy. To further examine the effect of increasing the  dimension, we performed a simulation for . The results are shown in Figure 14. Figure 14 shows that the effectiveness of the newly proposed estimator in remedying CSD diminishes as the  dimension increases. Thus, we can conclude that this proxy is only effective for very small  dimensions.
      
    
    Figure 13.
       test results with remedying the CSD by the  variable only for . Note: In the figure, point 10 in the x-axis is 0.0 factor loading and the y-axis shows the  test results. The 10 percent significance interval is given as −2.0 and 2.0 as an approximation to −1.96 and 1.96.
  
      
    
    Figure 14.
       test results with remedying the CSD by the  variable only for . Note: In the figure, point 10 in the x-axis is 0.0 factor loading and the y-axis shows the  test results. The 10 percent significance interval is given as −2.0 and 2.0 as an approximation to −1.96 and 1.96.
  
To further examine the impact of increasing the  dimension, we conducted a simulation for . The results are presented in Figure 15. Fortunately, as seen in Figure 15, the  test results for  and  show that the test values converge to 1000.0, while the new estimator’s  test values converge to 600.0. This indicates that the new estimator remedies part of the CSD, but a significant portion of the induced CSD still remains, leading to biased estimates. Additionally, the  test values start to decrease around a factor loading of , opposite to the trend in Figure 15. Therefore, it may be useful to examine the behavior of the  test for higher factor loadings. 
      
    
    Figure 15.
       test results with remedying the CSD by the  variable only for . Note: In the figure, point 10 in the x-axis is 0.0 factor loading and the y-axis shows the  test results. The 10 percent significance interval is given as −2.0 and 2.0 as an approximation to −1.96 and 1.96.
  
Figure 16 presents the simulation results for  and  with factor loadings ranging from  in 0.1 increments. As the factor loadings increase, the new proxy  adjusts within the range of . To better understand the behavior without any remedy applied, we simulated the same values, as shown in Figure 17 below.
      
    
    Figure 16.
       test results with remedying the CSD by the  variable for  and  with  [−1.0,10.0].
  
      
    
    Figure 17.
       test results without remedying the CSD for  and  with  [−1.0,10.0].
  
Comparing Figure 16 and Figure 17, we can observe that the  proxy remedies CSD across all factor loadings and converges to the results obtained by the original factor variable. To further examine the parameter estimates of both the original factor and the  variable, we performed simulations for  and  within the factor loading range . The simulated values are shown in Figure 18 below (the simulations were performed by the 50-entity 100 factor loading interval for 10 trials; hence, the simulated values are 50,000).
      
    
    Figure 18.
      The difference in parameter estimates of  and the original factor variable for  and  with .
  
Each 10,000-point interval corresponds to a factor loading of . After , the parameters become equal, marking the turning point in Figure 17, where the estimated parameter values for the original factor variable and the proxy converge, effectively remedying the CSD. The same pattern is observed in the CCE estimation. We subtracted the parameter value of  from the original factor’s parameter value within the same factor loading range. Figure 19 below summarizes the results of the Monte Carlo study.
      
    
    Figure 19.
      The difference in parameter estimates of  and the original factor variable for  and  with .
  
These two simulation results shed light on an issue. From Figure 19, we can see that the difference between the parameters ranges from −0.04 to 0.06 until the factor loading reaches , after which the difference becomes nearly zero, and it finally reaches 0.0 at , as shown in Figure 18. The difference obtained using the  parameter in Figure 18 is quite large compared to the CCE estimator for the factor loading range , where the CCE estimator consistently performs better. However, after , the performance of the  proxy improves. Moreover, based on Figure 13, Figure 14 and Figure 15, we observe that the  proxy performs better in small samples with . These results suggest that there is a transformation that remedies CSD more effectively than the CCE estimator for small samples, particularly for factor loadings in the range . However, we also recognize that while  or  is necessary for remedying CSD, they are not sufficient without incorporating . Using  and  (as in the CCE estimator) together remedies CSD across all factor loadings but introduces a negative significant  test result, indicating that some CSD remains in the estimation process in panel unit root testing. Based on the simulation results from the original factor variable, we identified the behavior of an effective proxy with respect to the  test. Any proxy that meets these criteria can be used as an efficient proxy for factor variables.
In Figure 20, we demonstrate that as the factor loadings increase in 0.1 increments, []’s CCE estimator, denoted as , and our proposed estimator, , become equal, meaning . Starting from a factor loading of 0.0 and increasing to 10.0, we find that at a factor loading of 1.0,  and  are approximately equal. At this point, the  test shows very low dependence when using our proposed estimator.
      
    
    Figure 20.
      The difference in parameter estimates of  and  variables for  and  with .
  
Based on the arguments and simulations presented, we propose a new estimator that may prove to be more efficient in small sample sizes. Our newly proposed method is outlined as follows:
Step 1. Run the below regression:
      
        
      
      
      
      
    
Step 2. Estimate the  and  from Equation (20), and compute the new variable: 
      
        
      
      
      
      
    
Step 3. Use this new variable as a dependent variable and obtain the panel unit root test:
      
        
      
      
      
      
    
In the first step, we remove the CSD effect from residual term  with []’s CCE method. Now, we are left with more CS independent data. However, still, some CSD prevails as we showed in the above simulations. In the second step, we impose some more CSD into the dependent variable by adding the  variable into the filtered dependent variable. And in the final stage, we use  as a proxy, which is indirectly equal to . Therefore, we use []’s method and Proposition 1 together. Our simulation results are given in Figure 21, Figure 22 and Figure 23.

      
    
    Figure 21.
       test results with remedying the CSD by the new method for . Note: In the figure, point 10 in the x-axis is 0.0 factor loading and the y-axis shows the  test results. The 10 percent significance interval is given as −2.0 and 2.0 as an approximation to −1.96 and 1.96.
  

      
    
    Figure 22.
       test results with remedying the CSD by the new method for . Note: In the figure, point 10 in the x-axis is 0.0 factor loading and the y-axis shows the  test results. The 10 percent significance interval is given as −2.0 and 2.0 as an approximation to −1.96 and 1.96.
  
      
    
    Figure 23.
       test results with remedying the CSD by the new method for . Note: In the figure, point 10 in the x-axis is 0.0 factor loading and the y-axis shows the  test results. The 10 percent significance interval is given as −2.0 and 2.0 as an approximation to −1.96 and 1.96.
  
The newly proposed estimator behaves similarly to the CCE estimator for . All characteristics, such as the decrease in  test values as the  dimension increases and the narrowing of the band, follow the same pattern. However, while the lower bound of the  test aligns in both methods, the upper bound converges more slowly than in the CCE estimator. This slower convergence of the upper bound makes the new estimator perform better than the CCE estimator in terms of the  test. 
In the second step, we increase the  dimension again to assess its impact. The results of these simulations are shown in Figure 22 below. The new proxy method exhibits very similar behavior to the original factor variable for , further supporting its effectiveness as a good proxy. In particular, for small  dimensions ( to ), the method performs as well as the original factor. However, for low levels of CSD, such as , this method does not remedy CSD, which can be seen as a favorable characteristic. Similarly to the CCE estimator, the  test shows a negative sign, indicating that the method introduces some CSD. Additionally, as  increases, the upper and lower bounds of the computed values converge to those of the CCE estimator. As  grows, the  test results follow the same trend as the CCE estimator. For instance, with  and 0, the lower bound of the  test reaches 10.0, while the upper bound also becomes closer to the CCE estimator. This convergence is a positive attribute for deriving the asymptotic properties of the new estimator, showing that as  and  approach infinity, the new test aligns with the same asymptotic properties. The following simulation results are obtained for .
From Figure 23, we can clearly observe that for  and , the newly proposed method closely approximates the original factor variable. These simulation results indicate that our proposed method performs similarly to the original factor as  increases while  is fixed. Additionally, we conducted simulations for two scenarios: one where both  and  approach infinity, and another where  is fixed and  is large.
Figure 24 shows that as  and  approach infinity, the new method converges toward the CCE method in remedying CSD. 
      
    
    Figure 24.
       test results with remedying the CSD by the new method for  and .
  
However, when  is fixed and  approaches infinity, the newly proposed method serves as a better proxy for the original variable. The results for this case are presented in Figure 25.
      
    
    Figure 25.
       test results with remedying the CSD by the new method for  and .
  
2.4. The Critical Values of the New Refined CCE Estimator and the Finite Sample Properties
Thus, the newly proposed estimator proves to be efficient, falling between the CCE estimator and the original factor in terms of performance. We calculated the critical values for the intercept-only case, which are presented in Table 1.
       
    
    Table 1.
    Exact critical values of  statistics.
  
The distribution appears to be slightly flawed, but it converges to the distribution from [] when  and . The critical values for the 1%, 5%, and 10% significance levels in [] are −2.15, −2.07, and −2.02, respectively, while for our proposed method, they are −2.19, −2.09, and −2.03. This pattern is illustrated in Figure 21, Figure 22, Figure 23 and Figure 24. Based on these critical values, we compare the finite sample performances of the two estimators.
First, following the DGP outlined in [], we conduct an empirical size analysis comparing our newly proposed estimator with the estimator from []. The results are given in Table 2. We use 2000 replications to compute the empirical size of the tests at the 5% nominal level. Table 2 shows that both of the methodologies exhibit good size properties. Fortunately, our methodology has slightly better size properties, especially for small  values.
       
    
    Table 2.
    Size comparison under the strong cross-sectional dependency.
  
Next, Table 3 presents the empirical power of both methodologies under the strong cross-sectional dependency. The results from Table 3 clearly indicate that when  and , increasing the empirical power of our proposed test is better than the CCE estimator. In short, for only small  and , the empirical power of the CCE estimator is better than our test. 
       
    
    Table 3.
    Power comparison under the strong cross-sectional dependency.
  
3. Empirical Study
In this empirical section, we apply our methodology to identify new proxy variables for the original factor used in the methodology. For the empirical analysis, we used unemployment rate data for four Nordic countries, sourced from the FRED (Federal Reserve Economic Data) database, covering the period from 2001Q1 to 2015Q2.
In the unemployment rate, we can clearly see that the CADF test incorrectly rejects the true null hypothesis from Table 4. 
       
    
    Table 4.
    Four Nordic Countries’ Unemployment Rate.
  
Figure 26 shows that the unemployment data include a structural break, yet the CADF test still rejects the null hypothesis. It is well known that when structural breaks are present, methods that do not account for de-trending are less powerful. This suggests a size problem, as noted in the small-sample performance. The slightly better performance of the RCCE test demonstrates its superiority in this empirical analysis. As shown in [], the CCE estimator of [] produces incorrect results in homogeneous structural break-type DGPs, and this empirical study supports the findings of [].

      
    
    Figure 26.
      Nonlinear trends for four Nordic countries.
  
4. Concluding Remarks
In this study, we examine the features of the CCE estimator in panel data models, particularly in the context of nonstationary panels. While cross-sectional dependence and its remedies have been analyzed in various studies through bias estimation, our approach utilizes the cross-sectional dependence test. This simple approach offers an advantage over other studies, as we also assess the reliability of the  test. Based on our simulation results, we propose new proxy variables for CCE estimation. These new variables and estimators can be considered refinements of the CCE estimator in panel unit root models.
As stated in [], including additional variables in panel unit root testing to remedy CSD can increase the correlation between regressors and proxy factor variables, which is a major drawback of the CCE estimator. Based on this, we proposed a simpler estimator that uses only the average of the dependent variable without taking the first difference. The simulation study showed promising results for very small  values, but for larger , the newly proposed method performed worse than the CCE estimator. In response to []’s criticism, we developed a two-stage estimator. In the first stage, we use the CCE estimator to avoid increasing the correlation between the regressors and factor variables. In the second stage, we incorporate an auxiliary regressor generated from the first stage, following the suggestion of []. This newly proposed two-stage method works effectively for both small and large  and  values. The shortcomings of the CCE estimator, as mentioned by [], are further examined in the empirical section. We compared the unit root test results of the new estimator with those of []’s test, focusing on data with structural breaks. While []’s test indicated that the series under investigation was stationary, it failed to account for the structural breaks in the DGP. In contrast, our newly proposed test accurately identified the presence of a unit root problem. As a result, our methodology eliminated size bias.
All of these simulation studies and empirical results demonstrate that there is still room for developing new estimators to address the cross-sectional dependence problem in panel unit root tests. For future research, scholars can explore nonlinear functional forms or similar methodologies to see if they can further mitigate the CSD issue in panel unit root testing. Fortunately, our new  test simulation approach provides a simple framework that will facilitate the design of more complex simulation studies.
Author Contributions
Conceptualization, Y.A.; Methodology, T.O. and F.E.; Software, T.O. and Y.A.; Validation, T.O., Y.A., F.E. and M.E.; Formal analysis, T.O. and F.E.; Resources, M.E.; Data curation, M.E.; Writing—original draft, Y.A.; Writing—review and editing, F.E. and M.E.; Visualization, M.E.; Supervision, Y.A. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Data Availability Statement
Data available in a publicly accessible repository: FRED data set.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Quah, D. Exploiting cross section variation for unit root inference in dynamic data. Econ. Lett. 1994, 44, 9–19. [Google Scholar] [CrossRef]
 - Maddala, G.S.; Wu, S. A comparative study of unit root tests with panel data and a new simple test. Oxf. Bull. Econ. Stat. 1999, 61, 631–652. [Google Scholar] [CrossRef]
 - Hadri, K. Testing for stationarity in heterogeneous panel data. Econom. J. 2000, 3, 148–161. [Google Scholar] [CrossRef]
 - Choi, I. Unit root tests for panel data. J. Int. Money Bank. 2001, 20, 249–272. [Google Scholar] [CrossRef]
 - Im, K.S.; Pesaran, M.H.; Shin, Y. Testing for unit roots in heterogeneous panels. J. Econom. 2003, 115, 53–74. [Google Scholar] [CrossRef]
 - Smith, L.V.; Leybourne, S.; Kim, T.H.; Newbold, P. More powerful panel data unit root tests with an application to mean reversion in real exchange rates. J. Appl. Econom. 2004, 19, 147–170. [Google Scholar] [CrossRef]
 - Im, K.S.; Lee, J.; Tieslau, M. Panel LM unit-root tests with level shifts. Oxf. Bull. Econ. Stat. 2005, 67, 393–419. [Google Scholar] [CrossRef]
 - Pesaran, M.H. A simple panel unit root test in the presence of cross-section dependence. J. Appl. Econom. 2007, 22, 265–312. [Google Scholar] [CrossRef]
 - Pesaran, M.H.; Smith, L.V.; Yamagata, T. Panel unit root tests in the presence of a multifactor error structure. J. Econom. 2013, 175, 94–115. [Google Scholar] [CrossRef]
 - Banerjee, A. Panel data unit roots and cointegration: An overview. Oxf. Bull. Econ. Stat. 1999, 61, 607–629. [Google Scholar] [CrossRef]
 - Baltagi, B.H.; Kao, C. Nonstationary panels, panel cointegration and dynamic panels. In Advances in Econometrics; Baltagi, B., Ed.; JAI: New York, NY, USA, 2000; Volume 15. [Google Scholar]
 - Breitung, J.; Pesaran, M.H. Unit Roots and Cointegration in Panels. In The Econometrics of Panel Data. Advanced Studies in Theoretical and Applied Econometrics; Mátyás, L., Sevestre, P., Eds.; Springer: Berlin/Heidelberg, Germany, 2008; Volume 46. [Google Scholar] [CrossRef]
 - Levin, A.; Lin, C.F.; Chu, C.S.J. Unit root tests in panel data: Asymptotic and finite-sample properties. J. Econom. 2002, 108, 1–24. [Google Scholar] [CrossRef]
 - Dickey, D.A.; Fuller, W.A. Distribution of the estimators for autoregressive time series with a unit root. J. Am. Stat. Assoc. 1979, 74, 427–431. [Google Scholar]
 - Banerjee, A.; Marcellino, M.; Osbat, C. Some cautions on the use of panel methods for integrated series of macroeconomic data. Econom. J. 2004, 7, 322–340. [Google Scholar] [CrossRef]
 - Banerjee, A.; Marcellino, M.; Osbat, C. Testing for PPP: Should we use panel methods? Empir. Econ. 2005, 30, 77–91. [Google Scholar] [CrossRef]
 - Chang, Y. Bootstrap unit root tests in panels with cross-sectional dependency. J. Econom. 2004, 120, 263–293. [Google Scholar] [CrossRef]
 - Ucar, N.; Omay, T. Testing for unit root in nonlinear heterogeneous panels. Econ. Lett. 2009, 104, 5–8. [Google Scholar] [CrossRef]
 - Emirmahmutoglu, F.; Omay, T. Reexamining the PPP hypothesis: A nonlinear asymmetric heterogeneous panel unit root test. Econ. Model. 2014, 40, 184–190. [Google Scholar] [CrossRef]
 - Omay, T.; Çorakcı, A.; Emirmahmutoglu, F. Real interest rates: Nonlinearity and structural breaks. Empir. Econ. 2017, 52, 283–307. [Google Scholar] [CrossRef]
 - Çorakçı, A.; Emirmahmutoglu, F.; Omay, T. Re-examining the real interest rate parity hypothesis (RIPH) using panel unit root tests with asymmetry and cross-section dependence. Empirica 2017, 44, 91–120. [Google Scholar] [CrossRef]
 - Chang, Y. Nonlinear IV-unit root tests in panels with cross-sectional dependency. J. Econom. 2002, 110, 261–292. [Google Scholar] [CrossRef]
 - Phillips, P.C.B.; Sul, D. Dynamic panel estimation and homogeneity testing under cross section dependence. Econom. J. 2003, 6, 217–259. [Google Scholar] [CrossRef]
 - Bai, J.; Ng, S. A PANIC attack on unit roots and cointegration. Econometrica 2004, 72, 1127–1177. [Google Scholar] [CrossRef]
 - Moon, H.R.; Perron, B. Testing for a unit root in panels with dynamic factors. J. Econom. 2004, 122, 81–126. [Google Scholar] [CrossRef]
 - Bai, J.; Carrion-i-Silvestre, J.L. Structural changes, common stochastic trends, and unit roots in panel data. Rev. Econ. Stud. 2009, 76, 471–501. [Google Scholar] [CrossRef]
 - Pesaran, M.H. Estimation and inference in large heterogeneous panels with a multifactor error structure. Econometrica 2006, 74, 967–1012. [Google Scholar] [CrossRef]
 - Pesaran, M.H.; Tosetti, E. Large panels with common factors and spatial correlation. J. Econom. 2011, 161, 182–202. [Google Scholar] [CrossRef]
 - Juodis, A.; Karabiyik, H.; Westerlund, J. On the robustness of the pooled CCE estimator. J. Econom. 2021, 220, 325–348. [Google Scholar] [CrossRef]
 - Pesaran, M.H. General Diagnostic Tests for Cross Section Dependence in Panels. Empir. Econ. 2021, 60, 13–50. [Google Scholar] [CrossRef]
 - Omay, T.; Hasanov, M.; Shin, Y. Testing for unit roots in dynamic panels with smooth breaks and cross-sectionally dependent errors. Comput. Econ. 2018, 52, 167–193. [Google Scholar] [CrossRef]
 - Smith, R.P.; Fuertes, A.M. Panel Time Series. 2012. Available online: https://www.researchgate.net/publication/277293522_Panel_Time-Series (accessed on 1 January 2022).
 
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.  | 
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).