Abstract
In this paper, the spatial dynamic panel data (SDPD) model is extended to the single-index spatial dynamic panel data (Si-SDPD) model by introducing a nonlinear connection function to reflect the interaction between explanatory variables. The Si-SDPD model not only retains the advantages of the parametric SDPD model in dealing with spatial and temporal interaction effects and spatio-temporal dependencies, but also solves the limitations of the parametric SDPD model that may lead to missed bias. It reduces the data dimension of non-parametric models and enhances the practicability and explanatory power of parametric models. Since the parts of the model to be estimated contain unknown functions, we propose a new estimation method, a profile maximum likelihood (PML) method, to solve the problem of incidental parameters in the estimation. Under the assumption that the spatial coefficients are known, we preliminarily estimate the unknown function by carrying out local polynomial estimation, so as to transform the model into the parametric form for solving purposes. We then solve the dynamic panel parametric model via quasi-maximum likelihood (QML) estimation. We derive the asymptotic properties of profile maximum likelihood estimators (PMLEs) and find that, under certain regularity conditions, both parametric and non-parametric estimators are consistent. Monte Carlo results show that PMLEs have good finite sample performance.
Keywords:
profile maximum likelihood; nonparametric estimation; spatial dynamic panel data; single-index panel model MSC:
62F12; 62G20; 62H11
1. Introduction
Research into spatial panel data is always a hot topic in econometrics. Scholars have gradually applied the spatial panel data model to analyze social and economic issues, such as the effective allocation of resources, factors influencing economic growth, energy consumption, environmental pollution, foreign direct investments, and so on. Compared with time series or cross-section data, panel data can increase the degree of freedom and reduce collinearity, thereby improving the accuracy of parameter estimation. Spatial panel data models can be divided into parametric and non-parametric models according to different assumptions about the explanatory variables. The parametric panel data model can simply and clearly describe the relationship between dependent and independent variables. Refs. [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15] and others have made outstanding contributions to parametric panel data modelling and its inference of economic phenomena. However, in many practical problems, the relationship between variables is usually complex, and it is often difficult to accurately describe this relationship using the preset form of the parametic model. In order to solve this problem, some non-parametric panel data models have been widely studied. Because the function form of a non-parametric model can be arbitrary, the model has great adaptability. When there are many explanatory variables in non-parametric models, the so-called “curse of dimensionality” problem often occurs, leading to the reduced reliability of estimates. Therefore, some new models have been proposed, and one of the effective methods is to build a single-index model.
The basic mathematical form of the single-index model is (Ref. [16]), where is the unknown parameter and is the unknown connection function . Its advantages are that it achieves dimension reduction by connecting functions, effectively avoids the problem of “curse of dimensionality”, and (at the same time) better reflects the relationship between variables. In recent years, the research focus on various extended single-index models has been to seek effective estimates of and . For the single-index model of cross-section data, Ref. [17] proposed a semi-parametric maximum likelihood estimation method, while Ref. [18] proposed non-iterative methods for estimating and based on the parameter constraint assumption, which improves the operation speed. Ref. [19] proposed root-n-consistent estimators for the variance components and a local linear smoother for the connection function. Ref. [20] constructed a semi-parametric minimum average variance estimation method in a partially linear single-index panel data model with fixed effects, and demonstrated the asymptotic properties of the estimation. Compared with the cross-sectional data model, there are very few studies on the single-index model for panel data, and even fewer on the single-index spatial panel data model because of its greater estimation complexity.
According to these research trends of spatial econometric models, we design a Si-SDPD model by introducing a nonlinear connection function to reflect the interaction between explanatory variables. The Si-SDPD model not only retains the advantages of the parametric SDPD model in dealing with spatial and temporal interaction effects and spatio-temporal dependencies, but also solves the limitations of the parametric SDPD model that may lead to missed bias. It reduces the data dimension of non-parametric models and enhances the practicability and explanatory power of parametric models. It is worth noting that due to the presence of a spatial lag term and the multi-directionality of the spatial correlation, the traditional estimation of the single-index model may not be directly applied to estimate the Si-SDPD model. In order to overcome the above difficulties, we propose a new estimation method, namely PML, which can first estimate the unknown functions via local polynomial estimation under the assumption that the spatial coefficients are known, and then solves the dynamic panel parametric model by conducting QML estimation. Then, we derive the asymptotic properties of the PMLEs of the Si-SDPD model and find that under certain regularity conditions, both the parametric and non-parametric estimators are consistent. Finally, we present a rigorous theoretical analysis of the asymptotic properties of PMLEs and verify some of their finite-sample properties by carrying out Monte Carlo experiments.
This paper is organized as follows. In Section 2, we introduce the Si-SDPD model and explain our PML estimation method. With the law of large numbers and central limit theorem for our settings developed in the Appendix A, Section 3 establishes the consistency of the parametric and nonparametric parts of the PMLE. We then present a Monte Carlo simulation to verify that the estimators have good finite sample performance. Section 5 concludes the paper. Some useful lemmas and results are provided in the Appendix.
2. The Model and Profile Maximum Likelihood Estimators
2.1. The Model
The model considered in this paper refers to the single-index spatial dynamic panel data (Si-SDPD) model:
where and are n × 1 column vectors; is independent and identically distributed across i and t with a zero mean and variance ; is the spatial weight matrix, which is predetermined and generates spatial dependence between cross-sectional units ; , is an unary unknown function, ; and the first component of is positive and is an identifiable condition.
Define , , and , where . At the true value, , , , where . Then, presuming is invertible and (1) can be rewritten as . The likelihood function of (1) is
where and . Thus, .
The QMLE is the extreme estimator derived from the maximization of (2). When the s are normally distributed, is the MLE; when the s are not normally distributed, is the QMLE.
2.2. The Profile Maximum Likelihood Estimation
For the likelihood function shown in (2), the parameter estimation method is not feasible because is unknown. In order to obtain a feasible estimate, we propose adopting the PML method. First, we consider the parameter as known, and then (1) becomes a general spatial nonparametric model. The initial estimate of can be obtained using the kernel estimation. Obviously, is a function of the parameter . By replacing in (2) with , we obtain the likelihood function with parameter . Then, by maximizing the likelihood function, we obtain the estimator of . Finally, the final estimate of , , is obtained by replacing in with .
The specific steps of the profile maximum likelihood method are as follows.
- Step 1: Considering as known, we obtain that is an initial estimate of using local polynomial estimation.
We denote and , and (1) can be written as . At , the p-order Taylor expansion of is:
Therefore, we can use the samples near to perform weighted regression to estimate and its higher-order derivatives, i.e., solving the following minimization problem:
where , is the multivariate kernel function and h is the bandwidth. In order to simplify the theoretical derivation, all variables in this chapter have the same window width, and the conclusion is also valid under the assumption of different window widths.
For the convenience of the following matrix operations, we denote . Then, we denote and , so is a matrix. We can rewrite the objective function (4) as:
After minimizing, the estimator of is:
The first component of is an estimate of . We denote , , and , and the initial estimate of is
where . Then, the initial estimate of is
where .
Step 2: By substituting for in (1), we obtain the approximate value of the logarithmic likelihood function as:
The that can maximize the above formula is the estimate of , i.e.,
Computationally and analytically, it is convenient to work with the concentrated log-likelihood by concentrating out the . Based on the log-likelihood function, the initial estimate of is
The concentrated log-likelihood function of is
The estimate of can be defined as:
Equation (10) displays a nonlinear optimization problem, which can be solved using an iterative method in the actual estimation. After is obtained, the final estimate of can be obtained by replacing in Equation (8) with :
Step 3: By using obtained in Step 2 to replace the parameters in the model, we describe the final estimate of the non-parametric part as:
3. Profile Likelihood Estimators and Their Asymptotic Properties
To analyze the asymptotic properties of the estimators, we need the following assumptions:
Assumption 1.
is a constant spatial weight matrix and its diagonal elements satisfy for . In addition, is uniformly bounded in the row and column sums in an absolute value (for short, UB).
Assumption 2.
The disturbances and , are across and with a zero mean, a variance of , and for some .
Assumption 3.
is invertible for all , where is compact and is in the interior of . Furthermore, is UB.
Assumption 4.
is an independent and identically distributed random sequence and . has marginal density functions , where is continuously differentiable near and is the support set of .
Assumption 5.
has continuous derivatives and , where is a positive constant.
Assumption 6.
When , and .
Assumption 7.
The kernel function is a bounded continuous non-negative function whose support set is bounded and closed: , where is a constant, i.e., only if . In addition, , , and are UB.
Assumption 8.
is an even function and . For any positive odd number , ; also, .
Assumption 9.
is the inner point of , where is a convex compact set. and the first component of vector quantity is positive, where is the Euclidean norm.
Assumption 1 is a standard normalization assumption in spatial econometrics. Here, the matrix is UB, meaning that there is a non-negative constant such that and . The constant is defined differently in the following sections depending on the matrices. Assumption 2 provides regularity assumptions for . The reversibility and compactness of in Assumption 3 were derived from Kelejian and Prucha (1998, 2001) and have also been used in many articles about spatial correlation. When exogenous variables are included in the model, it is convenient to assume that the exogenous regressors are uniformly bounded, as in Assumption 4. Assumption 5 is a necessary condition for (3). Assumptions 6–8 demonstrate the condition of kernel density estimation. The bandwidth of the kernel function, , is an important parameter that affects the estimation result of the kernel function. Kernel functions that satisfy Assumptions 7–8 exist, such as the product kernel, , where is a symmetric kernel of one variable on the closed interval . Assumption 9 makes the model (1) recognizable.
For the concentrated log-likelihood Function (4) divided by the sample size , the corresponding expected value function is , which is
To show the consistency of , we need the following uniform convergence results.
Lemma 1.
Under Assumptions 1–9, for an nonstochastic UB matrix ,
where , , and .
Lemma 2.
Let Θ be any compact parameter space. Then, under Assumptions 1–9, is uniform in .
Lemma 3.
Let Θ be any compact parameter space. Then, under Assumptions 1–8, is uniformly equicontinuous for .
Before obtaining the information matrix, we need to compute the first and second derivatives of the logarithmic likelihood function. The asymptotic distribution of the QMLE can be derived from the Taylor expansion of around . We define and at the true value and . The first-order derivative of the concentrated likelihood function involves both linear and quadratic functions of , which are as follows:
where and . Then, the second-order derivatives are:
The information matrix is as follows:
where and .
Assumption 10.
.
Assumption 10 is an important condition for the non-singularity of the limiting information matrix in addition to the global identification in Lemma 4 and Theorem 1.
Lemma 4.
The information matrix is non-singular.
Theorem 1.
Under Assumptions 1–10, is globally identifiable and is a consistent estimator of (similar to Yu (2008)).
Theorem 2.
Under Assumptions 1–10, is globally identifiable and if for (similar to Yu (2008)).
Lemma 5.
Under Assumptions 1–10, .
Lemma 6.
Under Assumptions 1–10, .
Lemma 7.
Under Assumptions 1–9, .
Theorem 3.
Under Assumptions 1–9, .
4. Monte Carlo Results
In this section, the Monte Carlo experiment is carried out on the estimation method pertaining to the previously constructed Si-SDPD model, and the simulation results are evaluated to test the performance of the PML method under limited samples. Then, the practical application value of the PMLEs is evaluated. All experiments are compiled using R language and plotted using the ‘ggplot2′ package.
For the parametric part, we generate samples from (1) and use and , where . The component of the two-dimensional random variable and the random error term are generated from the uniform distribution and the independent normal distribution , respectively. The spatial weight matrix that we use is the matrix, which is one of the main types of spatial weight matrix in spatial econometrics. For the non-parametric part, we use the commonly used Gaussian kernel function, , where . As it is difficult to select the optimal window width, we simply select the window width using the rule-of-thumb method. Finally, we use as the sample size and as the number of periods. For each set of and , the sampling observations are generated with the Metropolis–Hastings sampling algorithm.
The evaluation of simulation results should also be divided into parametric and non-parametric parts. In the parametric part, for each estimator, we calculate the standard deviation (Std) and the root-mean-squared error (RMSE), where is the number of simulations and are the parameter estimates obtained from each simulation. In order to accurately estimate the parameter values, according to Su (2012), we take the window width here, where represents the standard deviation of sequence . In the non-parametric part, we refer to Chen (2012) when choosing the mean absolute deviation error (MADE) as the evaluation standard, which is where is the fixed grid points selected within the support set of . We select 20 fixed lattice points in , namely . When estimating the non-parametric part, we use the method of leave-one-out cross-validation to select the window width, i.e., the window width minimizes , where is the ith element of the after the estimated value and is the estimate obtained with the observation value other than the ith observation.
For different cases of and , 100 simulations are carried out with R Language. In each simulation, the Metropolis–Hastings sampling algorithm is used to conduct 1000 samples in the PML function. In order to obtain the distribution of samples close to reality and in order to achieve stability in the state, the first 200 sampling results are discarded. With two different values of for each and , finite sample properties of both estimators are summarized in Table 1 and Table 2, in which we report the means, variances (Vars), root mean square error (RMSE), and coverage probability (CP).
Table 1.
The performance of spatial coefficients estimators with .
Table 2.
The performance of spatial coefficients estimators with .
For each case, the estimated value of the parameter, i.e., the mean, is relatively close to the real value, and we can see that for each given , when is larger, the variance of estimators will be smaller; for each given , when is larger, the biases between the real value and the estimators will be nearly the same, but the variance will be smaller. When both and are maximized, i.e., , the variance and RMSEs of the parameter estimators are the smallest in all cases, which indicates that the parameter estimators will converge with the increase in the value, which is consistent with the large sample property, as demonstrated. For different values of , the variances are almost all less than 0.01, which indicates that the fitting error is small and the fitting results are good. In addition, the variance and root mean square error have little changes and will not change with an increase in the simulation time and the sample size, which further indicates that the variance of parameters is stable. Due to the small estimation variance, the mean hardly fluctuates around the true value, and the range of the confidence interval is also relatively stable. In only a few cases, the confidence intervals do not cover the true value, and with an increase in and , the coverage degree becomes higher and higher, i.e., the CP gradually approaches 1.
Table 3 shows the average absolute error and variance of unknown function estimators under different samples. As shown in Table 3, with an increase in the sample size and total number of periods under the same parameter setting, both the value and estimation error of the unknown function at 20 fixed lattice points decrease, indicating that the estimation of the non-parametric part is convergent. This is consistent with the theoretical results of Theorem 3.
Table 3.
The performance of unknown function estimators .
5. Conclusions
In this paper, we propose a Si-SDPD model that can overcome the “Curse of Dimensionality” problem and deal with spatial and temporal correlation. With this model, we construct a PML method when the traditional maximum likelihood estimation method is not applicable. The theoretical results show that, under certain regularity conditions, both parametric and nonparametric estimators are consistent. Numerical simulation results show that PMLEs have good small sample characteristics and estimation accuracy increases with an increase in the sample size and time periods. Our research results will enrich and improve the estimation methods of single-index panel data models in spatial econometrics, and provide a new research tool and perspective for the applied research of related disciplines.
It is a complicated problem to prove the asymptotic normality of PML estimation because the number of parameters and unknown variables in the Si-SDPD model is too large, and the representation of the covariance matrix will be very complicated. The cross-sectional heteroskedasticity (space-varying error variances) in the Si-SDPD model is another interesting extension to consider. Most estimators are generally inconsistent in the presence of an unknown form of heteroskedasticity in the disturbance term in the SDPD model, much less in the Si-SDPD model with more unknown parameters. These works would be much more challenging than the already quite challenging works presented in this paper, and will be the topics of our future research.
Author Contributions
Conceptualization, M.Z. and B.T.; methodology, M.Z. and B.T.; software, M.Z.; validation, M.Z. and B.T.; formal analysis, M.Z. and B.T.; investigation, M.Z. and B.T.; resources, M.Z.; data curation, M.Z.; writing—original draft preparation, M.Z.; writing—review and editing, M.Z. and B.T.; visualization, M.Z. and B.T.; supervision, B.T.; project administration, B.T.; funding acquisition, B.T. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by the National Natural Science Foundation of China (grant number 91646106).
Data Availability Statement
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to the requirements of related projects supported by the National Natural Science Foundation of China.
Conflicts of Interest
The authors declare no conflict of interest.
Appendix A. Some Basic Lemmas
Proof of Lemma 2.
Based on , we have . Hence,
where, using and ,
Using Lemma 1,
As and are bounded in Θ, we have that is uniform in θ and Θ. Using the fact that is bounded away from zero in Θ and ,
uniformly in . □
Proof of Lemma 3
(similar to Ref. [3]). Given and , we have
And , where
According to Lemma 1, the third term is , and is uniform in and Θ because it is a polynomial function in and Θ is a bounded set. The second term is equal to , where , which are polynomial functions of and are uniform in Θ. Using in the first term, we have
where
To prove that is uniformly equicontinuous for , the following four conditions must be true. (1) is uniformly continuous; is bounded away from zero in , so (1) is obvious. (2) is uniformly equicontinuous; we know , where falls between and . Because is UB, and is uniform in , is bounded. Hence, (2) is true. (3) The first term, i.e., , is uniformly equicontinuous; both and are bounded, with , meaning (3) is true. (4) is uniformly equicontinuous given that
and , we have
And because and are UB, (4) is true. □
Proof of Lemma 4.
In line with Lee’s study (2004), we can use a contradiction to prove the result. Firstly, we assume where and are scalars. Next, for , we need to prove that implies . If this is true, then, columns of would be linear-independent and would be nonsingular.
From (21),
where , , , and . Hence, implies
The first and third equations imply, respectively, and . By eliminating and , the second equation becomes .
Under Assumption 10, we assume that and that is nonsingular, meaning that the above formula is only true if , i.e., . The information matrix is nonsingular. □
Proof of Lemma 5.
As known from the previous definition,
From Assumption 4, is an independent, identically distributed random sequence, so is an independent, identically distributed sequence.
where is a variable substitution, and falls between and .
Similarly, = and .
Moreover, = .
According to the Khintchine law of large numbers, E and
Based on Assumption 6, when , , meaning that
□
Proof of Lemma 6.
As , we describe . Based on Assumptions 2 and 4, since and are independent, and are independent, and, . Furthermore, and denote an independent, identically distributed sequence, so
Based on Assumption 5, and , so . Then, from Lemma 5, . □
Proof of Lemma 7.
As (6),
Based on Assumptions 2 and 4,
where falls between and . And then,
Similarly,
In that way, and . This suggests that
And then, according to Assumption 4,
□
Appendix B. Proof of Theoretical Results
Proof of Theorem 1.
As , at , (10) implies . Then, we have
We consider the process , for a period t, the log-likelihood function of which is
By letting be the expectation operator for , we have
Based on information inequality, . Thus, for any . Also, is a quadratic function of and . Under the condition that is nonsingular, whenever , so is globally identified, given that is a unique maximizer of . Hence, is globally identified. Combined with uniform convergence and equicontinuity in Lemmas 3 and 4, the consistency follows. □
Proof of Theorem 2.
From the proof of Theorem 1, . When is singular, and cannot be identified from . Global identification requires that the limit of is strictly less than zero. As based on information inequality, is equivalent to
(see Lee (2004)). After and are identified, given , can be identified from . Combined with uniform convergence and equicontinuity in Lemmas 3 and 4, the consistency follows. □
Proof of Theorem 3.
As (6), we have = = = = = , so that, .
From Theorems 1 and 2, . And from Lemmas 6 and 7, . Otherwise determined by the boundedness in the assumptions, . □
References
- Ai, C.; Chen, X. Efficient estimation of models with conditional moment restrictions containing unknown functions. Econometrica 2003, 71, 1795–1843. [Google Scholar] [CrossRef]
- Baltagi, B.H.; Song, S.H.; Koh, W. Testing panel data regression models with spatial error correlation. Econometrics 2003, 117, 123–150. [Google Scholar] [CrossRef]
- Chen, J.; Gao, J.T.; Li, D.G. Estimation in partially linear single-index panel data models with fixed effects. J. Bus. Econ. Stat. 2013, 31, 315–330. [Google Scholar] [CrossRef]
- Elhorst, J.P. Specification and estimation of spatial panel data models. Int. Reg. Sci. Rev. 2003, 26, 244–268. [Google Scholar] [CrossRef]
- Wickham, H. ggplot2: Elegant Graphics for Data Analysis; Springer: New York, NY, USA, 2016. [Google Scholar]
- Jin, B.; Wu, Y.; Rao, C.R.; Hou, L. Estimation and model selection in general spatial dynamic panel data models. Proc. Natl. Acad. Sci. USA 2020, 117, 5235–5241. [Google Scholar] [CrossRef] [PubMed]
- Lee, L.F.; Yu, J. Estimation of spatial panel model with fixed effects. Econometrics 2010, 154, 165–185. [Google Scholar] [CrossRef]
- Lee, L.F.; Yu, J. A spatial dynamic panel data model with both time and individual fixed effects. Econom. Theory 2010, 26, 564–597. [Google Scholar] [CrossRef]
- Lee, L.F.; Yu, J. Some recent developments in spatial panel data models. Reg. Sci. Urban Econ. 2010, 40, 255–271. [Google Scholar] [CrossRef]
- Pang, Z.; Xue, L.G. Estimation for the single-index models with random effects. Comput. Stat. Data Anal. 2012, 56, 1837–1853. [Google Scholar] [CrossRef]
- Parent, O.; LeSage, J.P. A space-time filter for panel data models containing random effects. Comput. Stat. Data Anal. 2011, 55, 475–490. [Google Scholar] [CrossRef]
- R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2022. [Google Scholar]
- Su, L.; Yang, Z. QML Estimation of Dynamic Panel Data Models with Spatial Errors; Singapore Management University: Singapore, 2007. [Google Scholar]
- Su, L.; Jin, S.N. Profile quasi-maximum likelihood estimation of partially linear spatial autoregressive models. J. Econom. 2010, 157, 18–33. [Google Scholar] [CrossRef]
- Wang, J.L.; Xue, L.G.; Zhu, L.X.; Chong, Y.S. Estimation for a partial-linear single-index model. Ann. Stat. 2010, 38, 246–274. [Google Scholar] [CrossRef]
- Yu, J.; de Jong, R.; Lee, L.F. Quasi-maximum likelihood estimators for spatial dynamic panel data with fixed effects when both n and T are large. J. Econom. 2008, 146, 118–134. [Google Scholar] [CrossRef]
- Yu, J.; Lee, J.F. Efficient GMM estimation of spatial dynamic panel data models. J. Econom. 2010, 180, 174–197. [Google Scholar]
- Yu, J.; Lee, J.F. Estimation of unit root spatial dynamic panel data models. Econom. Theory 2010, 26, 1332–1362. [Google Scholar] [CrossRef]
- Yu, J.; de Jong, R.; Lee, L.F. Estimation for spatial dynamic panel data with fixed effects: The case of spatial cointegration. J. Econom. 2012, 167, 16–37. [Google Scholar] [CrossRef]
- Zhang, Y.Q.; Shen, D.M. Estimation of semiparametric varying-coefficient spatial panel data models with random effects. J. Stat. Plan. Inference 2015, 159, 64–80. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).