Modeling China’s Prefecture-Level Economy Using VIIRS Imagery and Spatial Methods

: Nighttime light (NTL) data derived from the Visible Infrared Imaging Radiometer Suite (VIIRS), carried by the Suomi National Polar Orbiting Partnership (NPP) satellite, has been widely used to evaluate gross domestic product (GDP). Nevertheless, due to the monthly VIIRS data ﬂuctuation and missing data (excluded by producers) over high-latitude regions, the suitability of VIIRS data for longitudinal city-level economic estimation needs to be examined. While GDP distribution in China is always accompanied by regional disparity, previous studies have hardly considered the spatial autocorrelation of the GDP distribution when using NTL imagery. Thus, this paper aims to enhance the precision of the longitudinal GDP estimation using spatial methods. The NTL images are used with road networks and permanent resident population data to estimate the 2013, 2015, and 2017 3-year prefecture-level (342 regions) GDP in mainland China, based on eigenvector spatial ﬁltering (ESF) regression (mean R 2 = 0.98). The ordinary least squares (OLS) (mean R 2 = 0.86) and spatial error model (SEM) (mean pseudo R 2 = 0.89) were chosen as reference models. The ESF regression exhibits better performance for root-mean-square error (RMSE), mean absolute relative error (MARE), and Akaike information criterion (AIC) than the reference models and e ﬀ ectively eliminated the spatial autocorrelation in the residuals in all 3 years. The results indicate that the spatial economic disparity, as well as population distribution across China’s prefectures, is decreasing. The ESF regression also demonstrates that the population is crucial to the local economy and that the contribution of urbanization is growing.


Introduction
Gross domestic product (GDP) refers to the final outcome of the production activities of all permanent units in a certain period of time, calculated according to the market price. GDP is a crucial comprehensive statistical indicator in the accounting system and is also a core indicator in the national economic accounting system. It reflects the economic strength and market size of a city (or region). As a vital indicator, the GDP serves as an important reference function for political decision-making and national development [1]. The GDP is typically estimated by the government after a series of statistical procedures, but problems of inadequate measurement and slow publication remain [2]. Therefore, new methods are needed to provide timely and precise GDP estimates.
Nighttime light (NTL) data is an effective proxy for socioeconomic activity and has been widely used in many aspects, including socioeconomic statistical index estimation, urbanization, energy consumption, and ecological environments due to its availability and veracity [3][4][5][6][7][8][9][10][11][12]. Currently, two kinds of NTL data are widely applied to estimate GDP. The NTL data from the American Defense Meteorological Satellite Program (DMSP) Operational Linescan System (OLS) has approximately 2.7 km spatial resolution and has been used for large scale and time series GDP estimation, despite its relatively coarse spatial resolution and oversaturation problems [13][14][15][16][17][18][19][20][21][22][23]. The second is the new generation NTL data, collected by the Visible Infrared Imaging Radiometer Suite (VIIRS), carried by the Suomi National Polar Orbiting Partnership (NPP) satellite and made available for free by the Earth Observation Group (EOG) of Colorado School of Mines. With a resolution of approximately 500 m, the NPP-VIIRS data offers better data quality due to the wider radiometric detection and on-board calibration [24]. There are two configurations of VIIRS monthly composite NTL data, the "vcm" version that excludes any data impacted by stray light and the "vcmsl" version that includes the stray-light corrected data. Many scientists have used VIIRS data to map socioeconomic parameters [25][26][27][28][29][30][31], especially GDP. For example, Li et al. [32] explored the potential of NPP-VIIRS imagery for modeling China's GDP in 31 provinces and 393 counties by linear regression. The VIIRS data also exhibits better performance than the DMSP-OLS data. Shi et al. [33] compared the effect of DMSP and VIIRS in 31 provinces and 268 prefectures using linear regression and showed that the corrected VIIRS data offers higher average R 2 than DMSP in estimating GDP and electric power consumption. Zhao et al. [34] produced a pixel-level GDP map in south China and proved that the quadratic polynomial relationship offers a better fit than simple linear correlations between total nighttime light and economic activity at both the prefecture and county levels.
However, there are several problems in GDP studies using VIIRS data. First, researchers have found that the values of VIIRS imagery could be influenced by many factors, such as the observing angle and human activity intensity [35][36][37]. The VIIRS vcm version monthly composite images also have data impacted by stray light excluded over high-latitude areas of the northern hemisphere in summer. The fluctuations and "missing data" may influence the longitudinal measurements, so whether the vcm monthly VIIRS data is suitable for annual regional GDP estimation needs to be examined for smaller geographical units than the province level. Relevant experiments at the prefecture or county level for mainland China are still lacking. Second, many methods applied to estimate the GDP have not taken the spatial effect into consideration, which may decrease the accuracy and level of spatial explanation. The GDP distribution is accompanied by regional disparities, which depend on the study area and scale. The reasons NTL images shed light on economic characteristics are not only their availability and low cost but also the location information. Some of the information is lost when using NTL data to analyze the GDP distribution using non-spatial methods. Third, models using the NTL data are often difficult to interpret. Nonlinear models, such as the backpropagation neural network and quadratic polynomial model, lack explanatory information for multiple variables. Both precision and interpretation need to be considered when constructing GDP models incorporating spatial effects.
To solve the aforementioned problems, the main model we used here was eigenvector spatial filtering (ESF) regression. As a general linear model, eigenvector spatial filtering regression can reduce the spatial effects in model residuals by extracting eigenvectors from the adjacent matrix and linearly combining them in a spatial filtering function. ESF regression was developed by Griffith et al. and is becoming recognized because of its accuracy and usability [38]. ESF regression has been used to model PM2.5, the normalized difference vegetation index (NDVI), and landslide risk [39][40][41][42][43].
In this paper, a method using NTL combined with permanent resident population and OpenStreetMap (OSM) road networks based on eigenvector spatial filtering regression is proposed to estimate the GDP distribution in 342 Chinese regions (mostly prefecture-level) over 3 years: 2013, 2015, and 2017. Ordinary least squares (OLS) as a classical method and the spatial error model (SEM) as a popular spatial method were chosen as reference models. The aims of this study were to examine the efficacy of using VIIRS data to build a sustainable and precise method for estimating prefecture GDP across mainland China and how the significant variables in the ESF regression varied from one region of China to another. To illustrate the importance of NTL in GDP estimation, several experiments with and without NTL data were conducted and the effect of different versions of the VIIRS monthly data on GDP estimates was estimated.

Study Area
The research scale focused on the mainland city level in China, including 329 prefectures, 4 municipalities, and 9 county-level cities, which are directly under the jurisdiction of the Hubei, Henan, and Xinjiang provinces. The study area excluded Hong Kong, Macao, and Taiwan. Several coastal cities and regions that lacked borders on the mainland, such as the city of Zhou Shan and the whole of Hainan province, were omitted because their geographical relations were difficult to define. Following these adjustments, the number of regions was 342. The study area and nighttime light image in 2013 are shown in Figure 1.

Study Area
The research scale focused on the mainland city level in China, including 329 prefectures, 4 municipalities, and 9 county-level cities, which are directly under the jurisdiction of the Hubei, Henan, and Xinjiang provinces. The study area excluded Hong Kong, Macao, and Taiwan. Several coastal cities and regions that lacked borders on the mainland, such as the city of Zhou Shan and the whole of Hainan province, were omitted because their geographical relations were difficult to define. Following these adjustments, the number of regions was 342. The study area and nighttime light image in 2013 are shown in Figure 1.

Data Collection
The research data include population and economic statistical, nighttime light image, land cover, and OSM data. The population data refers to the actual population that lived in an area for a certain period (more than half a year). The 2013 prefecture-level city population and GDP data were relatively easy to collect as they were included in the 2014 China Statistical Yearbook for Regional Economy (http://tongji.cnki.net/kns55/Navi/HomePage.aspx?id=N2015070200&name=YZXDR&floor=1). Most of the 2015 and 2017 statistical data can be collected from the corresponding provincial statistical yearbooks, which can be downloaded from the China National Knowledge Infrastructure (CNKI) (http://tongji.cnki.net/kns55/), while population and GDP data in many remote regions were obtained from local statistics bureau websites.
The original "vcm" and "vcmsl" version monthly nighttime light (NTL) data were obtained from the Earth Observation Group, Colorado School of Mines (https://payneinstitute.mines.edu/eog/viirs/). The NTL data used here was the "vcm" NPP-VIIRS day/night band (DNB) cloud-free monthly average data, which had been filtered for stray light. Monthly composite NPP-VIIRS average radiance images for the years 2013, 2015, and 2017 were acquired to match the time window used for this project.

Data Collection
The research data include population and economic statistical, nighttime light image, land cover, and OSM data. The population data refers to the actual population that lived in an area for a certain period (more than half a year). The 2013 prefecture-level city population and GDP data were relatively easy to collect as they were included in the 2014 China Statistical Yearbook for Regional Economy (http://tongji.cnki.net/kns55/Navi/HomePage.aspx?id=N2015070200&name=YZXDR&floor=1). Most of the 2015 and 2017 statistical data can be collected from the corresponding provincial statistical yearbooks, which can be downloaded from the China National Knowledge Infrastructure (CNKI) (http://tongji.cnki.net/kns55/), while population and GDP data in many remote regions were obtained from local statistics bureau websites.
The original "vcm" and "vcmsl" version monthly nighttime light (NTL) data were obtained from the Earth Observation Group, Colorado School of Mines (https://payneinstitute.mines.edu/eog/viirs/). The NTL data used here was the "vcm" NPP-VIIRS day/night band (DNB) cloud-free monthly average data, which had been filtered for stray light. Monthly composite NPP-VIIRS average radiance images for the years 2013, 2015, and 2017 were acquired to match the time window used for this project.
The 2013 and 2015 global land cover (LC) data were produced by the European Space Agency (ESA) Climate Change Initiative (CCI). The CCI-LC data has a 300 m spatial resolution from 1992 to 2015 (http://maps.elie.ucl.ac.be/CCI/viewer/download.php).
The 2014, 2016, and 2018 January road lengths were obtained from the OSM (http://download. geofabrik.de/asia/china.html) and used to represent the 2013, 2015, and 2017 end-of-year road lengths.
The administrative boundary map was downloaded from the Resource and Environment Data Cloud Platform (http://www.resdc.cn/Default.aspx), which includes some towns.

Dataset description
The chosen independent variables are the permanent resident population of the prefecture-level regions, denoted as Permanent; the average brightness of the region is denoted as MEAN; and the road length is denoted as Road. The calculation of the MEAN variable is shared in the next section.  The tables show that all of the elements displayed have experienced varying rates of growth. The GDP and road length increased quickly, while the population remained more stable.

Methodology
The experiments included 3 steps: (1) the pre-processing of the input data; (2) the construction of the ESF regressions; and (3) the selection of the reference models and evaluation metrics.

Data Pre-Processing
To calculate the regional mean brightness, denoted as MEAN, original images need calibration. The 2013, 2015, and 2017 annual data were calculated from monthly images. Although the Colorado School of Mines Earth Observation Group (EOG) released 2015 annual NTL images, we chose to use monthly data to maximize data consistency. After assessing the quality of the NPP-VIIRS monthly data, we found that the May to August images contained missing data in high latitude regions, such as northeastern China. Since the study extent is the Chinese mainland, vcm version NPP-VIIRS data for the remaining 8 months were averaged, and the null values and negative values were converted to 0. The 12-month calibrated VIIRS data (vcmsl version) were processed in the same way and the results of the difference experiments are discussed in Section 4.1. According to previous research, a ceiling value needs to be obtained by comparing the highest value of the most economically advanced metropolitan area, such as Beijing, Shanghai, or Guangzhou [28]. The ceiling values for the 3 years of images and their corresponding city names are shown in Table 4. The water body and snow mask were derived from CCI-LC data to eliminate the moonlight reflection effects of the snow and water, and the 2015 land cover data were used with 2015 and 2017 annual VIIRS images because it provided the closest ICC-LC data to 2017. Finally, the regional average brightness was calculated using the calibrated nighttime images: where DN i refers to the region and n denotes the total pixel numbers in each of the administrative regions. When extracting roads from the OSM, it should be specified that the whole road length includes primary roads, secondary roads, tertiary roads, cycle ways, and service roads. The OSM data were reprojected from the WGS-84 geographic coordinate system to the Asia North Albers Equal Area geographical coordinate system to calculate lengths of the roads included in the 2013 China road network data reproduced in Figure 2.
Remote Sens. 2020, 11, x FOR PEER REVIEW 5 of 19 to 0. The 12-month calibrated VIIRS data (vcmsl version) were processed in the same way and the results of the difference experiments are discussed in Section 4.1. According to previous research, a ceiling value needs to be obtained by comparing the highest value of the most economically advanced metropolitan area, such as Beijing, Shanghai, or Guangzhou [28]. The ceiling values for the 3 years of images and their corresponding city names are shown in Table 4. The water body and snow mask were derived from CCI-LC data to eliminate the moonlight reflection effects of the snow and water, and the 2015 land cover data were used with 2015 and 2017 annual VIIRS images because it provided the closest ICC-LC data to 2017. Finally, the regional average brightness was calculated using the calibrated nighttime images: where refers to the region and denotes the total pixel numbers in each of the administrative regions.
When extracting roads from the OSM, it should be specified that the whole road length includes primary roads, secondary roads, tertiary roads, cycle ways, and service roads. The OSM data were reprojected from the WGS-84 geo  Logarithmic transformation was used to process all of the collected variables, since OLS regression requires normally distributed data. For the comparability of different variables, all of the collected independent variables in models were standardized. The 2015 and 2017 data were processed in the same way as the 2013 data.

ESF Regression Construction
The ESF regression uses the linear combination of selected eigenvectors calculated from the spatial weights matrix to filter out the spatial autocorrelation. The basic equation of the ESF regression is as follows: where Y is an n × 1 vector of the dependent variables, X is an n × p matrix containing the independent factors, E is an n × m matrix consisting of m filtered eigenvectors, α and β are the corresponding regression coefficients, and ε is an independent and identically distributed random error.
Building the GDP ESF regression model includes the following procedures: (1) Building a spatial weights matrix according to the adjacent relationships of the 342 regions.
(2) Centering the spatial weights matrix and calculating the eigenvectors, as well as the corresponding eigenvalues. (3) Selecting the eigenvectors by two methods. First, by singling out the eigenvectors, whose corresponding eigenvalues divided by the maximum eigenvalue were larger than 0.1 [43,44], which means selecting the most important eigenvectors. The second method used a stepwise procedure with all of the covariates to simplify the model [45,46]. The independent variables, such as average brightness or population, will be dislodged with this method if they are not reasonable. (4) Calculating the regression coefficients in the OLS estimate.
The rook criterion, meaning that the different city regions were defined as adjacent if they shared a common side, was used to construct the spatial weights matrix C 0 . Because of the irregular regional boundaries, the neighbor relation is almost the same, regardless of whether queen or rook adjacency rules are used. It is imperative to center the spatial weights matrix to calculate the eigenvectors. C 1 is the matrix after C 0 is centered, as follows: where 1 is an n × 1 vector of ones, T is the matrix transpose operator, such that 11 T equals an n × n matrix, whose elements are one and n is 342, and I is an n × n identity matrix, whose diagonal elements are 1 and the others are 0.
The final ESF regression model with the best model fitness had the following form: where β 0 is the intercept, β i (i = 1,2,3) are the regression coefficients, E k is an n × k matrix consisting of the selected eigenvectors, and β k E k represents the spatial autocorrelation.

Reference Model and Evaluation Standards
Several models were chosen to compare model performances. The classic ordinary least squares (OLS) is a non-spatial method and was chosen to compare spatial and non-spatial models. The spatial error model (SEM) was chosen to represent spatial methods that endeavor, like ESF, to eliminate the spatial autocorrelation in the regression residuals.
The SEM produces pseudo − R 2 values, which cannot be compared with R 2 . The model fitness was evaluated from five perspectives: R 2 and adjusted R 2 , root-mean-square error (RMSE), mean absolute relative error (MARE), the Akaike information criterion (AIC), and Morans' I for the regression residuals. Although the GDP used a logarithmic transformation, to make the results more intuitive, the calculation of RMSE and MARE used the predicted GDP values as follows: where GDP estimate i is the estimated GDP value in the ith region and GDP i is the official statistical GDP value of the corresponding region. RE i is the absolute value of regional relative errors. The Morans' I of the residuals is used to measure the spatial autocorrelation in the residuals. In addition to the above standards, leave-one-out cross validation was also needed to test the model robustness.

Correlation Analysis and Multi-Collinearity test
To test whether or not the independent variables are qualified to estimate the GDP, the Pearson correlation coefficients between the GDP and other covariates were calculated. Table 5 shows that all of the chosen independent variables passed the significance test, and the correlation results demonstrate that the covariates had a strong relationship with the GDP. The minimum correlation value was 0.68 between the GDP and road length. The average correlation values of Permanent, MEAN, and Road with the GDP were 0.83, 0.70, and 0.70, respectively, showing that the GDP and population had the closest connection. Generally, all the chosen variables were qualified to estimate the prefecture-level GDP.  Both the ESF and OLS regression models passed the multi-collinearity test in each of the 3 years, as the variance inflation factor (VIF) values shown in Tables 6 and 7. The results in this pair of tables shows that none of the VIF values were larger than 10. This means that the two linear models passed the multi-collinearity test in all 3 years and that these coefficients were able to explain the influence between the GDP and independent variables.

Regression results
The top 3 ESF, OLS, and SEM regression models are summarized in Equations (8) where E k is the eigenvectors that were selected to filter the spatial autocorrelation, so the residual ε could satisfy the independent identity distribution requirement. The β k × E k in ESF regressions can filter the spatial effect out of the residuals and the coefficients of the collected variables included in the aforementioned results were significant (α = 0.001, two-tailed test).

Model Fitness
The various metrics noted earlier can be used to assess different aspects of model performance and to decide which model is the best for estimating mainland China GDP at the prefecture level. The RMSE results reproduced in Tables 8-10 use 10 8 yuan.  Although pseudo − 2 and 2 cannot technically be compared because their statistical methods are different, they are still included in the tables, since they can reveal the difference between the ESF and OLS regressions.
From Tables 8-10, it is difficult to obtain intuitive knowledge of regression fitness. The visualization of the models is more telling. All of the regression outputs for 3 years are displayed in     Take Figure 4 as an example, it shows the three models with different regression characteristics in 2015. All of the GDP distribution images used the same segmentation scale, so the corresponding   Take Figure 4 as an example, it shows the three models with different regression characteristics in 2015. All of the GDP distribution images used the same segmentation scale, so the corresponding Take Figure 4 as an example, it shows the three models with different regression characteristics in 2015. All of the GDP distribution images used the same segmentation scale, so the corresponding color is more illustrative, and the four images can be directly compared. The OLS fit was poor in many regions, such as the inner Mongolia region, Yunnan Province, and mid-south of China. The SEM is accurate in most regions, except those in the west. The best regression result comes from the ESF model, which, for the most part, reduces differences to the original GDP distribution with only parts of the southeastern regions not fitting reality.

Cross Validation.
Cross validation is necessary because the model needs to simulate the unknown situation to a certain extent, and this provides a test of model robustness. Tables 11 and 12 show the leave-one-out cross validation (LOOCV) for the RMSE and MARE results of the OLS and ESF models, because the leave-one-out cross validation cannot be processed for the SEM model. The RMSE units are 10 8 Chinese yuan (as before). The results in this pair of tables show that the ESF regression is better than OLS in terms of RMSE and MARE in all 3 years, proving the robustness of the ESF regression. Moreover, there is a trend of quickly declining accuracy for the OLS, while the MARE of the ESF regression is relatively stable.

VIIRS Data Evaluation
During the NPP-VIIRS data processing, the May to August monthly average composite images were filtered for missing data and the three annual images were derived from the other months. As shown in Tables 2 and 3, the average brightness of the 2015 annual image was slightly higher than that of the 2013 annual image. The mean value of MEAN increased rapidly from 1.16 to 1.44 in the 2017 annual image.
The NTL data was obviously important to GDP estimation. The same experiments were processed without the NTL variable and it turned out that nearly all of the accuracy metrics for the three models declined. The results are summarized in Appendix A Tables A1-A5 and include the leave-one-out cross validations. Although the ESF regression could maintain the R 2 at a relatively high level, the MARE and RMSE of all methods increased dramatically, which indicates the importance of NTL data. Tables A4 and A5 suggest that the model robustness dropped sharply because the minimum value of the ESF regression LOOCV MARE was >30%, while the maximum MARE value of the ESF regression with the NTL variable was only 23.92%. These differences show that the NTL data was indispensable to GDP estimation in our research.
The MEAN factor was calculated from the 12 months' vcmsl version VIIRS data using the same procedures, and the results of the parallel experiments for 2015 and 2017 are shown in Appendix A Tables A6-A8. These results reveal that the vcmsl version NTL is appropriate for evaluating the GDP distribution across China. The vcm version performed a little better than the vcmsl version in terms of RMSE and MARE. For example, in 2015 and 2017 regressions, 10 of the 12 regression accuracy metrics, as well as 5 of the 8 LOOCV metrics of vcm version's RMSE and MARE, were a little better than the vcmsl version results. But overall, the differences between the NTL versions were much smaller than those between models and ESF regression performed better than other methods.
Although the vcm version VIIRS data contain fluctuations and "missing data", it is still possible to estimate GDP at the prefecture scale across multiple years in combination with other variables, and it has great impact on GDP estimates. From Tables 9-11, it is evident that the three models have similar fitness for different years. Apart from the AIC, which cannot be compared over the years, and the RMSE, which increased due to the GDP growth, the other aspects of each model displayed similar accuracies. For instance, the maximum difference for R 2 of the OLS in the 3 years was only 0.02 and for Ad justed R 2 was 0.03. The spatial models were more stable. The ESF regression, for example, maintained R 2 at 0.98 in all 3 years. Consequently, the main difference of the study came from the model instead of NPP-VIIRS, proving the ability of these data to estimate GDP over multiple years at the prefecture scale over all of mainland China.

Accuracy Assesment
In terms of overall model fitness, spatial models made different degrees of progress compared to OLS, and the ESF regression achieved significant improvement over the other models. In 2013, the R 2 of the ESF regression was 0.98 and the Ad justed R 2 was 0.96. Compared to the OLS regression, the ESF regression improved 12.64% in R 2 and 10.34% in Ad justed R 2 . The SEM model showed progress compared to OLS by reducing AIC values from 366.13 to 285.75. The difference between the AIC values in the SEM and ESF regression models was only 28.49. The improvement was also noticeable in the other 2 years.
In terms of regression error, the precision of ESF regression was obvious. The average RMSEs of the OLS, SEM, and ESF models were 1135.63, 1047.96, and 463.41, respectively. The average MARE of the ESF regression over the 3 years was 12.57% compared to 29.05% for SEM. This difference means that the ESF regression was better for estimating the prefecture-level GDP in China. Interestingly, the ESF regression was very stable, given that the 3-year ESF regressions of some areas obtained similar MARE results. The MARE of over 200 regions was less than 10% for 3 years and 11 repeat regions, whose MARE was larger than 30% over the 3 years. The average regional RE i (Equation (4)) values were calculated for the three models and are shown in Figure 6. All images used the same segmentation scale, and it was observed that larger the light areas were, the better the regression effect. The ESF regression obviously maintained relative errors <30% in most regions, while the other methods did not work well. In terms of the maximum relative error, the OLS and SEM both reached over 200% compared to 83.78% for the ESF regression.
The last accuracy metric was Morans' I, which measures the spatial autocorrelation in the regression model residuals. The OLS had the largest residual Morans' I, since the nonspatial model could not remove the spatial effects in the residuals. All of the spatial regressions had better outcomes than OLS. Over 3 years, the SEM could reduce the Morans' I to nearly 0, and, although the ESF regression was not as good as the SEM, it still eliminated most of the spatial effects in the residuals by lowering the absolute values of Morans' I values 82.86, 82.76, and 81.25% from OLS.
Although the ESF regression did not have the best performance in eliminating the spatial autocorrelation effects in the residuals, it was a close second. The ESF regression had great accuracy in R 2 , Ad justed R 2 , AIC, RMSE, and MARE over the 3 years that were used. To conclude, the ESF regression had the best fitness of the chosen models here.

Variable Coefficients Analysis
The three variables in all of the regressions positively correlated with prefecture GDP. However, the individual influences were very different. Different aspects can be discussed to explain the contribution or potential meaning of the independent variables. All of the chosen variables were standardized so that they could be compared using their coefficients.
The coefficient of population was the largest, with the exception of the 2017 ESF regression, which suggests that Permanent was the most important variable. The coefficients of population increased from 0.66 to 0.72 over the three OLS regressions. The population of permanent residents undoubtedly influenced local economic development. This may explain why many cities, such as Wuhan, Hangzhou, and Chengdu, have sought to improve city competitiveness by trying to attract people to work and live there.
The NTL data has a strong relationship with local urbanization [47]. The average regional brightness can reveal the regional construction activity, and we use the MEAN factor to represent urbanization. The MEAN factor has the second largest coefficient in most of the regressions, suggesting the importance of urbanization level to economic activity. According to the 3-year ESF regression results, the coefficient of the MEAN increased, which indicates that the contribution of urbanization to regional economic activity is increasing.
There is an interesting common point that the sum of the coefficients of the Mean and Permanent increase with time in all three models. The sum increased from 0.99 to 1.11 to 1.13 for the ESF regression, from 1.11 to 1.14 and then to 1.19 for the OLS regression, and from 1.06 to 1.10 and then to 1.14 for the SEM regression. This situation may not have been a coincidence. The combination of these two variables contributed to most of the variables, meaning that population and urbanization were significant in GDP development. These factors play increasingly important roles in regional economic activity, which is consistent with the reality of the politics of attracting talent, especially in second-tier cities with continuous urban construction. These increasing coefficients may demonstrate that the policy will continue to rely on talent recruitment and city construction to maintain economic growth.

Spatial Autocorrelation Analysis
The regional results point to development inequalities. The GDP is a fine indicator of regional development degree. Morans' I was used to explore whether the GDP distribution in China is accompanied by spatial autocorrelation, and the results are summarized in Table 13. The mean Morans' I of China prefecture GDP for the 3 years is 0.25, showing that the distribution of prefecture GDP exhibits a positive spatial autocorrelation relationship. The economically advanced regions are clustered in the southern and eastern coastal zones, and the same is evident in the relatively underdeveloped areas, such as the Tibet and Xinjiang regions. It could explain why the two spatial models performed better than OLS, since spatial effects had an impact on model accuracy. However, the Morans' I value declines from 0.26 to 0.23 from 2013 to 2017, indicating that China's prefecture development inequalities are slowly declining.
The population shows the same trend as GDP (Table 14). From 2013 to 2017, the Morans' I declined from 0.28 to 0.23. The positive autocorrelation degree of permanent resident population across the prefecture regions is declining.

Conclusions
Despite the presence of signal fluctuations and missing data in the vcm version NPP-VIIRS monthly cloud free average composite data, it is still suitable to estimate prefecture GDP across all of mainland China and multiple years. Our study indicates that the main differences in the results depends on the choice of model instead of the VIIRS monthly data version.
The main advantages of the ESF regression in estimating the GDP is its reliability and explanatory power. The reliability of the ESF regression in estimating the GDP includes precision and stability. The mean R 2 of the ESF regression is approximately 0.98, and the MARE varies from 11.83 to 13.83% in 2013 and 2017, respectively. The cross validation also proved the robustness of the ESF regression. In addition, the method was very stable over the 3 years, which means that the ESF regression could be used in time series studies.
With reliable results, the interpretation of the GDP distribution is important because it reveals the contribution of the variables over time. The results of the ESF regression demonstrate that the population is crucial to local economic activity and that the contribution of urbanization is growing. Furthermore, the spatial autocorrelation of China prefecture GDP distribution and permanent resident population decreased from 2013 to 2017, indicating that regional economic disparities are weakening.
Since the relationship between NTL and GDP is very complicated and the GDP distribution is influenced by various factors, future work will concentrate on improving the precision of GDP ESF regression by including more accurate independent variables to describe economic structures and land use. We also plan to use more precise data sources, such as Baidu map or Amap, not only to promote accuracy but also to provide real-time updates.
Author Contributions: J.C. collected and processed the data, performed analysis, and wrote the paper. J.P.W. helped to write and edit the article. J.C., Y.C., and H.T. analyzed the results. J.Y. and Z.X. contributed to the validation. All authors have read and agreed to the published version of the manuscript.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
Tables A1-A5 summarize the results for the experiments without the NTL variable. Tables A6-A8 summarize the results using the two versions of the VIIRS monthly data for GDP estimation.