Estimation and Analysis of PM2.5 Concentrations with NPP-VIIRS Nighttime Light Images: A Case Study in the Chang-Zhu-Tan Urban Agglomeration of China

Rapid economic and social development has caused serious atmospheric environmental problems. The temporal and spatial distribution characteristics of PM2.5 concentrations have become an important research topic for sustainable social development monitoring. Based on NPP-VIIRS nighttime light images, meteorological data, and SRTM DEM data, this article builds a PM2.5 concentration estimation model for the Chang-Zhu-Tan urban agglomeration. First, the partial least squares method is used to calculate the nighttime light radiance, meteorological elements (temperature, relative humidity, and wind speed), and topographic elements (elevation, slope, and topographic undulation) for correlation analysis. Second, we construct seasonal and annual PM2.5 concentration estimation models, including multiple linear regression, support random forest, vector regression, Gaussian process regression, etc., with different factor sets. Finally, the accuracy of the PM2.5 concentration estimation model that results in the Chang-Zhu-Tan urban agglomeration is analyzed, and the spatial distribution of the PM2.5 concentration is inverted. The results show that the PM2.5 concentration correlation of meteorological elements is the strongest, and the topographic elements are the weakest. In terms of seasonal estimation, the spring estimation results of multiple linear regression and machine learning estimation models are the worst, the winter estimation results of multiple linear regression estimation models are the best, and the annual estimation results of machine learning estimation models are the best. At the same time, the study found that there is a significant difference in the temporal and spatial distribution of PM2.5 concentrations. The methods in this article overcome the high cost and spatial resolution limitations of traditional large-scale PM2.5 concentration monitoring, to a certain extent, and can provide a reference for the study of PM2.5 concentration estimation and prediction based on satellite remote sensing technology.


Introduction
In recent years, with the rapid development of China's industrialization and urbanization, air quality problems have become increasingly intensified. In 2012, the Chinese government included PM 2.5 concentration as an important pollution source indicator in the national environmental air quality standards [1,2]. PM 2.5 can remain in the air for a long time, which will not only cause serious environmental problems, such as haze [3][4][5][6][7][8], but will also have a certain negative impact on meteorological changes, and it also has many health effects, such as premature mortality [9,10], hypertension [11], burden of disease [12,13], and The Chang-Zhu-Tan urban agglomeration is located in the middle-eastern part of Hunan Province (Figure 1). It has a mid-subtropical monsoon climate with four distinct seasons, short winters, long summers, and abundant rainfall. As the core growth pole of economic development in Hunan Province, the Chang-Zhu-Tan urban agglomeration industry has achieved rapid development in recent years [46]. At the same time, the problem of air pollution has become increasingly prominent, and the concentration of various air pollutants in the urban agglomeration remains high [47,48]. The air quality level ranks last in the province year round. Regional air pollution seriously affects public health and ecological safety, and the serious haze problem has also attracted great attention from all walks of life [49]. In recent years, the relevant air pollution control measures of the Chinese government have resulted in a significant decrease in the PM 2.5 concentration in the Chang-Zhu-Tan urban agglomeration, effectively improving the air quality of the urban agglomeration [50].

Data Sources
The data for this research include PM2.5 concentration data, meteor NPP-VIIRS nighttime light images, and Shuttle Radar Topography Mission ital elevation model (DEM) data in the Chang-Zhu-Tan urban agglomeratio 2018.
PM2.5 concentration data: The PM2.5 concentration data used in this arti the national urban air quality real-time release platform of the China E Monitoring Station (CEMS. http://106.37.208.233:20035/ (accessed on 15 O The quarterly and annual average PM2.5 concentrations were derived fro monitoring data of 24 ambient air quality assessment monitoring p Chang-Zhu-Tan urban agglomeration (Figure 2a). In order to ensure the a tinuity, and integrity of PM2.5 concentration measurement data, the Chines stipulates that, when automatic monitoring equipment is used for monitor itoring equipment needs to run continuously, 365 days a year. The daily av concentration measurements requires at least 20 h of average concentrat adoption time. The PM2.5 concentration measurement data in this paper ar the continuous automatic monitoring method. The Chinese government s the PM2.5 automatic monitoring method with different principles can on measure PM2.5 if it is consistent with the monitoring results of the manu method. Therefore, the PM2.5 concentration measurement values used in subject to strict quality control and are effective.

Data Sources
The data for this research include PM 2.5 concentration data, meteorological data, NPP-VIIRS nighttime light images, and Shuttle Radar Topography Mission (SRTM) digital elevation model (DEM) data in the Chang-Zhu-Tan urban agglomeration in 2015 and 2018. PM 2.5 concentration data: The PM 2.5 concentration data used in this article came from the national urban air quality real-time release platform of the China Environmental Monitoring Station (CEMS. http://106.37.208.233:20035/ (accessed on 15 October 2019)). The quarterly and annual average PM 2.5 concentrations were derived from the hourly monitoring data of 24 ambient air quality assessment monitoring points in the Chang-Zhu-Tan urban agglomeration (Figure 2a). In order to ensure the accuracy, continuity, and integrity of PM 2.5 concentration measurement data, the Chinese government stipulates that, when automatic monitoring equipment is used for monitoring, the monitoring equipment needs to run continuously, 365 days a year. The daily average of PM 2.5 concentration measurements requires at least 20 h of average concentration values or adoption time. The PM 2.5 concentration measurement data in this paper are obtained by the continuous automatic monitoring method. The Chinese government stipulates that the PM 2.5 automatic monitoring method with different principles can only be used to measure PM 2.5 if it is consistent with the monitoring results of the manual gravimetric method. Therefore, the PM 2.5 concentration measurement values used in this paper are subject to strict quality control and are effective. and provide more spatial details of human activities [55,56].
SRTM DEM data: The DEM data of the experimental area came from the SRTM data of the U.S. Space Shuttle Endeavour. This dataset was based on the latest SRTM V4.1 data, through collation and splicing, to generate 90 m resolution DEM data (Figure 2a). Topography not only affects the spatial distribution of pollutant emissions by affecting the intensity of human activities but also has a profound impact on the diffusion of PM2.5, which is an important factor affecting the spatial distribution of PM2.5 [43,57].

Correlation Analysis between Remote Sensing Data and PM2.5 Concentration
Based on the theory of radiative transmission, the relationship model between nighttime light radiance and PM2.5 concentration in the near-surface layer can be constructed [40]. First, it is assumed that there is no change in the distribution of surface features (especially buildings and city lights) around the ground air quality monitoring site. Then, there is the nighttime light radiance, after reflection/scattering by various physical media from lights emitting upwards, from what is considered a Lambertian body, which is a constant with spatial differences [40]. Assuming negligible multiple scattering from aerosols, the nighttime light radiance reaching the sensor follows Beer's Meteorological data: The meteorological data came from the National Meteorological Science Data Sharing Service Platform (NMSDSSP. http://data.cma.cn. (accessed on 15 October 2019)) and mainly include precipitation, temperature, relative humidity, and wind speed. The quarterly and annual average weather data came from the daily average values of meteorological stations in the Chang-Zhu-Tan urban agglomeration (Figure 2a). Meteorological factors have a great impact on the spatial distribution of PM 2.5 in the Chang-Zhu-Tan urban agglomeration [47]. The meteorological information of the air quality monitoring stations comes from four ground meteorological stations. Since the air quality monitoring stations are distributed in the plain area and are concentrated near the four ground meteorological stations, the uniformity of meteorological factors in a small range is considered [51]. Therefore, it is feasible that the meteorological information of the air quality monitoring station comes from four ground meteorological stations in this study.
NPP-VIIRS nighttime light images were obtained from the Earth Observation Group (EOG). This article used the monthly data from NPP-VIIRS nighttime light images in 2015 and 2018, with a resolution of 500 m (Figure 2b). The monthly nighttime light image was composed of the cloudless nighttime light image of the month, which was the average radiation image. The monthly nighttime light images were also processed with stray light correction. The processed monthly NPP-VIIRS nighttime light images can effectively monitor the status quo of regional socioeconomic development [52][53][54]. Nighttime light images can effectively reflect the development status of human society and provide more spatial details of human activities [55,56].
SRTM DEM data: The DEM data of the experimental area came from the SRTM data of the U.S. Space Shuttle Endeavour. This dataset was based on the latest SRTM V4.1 data, through collation and splicing, to generate 90 m resolution DEM data (Figure 2a). Topography not only affects the spatial distribution of pollutant emissions by affecting the intensity of human activities but also has a profound impact on the diffusion of PM 2.5 , which is an important factor affecting the spatial distribution of PM 2.5 [43,57].

Correlation Analysis between Remote Sensing Data and PM 2.5 Concentration
Based on the theory of radiative transmission, the relationship model between nighttime light radiance and PM 2.5 concentration in the near-surface layer can be constructed [40]. First, it is assumed that there is no change in the distribution of surface features (especially buildings and city lights) around the ground air quality monitoring site. Then, there is the nighttime light radiance, after reflection/scattering by various physical media from lights emitting upwards, from what is considered a Lambertian body, which is a constant with spatial differences [40]. Assuming negligible multiple scattering from aerosols, the nighttime light radiance reaching the sensor follows Beer's law. Assuming that there is a good and stable aerosol extinction coefficient profile structure in the boundary layer at night, and PM 2.5 is uniformly mixed at the effective height, the relationship between PM 2.5 and nighttime light radiance can be established [40]. In this paper, the average value of nighttime light, 2 km around the environmental detection site, was extracted as its nighttime light radiance value.
Meteorological elements are important factors influencing the changes in PM 2.5 concentration [44,[58][59][60]. Wang et al. [58] discussed whether meteorological elements can affect PM 2.5 concentrations and found that meteorological elements, such as humidity and air temperature, can affect the temporal and spatial distributions of PM 2.5 concentrations. In addition, topographic elements affect the change in regional PM 2.5 concentration to a certain extent [43,45,57]. He et al. [45] added the information extracted from DEM data to the PM 2.5 estimation model, and the results showed that the model with topography, meteorology, and other elements can better estimate PM 2.5 concentrations. Therefore, the PM 2.5 concentration estimation model that takes into account the influence of multiple factors, such as weather and topography, at the same time can obtain higher-precision PM 2.5 concentration simulation results. Therefore, the characteristic factors determined in this paper include nighttime light radiance I, elevation E, slope S, precipitation R, temperature T, relative humidity RHU, and wind speed W.

Selection of Characteristic Factors for the PM 2.5 Concentration Estimation Model
The correlation analysis was carried out by constructing a partial least squares model of Factor Set A and PM 2.5 concentration. The partial least squares method uses the algorithm of decomposing and screening the data information in the model, extracts the comprehensive variable with the strongest explanatory power for the dependent variable, and can calculate the importance of each factor. The partial least squares method can better solve the factor collinearity problem and obtain more objective and accurate factor importance results [61]. The variable importance in projection (VIP) value of partial least squares is used as the factor importance result [62], and the VIP value calculation formula is as follows: where: V IP j is the VIP value of the j-th variable; p is the number of variables participating in the analysis; h is the number of iteration calculations; c 2 k t k t k is the interpretation of the dependent variable from the k-th independent variable mapping result interpretation degree; w 2 jk is the weight of variable j in the k-th iteration.

Construction of the PM 2.5 Concentration Estimation Model
Simple models have limitations in simulating complex geographic phenomena, with multiple factors, at high precision [63]. Zhang et al. [63] found that simple models cannot effectively estimate the spatial distribution of PM 2.5 concentrations affected by multiple factors. In this paper, referring to the research results of Wang et al. [40], a multiple linear regression model was selected to construct the PM 2.5 concentration estimation Model I of the Chang-Zhu-Tan urban agglomeration. There are 24 air quality monitoring stations in the Chang-Zhu-Tan urban agglomeration.
where: PM 2.5 is the estimated PM 2.5 concentration of the air quality monitoring site; X 1 , X 2 , and X n are the 1st, 2nd· · · nth estimated model factors, respectively; β 1 , β 2 , and β n are the regression coefficients of each model, respectively. When there is no definite estimation method of PM 2.5 concentration, the application of machine learning can extract key feature information to find the relationship between known datasets, and the machine model trained with a large amount of data can be used for accurate prediction. Machine learning methods have been increasingly used in socioeconomic parameter estimation and geographic phenomenon inversion, and there have also been related studies using machine learning methods for PM 2.5 concentration estimation. Among them, there are many studies on the use of random forest models for PM 2.5 concentration estimation [64][65][66], and other machine learning models are gradually applied to PM 2.5 concentration estimation [67,68]. Based on the PM 2.5 concentration data from ground stations and the known data of nighttime light radiance I, elevation E, slope S, precipitation R, temperature T, relative humidity RHU, and wind speed W, three machine learning PM 2.5 concentration estimation models were constructed in this paper: random forest Model II, support vector machine Model III, and Gaussian process regression Model IV. These three models are more commonly used and more mature machine learning regression models. Each of them has some advantages. For example, support vector machines can solve machine learning problems with small samples and can find the nonlinear relationship between variables well. For unbalanced data sets, ensemble trees can balance errors to a certain extent. Gaussian process regression can quantify the prediction uncertainty in a principled way.
In this paper, the three machine learning estimation models were trained with multiple samples, and the fivefold cross-validation method was used to test model accuracy. Finally, the model parameters, when the goodness of fit (R 2 ) of the model is the highest, are determined. According to the training results, the important parameters of the machine learning model with the highest R 2 are selected (see Table 1). Among them, the parameter of random forest Model II is the minimum leaf size, and the parameters of Model III support vector machine and Model IV Gaussian process regression are the kernel function.

Importance Analysis of PM 2.5 Concentration Estimation Model Factors
To explore the influence of characteristic factors on the model estimation results, nighttime light radiance, elevation, slope, precipitation, air temperature, relative humidity, and wind speed were selected as Factor Set A. In addition, the more relevant feature factors from the Factor Set A were selected as Factor Set B. Finally, the precipitation, temperature, relative humidity, and wind speed of commonly used meteorological elements were selected from Factor Set A as Factor Set C.
In this paper, the partial least squares method was used to analyze the importance of the model factors. The VIP score of each factor obtained by the formula (1) determines the correlation between the factor and the PM 2.5 concentration. The results showed that (Table 2) four meteorological factors (air temperature T, relative humidity RHU, precipitation R, and wind speed W) had high VIP scores. The mean VIP scores of quarterly and annual were 1.552, 0.795, 0.835, and 1.100, respectively. The air temperature T factor is the most important factor affecting the temporal and spatial distribution of PM 2.5 concentration. There was a high correlation between nighttime light radiance I and PM 2.5 concentration, with an average VIP score of 0.504. The topographic factors (elevation E and slope S) had a low correlation with the PM 2.5 concentration, with average VIP scores of 0.320 and 0.304, respectively. Therefore, this paper selected temperature T, relative humidity RHU, precipitation R, wind speed W, and nighttime light radiance I as Factor Set B.

The Results and Accuracy Evaluation of the PM 2.5 Concentration Estimation Model for the Chang-Zhu-Tan Urban Agglomeration
Based on the multiple linear regression model and three machine learning regression models, combined with the environmental monitoring site data of the Chang-Zhu-Tan urban agglomeration, model verification was carried out for the four seasons as well as annually (see Tables 3 and 4).
Since the temporal and spatial distribution of PM 2.5 concentration is a complex geographical phenomenon, the variation law of PM 2.5 concentration, under the action of multiple factors, may be different in different time periods. Therefore, this paper considers selecting a variety of models to analyze the relationship between PM 2.5 concentration and factors, in order to improve the estimation accuracy of PM 2.5 concentration. The results showed that there were obvious differences in the estimation results of PM 2.5 concentration models in different seasons, among which the PM 2.5 concentration estimation model in spring had the worst results, and the R 2 value was significantly lower than those from the other three seasonal and annual estimation models. The annual estimation model had the best effect, followed by the winter, summer, and autumn estimation models, which had similar effects.
There were also obvious differences in the estimation effects of different models. The multiple linear regression models had better estimation results for the seasonal PM 2.5 concentration, while the machine learning model had better estimation results for the annual PM 2.5 concentration. The number of sample points for the construction of seasonal and annual PM 2.5 concentration estimation models was different. The number of sample points for seasonal PM 2.5 concentration was small, only one-fourth of the number of annual PM 2.5 concentration sample points, resulting in opposite results in the season and year for PM 2.5 concentration estimation accuracy based on multivariate linear and machine learning models.
The effect of the estimation model of Factor Set B was obviously better than that of Factor Set C, indicating that adding nighttime light image information can effectively improve the performance of the estimation model. In addition, the estimation model effect of Factor Set A was better than that of Factor Set B, which also shows that adding topographic information can also effectively improve the model estimation ability. At the same time, this paper established a scatter diagram between the annual estimated and actual PM 2.5 concentrations ( Figure 3). The results showed that there was a high correlation between the two, in which the R 2 values in 2015 and 2018 were 0.87 and 0.92, respectively, indicating that there were good estimation results for the PM 2.5 concentration. At the same time, this paper established a scatter diagram between the annual estimated and actual PM2.5 concentrations (Figure 3). The results showed that there was a high correlation between the two, in which the R 2 values in 2015 and 2018 were 0.87 and 0.92, respectively, indicating that there were good estimation results for the PM2.5 concentration.

Spatial Analysis of the PM 2.5 Concentration in the Chang-Zhu-Tan Urban Agglomeration
In this paper, kriging interpolation analysis was performed on the seasonal PM 2.5 concentration of the Chang-Zhu-Tan urban agglomeration in 2018, and the continuous spatial interpolation of PM 2.5 concentration was realized. The results are shown in Figure 4. According to the inversion results, the temporal and spatial distributions of seasonal PM 2.5 concentrations in the Chang-Zhu-Tan urban agglomeration were analyzed. The results showed that the PM 2.5 concentration of the Chang-Zhu-Tan urban agglomeration in winter was significantly higher than that in the other three seasons, with the lowest PM 2.5 concentration in summer and similar PM 2.5 concentrations in spring and autumn.
The study area is located in the subtropical monsoon region. The northerly wind prevails in the Chang-Zhu-Tan urban agglomeration in winter, the atmospheric structure is stable, and the meteorological conditions are not conducive to the diffusion of PM2.5 and other particles. The study area is prone to temperature inversion in winter, which makes PM2.5 particles gradually accumulate on the surface. In addition, the burning of a large amount of coal for heating in winter increases the PM2.5 concentration.
In summer, the southerly wind prevails, and the meteorological conditions are conducive to the diffusion of PM2.5 and other particles. In summer, strong winds are more likely to lead to the diffusion of PM2.5. In addition, it is rainy and humid in summer, and it is difficult for PM2.5 particles to stay in the air. The high temperature in summer makes it less likely for temperature inversion to occur, and the atmosphere is prone to convection, which is conducive to the diffusion of PM2.5 particles. Therefore, the concentration of PM2.5 is relatively high in winter and low in summer. At the same time, there are differences in the spatial distribution of PM2.5 concentrations. The PM2.5 concentration in the northwestern part of the Chang-Zhu-Tan urban agglomeration is relatively high, and the PM2.5 concentration in some central areas is low, which is significantly different from the adjacent areas.

Discussion
With the rapid development of industry and the increasing number of vehicles, the problem of air pollution is becoming increasingly serious [69]. Monitoring the spatial and temporal distribution of polluted gases is the key to solving the problem of air pollution. The study area is located in the subtropical monsoon region. The northerly wind prevails in the Chang-Zhu-Tan urban agglomeration in winter, the atmospheric structure is stable, and the meteorological conditions are not conducive to the diffusion of PM 2.5 and other particles. The study area is prone to temperature inversion in winter, which makes PM 2.5 particles gradually accumulate on the surface. In addition, the burning of a large amount of coal for heating in winter increases the PM 2.5 concentration.
In summer, the southerly wind prevails, and the meteorological conditions are conducive to the diffusion of PM 2.5 and other particles. In summer, strong winds are more likely to lead to the diffusion of PM 2.5 . In addition, it is rainy and humid in summer, and it is difficult for PM 2.5 particles to stay in the air. The high temperature in summer makes it less likely for temperature inversion to occur, and the atmosphere is prone to convection, which is conducive to the diffusion of PM 2.5 particles. Therefore, the concentration of PM 2.5 is relatively high in winter and low in summer. At the same time, there are differences in the spatial distribution of PM 2.5 concentrations. The PM 2.5 concentration in the northwestern part of the Chang-Zhu-Tan urban agglomeration is relatively high, and the PM 2.5 concentration in some central areas is low, which is significantly different from the adjacent areas.

Discussion
With the rapid development of industry and the increasing number of vehicles, the problem of air pollution is becoming increasingly serious [69]. Monitoring the spatial and temporal distribution of polluted gases is the key to solving the problem of air pollution. Among them, PM 2.5 has always been one of the main air pollutants monitored by humans. At present, the model used by daytime remote sensing satellite technology for PM 2.5 concentration estimation is relatively mature, and it can better perform spatial processing of large-scale PM 2.5 concentrations. Human production and living activities greatly affect the temporal and spatial distributions of PM 2.5 concentrations. Human social activities at night can reflect the intensity of human activities and reflect the state of human production, and living, to a certain extent. Therefore, this paper added nighttime light image information to PM 2.5 concentrations. In the concentration estimation model, the results showed that the accuracy of the PM 2.5 concentration estimation results has been somewhat improved, indicating that nighttime light images are of practical significance for PM 2.5 concentration estimation.
In this paper, the partial least squares method was used to calculate the factor importance of the PM 2.5 concentration. The partial least squares method can better solve the multicollinearity problem on the basis of retaining all factors, and the partial least squares method extracts, as much as possible, real PM 2.5 concentration-related factor information to obtain a more objective and reliable correlation between factors and PM 2.5 concentration. Compared with other factor analysis methods, the partial least squares method can calculate factor VIP scores on the basis of more effectively solving the multicollinearity problem.
In this paper, the multivariate linear model was used to obtain the estimated value of the seasonal PM 2.5 concentration, and scatter plots ( Figure 5) of the estimated value and the actual value of the PM 2.5 concentration in the four seasons were constructed. The results showed that the estimated value and the actual value of the PM 2.5 concentration in the four seasons was very close to y = x, indicating that the error distribution of the model, underestimating and overestimating PM 2.5 concentration, was relatively balanced. The estimated R 2 value of the PM 2.5 concentration model in spring was significantly lower than that in the other three seasons, while the estimated R 2 value of the PM 2.5 concentration model in winter was significantly higher than that in the other three seasons, indicating that the model estimation accuracy had seasonality. than that in the other three seasons, while the estimated R 2 value of the PM2.5 concentration model in winter was significantly higher than that in the other three seasons, indicating that the model estimation accuracy had seasonality. In addition, the spatial distribution of PM2.5 concentration is a complex geographic phenomenon, and the spatial characteristics of different air quality monitoring stations are different, resulting in obvious spatial differences in the accuracy of PM2.5 concentration estimation models. In this paper, the multivariate linear estimation model, with high estimation accuracy of seasonal PM2.5 concentration, was used to obtain the estimated PM2.5 concentration in the four seasons, and the estimated PM2.5 concentration in the four seasons was compared with the actual value ( Figure 6). The results showed that the estimated and actual PM2.5 concentrations in the four seasons had similar trends, indicating that the overall effect of the model estimation was good, but there were still obvious local differences. The estimated value of the PM2.5 concentration, at some stations, In addition, the spatial distribution of PM 2.5 concentration is a complex geographic phenomenon, and the spatial characteristics of different air quality monitoring stations are different, resulting in obvious spatial differences in the accuracy of PM 2.5 concentration estimation models. In this paper, the multivariate linear estimation model, with high estimation accuracy of seasonal PM 2.5 concentration, was used to obtain the estimated PM 2.5 concentration in the four seasons, and the estimated PM 2.5 concentration in the four seasons was compared with the actual value ( Figure 6). The results showed that the estimated and actual PM 2.5 concentrations in the four seasons had similar trends, indicating that the overall effect of the model estimation was good, but there were still obvious local differences. The estimated value of the PM 2.5 concentration, at some stations, was quite different from the actual value. To further analyze the spatial difference in model estimation accuracy, this paper also analyzed the actual error of PM 2.5 concentration estimation at the stations. At the same time, it can be seen from the figure that the spring PM 2.5 concentration of most air quality monitoring stations in the Chang-Zhu-Tan urban agglomeration was higher than the Level 1 standard but lower than the Level 2 standard. The summer PM 2.5 concentration of most air quality monitoring stations was lower than the Level 1 standard, and the autumn PM 2.5 concentration of air quality monitoring stations was similar to spring but significantly higher than the spring PM 2.5 concentration. The PM 2.5 concentration of air quality monitoring stations in winter was significantly higher than that of the other three seasons, and the winter PM 2.5 concentration of most air quality monitoring stations was higher than the Level 2 standard. was quite different from the actual value. To further analyze the spatial difference in model estimation accuracy, this paper also analyzed the actual error of PM2.5 concentration estimation at the stations. At the same time, it can be seen from the figure that the spring PM2.5 concentration of most air quality monitoring stations in the Chang-Zhu-Tan urban agglomeration was higher than the Level 1 standard but lower than the Level 2 standard. The summer PM2.5 concentration of most air quality monitoring stations was lower than the Level 1 standard, and the autumn PM2.5 concentration of air quality monitoring stations was similar to spring but significantly higher than the spring PM2.5 concentration. The PM2.5 concentration of air quality monitoring stations in winter was significantly higher than that of the other three seasons, and the winter PM2.5 concentration of most air quality monitoring stations was higher than the Level 2 standard. In this paper, a total of 48 air quality monitoring stations, in 2015 and 2018, were analyzed for the real error of PM2.5 concentration, and the average estimation errors of 48 stations in the four seasons were calculated (Figure 7). The results showed that the estimation error fluctuated greatly between stations, and there was an obvious uneven spatial distribution of model estimation errors. The total average error of 48 stations in the four seasons was 4.22 μg.m −3 , and the estimation error of 23 stations was greater than the total average error. The spatial distribution of these 23 stations was further analyzed. Among them, 15 and 8 stations in 2015 and 2018, respectively, had estimation errors greater than the total average error, indicating that the estimation errors of PM2.5 concentrations, at stations in 2015, were relatively large.  Generally, an error higher than 4.22 μg.m −3 is a high error site, and an error lower than 4.22 μg.m −3 is a low error site. By analyzing the spatial locations of the 23 stations with large estimation errors, it can be found that the stations with high errors in 2015 and 2018 were mostly distributed in Xiangtan and Zhuzhou, and the economic development of these two cities was much slower than that of Changsha ( Figure 8). The GDP of Changsha is 2.30 times that of the sum of the GDPs of Xiangtan and Zhuzhou, and the nighttime light area of Changsha is also much larger than that of Xiangtan and Zhuzhou. In addition, most stations distributed in dark areas at night had larger estimation errors, which was similar to the conclusion of Wang et al. [40]. The estimated models tended to underestimate PM2.5 concentrations in darker nighttime areas.  Generally, an error higher than 4.22 µg·m −3 is a high error site, and an error lower than 4.22 µg·m −3 is a low error site. By analyzing the spatial locations of the 23 stations with large estimation errors, it can be found that the stations with high errors in 2015 and 2018 were mostly distributed in Xiangtan and Zhuzhou, and the economic development of these two cities was much slower than that of Changsha ( Figure 8). The GDP of Changsha is 2.30 times that of the sum of the GDPs of Xiangtan and Zhuzhou, and the nighttime light area of Changsha is also much larger than that of Xiangtan and Zhuzhou. In addition, most stations distributed in dark areas at night had larger estimation errors, which was similar to the conclusion of Wang et al. [40]. The estimated models tended to underestimate PM 2.5 concentrations in darker nighttime areas.  Generally, an error higher than 4.22 μg.m −3 is a high error site, and an error lower than 4.22 μg.m −3 is a low error site. By analyzing the spatial locations of the 23 stations with large estimation errors, it can be found that the stations with high errors in 2015 and 2018 were mostly distributed in Xiangtan and Zhuzhou, and the economic development of these two cities was much slower than that of Changsha ( Figure 8). The GDP of Changsha is 2.30 times that of the sum of the GDPs of Xiangtan and Zhuzhou, and the nighttime light area of Changsha is also much larger than that of Xiangtan and Zhuzhou. In addition, most stations distributed in dark areas at night had larger estimation errors, which was similar to the conclusion of Wang et al. [40]. The estimated models tended to underestimate PM2.5 concentrations in darker nighttime areas.  In this paper, a variety of estimation models for seasonal and annual PM 2.5 concentrations were constructed based on nighttime light images, meteorological data, and topographic data. Except for spring, the models achieved high estimation accuracy, but further research is needed in terms of temporal and spatial resolution. In terms of temporal resolution, follow-up research should be more refined to the hourly scale. Nighttime light images, meteorological data, and topographic data can meet the requirements of this scale. However, in terms of spatial resolution, due to too few meteorological stations, the spatial resolution of meteorological conditions is limited. It is difficult to meet the high-precision inversion of PM 2.5 concentrations. At the same time, the spatial resolution of the nighttime light images used in this paper is low, at only 500 m, and subsequent research should attempt to select higher spatial resolution images.

Conclusions
Based on multisource data and monitoring station PM 2.5 concentration data, this paper constructed a variety of PM 2.5 concentration estimation models for the Chang-Zhu-Tan urban agglomeration. The seasonal and annual PM 2.5 concentrations of the Chang-Zhu-Tan urban agglomeration, in 2015 and 2018, were estimated, respectively, and the correlation between characteristic factors and PM 2.5 concentrations was analyzed. The results showed that, in terms of the estimation results of the seasonal PM 2.5 concentration model, the spring estimation results were the worst, and the winter estimation results were the best. Due to the increase in the number of samples in the annual PM 2.5 concentration model, the estimation results of the machine learning model were better than the seasonal estimation results. In terms of the correlation of PM 2.5 concentration, meteorological elements had a greater correlation with PM 2.5 concentration, followed by nighttime light radiance, and terrain elements and PM 2.5 concentration were the smallest. This paper proposes a PM 2.5 concentration estimation method based on multisource data. At the same time, there are some limitations in multisource data fusion and continuous surface PM 2.5 concentration inversion, so further exploration is needed in subsequent research.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
No new data were created or analyzed in this study. Data sharing is not applicable to this paper.

Conflicts of Interest:
The authors declare no conflict of interest.