Exploring Spatial Influence of Remotely Sensed PM2.5 Concentration Using a Developed Deep Convolutional Neural Network Model

Currently, more and more remotely sensed data are being accumulated, and the spatial analysis methods for remotely sensed data, especially big data, are desiderating innovation. A deep convolutional network (CNN) model is proposed in this paper for exploiting the spatial influence feature in remotely sensed data. The method was applied in investigating the magnitude of the spatial influence of four factors—population, gross domestic product (GDP), terrain, land-use and land-cover (LULC)—on remotely sensed PM2.5 concentration over China. Satisfactory results were produced by the method. It demonstrates that the deep CNN model can be well applied in the field of spatial analysing remotely sensed big data. And the accuracy of the deep CNN is much higher than of geographically weighted regression (GWR) based on comparation. The results showed that population spatial density, GDP spatial density, terrain, and LULC could together determine the spatial distribution of PM2.5 annual concentrations with an overall spatial influencing magnitude of 97.85%. Population, GDP, terrain, and LULC have individual spatial influencing magnitudes of 47.12% and 36.13%, 50.07% and 40.91% on PM2.5 annual concentrations respectively. Terrain and LULC are the dominating spatial influencing factors, and only these two factors together may approximately determine the spatial pattern of PM2.5 annual concentration over China with a high spatial influencing magnitude of 96.65%.


Introduction
Remote sensing technology has developed rapidly since the 1960s [1], and an abundance of remote sensing data has been accumulated in this 50-year period. Although abundant remotely sensed data have been applied to many fields, such as ecology, environment, geography, etc., the spatial analysis method for remotely sensed lattice data desiderates innovation. A single spatial variable generally has autocorrelation [2] (i.e., spatial dependence [3,4]), and various spatial variables have correlation. Spatial autocorrelations can be analysed with local indicators of the spatial association (LISA) index [5] (e.g., local Moran I [6], local Geary c index [7]). The main objective of spatial analysis is to identify the natural relationships that exist between variables [8,9]. The mainstream classical spatial analysis models, e.g., spatial lag model [10,11], spatial error model [10,11], and Bayesian spatial regression model [12], can only evaluate the overall or average linear correlation feature over a whole study region, neglecting the details of local area. These methods ignore the consequences of spatial heterogeneity [13]. The majority of spatial analysing methods assume stationary space.
However, assuming spatial convariance structure to be stationary is not so reasonable [14]. The spatial influencing relationship can better be explored when the analysis is local and more detailed results can be yielded [15]. The inclusion of a spatial heterogeneity resulting from differences in environmental conditions, socioeconomic dynamics, and other factors reinforces the need for more regionalized spatial analyses in exposure assessment and public health [16]. Although, the geographically weighted regression (GWR) [17] method considers local details; however, it can only describe a linear or simple non-linear spatial influencing relationship. In the era of big data, the need for developing advanced spatial analysis methods (e.g., machine learning methods) for remotely sensed data is urgent.
Previously, there are several studies that have applied machine learning methods to address the influencing factors on PM 2.5 concentrations. Zheng et al. [18] used traditional artificial neural networks to model spatial correlation between Beijing's air qualities and influencing factors, e.g., meteorology, traffic flow, human mobility. Yan et al. [19] predicted the daily average PM 2.5 concentration in Nanjing, Beijing, and Sanya, combining meteorological and contaminant factors based on the Long Short-Term Memory (LSTM) model. Suleiman et al. [20] presented a machine learning model to predict the traffic-related PM 10 and PM 2.5 concentrations from various variables (e.g., traffic variables). Hsieh et al. [21] proposed a semi-supervised learning algorithm to optimize the monitoring locations of air quality in Beijing based on spatial correlation. Certainly, there are some studies which utilized typical methods to investigate the influence of satellite-based PM 2.5 . He et al. [22] used empirical orthogonal function (EOF) to analyse the relationship between remotely sensed PM 2.5 and climate circulation transformation in East China. Hajiloo et al. [23] employed geographical weight regression (GWR) to investigate impact of meteorological and environmental parameters on PM2.5 concentrations in Tehran, Iran. Yang et al. [24] quantified the influence of natural and socioeconomic factors on PM 2.5 pollution using the GeoDetector model [25,26].
This study proposed a spatial analysis method that exploits the spatial influencing feature of remotely sensed data based on the deep CNN. CNN is a mainstream deep learning method and can effectively extract the feature representations from a large number of images [27] and object detection [28]. Some researchers have applied deep CNN in remote sensing classification. Q. Zou et al. [29] and Zhao et al. [30] proposed a DBN method for high-solution satellite imagery classification. H. Liang and Q. Li, C. Tao et al., and F.P.S. Luus et al. [31], Nogueira et al. [32], Volpi et al. [33], Chen et al. [34], have employed deep CNN in hyperspectral imagery classification or feature extraction. Some researchers have also employed deep CNN in synthetic aperture radar (SAR) image classification, e.g., Du et al. [35] and Geng et al. [36]. To our knowledge, the research applying deep CNN into spatial influencing of remotely sensed lattice data is very rare. This study aimed to present a deep CNN model exploiting the magnitude of spatial influence of four factors-population, gross domestic product (GDP), terrain, and land-use and land-cover (LULC)-to remotely sense the annual mean concentration of PM 2.5 over China. This model not only considers local spatial heterogeneity but also has super nonlinear fitting ability. Therefore, the presented model is rooted in a deep learning framework and may reduce uncertainty of the results obtained from a simplistic correlation analysis or simple regression model, therefore giving better information to decision makers of public health.

Materials
The materials used in this research contain five types of data: remotely sensed PM 2.5 concentration, population spatial distribution density, GDP spatial distribution density, terrain data, and LULC in China. The remotely sensed PM 2.5 annual concentration dataset in 2010 was produced by the Atmospheric Physics Institute of Dalhousie University in Canada [37] with a resolution of 0.1 • × 0.1 • . The population density data in this paper are cited in the Gridded Population of the World (GPW), data of the UN-Adjust Population Density-v4 [38], published by a data centre in NASA's Earth Observing System Data and Information System (EOSDIS), with a resolution of 30 × 30 . GDP spatial distribution density, terrain data, and LULC datasets were drawn from the Resources and Environmental Science Centre of the Chinese Academy of Sciences (http://www.resdc.cn). All above-mentioned data were projected by Albers Conic Equal Area with WGS-84 datum, and the resolution was unified to 10 km × 10 km.

Methodology
The methodology in this paper consists of two modules: processing geospatial data and structuring the deep CNN model. The purpose of the former is to establish the dataset for the deep CNN model. The deep CNN model undertakes the mission of fitting the complex function of spatial correlation relationship.

Processing Geospatial Data
The deep CNN method is usually applied in image identification or classification, not directly transplanted into analysing geospatial data. In the geospatial issue, spatial correlation and geographical attribute need to be considered. Hence, geospatial data require technical processing to match the deep CNN model structure. The four influencing factors generate inputs. Each pixel location contains PM 2.5 concentrations as output and four influencing factors. In view of spatial correlation, the pixel location and the surrounding locations should be considered. The deep CNN model has the ability of processing big data; therefore the order of spatial correlation can be amplified. In this paper, the order of spatial correlation adopts n-order shape, (2n + 1) × (2n + 1) pixels (n = 1, 2, . . .). Figure 1 shows an illustration of 5-order shape of the spatial correlation extent, including 11 × 11 pixels. Subsequently, it can extract the corresponding four sets of influencing factor attribute data for a pixel location. Each dataset of influencing factors comprises the corresponding values of the surrounding (2n + 1) × (2n + 1) pixels. In short, the PM 2.5 annual concentration of a pixel location is affected by the four influencing factors of its own and the surrounding n-order spatial correlation extent, (2n + 1) × (2n + 1) pixels. The mathematic form can be expressed as follows: is the PM 2.5 annual concentration of the i-th pixel, Pop i|(2n+1)×(2n+1) , GDP i|(2n+1)×(2n+1) , Ter i|(2n+1)×(2n+1) , LULC i|(2n+1)×(2n+1) represent the four influencing factor attribute values of the i-th pixel and its surrounding (2n + 1) × (2n + 1) pixels, and ξ i represents the error. The spatial influencing function F(.) can be learned by the deep CNN model.

Methodology
The methodology in this paper consists of two modules: processing geospatial data and structuring the deep CNN model. The purpose of the former is to establish the dataset for the deep CNN model. The deep CNN model undertakes the mission of fitting the complex function of spatial correlation relationship.

Processing Geospatial Data
The deep CNN method is usually applied in image identification or classification, not directly transplanted into analysing geospatial data. In the geospatial issue, spatial correlation and geographical attribute need to be considered. Hence, geospatial data require technical processing to match the deep CNN model structure. The four influencing factors generate inputs. Each pixel location contains PM . concentrations as output and four influencing factors. In view of spatial correlation, the pixel location and the surrounding locations should be considered. The deep CNN model has the ability of processing big data; therefore the order of spatial correlation can be amplified. In this paper, the order of spatial correlation adopts n-order shape, (2n + 1) × (2n + 1) pixels (n = 1,2, …). Figure 1 shows an illustration of 5-order shape of the spatial correlation extent, including 11 × 11 pixels. Subsequently, it can extract the corresponding four sets of influencing factor attribute data for a pixel location. Each dataset of influencing factors comprises the corresponding values of the surrounding (2n + 1) × (2n + 1) pixels. In short, the PM . annual concentration of a pixel location is affected by the four influencing factors of its own and the surrounding n-order spatial correlation extent, (2n + 1) × (2n + 1) pixels. The mathematic form can be expressed as follows: is the PM . annual concentration of the i-th pixel, represent the four influencing factor attribute values of the i-th pixel and its surrounding (2n + 1) × (2n + 1) pixels, and represents the error. The spatial influencing function F(. ) can be learned by the deep CNN model.

A Developed Deep Convolutional Neural Network Model
CNN contains two categories of cells in the visual cortex, simple cells which exploit local features and complex cells which "pool" (e.g., maximizing, averaging) the outputs of simple cells within a neighbourhood. The structure of CNN model which has two special aspects of local connections and sharing weights is different from general deep learning models. A complete deep CNN stack three types of layers, convolutional layers, pooling layers, and full connected layers.
The commonly used CNNs are 2-Dimensional CNN and 3-Dimensional (3-D) CNN. Figure 2 shows a 3-D CNN illustration with m (m = 1, 2, . . . ) filters and k (k = 1, 2, . . . ) convolution kernels. The value of a neuron S xyz ij at position (x, y, z) of the j-th convolutional feature in the i-th layer can be expressed as follows [34]: where m indexes the convolutional feature in the (i − 1)th layer connected to the j-th convolutional feature, and P i and Q i are the height and the width of the convolutional kernel. K i is the size of the spatial influencing factors, w pqk ijm is the value of position (p, q, k) connected to the m-th convolutional feature, and b ij is the bias of the j-th convolutional feature in the i-th layer. The commonly used CNNs are 2-Dimensional CNN and 3-Dimensional (3-D) CNN. Figure 2 shows a 3-D CNN illustration with m (m=1,2,…) filters and k (k=1,2,…) convolution kernels. The value of a neuron S at position ( , , ) of the j-th convolutional feature in the i-th layer can be expressed as follows [34]: where m indexes the convolutional feature in the ( − 1)th layer connected to the j-th convolutional feature, and and are the height and the width of the convolutional kernel. is the size of the spatial influencing factors, is the value of position ( , , ) connected to the m-th convolutional feature, and is the bias of the j-th convolutional feature in the i-th layer. This paper designs a deep 3-D CNN model for exploiting spatial influencing feature of remotely sensed data. Figure 3 illustrates the presented deep CNN model architecture which contains including four convolutional layers, four polling layers, and three hidden layers. And the activation function for hidden layer adopted the Rectified Linear Unit (ReLU) function. The pooling mode employed average mode. The batch normalization was set in each layer except for the output layer. The dropout ratio and learning ratio were set as 25% and 1%, respectively. The dimension of the pre-processed input neural layer is (2n + 1) × (2n + 1) × 4, including four sets of influencing factors with the i-th pixel and its surrounding (2n + 1) × (2n + 1) pixels. Table 1 lists the experimental results when the spatial correlation parameter n was assigned various values. It shows that the validation accuracy reaches the highest when the spatial correlation parameter, n, is taken 9, although the training accuracy is improved along with the increase of the parameter, n. Considering that the validation accuracy is better indicator representing the accuracy of a model. Hence, the spatial correlation parameter, n, is assigned with nine. Then the input layer contains 19 × 19 × 4 neurons with the four factors' attribute value. The number of the output neuron is 11, labelled by PM . annual concentration with 11 categories: <10 μg/m , 10~20 μg/m , 20~30 μg/m , 30~40 μg/m , 40~50 μg/m , 50~60 μg/m , 60~70 μg/ m , 70~80 μg/m , 80~90 μg/m , 90~100 μg/m , > 100 μg/m . This paper designs a deep 3-D CNN model for exploiting spatial influencing feature of remotely sensed data. Figure 3 illustrates the presented deep CNN model architecture which contains including four convolutional layers, four polling layers, and three hidden layers. And the activation function for hidden layer adopted the Rectified Linear Unit (ReLU) function. The pooling mode employed average mode. The batch normalization was set in each layer except for the output layer. The dropout ratio and learning ratio were set as 25% and 1%, respectively. The dimension of the pre-processed input neural layer is (2n + 1) × (2n + 1) × 4, including four sets of influencing factors with the i-th pixel and its surrounding (2n + 1) × (2n + 1) pixels. Table 1 lists the experimental results when the spatial correlation parameter n was assigned various values. It shows that the validation accuracy reaches the highest when the spatial correlation parameter, n, is taken 9, although the training accuracy is improved along with the increase of the parameter, n. Considering that the validation accuracy is better indicator representing the accuracy of a model. Hence, the spatial correlation parameter, n, is assigned with nine. Then the input layer contains 19 × 19 × 4 neurons with the four factors' attribute value. The number of the output neuron is 11, labelled by PM 2.5 annual concentration with 11 categories: <10 µg/m 3 , 10 ∼ 20 µg/m 3 , 20 ∼ 30 µg/m 3 , 30 ∼ 40 µg/m 3 , 40 ∼ 50 µg/m 3 , 50 ∼ 60 µg/m 3 , 60 ∼ 70 µg/m 3 , 70 ∼ 80 µg/m 3 , 80 ∼ 90 µg/m 3 , 90 ∼ 100 µg/m 3 , >100 µg/m 3 .

Results
The remotely sensed PM . annual concentration and influencing factors possessed 96,337 pixels, among which, 86,903 pixels (accounting for the ratio of 90%) were used for deep learning, and the remaining 9434 pixels (accounting for 10%) were reserved for validation. Training accuracy is defined as the accuracy applied to the training data (i.e., 86,903 pixels), while validation accuracy is the accuracy for the remaining data (i.e., 9434 pixels), and estimated accuracy is the accuracy for the total data (i.e., 96,337 pixels). To investigate the integrated and respective spatial influence of the four various factors, we exploited the congregate magnitude of spatial influence from the four factors and the separate influencing magnitude from one or two factors.

Pop
Ter LULC GDP

Results
The remotely sensed PM 2.5 annual concentration and influencing factors possessed 96,337 pixels, among which, 86,903 pixels (accounting for the ratio of 90%) were used for deep learning, and the remaining 9434 pixels (accounting for 10%) were reserved for validation. Training accuracy is defined as the accuracy applied to the training data (i.e., 86,903 pixels), while validation accuracy is the accuracy for the remaining data (i.e., 9434 pixels), and estimated accuracy is the accuracy for the total data (i.e., 96,337 pixels). To investigate the integrated and respective spatial influence of the four various factors, we exploited the congregate magnitude of spatial influence from the four factors and the separate influencing magnitude from one or two factors.

Integrated Spatial Influencing Feature
If the four impact factors were all fed into the input layer, after 1000 epochs of learning, the training accuracy of 86,903 pixels were 98.71%, and the validation accuracy of the remaining 9434 pixels reached 93.29%. Figure 4 illustrates the spatial distribution of the original and estimated PM 2.5 annual concentration of the total 96,337 pixels using the trained deep learning model fed with the four influencing factors.

Integrated spatial influencing feature
If the four impact factors were all fed into the input layer, after 1,000 epochs of learning, the training accuracy of 86,903 pixels were 98.71%, and the validation accuracy of the remaining 9,434 pixels reached 93.29%. Figure 4 illustrates the spatial distribution of the original and estimated PM . annual concentration of the total 96,337 pixels using the trained deep learning model fed with the four influencing factors.  The estimated spatial distribution of PM 2.5 annual concentration was nearly the same, except for very few pixel locations. It indicated that the four factors (population spatial density, GDP spatial density, terrain, and LULC) can almost determine the PM 2.5 annual concentration. Furthermore, Table 2 listed the corresponding confusion matrix between the original and estimated PM 2.5 annual concentration of the total 96,337 pixel locations by the four factors using the trained deep CNN model. The result showed that although there are some incorrect estimated pixel values which were close to the correct values, that is an obvious narrow diagonal band. The overall estimated accuracy is 97.85%. The estimated accuracy of the first category of PM 2.5 annual concentration, <10 µg/m 3 , reaches a maximum of 99.38%. The minimum and the second minimum predicted accuracies are 90.81% and 95.48% respectively, occurring on the eighth and eleventh category of 70 ∼ 80 µg/m 3 and 90 ∼ 100 µg/m 3 . The estimated accuracy can be regarded as the spatial influencing magnitude of the influencing factors on PM 2.5 annual concentration. A high estimated accuracy reflects directly a high spatial influencing feature. The results show that there is a strong correlation between PM 2.5 annual concentration and the four factors. Especially while the trained deep CNN evaluated the total 96,337 pixels, the overall estimated accuracy has reached up to 97.85%, indicating the spatial influencing magnitude of the four factors on PM 2.5 annual concentration.  0  0  0  0  0  0  0  0  0  10~20  112  11,792  70  3  0  1  0  0  0  0  0  20~30  2  83  21,891  86  12  1  0  2  0  0  0  30~40  0  3  101  10,804  89  3  0  0  0  0  0  40~50  0  6  18  115  13,971  60  5  0  0  0  0  50~60  0  0  1  5  62  4103  57  4  0  0  0  60~70  0  0  0  0  0  49  3650  62  0  0  0  70~80  0  0  2  0  0  1  142  3598  159  0  0  80~90  0  0  0  0  0  0  14  296  4563  60  0  90~100  0  0  0  0  0  0  0  0  38  1332  7  >100  0  0  0  0  0  0  0  0  1  3

Single Spatial Influencing Feature
The spatial influencing magnitude of the single factor can be measured by the deep CNN model proposed in this paper. We have implemented other deep CNNs whose input layer contains 19 × 19 neurons with a single factor attribute value; the other parameters are the same as above. After 1000 epochs of learning, the training accuracy and validation accuracy of population spatial density and GDP spatial density were 47.12% and 36.13%, 50.07% and 40.91%. Furthermore, the results show that PM 2.5 annual concentration has strong spatial correlation with terrain or LULC, as the validation accuracies of terrain and LULC were up to 83.17% and 72.37%. The result showed that although the overall estimated accuracies of population and GDP over China were relatively low, the two factors could have determined the severe PM 2.5 polluted region. Furthermore, the result indicated that terrain and LULC are the main spatial influencing factors on PM 2.5 annual concentration over China.
In addition, we also have implemented the deep CNN with an input layer containing 19 × 19 × 2 neurons describing terrain and LULC. The learning result shows that the training accuracy and validation accuracy of the two factors, terrain and LULC, were up to 90.69% and 87.95%. Table 3 listed the corresponding confusion matrix between the original and estimated PM 2.5 annual concentration produced by the trained deep CNN fed by terrain and LULC data on the total 96,337 pixel locations. Except for the eleventh category (>100 µg/m 3 ) of PM 2.5 annual concentration, the other ten categories' estimated accuracies are more than 91%. Furthermore, the overall estimated precision can reach up to 96.65%.

Comparation with the GWR Prediction
To verify the advantage of the deep CNN model presented in this paper, we conducted the GWR in the same dataset. Figure 4C is the estimated spatial distribution of PM 2.5 annual concentrations over China in 2010 by the GWR model. It can be seen that the GWR estimated results have obvious bias comparing with the origin data ( Figure 4). Furthermore, the lowest and highest PM 2.5 concentrations were particularly misestimated by the GWR model. And the overall estimated accuracy was 72.81% which is more less than the estimated accuracy of the deep CNN model, 97.85%. Comparing the Figure 4B,C, it indicated that although the overall spatial structure estimated by the GWR is generally similar with the origin spatial structure of PM 2.5 annual concentrations, there were some deviations in detail. The cause of the difference of the two models could be that, the deep CNN model has super strong non-linear fitting ability which can train very complicated non-linear function, however, the GWR is still a linear regression model which cannot catch complicated non-linear variation effects. Inaccurate correlativity between PM 2.5 concentration and other influencing factors could lead to biased public policies. Scientific public policy-making need more fine and accurate analysing evidences.

Discussion
This study proposed a deep CNN model to exploit spatial influencing magnitude for the annual mean concentration of remotely sensed PM 2.5 over China. In consideration of the influencing mechanism and the availability of the dataset, this study investigated the spatial influence of the four factors (population, GDP, terrain, and LULC) on the annual concentration of PM 2.5 over China. The influencing factors of PM 2.5 pollution are known to include natural and anthropogenic activities [39]. Among the four factors selected for this paper, terrain represented natural elements, population and GDP reflected anthropogenic activities, and LULC could be regarded as a mixture of natural and anthropogenic activities. The presented deep CNN method fully considered the local spatial heterogeneity, and a wider spatial correlated scope could be considered by more than one-order shape extent, which benefited from the strong ability of the deep CNN to process big data. This paper bridged the gap between spatial analysis and deep CNN technology with the idea of reprocessing or reorganizing remotely sensed data for deep CNN input. The deep CNN method was commonly used to extract the feature representations from a mass of labelled images [27,28]. As aforesaid, few researchers applied the deep CNN model when analysing spatial influence of multiple variables. From a different view, combining a geospatial reprocess, this study designed a 3D deep CNN structure in which the input and output neurons were influencing factors and PM 2.5 concentration, respectively. The strong non-linear function fitting ability of a deep CNN model could then detect complicated non-linear spatial influencing effect, and the deep CNN model might consider local spatial heterogeneity. From the results, the developed deep CNN model can fully consider spatial relationship and can calculate on each pixel location. Hence, the results can effectively describe the spatial influencing feature on every pixel location. Although the GWR method can also investigate the local correlation on each pixel location, only a linear or simple non-linear regression can be implemented, and the capability of processing big data is not very strong. From the above, the deep CNN model can not only process big data well but can also fit or learn very complicated correlativity. This paper demonstrated that the deep CNN technology could be applied in exploiting the spatial influence feature of geospatial or remotely sensed data, and its advantages could be fully performed. The spatial influencing magnitude of the four factors on the annual concentration of PM 2.5 was investigated employing the presented deep CNN model. This model was not only used in exploring spatial influence of remotely sensed PM 2.5 concentration, but also in other fields, such as detecting risk factors of some kind of epidemic based on remotely sensed data. Through the model, the risk level of risk factors in public health could be quantificationally assessed. In other words, the developed deep CNN model has the potential to expand the field of spatial analysis of remotely sensed lattice data. Despite all this, this research has some limitations. Firstly, the spatial dependent variable, PM 2.5 annual concentration, is classified into 11 categories, not as a continuous variable. Secondly, the deep CNN model can learn a very complicated function structure, but the mathematical mechanism is currently not clear, namely mysterious "black boxes" [40], and it is difficult to explain in a geographical process.

Conclusions
Population spatial density, GDP spatial density, terrain, and LULC can almost determine the spatial pattern of PM 2.5 annual concentration with an overall estimated precision of 97.85% over China. Furthermore, terrain and LULC are the main spatial influencing factors on PM 2.5 annual concentration among the four factors. And the overall spatial influencing magnitude of the two factors, terrain and LULC, reached up to 96.65%, nearly equal to all four factors' spatial influencing magnitude on PM 2.5 annual concentration.