Estimation of PMx Concentrations from Landsat 8 OLI Images Based on a Multilayer Perceptron Neural Network

The estimation of PMx (incl. PM10 and PM2.5) concentrations using satellite observations is of great significance for detecting environmental issues in many urban areas of north China. Recently, aerosol optical depth (AOD) data have been being used to estimate the PMx concentrations by implementing linear and/or nonlinear regression analysis methods. However, a lot of relevant research based on AOD published so far have demonstrated some limitations in estimating the spatial distribution of PMx concentrations with respect to estimation accuracy and spatial resolution. In this research, the Google Earth Engine (GEE) platform is employed to obtain the band reflectance (BR) data of a large number of Landsat 8 Operational Land Imager (OLI) remote sensing images. Combined with the meteorological, time parameter and the latitude and longitude zone (LLZ) method proposed in this article, a new BR (band reflectance)-PMx (incl. PM10 and PM2.5) model based on a multilayer perceptron neural network is constructed for the estimation of PMx concentrations directly from Landsat 8 OLI remote sensing images. This research used Beijing, China as the test area and the conducted experiments demonstrated that the BR-PMx model achieved satisfactory performances for the PMx-concentration estimations. The coefficient of determination (R2) of the BR-PM2.5 and BR-PM10 models reached 0.795 and 0.773, respectively, and the root mean square error (RMSE) reached 20.09 μg/m3 and 31.27 μg/m3. Meanwhile, the estimation results have been compared with the results calculated by Kriging interpolation at the same time point, and the spatial distribution is consistent. Therefore, it can be concluded that the proposed BR-PMx model provides a new promising method for acquiring accurate PMx concentrations for various cities of China.


Introduction
With the rapid development of the economy in China, the processes of industrialization and urbanization have increased the environmental burden, and air pollution has become increasingly serious.Aerosols, which not only have an impact on global climate change but also on the environmental quality of the atmosphere and human health, have become the primary pollutant affecting ambient air quality in most parts of China.Simultaneously, it has been shown that increasing the concentration of aerosol particles is an important reason for the frequent "haze" weather in cities and suburban areas [1][2][3].Therefore, the estimation of aerosol PM x (including PM 10 and PM 2.5 ) has become a popular research topic in recent years [4][5][6][7].Aerosols have the characteristics of continuous occurrence and large spatial variability in concentration.Due to the uneven distribution of ground monitoring stations, it is difficult to obtain the accurate distributions of aerosol data together with their change trends for an entire city by the methods of spatial interpolation and/or numerical simulation based on the data from monitoring stations [8,9].Taking advantages of high timeliness, wide coverage and high resolution, the technique of satellite remote sensing makes it possible to monitor the aerosol conditions on a larger spatial scale [10][11][12].
Meanwhile, a number of previous studies have been conducted to investigate the quantitative relationships between AOD (aerosol optical depth) and PM x concentrations [20][21][22].The AOD-PM x model is constructed by linear or nonlinear regression, which can be employed to accurately estimate ground-level particulate matter, partly by accounting for relative humidity (RH), wind speed (WS), temperature (TEMP) and planetary boundary layer height (PBLH) [23][24][25].More recently, Artificial Neural Networks (ANNs) have proven to be an advanced mathematical model to achieve better regression results compared with simple linear/nonlinear regression analysis [26,27].
Although some AOD-PM x models can be used to estimate PM x well in some areas, it is still essential to evaluate the robustness of these methods.Paciorek and Liu [28].highlighted the limitations of AOD in predicting the spatial distribution of PM 2.5 ; Kumar [29] summarized some of the factors that affect the AOD-PM 2.5 association, such as mismatch in spatial-temporal resolution, decomposition of AOD by aerosol types, collocation of AOD and PM 2.5 data, and control for spatial-temporal structure in the statistical model.In view of the relative weaknesses between AOD and PM 2.5 in the study of Paciorek and Liu [28], some of the findings need further analysis.
This research is dedicated to developing a so-called BR (band reflectance)-PM x model based on the algorithm of a multilayer perceptron neural network for the estimation of the PM x (including PM 10 and PM 2.5 ) concentrations, where the various meteorological factors and the time parameter are jointly considered.As a result, the PM x -concentration estimations with high spatial resolution and complete spatial coverage have been realized.

Materials and Methods
According to several previous research works concerning the aerosol retrievals from satellite remote sensing images [19,30], the band 1 (0.43-0.45 µm), band 3 (0.53-0.59 µm) and band 7 (2.11-2.29 µm) of Landsat 8 OLI have been considered to be relevant to PM x concentrations.Considering that the band reflectance can be affected by the land surface reflectance, normalized difference vegetation index (NDVI) has been also employed in the proposed BR-Model to realize a more reasonable evaluation of the atmospheric contribution to this reflectance.In this research, the Google Earth Engine (GEE) cloud computing platform has been employed to obtain a large number of consecutive historical satellite image data (Landsat 8 OLI) over the whole area of China from May 2014 to May 2018.Meanwhile, (a) the hourly PM x concentration extracted from the China Environmental Monitoring Center (CEMC) and (b) the meteorological parameters, such as relative humidity (RH), temperature (TEMP), wind direction (WD), wind speed (WS) and atmospheric pressure (PRS) have also been implemented in this research.It should be mentioned that the data of (b) (viz.the meteorological parameters) were initially captured by China Meteorological Administration and had been sent to the National Centers for Environmental Information of National Oceanic and Atmospheric Administration (NOAA) U.S. Department of Commerce for further disseminations.In order to facilitate the description, these meteorological parameters will be termed as 'NOAA data' in the following paragraphs of this article.Moreover, a new method has been developed to explore the 'optimal' latitude and longitude range of the subset for ANN trainings.This method is termed as an LLZ (latitude and longitude zone) method and can be employed by the proposed BR-PM x model to enhance the estimation precision of the PM x concentrations in different research areas.

Data Collection and Preprocessing
As illustrated in Table 1, three types of data are employed in this research: (a) Landsat 8 OLI remote sensing images from the Google Earth Engine (GEE) platform, (b) hourly PM x concentrations from the China Environmental Monitoring Station (CEMS) and (c) meteorological parameters of RH, PRS, TEMP, WD and WS from the National Oceanic and Atmospheric Administration (NOAA).The Landsat 8 satellite, which carries the Operational Land Imager (OLI) and the Thermal Infrared Sensor (TIRS), was successfully launched by NASA on 11 February 2013.The OLI covers all nine bands of the ETM+ sensor.Due to the generic nature of higher resolution and wider band coverage, the Landsat 8 data have been widely used in Earth resource exploration, natural disaster, agriculture, forestry, animal husbandry management and environmental pollution monitoring.
The Google Earth Engine (GEE) cloud computing platform is a facility for processing remote sensing images in bulk under Google Inc. in Mountain View, CA, USA.As one of the most advanced platforms for the analysis and visualization of geographical data, GEE is able to provide users with a large amount of historical satellite image data for consecutive years (e.g., the MODIS, Landsat, etc.).Meanwhile, GEE also provides algorithms and tools for processing and analyzing petabyte data for information mining [31][32][33].As one significant subset of Landsat 8 OLI/TIRS data, the " USGS Landsat 8 Collection 1 Tier 1" has been employed as the primary data for the estimation of PM x concentrations in this study considering that (a) the data in this subset can accurately calibrate the top-of-atmosphere (TOA) reflectance and (b) its temporal resolution is 16 days in China, which is shorter than some other subsets of Landsat 8 OLI/TIRS.Then, a JavaScript program was developed in GEE to obtain, adjust and correct the data from thousands of Landsat 8 OLI images by the method of orthorectification and geographical registration.After that, the selected images are processed using the following steps: (a) remove cloud by the function of mask algorithm (FMask) provided by GEE, (b) establish a circular buffer around each CEMC monitoring station, where the radius of the buffer is 15 meters, viz. the half resolution of Landsat 8 OLI data (30 m), (c) extract the band reflectance of bands 1, 3 and 7 from Landsat 8 OLI images of all the pixels falling inside each established buffer, (d) calculate NDVI from band 4 and band 5 of the Landsat 8 OLI data and (e) calculate the average values of the band reflectance and NDVI in each established buffer.After the data processing from step (a) to (e), the calculated average values together with the imaging time properties were assigned to their corresponding CEMC monitoring stations, which can be regarded as the primary source data for the proposed BR-PM x models.It should be mentioned that the imaging time property will be used as temporal reference in the following processes.As one of the public institutions under the Ministry of Ecological Environment, CEMC undertakes the primary tasks of national environmental monitoring.The widespread monitoring stations enable continuous aerosol measurements by using the methods of DUSTTRAK DRX and TEOM [34], which sample ambient air hourly and provide simultaneous measurements of PM 2.5 and PM 10 .
Considering the reliability of data quality [35], the research time interval is set from 1 May 2014 to 1 May 2018.As depicted in Figure 1, there are 1497 CEMC monitoring stations in total distributed within the study area.A time buffer interval of ±0.5 h was used to match the dataset of CEMC ground observations with the Landsat 8 OLI satellite transit phase.Thus, the data of 99,404 PM x observations were compiled.As one of the public institutions under the Ministry of Ecological Environment, CEMC undertakes the primary tasks of national environmental monitoring.The widespread monitoring stations enable continuous aerosol measurements by using the methods of DUSTTRAK DRX and TEOM [34], which sample ambient air hourly and provide simultaneous measurements of PM2.5 and PM10.
Considering the reliability of data quality [35], the research time interval is set from 1 May 2014 to 1 May 2018.As depicted in Figure 1, there are 1497 CEMC monitoring stations in total distributed within the study area.A time buffer interval of ±0.5 h was used to match the dataset of CEMC ground observations with the Landsat 8 OLI satellite transit phase.Thus, the data of 99,404 PMx observations were compiled.

Other Influence Parameters
PMx concentration has been demonstrated to be affected by pollutant emission and meteorological situations, which exhibit significant time-varying characteristics [36].Pollution source parameters are not discussed due to the limitations of the data source.In this study, two types of parameters were involved: the meteorological data and time data.According to the imaging time property, the Landsat 8 OLI data is assigned a 'month number (1, 2, …, 12)'.The 'month number' will be regarded as a time parameter in the BR-PMx models, allowing seasonal influences to be taken into

Other Influence Parameters
PM x concentration has been demonstrated to be affected by pollutant emission and meteorological situations, which exhibit significant time-varying characteristics [36].Pollution source parameters are not discussed due to the limitations of the data source.In this study, two types of parameters were involved: the meteorological data and time data.According to the imaging time property, the Landsat 8 OLI data is assigned a 'month number (1, 2, . . ., 12)'.The 'month number' will be regarded as a time parameter in the BR-PM x models, allowing seasonal influences to be taken into account.The meteorological data are primarily archived from the National Center for Environmental Information of NOAA, which involve qualified daily, monthly, seasonal, and yearly measurements of temperature, wind, precipitation, etc.The daily measurement produces data every 3 h starting at 0:00 UTC (viz.8:00am Beijing time), of which the nearest neighbor time method was used for data selection.The meteorological variables selected in the proposed BR-PM x model are RH, TEMP, WD, WS, and atmospheric pressure (PRS).Over the study area (ref.Figure 1), there are 372 NOAA stations with 949,531 meteorological observations during the period.
In the process of geospatial analysis, it is essential to ensure that all of the relevant elements are at the same or very approximate temporal stamps.The imaging time property of the Landsat 8 OLI data is regarded as the temporal reference for the conducted research.Due to the different time periods of the CEMC and NOAA measurements, the time buffer intervals of ±0.5 h and ±1 h are settled for data from CEMC and NOAA, respectively, to obtain the required parameter datasets.In this way, the data from Landsat 8 OLI, CEMC and NOAA can be harmonized together.account.The meteorological data are primarily archived from the National Center for Environmental Information of NOAA, which involve qualified daily, monthly, seasonal, and yearly measurements of temperature, wind, precipitation, etc.The daily measurement produces data every 3 h starting at 0:00 UTC (viz.8:00am Beijing time), of which the nearest neighbor time method was used for data selection.The meteorological variables selected in the proposed BR-PMx model are RH, TEMP, WD, WS, and atmospheric pressure (PRS).Over the study area (ref.Figure 1), there are 372 NOAA stations with 949,531 meteorological observations during the period.

Methodology
In the process of geospatial analysis, it is essential to ensure that all of the relevant elements are at the same or very approximate temporal stamps.The imaging time property of the Landsat 8 OLI data is regarded as the temporal reference for the conducted research.Due to the different time periods of the CEMC and NOAA measurements, the time buffer intervals of ±0.5 h and ±1 h are settled for data from CEMC and NOAA, respectively, to obtain the required parameter datasets.In this way, the data from Landsat 8 OLI, CEMC and NOAA can be harmonized together.Before inputting to the proposed BR-PMx model, the data of (b) and (c) need to be integrated together by the Near Analysis Algorithm (NAA) (Section 2.2.1, below).The estimation of the PMx concentrations are primarily based on multilayer perceptron (MLP) neural network (Section 2.2.2, below).In order to enhance the estimation accuracy for the PMx concentrations in an entire city, the latitude and longitude zone method (Section 2.2.3, below) has been developed as well as employed by the proposed BR-PMx model, which can help to settle the 'optimal' region of the 'MLP neural network training' for the specific research area.Before inputting to the proposed BR-PM x model, the data of (b) and (c) need to be integrated together by the Near Analysis Algorithm (NAA) (Section 2.2.1, below).The estimation of the PM x concentrations are primarily based on multilayer perceptron (MLP) neural network (Section 2.2.2, below).In order to enhance the estimation accuracy for the PM x concentrations in an entire city, the latitude and longitude zone method (Section 2.2.3, below) has been developed as well as employed by the proposed BR-PM x model, which can help to settle the 'optimal' region of the 'MLP neural network training' for the specific research area.

Near Analysis Algorithm (NAA)
Due to the different number of monitoring stations and geographic locations, it is complex to directly obtain the meteorological information around the CEMC stations.Therefore, the collocation of various meteorological parameters of the investigated CEMC stations has become a priority condition for PM x estimation.
To integrate the CEMC observations and NOAA datasets into the processes of estimating PM x concentrations seamlessly, the NAA has been employed to compute the distance from each feature in the analytic objects (viz.the 'Input Elements' of CEMC) to the nearest features in the neighboring objects (viz.the 'Proximity Elements' of NOAA), within the settled search radius (e.g., 100 km).Thus, the nearest dataset of the monitoring stations can be calculated.Then, the necessary meteorological variables around one CEMC station can be acquired from its nearest NOAA station.

Multilayer Perceptron (MLP) Neural Network
Physical models are considered to be the most suitable predictors of small-scale PM x concentrations.However, the complexity of physical models and some other influential factors make it hard to achieve the macroscopic estimation of PM x .Statistical regression and ANN approaches are considered a promising method for estimating PM x concentrations.
A neural network is an important technology for pattern recognition and machine learning.It is the basis of deep learning that simulates the neural network of the human brain to realize artificial intelligence.It has been widely used for diverse remote sensing applications [37,38].The neural network consists of many interconnected processing units.These units are usually linearly arranged into groups, which can be simulated by electronic circuits or computer programs.Multilayer perceptron (MLP) is a forward-structured ANN, where the input vectors are mapped to output vectors, and the representative models involve a multilayer BPN network [39], RBF network [40], the Hopfield model [41] and so on.
In this paper, the BPN (back-propagation network) is selected for PM x concentration estimation.BPN is a multilayered network for weights' training with forward nonlinear differentiable functions.The BPN neural network consists of input layer, multiple hidden layers and the output layer (see Figure 2).Each of these layers is connected to all of the cells in the adjacent layer.Simultaneously, there is no connection between the cells of the same layer.Once a pair of learning samples is provided to the network, the activation value of the neuron propagates from the input layer to the output layer, and the neurons in the output layer obtain the input response of the network.After that, the connection weights are modified layer by layer from the output layer through the middle layer and then back to the input layer.
In Figure 3, X i is the input of node i of the input layer, i = 1 . . .N, W ij is the weight between node j of the hidden layer and the node i of the input layer, j = 1 . . .p, θ j is the threshold of the node j of the hidden layer; θ k is the threshold of the node k of the hidden layer; φ(X) is the excitation function of the hidden layer; W jk is the weight between node k of the hidden layer and node j of the hidden layer, k = 1 . . .q. W kl is the weight between node l of the output layer and node k of the hidden layer, l = 1 . . .M. O l represents the threshold of node l of the output layer, l = 1 . . .M, ϕ(X) is the excitation function of the output layer, and Y(PM X ) is the output layer node.
developed to get the optimal training region (Section 2.2.3, below).The input variables of the proposed BPN model refer to the 'band parameter', the meteorological data and the corresponding time parameter, while the output variables are the measured PMx concentrations at CEMC stations.To improve the computational speed and PMx concentration estimation accuracy, the input and output data must be normalized uniformly to the data range of [−1,1].For the topologic design of the neural network, the tansig function (tan-sigmoid transfer function, see Figure 4a) is utilized for transmitting the neurons of each layer [42].The purelin function (linear transfer function, see Figure 4b) is employed as the transfer function of the output layer [43,44].The number of intermediate nodes of the hidden layers is enlarged from 5 to 20 in a stepwise fashion using the Kolmogorov theorem to establish the nodes of the network structure [45].Based on the growth method of the network structure, the best number of nodes of the hidden layers is selected as (15,15) for the continuous training of the network.In the process of neural network operation, the input data are randomly divided into three subsets.The first subset is for training (60%), the second is for validation (20%), and the third is for testing (20%).The data in the training subset are used to train and adjust the weights on the four- In this research, a 4-layer BPN neural network has been developed to estimate the PM x concentration.In the process of determining the input and output factors of the model, (a) the band reflectance of band 1, 3, 7 and the band value of NDVI at the CEMC monitoring stations from the Landsat 8 OLI images are considered as the 'band parameter'; (b) the NAA method has been utilized to obtain the meteorological data (viz.RH, PRS, TEMP, WD and WS) around the CEMC stations based on the NOAA data; and (c) the dataset of 'actual' values of PM x concentration have been measured by CEMC stations.In addition, the latitude and longitude zone (LLZ) method has been developed to get the optimal training region (Section 2.2.3, below).The input variables of the proposed BPN model refer to the 'band parameter', the meteorological data and the corresponding time parameter, while the output variables are the measured PM x concentrations at CEMC stations.To improve the computational speed and PM x concentration estimation accuracy, the input and output data must be normalized uniformly to the data range of [−1,1].

Input layer Output layer
For the topologic design of the neural network, the tansig function (tan-sigmoid transfer function, see Figure 4a) is utilized for transmitting the neurons of each layer [42].The purelin function (linear transfer function, see Figure 4b) is employed as the transfer function of the output layer [43,44].The number of intermediate nodes of the hidden layers is enlarged from 5 to 20 in a stepwise fashion using the Kolmogorov theorem to establish the nodes of the network structure [45].Based on the growth method of the network structure, the best number of nodes of the hidden layers is selected as (15,15) for the continuous training of the network.
In the process of neural network operation, the input data are randomly divided into three subsets.The first subset is for training (60%), the second is for validation (20%), and the third is for testing (20%).The data in the training subset are used to train and adjust the weights on the four-layer neural networks.The validation subset is required to minimize overfitting.The data in the testing subset are implemented to predict the PM x concentration data by the BPN model, for which the optimal connection weights should be determined during the training process.The performance of the established neural network model can be primarily evaluated according to the factors of root mean square error (RMSE) and coefficient of determination (R 2 ) between the predicted and the observed value.
For the topologic design of the neural network, the tansig function (tan-sigmoid transfer function, see Figure 4a) is utilized for transmitting the neurons of each layer [42].The purelin function (linear transfer function, see Figure 4b) is employed as the transfer function of the output layer [43,44].The number of intermediate nodes of the hidden layers is enlarged from 5 to 20 in a stepwise fashion using the Kolmogorov theorem to establish the nodes of the network structure [45].Based on the growth method of the network structure, the best number of nodes of the hidden layers is selected as (15,15) for the continuous training of the network.In the process of neural network operation, the input data are randomly divided into three subsets.The first subset is for training (60%), the second is for validation (20%), and the third is for testing (20%).The data in the training subset are used to train and adjust the weights on the four-

Latitude and Longitude Zone (LLZ) Method
As mentioned earlier, the Latitude and Longitude Zone (LLZ) method can be implemented to compute the 'optimal' training region for one specific research area (viz.one city), which can be characterized by six steps: (a) the geometric center of the research area is settled as the center point of the latitude and longitude zone (LLZ); (b) the initial latitude and longitude width can be calculated by Equation ( 1): where Lon width and Lat width respectively represent the longitude and latitude width of the initial LLZ, Lon area and Lat area represent the longitude and latitude widths of the minimum enclosing rectangle (MER) of the research area; and ξ is a coefficient that can be empirically settled between 0.5 and 1. (c) the LLZ is progressively growing at 0.1 degrees along both of the longitude and latitude directions; (d) the monitoring datasets falling inside the LLZ are selected and inputted to the proposed BR-PM x models for the MLP Neural Network training; (e) The eigenvalue R 2 will be calculated after the MLP Neural Network training, which can reflect the current accuracy of the PM x concentration estimation; (f) the steps (c) to (e) will be continuously operated until the area of LLZ becomes K times larger than the MER of the specific research area, where K can be empirically settled as 2, 3, . . .N. (g) the R 2 of all of the LLZs will be compared to each other, and then the LLZ with the largest R 2 will be picked out as the 'optimal' training region of the specific research area.
As illustrated by Figure 5, the 'optimal' training region of LLZ is relevant to distribution of the monitoring stations, which could be smaller, larger or quite similar to the MER of the research area.It is worth mentioning that it is not desired if the LLZ is much smaller than the MER of the research area.Such undesired conditions could be avoided by the empirical coefficient ξ settled in step (b).
be picked out as the 'optimal' training region of the specific research area.
As illustrated by Figure 5, the 'optimal' training region of LLZ is relevant to distribution of the monitoring stations, which could be smaller, larger or quite similar to the MER of the research area.It is worth mentioning that it is not desired if the LLZ is much smaller than the MER of the research area.Such undesired conditions could be avoided by the empirical coefficient ξ settled in step (b).

PMx Estimations Based on MLP Neraul Network in the Whole Area of Mainland China
Based on the MLP neural network, the PM x concentrations have been estimated in the whole area of mainland China.Several previous studies have demonstrated the temporal patterns of PM x concentrations, i.e., the PM x concentrations may vary with months and/or seasons [46,47].Hence, the R 2 between PM est and PM rea is calculated for different months (from January to December) and seasons (from spring to winter).Then, the scatter plots of PM est and PM rea are drawn from the 70,393 samples for the BR-PM 2.5 model and 67,265 samples for the BR-PM 10 model by the line regression functions provided by MATLAB 2016, MathWorks Inc, Natick, USA (see Figure 6).With the BR-PM 2.5 model, the R 2 is 0.3647, and the root mean square error (RMSE) is 25.34 (µg/m 3 ), while with the BR-PM 10 model, the R 2 is 0.2928 and the RMSE is 36.66(µg/m 3 ).Obviously, neither result is satisfactory.
To further investigate the temporal aerosol variations, the proposed BR-PM x models are implemented to estimate the concentrations of PM 2.5 and PM 10 in each month and season from 2014 to 2018.The relevant regression coefficients of the estimation results are illustrated in Table 2.For all monthly and seasonal regression calculations, the R 2 of the BR-PM 2.5 model is generally slightly higher than that of the BR-PM 10 model, while the RMSE of the BR-PM 2.5 model is much smaller.For the BR-PM 2.5 model, R 2 exhibited a seasonal behavior characterized by a maximum during autumn (0.4280), lower values in spring (0.3910) and summer (0.3813), and a minimum in winter (0.3754), which roughly agrees with the findings of Engel-Cox et al. [48].For the BR-PM 10 model, the maximum value of R 2 also exists in autumn (0.3407), followed by winter (0.3233), summer (0.2866) and spring (0.2646).R 2 between PMest and PMrea is calculated for different months (from January to December) and seasons (from spring to winter).Then, the scatter plots of PMest and PMrea are drawn from the 70,393 samples for the BR-PM2.5 model and 67,265 samples for the BR-PM10 model by the line regression functions provided by MATLAB 2016, MathWorks Inc, Natick, USA (see Figure 6).With the BR-PM2.5 model, the R 2 is 0.3647, and the root mean square error (RMSE) is 25.34 (μg/m 3 ), while with the BR-PM10 model, the R 2 is 0.2928 and the RMSE is 36.66(μg/m 3 ).Obviously, neither result is satisfactory.

Caculation of the Optimal LLZ for the Specific Research Area
To improve the accuracy of PM x concentration estimation and reduce the values of RMSE, the LLZ method (described in Section 2.2.2) is integrated into the MLP neural network algorithm in the proposed BR-PM x models.The capital of China, Beijing (116.39E, 39.92 W) is selected as the test area, and the LLZ center point is set at the geometric center of Beijing, growing at 0.1 degrees along the longitude and latitude direction.Then, the monitoring datasets inside the latitude and longitude zone are continuously tested by the BR-PM x models to explore the optimal bandwidth.Figures 7 and 8 show the variation of the coefficient of determination (R 2 ) between PM est and PM rea at different spatial ranges obtained by the LLZ method.It should be noted that the origin point of the x, y axis is (116.39,39.92) in both Figures 7 and 8. concentrations increases to 0.795 and 0.773, respectively.The RMSE is reduced by 14.6% (36.65 to 31.27) and 20.7% (25.34 to 20.09) for PM10 and PM2.5, respectively (see Figure 9), which indicates that the LLZ method is very helpful for estimating the PMx concentrations.According to Figures 7 and 8, it is found that the variation of the R 2 between PM est and PM rea is not very drastic in a range with the longitude width within 2 • and the latitude width within 1 • .Once the latitude/longitude range is enlarged to a certain width, the R 2 values decrease dramatically.The optimal values for the coefficient R 2 of the BR-PM X model exceed 0.7 with a longitude range of 2 • and a latitude range of 1 • .
Using the LLZ method, the data located inside the bandwidth of ca. 2 • longitude and ca. 1 • latitude with Beijing's geometric center as the center point are inputted to the BR-PM x models.Both the BR-PM 10 and BR-PM 2.5 models demonstrate significant improvements for the calculated estimates of the PM x concentrations after the combination with the LLZ method.The R 2 of PM 2.5 and PM 10 concentrations increases to 0.795 and 0.773, respectively.The RMSE is reduced by 14.6% (36.65 to 31.27) and 20.7% (25.34 to 20.09) for PM 10 and PM 2.5 , respectively (see Figure 9), which indicates that the LLZ method is very helpful for estimating the PM x concentrations.
concentrations increases to 0.795 and 0.773, respectively.The RMSE is reduced by 14.6% (36.65 to 31.27) and 20.7% (25.34 to 20.09) for PM10 and PM2.5, respectively (see Figure 9), which indicates that the LLZ method is very helpful for estimating the PMx concentrations.

Validation Analysis
As depicted in Section 2.1.1,the remote sensing images imported to the BR-PM x model need to be processed for removing of clouds.For the mainland of China, winter is the season with the most serious air pollution, while the air quality is much better in summer.Then, the data of (a) Landsat 8 OLI remote sensing images at UTC 2:54 (viz.10:54 am Beijing time), on 30 December 2016 (winter) (see Figure 10, below) and UTC 2:54, on 24 August 2016 (summer) (see Figure 11, below) in Beijing and (b) NOAA meteorological data for the same region and at the approximate time points, are selected and retained.All the other data in December and August of 2016 were removed from the BR-PM x models.
respect to the spatial distribution.The higher PMx values are distributed in the south, while the lower values are located in the north, which are not only related to the special terrain of Beijing but also to the industrialization and population density.Furthermore, the minimum (Min), maximum (Max) and the mean values of the PMx concentrations, which are respectively generated by Kriging interpolations and the BR-PMx models, have been illustrated as well as compared in Tables 3 and 4 (below).Tables 3 and 4 demonstrate that, under both of the conditions with (a) heavy air pollution at UTC 2:54 on 30 December 2016 and (b) mild air pollution at UTC 2:54 on 24 August 2016, the coefficient of determination (R 2 ) between Kriging interpolation and BR-PM2.5 model reached 0.80817 and 0.82954 respectively, while the R 2 of PM10 are respectively 0.77442 and 0.75519: the distribution trend of the PMx concentrations graph generated by Kriging interpolation and BR-PMx models remained consistent regardless of whether the air is heavily polluted or not.
After the validation analysis, it can be concluded that the proposed BR-PMx models are feasible for the PMx concentration estimations with the air pollution in different degrees, and thereby have a certain potential for real-world applications.To obtain the continuous distribution graph of the PM x concentrations of the research area, one of the common methods is to operate the spatial interpolation based on the discrete PM x values at various points.Figure 10b,d and Figure 11b,d are such spatial distribution graphs calculated by the Kriging interpolations based on the monitoring PM x values from the 12 CEMC stations located in the area of Beijing, China.Figure 10c,e and Figure 11c,e are the spatial distribution graphs of the PM 10 and PM 2.5 concentrations estimated by the proposed BR-PM x models in this research.All of these graphs have the spatial resolution of 30 m.
The estimation results of the BR-PM x models and the interpolation results are consistent with respect to the spatial distribution.The higher PM x values are distributed in the south, while the lower values are located in the north, which are not only related to the special terrain of Beijing but also to the industrialization and population density.Furthermore, the minimum (Min), maximum (Max) and the mean values of the PM x concentrations, which are respectively generated by Kriging interpolations and the BR-PM x models, have been illustrated as well as compared in Tables 3 and 4 (below).Tables 3 and 4 demonstrate that, under both of the conditions with (a) heavy air pollution at UTC 2:54 on 30 December 2016 and (b) mild air pollution at UTC 2:54 on 24 August 2016, the coefficient of determination (R 2 ) between Kriging interpolation and BR-PM 2.5 model reached 0.80817 and 0.82954 respectively, while the R 2 of PM 10 are respectively 0.77442 and 0.75519: the distribution trend of the PM x concentrations graph generated by Kriging interpolation and BR-PM x models remained consistent regardless of whether the air is heavily polluted or not.After the validation analysis, it can be concluded that the proposed BR-PM x models are feasible for the PM x concentration estimations with the air pollution in different degrees, and thereby have a certain potential for real-world applications.

Discussion
Taking advantage of the wide spatial and temporal coverage, it has been generally proved that the various satellite remote sensing images could be employed for the estimation of the PM x concentrations in large geographic areas.In order to obtain the continuous spatial distribution of PM x concentrations in various cities of China, this research develops new BR-PM x models for the estimation of PM x (including PM 10 and PM 2.5 ) concentrations based on the algorithm of a multilayer perceptron neural network combined with LLZ method.In the BR-PM x models, the primary source data were from Landsat 8 OLI and in the meantime, various meteorological factors and the time parameter have been taken into consideration.To the best of our knowledge, the Landsat 8 OLI satellite data have been seldom investigated for the PM x concentration estimations in the literature published so far.
With an explore process with stepwise enlarged areas, the optimal LLZ for Beijing has been identified.Then, the coefficient of determination (R 2 ) of the BR-PM 2.5 and BR-PM 10 models respectively reached 0.795 and 0.773; synchronously, the RMSE values of the BR-PM 2.5 and BR-PM 10 models reduced to 31.27 µg/m 3 and 20.09 µg/m 3 , respectively.The estimation accuracy is satisfactory and higher than many established models based on the AOD method [19,20,24,49].To make the validation analysis, the BR-PM x models have been implemented to obtain the spatial distribution graph of the PM x concentrations in Beijing at two different times: one is in winter with heavily polluted air and the other is in summer with much better air quality.The distribution graphs generated by the BR-PM x models are significantly better than that of Kriging interpolations in terms of resolution and clarity.
Furthermore, as the Landsat 8 OLI remote sensing images have been employed as the primary source data in this research, the distribution graphs of PM x concentrations estimated by the PM x models have a much higher spatial resolution than many other research works, e.g., the estimated PM x concentrations calculated by the AOD method in [19][20][21]24,50], which is based on the MODIS data with the highest resolution of 250 m.
Therefore, the proposed BR-PM x models can be used as one of the possible methods to estimate the general spatial distributions of PM x concentrations in various cities in China that have been monitored by CEMC stations.

Conclusions
In this research, we developed a new BR-PM x model to estimate the PM x concentration, which is based on the spectral data of Landsat 8 OLI remote sensing images and the PM x concentration data from the ground monitoring stations.During the process of analyzing the correlations between PM est and PM rea in terms of temporal dimensions with the data from May 2014 to May 2018, satisfactory estimation results cannot be obtained if the whole area of China is set as an estimation region and the regression coefficient R 2 between PM est and PM rea for the models BR-PM 2.5 and BR-PM 10 just reach ca.0.3647 and 0.2928, respectively.The estimation results represent typical characteristics of seasonal variations.The period with the highest estimation accuracy is in autumn for both the BR-PM 2.5 and BR-PM 10 models.
To improve the estimation performance of the BR-PM x model, the factor of latitude and longitude zone (LLZ) has been integrated into the model to obtain the optimal training area for the specific research area of Beijing, China.The results demonstrate that the LLZ method performs promisingly as the coefficient of determination (R 2 ) of the BR-PM 2.5 and BR-PM 10 models reached 0.795 and 0.773, respectively; their RMSE reached 31.27µg/m 3 for the BR-PM 2.5 model and 20.09 µg/m 3 for the BR-PM 10 model.
In general, based on the Landsat 8 OLI remote sensing images and the PM x concentration data, the MLP neural network combined with the LLZ method produced reasonable estimation results of PM x concentration in the test area of Beijing, China.As known, there are some deficiencies in the LLZ method due to the nonuniform distribution of the observation stations.In addition, it is difficult to determine the optimal bandwidth accurately since the optimal bandwidth can vary from one area to another.Due to the fact that the regional characteristics of aerosols can be taken into account by the LLZ method, the LLZ method is very useful to solve the problems of missing or having few observation stations in the research area by acquiring the training data from the surrounding stations.The BR-PM x models after the training in the optimal LLZs can then be utilized to analyze the air pollution characteristics of the investigated research areas.Meanwhile, the high-resolution (30 m) estimations of PM x concentrations with complete spatial coverage were derived in this research.From the perspective of daily coverage, the estimation results are quite close to the observation values.Hence, the estimation results can help to (a) understand the formation process of regional PM x pollution episodes, (b) obtain accurate PM x sources and distributions with high spatial resolution and (c) establish pollution control measures.
In this research, the meteorological factors of RH, TEMP, WD, WS and PRS were considered in the developed BR-PM x models.In our future work, some other parameters (e.g., the planetary boundary layer) and the sensitivity analysis of the employed parameters will be further investigated.Moreover, it should make some sense if the AOD method and the proposed BR-PM x models can be integrated together.

2. 1 . 2 .
PMx Concentration Data of the China Environmental Monitoring Center (CEMC) Hourly PM x data can be collected from the China urban air quality real-time release platform (http://106.37.208.233:20035/)maintained by the China Environmental Monitoring Center (CEMC).

Figure 1 .
Figure 1.Research area and the spatial distribution of CEMC (red points) and NOAA (yellow points) stations used in this study.

Figure 1 .
Figure 1.Research area and the spatial distribution of CEMC (red points) and NOAA (yellow points) stations used in this study.

Figure 2
Figure 2 shows the workflow used to estimate the PM x concentrations, which requires four groups of source data, viz.(a) the 'time parameter' extracted from Landsat 8 OLI, (b) the preprocessed Band 1, 3, 7 and NDVI from Landsat 8 OLI, (c) meteorological parameters (viz.RH, PRS, TEMP, WD, WS) at NOAA stations, and (d) PM x (PM 2.5 and PM 10 ) monitoring values at CEMC stations.

Figure 2 .
Figure 2. The workflow of PM x concentration estimations.

Figure 3 .
Figure 3.The diagram of the back-propagation network (BPN).

Figure 4 .
Figure 4. Transfer functions utilized in the BPN: (a) the tansig function and (b) the purelin function.

Figure 3 .
Figure 3.The diagram of the back-propagation network (BPN).

Figure 4 .
Figure 4. Transfer functions utilized in the BPN: (a) the tansig function and (b) the purelin function.

Figure 4 .
Figure 4. Transfer functions utilized in the BPN: (a) the tansig function and (b) the purelin function.

Figure 5 .
Figure 5.The diagrammatic sketch of the 'optimal' LLZ which could be (a) smaller, (b) larger or (c) quite similar to the MER of the specific research area; rectangle with solid line: MER of the research area; rectangle with dashed line: optimal LLZ; blue dots: the monitoring stations.

Figure 5 .
Figure 5.The diagrammatic sketch of the 'optimal' LLZ which could be (a) smaller, (b) larger or (c) quite similar to the MER of the specific research area; rectangle with solid line: MER of the research area; rectangle with dashed line: optimal LLZ; blue dots: the monitoring stations.

Figure 6 .
Figure 6.Scatterplots between PMest and PMrea of (a) the BR-PM2.5 model and (b) BR-PM10 model in the whole area of mainland China, where the solid line represents the fit line obtained.

Figure 6 .
Figure 6.Scatterplots between PM est and PM rea of (a) the BR-PM 2.5 model and (b) BR-PM 10 model in the whole area of mainland China, where the solid line represents the fit line obtained.

Figure 7 .
Figure 7.The coefficient of determination (R 2 ) between PMest and PMrea of the BR-PM2.5 model at different spatial obtained by the LLZ method.

Figure 7 .
Figure 7.The coefficient of determination (R 2 ) between PM est and PM rea of the BR-PM 2.5 model at different spatial obtained by the LLZ method.

Figure 7 .
Figure 7.The coefficient of determination (R 2 ) between PMest and PMrea of the BR-PM2.5 model at different spatial obtained by the LLZ method.

Figure 8 .Figure 9 .
Figure 8.The coefficient of determination (R 2 ) between PMest and PMrea of the BR-PM10 model at different spatial ranges obtained by the LLZ method.

Figure 8 .
Figure 8.The coefficient of determination (R 2 ) between PM est and PM rea of the BR-PM 10 model at different spatial ranges obtained by the LLZ method.

Figure 7 .
Figure 7.The coefficient of determination (R 2 ) between PMest and PMrea of the BR-PM2.5 model at different spatial obtained by the LLZ method.

Figure 8 .
Figure 8.The coefficient of determination (R 2 ) between PMest and PMrea of the BR-PM10 model at different spatial ranges obtained by the LLZ method.

Figure 9 .
Figure 9. Scatterplots between PMest and PMrea of (a) BR-PM10 model and (b) BR-PM2.5 models combined with the LLZ method, where the solid line is the fitting line obtained by the line regression method provided by MATLAB functions and the dashed line is the reference of the linear function y = x.

Figure 9 .
Figure 9. Scatterplots between PM est and PM rea of (a) BR-PM 10 model and (b) BR-PM 2.5 models combined with the LLZ method, where the solid line is the fitting line obtained by the line regression method provided by MATLAB functions and the dashed line is the reference of the linear function y = x.

Figure 10 .
Figure 10.Spatial distributions of the PM10 and PM2.5 concentrations (μg/m 3 ) over the Beijing area at UTC 2:54 on 30 December 2016: (a) the Landsat 8 OLI image of RGB true color; (b) the PM10 concentration calculated by the method of Kriging interpolation; (c) the results estimated by the BR-PM10 model; (d) the PM2.5 concentration calculated by the method of Kriging interpolation; and (e) the results estimated by the BR-PM2.5 model.

Figure 10 .Figure 11 .
Figure 10.Spatial distributions of the PM 10 and PM 2.5 concentrations (µg/m 3 ) over the Beijing area at UTC 2:54 on 30 December 2016: (a) the Landsat 8 OLI image of RGB true color; (b) the PM 10 concentration calculated by the method of Kriging interpolation; (c) the results estimated by the BR-PM 10 model; (d) the PM 2.5 concentration calculated by the method of Kriging interpolation; and (e) the results estimated by the BR-PM 2.5 model.

Table 1 .
Details of all datasets used in this study.
Hourly PMx data can be collected from the China urban air quality real-time release platform (http://106.37.208.233:20035/)maintained by the China Environmental Monitoring Center (CEMC).

Table 2 .
Regression coefficients of BR-PM x models using the samples in each month/season from 2014 to 2018.

Table 3 .
Comparison of the PM x -concentration distributions generated by Kriging interpolation and the BR-PM x models at UTC 2:54 on 30 December 2016 (in winter).

Table 4 .
Comparison of the PM x -concentration distributions generated by Kriging interpolation and the BR-PM x models at UTC 2:54 on 24 August 2016 (in summer).