A Full-Coverage Daily Average PM 2.5 Retrieval Method with Two-Stage IVW Fused MODIS C6 AOD and Two-Stage GAM Model

: Current PM 2.5 retrieval maps have many missing values, which seriously hinders their performance in real applications. This paper presents a framework to map full-coverage daily average PM 2.5 concentrations from MODIS C6 aerosol optical depth (AOD) products and ﬁll missing pixels in both the AOD and PM 2.5 maps. First, a two-stage inversed variance weights (IVW) algorithm was adopted to fuse the MODIS C6 Terra and Aqua AOD products, which ﬁlls missing data in MODIS standard AOD data and obtains a high coverage daily average. After that, using the fused MODIS daily average AOD and ground-level PM 2.5 in all grid cells, a two-stage generalized additive model (GAM) was implemented to obtain the full-coverage PM 2.5 concentrations. Experiments on the Yangtze River Delta (YRD) in 2013–2016 were carefully designed to validate the performance of our proposed framework. The results show that the two-stage IVW could not only improve the spatial coverage of MODIS AOD against the original standard product by 230%, but could also keep its data accuracy. When compared with the ground-level measurements, the two-stage GAM can obtain accurate PM 2.5 concentration estimates (R 2 = 0.78, RMSE = 19.177 µ g / m 3 , and RPE = 28.9%). Moreover, our method performs better than the inverse distance weighted method and kriging methods in mapping full-coverage daily PM 2.5 concentrations. Therefore, the proposed framework provides a good methodology for retrieving full-coverage daily average PM 2.5 concentrations from MODIS standard AOD products.


Introduction
The rapid growth of the extensive economy in China during last three decades brings about many ecological and environmental problems, especially the air pollution problem [1][2][3]. Most of the developed regions in China have been experiencing severe particulate matter pollution problems, e.g., the North China Plain (NCP) [4,5], the Yangtze River Delta region (YRD) [6,7] and the Pearl River Delta region (PRD) [8,9]. Particulate matter with an aerodynamic diameter less than 2.5 µm (PM 2.5 ) is one of most harmful particulate matters [10,11]. A high PM 2.5 concentration in the air results in a greater probability of lung cancer and mortality from it [12,13]. Therefore, monitoring PM 2.5 concentrations is of great importance for making effective air pollution control measures to reduce its harms.
Satellite-based remote sensing has been becoming a widely used technique to monitor the PM 2.5 concentrations [14][15][16]. It can provide a large spatial coverage of PM 2.5 concentrations over long periods, and this distinct advantage is unavailable for ground monitoring stations. Many satellite sensors like to obtain a full-coverage PM 2.5 estimation. However, many of the above works have some specific prerequisites, which might limit their performances in realistic applications.
The YRD region is one of the fast-growing economic areas in China and has been experiencing severe PM 2.5 pollution. Ma presented a nested linear mixed effects (LME) model to estimate the high resolution PM 2.5 concentrations in the YRD of China from 3 km MODIS AOD [36]. The LME model has been proven to successfully describe the nested month-, week-, and day-specific random effects of PM 2.5 -AOD relationships. Unfortunately, the LME model does not handle the problem of missing AOD values and cannot obtain full-coverage of PM 2.5 concentrations in the YRD. Later, Xiao proposed a multiple imputation (IM) method to fill missing AOD values and predicted the full-coverage PM 2.5 concentrations in the YRD from 2013 to 2014 [52]. The IM combines the multiangle implementation of atmospheric correlation (MAIAC) AOD with community multiscale air quality (CMAQ) simulations to fill missing AOD values, and it implements a two-stage statistical model to estimate the daily ground PM 2.5 concentrations. The specific MAIAC AOD product and the complicated CMAQ model limit the applicability of the IM method. In this paper, we would like to propose an innovative framework to retrieve full-coverage daily average PM 2.5 concentrations from MODIS C6 AOD data in the YRD for 2013-2016. Our idea is to fill partial missing MODIS AOD values and then predict the full-coverage PM 2.5 concentrations. Both the MODIS C6 AOD data and the idea of the proposed framework are different from above studies. More specifically, our framework is formulated on the statistical relationships between PM 2.5 concentrations and MODIS AOD, and it is much easier to implement than others. The spatial-temporal correlations between MODIS DB and DT AOD data from Terra and Aqua satellites provide a high probability of accurately estimating the missing AOD values in the standard product. Therefore, a two-stage version of the inverse variance weights (IVW) [53] algorithm is presented to iteratively fuse the MODIS DB and DT AOD from the Terra and Aqua satellites. By considering the divergent weights of different MODIS AOD data in the fusion procedure, the first-stage IVW fuses the DB and DT AOD data from the Terra and Aqua satellites, respectively. The second IVW performs a further fusion of the filled Terra and Aqua data to continue improving the spatial coverage of MODIS AOD data. After that, using the fused daily average MODIS AOD data, a two-stage generalized additive model (GAM) is implemented to obtain the full-coverage PM 2.5 concentrations. The GAM model considers complicated nonlinear relations between ground-level PM 2.5 and AOD and other explanatory variables (e.g., meteorological and geographic factors), and can automatically select smooth functions to fit explanatory variables and formulate the proper retrieval model [54]. The first-stage GAM investigates the temporal variations of ground PM 2.5 and predicts the PM 2.5 concentrations from the fused MODIS AOD, and the second-stage GAM adopts more spatial smooth factors to retrieve PM 2.5 concentrations for the pixels without the corresponding MODIS AOD.
The rest of our paper is arranged as follows. Section 2 describes the study area and the data used in the study. Section 3 presents the main methodology. Section 4 presents the experiments on the YRD region over four years, i.e., 2013-2016. Section 5 discusses the experimental results and Section 6 draws the conclusions.

Ground-Level PM 2.5 Observations
Our study area is located in the YRD region of China, including Shanghai city, Zhejiang, Jiangsu, and Anhui provinces. The YRD region covers an area of 350,600 km 2 , taking 3.63% of the total area of China [1,2]. It belongs to a typical monsoon climate, having warm and humid weather in summer and cold and dry weather in winter. Ground-level hourly PM 2.5 concentration observations from 2013-2016 were downloaded from the China air quality real-time release system of the Chinese Ministry of Environmental Protection (available at http://106. 37.208.233:20035). The PM 2.5 concentrations were measured by tapered element oscillating microbalances (TEOM) or the beta attenuation method (BAM or β-gauge). The data has an uncertainty less than 0.75%, with its accuracy reaching up to ±1.5 µg/m 3 Remote Sens. 2019, 11, 1558 4 of 18 for the hourly average, and hence it is accurate enough as ground truth for PM 2.5 concentration measurements. Figure 1 illustrates the 245 ground monitoring stations of PM 2.5 in the YRD in 2016.

MODIS C6 and AERONET AOD Data
The latest MODIS AOD product version Collection 6 is constructed from the MODIS imagery using both the enhanced DB and DT algorithm, and the AOD product is adaptable for both dark and bright surfaces. In this study we utilized the MODIS C6 DB/DT AOD products from Level 2 and the Atmosphere Archive & Distribution System of NASA, and the DB/DT AOD products have two versions of MOD04 (Terra) and MYD04 (Aqua) [55] (available at https://ladweb.nascom.nasa.gov/search/). Considering the data quality of the MODIS C6 AOD product, the data with flags from 1 to 3 were used, and the data with flag = 0 were discarded.
The ground AERONET AOD data was used to validate the accuracy of retrieved AOD from MODIS satellite imagery [56]. The AERONET collaboration network provides globally distributed observation of AOD (available at https://AERONET.gsfc.nasa.gov/), and the AERONET data has three data quality levels: Level 1.0 (unscreened), Level 1.5 (cloud-screened), and level 2.0 (cloud-screened and quality-assured). Six AERONET sites exist in the YRD region as shown in Figure 1, i.e., Xuzhou_CUMT, SONET_Zhoushan, SONET_Nanjing, SONET_Shanghai, SONET_Hefei and Taihu. The 440 nm and 675 nm AERONET AOD with level 1.5 in 2013-2016 was used in the study because of its larger coverage and relatively better data quality in the YRD.

MODIS C6 and AERONET AOD Data
The latest MODIS AOD product version Collection 6 is constructed from the MODIS imagery using both the enhanced DB and DT algorithm, and the AOD product is adaptable for both dark and bright surfaces. In this study we utilized the MODIS C6 DB/DT AOD products from Level 2 and the Atmosphere Archive & Distribution System of NASA, and the DB/DT AOD products have two versions of MOD04 (Terra) and MYD04 (Aqua) [55] (available at https://ladweb.nascom. nasa.gov/search/). Considering the data quality of the MODIS C6 AOD product, the data with flags from 1 to 3 were used, and the data with flag = 0 were discarded.
The ground AERONET AOD data was used to validate the accuracy of retrieved AOD from MODIS satellite imagery [56]. The AERONET collaboration network provides globally distributed observation of AOD (available at https://AERONET.gsfc.nasa.gov/), and the AERONET data has three data quality levels: Level 1.0 (unscreened), Level 1.5 (cloud-screened), and level 2.0 (cloudscreened and quality-assured). Six AERONET sites exist in the YRD region as shown in Figure 1, i.e., Xuzhou_CUMT, SONET_Zhoushan, SONET_Nanjing, SONET_Shanghai, SONET_Hefei and Taihu. The 440 nm and 675 nm AERONET AOD with level 1.5 in 2013-2016 was used in the study because of its larger coverage and relatively better data quality in the YRD.

Meteorological Data
Eight meteorological factors were used to help retrieve the PM2.5 concentrations, including planetary boundary layer height (PBLH), precipitation (PRECTOT), wind speed (WS), air pressure (PS), temperature at 2m above the ground (T2M), wind speed and direction at 10 m above the ground (U10M, V10M), cloud fraction (CLOUD), and relative humidity (RH). The majority of particle mass loading resides in the lower troposphere, and the particle mass distribution below the planetary boundary layer tends to be more homogeneous due to the active mixing. The PM2.5 is usually well mixed in the planetary boundary layer, and accordingly the PM2.5 is large for a small mixed layer height corresponding to the same AOD [57]. The precipitation in wet seasons increases soil moisture and suppresses wind-induced PM2.5 emissions from the ground. The relative humidity affects the AOD-PM2.5 through changing the optical properties of the aerosols. An increase in humidity reflects an updraft of boundary layer air masses to the 3 km level, leading to a higher level of air pollution. High surface temperature or high air pressure accelerates the atmospheric vertical motion to transport ground pollutants into higher places. Wind speed is an effective index of quantifying

Meteorological Data
Eight meteorological factors were used to help retrieve the PM 2.5 concentrations, including planetary boundary layer height (PBLH), precipitation (PRECTOT), wind speed (WS), air pressure (PS), temperature at 2m above the ground (T2M), wind speed and direction at 10 m above the ground (U10M, V10M), cloud fraction (CLOUD), and relative humidity (RH). The majority of particle mass loading resides in the lower troposphere, and the particle mass distribution below the planetary boundary layer tends to be more homogeneous due to the active mixing. The PM 2.5 is usually well mixed in the planetary boundary layer, and accordingly the PM 2.5 is large for a small mixed layer height corresponding to the same AOD [57]. The precipitation in wet seasons increases soil moisture and suppresses wind-induced PM 2.5 emissions from the ground. The relative humidity affects the AOD-PM 2.5 through changing the optical properties of the aerosols. An increase in humidity reflects Remote Sens. 2019, 11, 1558 5 of 18 an updraft of boundary layer air masses to the 3 km level, leading to a higher level of air pollution. High surface temperature or high air pressure accelerates the atmospheric vertical motion to transport ground pollutants into higher places. Wind speed is an effective index of quantifying surface motions of air flow because it affects the horizontal transport of ground pollutants. When the wind speed is greater than zero, wind direction is another factor to revise the weights calculated by distances.
Daily meteorological data from 2013-2016 was MERRA (The Modern Era Retrospective-analysis for Research and Applications) assimilation data, which were obtained from the Goddard Earth Sciences Data and Information Services Center (https://gmao.gsfc.nasa.gov/GMAOproducts/).

Geographic Data
Several widely used geographic data were also manually selected to retrieve the PM 2.5 concentrations, including population density (Pop), road coverage (Road) and forest coverage (Fore). The increasing population density, building coverage and road coverage correlate closely with the PM 2.5 emissions in China that result from rapid urbanization in the YRD region. The reason for selecting forest coverage is that the forest ecosystems can block and capture PM 2.5 from the air. In particular, forests with luxuriant foliage are most effective in removing PM 2.5 from the air.
The population density data were obtained from the China Global Change Science Research Data Publishing & Repository (available at http://geodoi.ac.cn/WebCn/doi.aspx?Id=131), which is 1 m grid cell data cropped from the Chinese Gregorian Grid Population Distribution Data Set. Forest coverage and building coverage data were extracted from the 300 m global land cover product Glob Cover 2009, and the grid data were made by the European Space Agency (ESA) (available at http://due.esrin.esa.int/page_globcover.php). Road coverage is estimated from the vector data of roads of the YRD region, which was obtained from the National Geomatics Center of China (available at http://ngcc.cn/). Table 1 summarizes the data used in our study. The ground-level PM 2.5 data, AOD data, meteorological and geographic data have different formats, sources and spatio-temporal resolutions. and some preprocessing works are required to unify all the datasets into the same spatial and temporal frameworks. All the above datasets were transformed into the WGS84 geographic coordinate system, and the YRD region was digitized into grid cells with a fixed grid size of 1 degree. The passing times of MODIS Terra and Aqua satellites are around 10:30 and 14:00 in the local time of the YRD region, which is almost the same as Beijing time. Accordingly, the 3-h meteorological data in Beijing time 10:00-15:00 was averaged and then resampled into grid cells of 1 degree. The population density rate was estimated by the total population in the area of each grid cell, and we adopted a similar method to obtain the road coverage rate and forest coverage rate in all grid cells.

Data Pre-Processing and Integration
The AERONET AOD does not have the same band of 550 nm as the MODIS AOD, and the Angstrom algorithm was employed to interpolate the AERONET AOD data at 550 nm from those of both 440 nm and 576 nm. After that, a spatial buffer of 10 km around all six AERONET sites was created, and the average of MODIS AOD within each buffer was registered into the nearest AERONET AOD site. The median AERONET AOD in Beijing time 10:00-15:00 was chosen as the ground truth of its corresponding MODIS AOD. Finally, using overlay analysis, the averages of MODIS AOD, AERONET AOD and ground PM 2.5 within each grid cell were assigned as corresponding values of its grid cell.
Previous studies show that some outliers negatively affect the accuracy and robustness of PM 2.5 retrieval modelling [49][50][51]58]. Therefore, in our study, we excluded PM 2.5 and AOD data in three conditions: (1) the AOD < 2.5; (2) the AOD > 0.5 and PM 2.5 < 10 µg/m 3 ; and (3) the PM 2.5 < 3 µg/m 3 . Conditions (1) and (2) were to eliminate the effects from cloud contamination, and condition (3) was to exclude the ineffective and small PM 2.5 from modelling. Moreover, the daily threshold of 30 records was manually chosen to guarantee a sufficient number of validated daily records of AOD and ground-level Remote Sens. 2019, 11, 1558 6 of 18 PM 2.5 concentrations. The reason for that is a very small number of daily records could not reflect the real spatial coverage of PM 2.5 concentrations on the same day.

Methodology
Our methodology included two main steps: (1) fusing MODIS C6 Terra and Aqua AOD using the two-stage IVW algorithm; (2) retrieving PM 2.5 with the fused MODIS daily average AOD and two-stage GAM model. Figure 2 illustrates the flowchart of our methodology. The first step was to fill missing data in MODIS standard AOD and obtain high coverage daily average MODIS AOD; and the second step was to obtain full coverage PM 2.5 concentrations using the interpolated daily average MODIS AOD and two-stage GAM.

Fusing MODIS Terra and Aqua AOD via Two-Stage IVW
The original MODIS C6 AOD product has many missing pixels due to cloud contamination and other factors, which severely impacts the daily PM 2.5 retrievals and its further application. Fortunately, the MODIS Terra and Aqua satellites have the same revisiting period, same bands and a small difference in the passing time. Meanwhile, the DT AOD is made for a dark surface, and the second generation of DB AOD is created for both dark and bright surfaces. These produce strong spatial-temporal correlations between DB and DT AOD from two satellites in the same area, and provide a high probability of making spatial complementarity. On the other hand, the different data quality of AOD will greatly affect the fusion result, and accordingly the weights of AOD data accuracy should be incorporated into the procedure. Accordingly, we propose the two-stage IVW to fuse the four AOD products (Terra DB, Terra DT, Aqua DB and Aqua DT) and improve the spatial coverage of daily MODIS AOD product. The Two-stage IVW includes the following three steps.
(1) The DB and DT AOD products of Terra and Aqua with different QA (i.e., 1, 2, and 3) were evaluated by comparing them with the AERONET AOD. The data evaluation is to guarantee the proper input for the next step of fusion. In the procedure, several popular accuracy indicators are utilized, i.e., root mean square (RMSE), coefficient of determination (R 2 ), relative percentage error (RPE) and expected error (EE). The EE is a deviation interval between MODIS C6 and AERONET AOD, and the interval in this study is manually set as 0.05 + 0.2 × AERONET AOD [59].
Remote Sens. 2019, 11, x FOR PEER REVIEW 7 of 17 fuse the four AOD products (Terra DB, Terra DT, Aqua DB and Aqua DT) and improve the spatial coverage of daily MODIS AOD product. The Two-stage IVW includes the following three steps.
1) The DB and DT AOD products of Terra and Aqua with different QA (i.e., 1, 2, and 3) were evaluated by comparing them with the AERONET AOD. The data evaluation is to guarantee the proper input for the next step of fusion. In the procedure, several popular accuracy indicators are utilized, i.e., root mean square (RMSE), coefficient of determination (R 2 ), relative percentage error (RPE) and expected error (EE). The EE is a deviation interval between MODIS C6 and AERONET AOD, and the interval in this study is manually set as 0.05 + 0.2 × AERONET AOD [59]. 2) The first stage of the IVW algorithm was implemented to fuse the DB and DT AOD data for both Terra and Aqua satellites separately. Taking the Terra satellite as an example, with over 10 pairs of evaluated daily MODIS Terra DT and DB AOD on the same grid cells, two linear regression equations can be separately formulated as [16]: where and are daily Terra MODIS DB and DT AOD, respectively, 1 and 2 are intercepts of regression functions, respectively, and 1 and 2 are slopes, respectively. For the grid cells with DB AOD, Equation (1) is to predict and fill the missing DT AOD with the DB AOD; for the grid cells with DT AOD, Equation (2) is to fill the missing DB AOD with the DT AOD. The two equations are formulated using the linear correlations between DB AOD and DT AOD, but they could not simultaneously be used to predict the AOD on the same grid cells. After that, the filled DB AOD and DT AOD were compared with the AOD of its nearest AERONET sites. The standard variance of filled DB AOD and DT AOD within the same season shows the degree of deviation of the data from its expected values (i.e., AERONET AOD), and a larger variance indicates a worse data quality of the MODIS AOD. Accordingly, taking the variance reciprocals of MODIS DB and DT AOD as weights, the IVW fuses MODIS DB and DT AOD as: (2) The first stage of the IVW algorithm was implemented to fuse the DB and DT AOD data for both Terra and Aqua satellites separately. Taking the Terra satellite as an example, with over 10 pairs of evaluated daily MODIS Terra DT and DB AOD on the same grid cells, two linear regression equations can be separately formulated as [16]: where τ DB and τ DT are daily Terra MODIS DB and DT AOD, respectively, β 1 and β 2 are intercepts of regression functions, respectively, and α 1 and α 2 are slopes, respectively. For the grid cells with DB AOD, Equation (1) is to predict and fill the missing DT AOD with the DB AOD; for the grid cells with DT AOD, Equation (2) is to fill the missing DB AOD with the DT AOD. The two equations are formulated using the linear correlations between DB AOD and DT AOD, but they could not simultaneously be used to predict the AOD on the same grid cells. After that, the filled DB AOD and DT AOD were compared with the AOD of its nearest AERONET sites. The standard variance of filled DB AOD and DT AOD within the same season shows the degree of deviation of the data from its expected values (i.e., AERONET AOD), and a larger variance indicates a worse data quality of the where τ c is the IVW fused Terra daily AOD from DB and DT; τ DB_ f and τ DT_ f are filled DB and DT AOD using linear regressions in (1) and (2), respectively; Var DB and Var DT are the variances of DB and DT AOD against AERONET AOD in the same season, respectively.
(3) Following Step 2, the fused AOD data of Terra and Aqua was also fused with the IVW. The two linear regression functions between fused Terra AOD and Aqua AOD were separately formulated to predict and fill the missing Aqua AOD and Terra AOD in the same grid cells. After that, the weights of filled Aqua AOD and Terra AOD were computed by comparing with the nearest AERONET AOD. The daily average MODIS AOD was finally obtained by linearly fusing the filled Aqua and Terra AOD with the variance reciprocals as weights.

Retrieving PM 2.5 from Fused MODIS AOD and Two-Stage GAM
The above fusion procedures improved the spatial coverage of daily MODIS AOD, but could not accurately estimate all the missing AOD. The missing AOD still degrades the utility performance of PM 2.5 retrievals. The GAM could better consider nonlinear correlations between PM 2.5 concentrations and geographic and meteorological data. Therefore, we adopted the two-stage GAM model to retrieve the PM 2.5 concentrations and obtain full coverage PM 2.5 data in the study area. Among all the datasets, the PM 2.5 and AOD data had missing values, and accordingly we manually divided the two datasets in all the grid cells into four groups: (i) PM 2.5 and AOD; (ii) AOD but missing PM 2.5 ; (iii) PM 2.5 but missing AOD; and (iv) missing PM 2.5 and AOD. The two-stage GAM incorporates the following two procedures.
(1) Group (i) was used to model the first-stage GAM model and predict the PM 2.5 with the AOD in group (ii). Except the fused daily average AOD, the first-stage GAM utilizes meteorological data as auxiliary factors to reflect the time variability of PM 2.5 concentrations as: (4) where (PM2.5 GAM1 ) ij is the predicted average PM 2.5 in the grid cell j on the data i from the first-stage GAM; µ 0 is the constant term; f (PBLH) j , f (RH) j , f (T2M) j and f (U10M, V10M) j are smooth functions of planetary boundary layer height, relative humidity, temperature at 2 m above the ground, and the wind speed and direction at 10 m above the ground in the grid cell j, respectively; and ε ij is the residual error of grid cell j at data i.
(2) Group (i) and the predicted PM 2.5 in group (ii) were integrated with geographic data to model the second-stage GAM. In our study, the second-stage GAM adopts the smooth functions of population, forest coverage, road coverage and spatial locations as main factors to predict the spatial variability of PM 2.5 in group (iv). The second-stage GAM model was formulated as: where (PM2.5 GAM2 ) ij is the predicted average PM 2.5 in the grid cell j on the data i from the second-stage GAM; f (Lon, Lat) j , f (log(Fore)) j , f (log(Road)) j and f (log(Pop)) j are smooth functions of spatial locations (i.e., longitude, latitude), forest coverage rate, road coverage rate and population density respectively; µ 1 is the constant term and ij is residual error of grid cell j at data i.

Model Evaluation and Validation
In this study, we implemented two main schemes to evaluate the performance of both two-stage IVW in fusing daily MODIS AOD data and two-stage GAM in retrieving PM 2.5 concentrations. For the two-stage IVW, the fused daily average MODIS AOD was compared with its nearest AERONET AOD, and several popular indicators were used to quantify the accuracy, including RMSE, R 2 , and RPE. Meanwhile, the spatial coverage rate (SCR) of fused daily average MODIS AOD was also computed and compared with those from the first-stage IVW and the original MODIS AOD product.
The performance of two-stage GAM was evaluated from its two stages separately. For the first-stage GAM, the ten-fold cross validation (10-fold CV) was utilized, and the adopted indicators were scatter plots of true and predicted PM 2.5 , RMSE, R 2 and RPE. The modelling performance of second-stage GAM was assessed from three different aspects. The feasibility evaluation of geographic factors in the second-stage GAM was evaluated by modelling fitting with MODIS AOD and PM 2.5 data in group (i). Meanwhile, ten-fold CV was used to evaluate the second-stage GAM on the integration of group (i) and predicted PM 2.5 in group (ii). Moreover, the PM 2.5 in group (iii) was implemented as the ground truth to compare with the predicted PM 2.5 from two-stage GAM to further validate its performance. Table 2 lists the R 2 of MODIS standard AOD, the fused MODIS Terra and Aqua AOD from first-stage IVW and the fused daily average MODIS AOD from second-stage IVW in the YRD in 2013-2016. The R 2 was estimated by comparing the above three AOD products with those of AERONET AOD sites. The first-stage IVW slightly lowers the accuracy of MODIS standard AOD, and the second-stage promotes the accuracy of fused MODIS Terra and Aqua AOD. The average accuracy of MODIS standard AOD during 2013-2016 did not change much after the two-stage IVW fusion.  Table 3 lists the spatial coverage rates of MODIS standard AOD, fused MODIS AOD from first-stage IVW and fused daily average MODIS AOD from two-stage IVW during 2013-2016. The average SCR of original MODIS standard AOD is as low as 11.88%, and the first-stage IVW greatly improves the spatial coverage rate of original MODIS AOD product up to 29.19%. Moreover, after the second-state IVW, the SCR of fused daily average MODIS AOD reaches up to 39.29%, and improves about 230% against the original MODIS AOD. Moreover, Figure 3 compares the annual coverage information of MODIS standard AOD and fused daily average MODIS AOD from two-stage IVW. The annual coverage days of MODIS AOD after two-stage IVW have been greatly promoted, particularly in the north region of YRD. The promotion of coverage days in Taihu Lake and coastal areas of south YRD is not obvious, and the explanation for that is their rain and cloudy weather easily reduce the effectiveness of MODIS AOD data.

Experiments on PM2.5 Retrieved from Two-Stage GAM
This section explores the performance of two-stage GAM in retrieving full-coverage PM2.5 in the YRD region during 2013-2016. We used the Pearson correlation on meteorological factors (T2M, PBLH, RH, V10M, U10M) and geographic factors (Pop, Fore, and Road) to avoid collinearity problems among them. All the above factors showed good independent statistical behaviors and were used in the experiments. The geographic data were transformed into logarithmic formats to alleviate their spatial homogeneity.

Experiments on PM 2.5 Retrieved from Two-Stage GAM
This section explores the performance of two-stage GAM in retrieving full-coverage PM 2.5 in the YRD region during 2013-2016. We used the Pearson correlation on meteorological factors (T2M, PBLH, RH, V10M, U10M) and geographic factors (Pop, Fore, and Road) to avoid collinearity problems among them. All the above factors showed good independent statistical behaviors and were used in the experiments. The geographic data were transformed into logarithmic formats to alleviate their spatial homogeneity. Figure 4 depicts the regression results with zero intercept between our predicted PM 2.5 of group (i) against observed PM 2.5 measures using the 10-fold CV scheme in 2013-2016. The first-stage GAM showed slight overfitting throughout all the four years when compared with their model fitting results. Fortunately, the higher R 2 , low RMSE and RPE proves the robustness and reliability of the model, and therefore we suggest the predicted PM 2.5 in group (ii) can be used in the second-stage GAM model. Combing the data in groups (i) and (ii), we further implemented the PM2.5 and fused MODIS AOD to verify the second-stage GAM model in Equation (5). Figure 5 shows the scatter plots between the predicted and observed PM2.5 in groups (i)+(ii), where the predicted PM2.5 of group (ii) in the firststage GAM was utilized as the ground truth in this model. The results show the second-stage GAM model behaves well in model fitting and the 10-fold CV test, and no clear overfitting exists in the model. The explanation for this is the involvement of PM2.5 in group (i) and the predicted PM2.5 in group (ii) improves the robustness of the second-stage GAM model. Meanwhile, Figure 6 shows the scatter plots of predicted PM2.5 from second-stage GAM and the observed PM2.5 in group (iii). The high averaged R 2 = 0.78, low averaged RMSE = 19.177 μg/m 3 and averaged RPE = 28.9% also support the above observations of the second-stage GAM model.

The Performance Verification of Second-Stage GAM
Combing the data in groups (i) and (ii), we further implemented the PM 2.5 and fused MODIS AOD to verify the second-stage GAM model in Equation (5). Figure 5 shows the scatter plots between the predicted and observed PM 2.5 in groups (i)+(ii), where the predicted PM 2.5 of group (ii) in the first-stage GAM was utilized as the ground truth in this model. The results show the second-stage GAM model behaves well in model fitting and the 10-fold CV test, and no clear overfitting exists in the model. The explanation for this is the involvement of PM 2.5 in group (i) and the predicted PM 2.5 in group (ii) improves the robustness of the second-stage GAM model. Meanwhile, Figure 6 shows the scatter plots of predicted PM 2.5 from second-stage GAM and the observed PM 2.5 in group (iii). The high averaged R 2 = 0.78, low averaged RMSE = 19.177 µg/m 3 and averaged RPE = 28.9% also support the above observations of the second-stage GAM model.

Comparison with other spatial interpolation methods
We further compared our full-coverage PM2.5 concentration results with those of two widely used spatial interpolation methods; the inverse distance weighted method (IDW) and ordinary kriging (OK). IDW interpolates values at unknown grid cells using weighted known neighborhood PM2.5 measures, and the weights are inversely proportional to the distance to the target grid cells. Different from other kriging approaches, the most commonly used OK estimates the unknow PM2.5 on certain grid cells by using weighted linear combinations of the neighboring measures. We did not implement more complicated methods because there are too many prerequisites in essential datasets. For example, the multiple imputation model [52] by Xiao requires the atmospheric correction AOD with community multiscale air quality (CMAQ) simulations, whereas the parameters of CMAQ are unavailable for our study areas.
We then implemented the three methods on the ground observations to obtain full-coverage PM2.5 in the YRD in 2013-2016. We benchmarked the performance of all three methods using fivefold cross validation. In detail, we randomly removed 20% of the ground stations and predicted their PM2.5 concentrations as measured at the stations. Each model was trained using 80% of the ground PM2.5 concentrations, and then tested on the remaining 20% of the observations. Table 4 lists the comparison results of all three methods, where RMSE, RPE, R² and mean absolute percentage error (MAPE) were used to quantify the differences between retrieved PM2.5 concentrations and ground observations. Our method performed the best, achieving the largest R² and the smallest MAPE and RPE, particularly in the year 2013. Table 4. Geospatial prediction models evaluation.

Comparison with Other Spatial Interpolation Methods
We further compared our full-coverage PM 2.5 concentration results with those of two widely used spatial interpolation methods; the inverse distance weighted method (IDW) and ordinary kriging (OK). IDW interpolates values at unknown grid cells using weighted known neighborhood PM 2.5 measures, and the weights are inversely proportional to the distance to the target grid cells. Different from other kriging approaches, the most commonly used OK estimates the unknow PM 2.5 on certain grid cells by using weighted linear combinations of the neighboring measures. We did not implement more complicated methods because there are too many prerequisites in essential datasets. For example, the multiple imputation model [52] by Xiao requires the atmospheric correction AOD with community multiscale air quality (CMAQ) simulations, whereas the parameters of CMAQ are unavailable for our study areas.
We then implemented the three methods on the ground observations to obtain full-coverage PM 2.5 in the YRD in 2013-2016. We benchmarked the performance of all three methods using five-fold cross validation. In detail, we randomly removed 20% of the ground stations and predicted their PM 2.5 concentrations as measured at the stations. Each model was trained using 80% of the ground PM 2.5 concentrations, and then tested on the remaining 20% of the observations. Table 4 lists the comparison results of all three methods, where RMSE, RPE, R 2 and mean absolute percentage error (MAPE) were used to quantify the differences between retrieved PM 2.5 concentrations and ground observations. Our method performed the best, achieving the largest R 2 and the smallest MAPE and RPE, particularly in the year 2013.

Discussion
The above three experiments comprehensively investigated the performance of our proposed framework in the retrieval of full-coverage daily average PM 2.5 during 2013-2016. The first experiment verifies the accuracy and SCR of fused daily average MODIS AOD from two-stage IVW. Experimental results show that the R 2 of fused daily average MODIS AOD offers no clear difference from the original MODIS standard AOD product, but the SCR (i.e., spatial coverage rate) was greatly improved by 230% when compared with the original MODIS AOD. Therefore, the two-stage IVW could satisfy the fusing requirements of MODIS DB and DT AOD from Terra and Aqua satellites and yield high-coverage daily average MODIS AOD data.
The fused daily average MODIS AOD and ground PM 2.5 were classified into four groups, and the two-stage GAM were designed to retrieve the ground PM 2.5 in group (ii) and (iv) to obtain full-coverage daily average PM 2.5 in the YRD region. The data in group (i) was used to model the first-stage GAM to predict the ground PM 2.5 in group (ii). The accuracy evaluation result in the first-stage GAM shows that the predicted PM 2.5 in group (i) has high R 2 , low RMSE and RPE and could satisfy the practical requirements in predicting that of group (ii). The performance of second-stage GAM was evaluated by comparing the predicted PM 2.5 in groups (i) and (ii) with the observed, where the predicted PM 2.5 in group (ii) in the first-stage GAM was regarded as the observed in the second-stage GAM. Experimental results show that the second-stage GAM model behaves well, with no clear overfitting, but a higher accuracy makes it useful in predicting PM 2.5 in group (iv). Meanwhile, the accuracy evaluation of second-stage GAM on the data in group (iii) also supports the above observations. The two-stage GAM model has good performance in retrieving the ground PM 2.5 in groups (ii) and (iv), and the full-coverage daily average PM 2.5 can be obtained via the combination of the above results.
The third experiment compared the performance of our methods with two widely used spatial interpolation methods IDW and OK. The observations show that all three methods could obtain full-coverage PM 2.5 concentration maps, but our retrieved results are superior to those of IDW and OK. This verifies the advantages of combining satellite AOD and ground-observed PM 2.5 concentrations against the single data source of ground measures in mapping full-coverage PM 2.5 concentrations, which coincides with previous studies [46,52]. The IDW and OK methods highly rely on the spatial distributions of ground stations, and the sparse or nonexistent PM 2.5 measures in the YRD would negatively affect the estimated full-coverage map. In contrast, the implemented GAM models in our presented framework could describe the spatial inference of PM 2.5 concentrations well, and accordingly perform better than IDW and OK.
Unfortunately, our study has some limitations. The first is regarding the data quality of MODIS DB/DT AOD. We used the MODIS standard AOD with QA = 1, 2 and 3 in the two-stage IVW model to ensure sufficient AOD data. The poor data quality of QA = 2 and 3 negatively affects the fusion procedure and may make the two-stage IVW ineffective in improving the accuracy of daily average MODIS AOD. The AOD with good quality, with QA = 1, in long series, will be implemented to obtain daily average MODIS AOD with a higher accuracy. The second is regarding the selection of meteorological and geographic factors in the two-stage GAM model. In our experiment, we followed previous studies, selected the most popular factors and did not carefully analyze the specific characteristic of the YRD region. The future work will involve more meteorological and geographic factors. The third is that we only implemented the MODIS C6 AOD product, which might not capture the variability of daily PM 2.5 distribution on the intra-urban scale. The newly released MODIS MAIAC AOD [60][61][62] with 1 km spatial resolution will be used to further investigate the performance of our framework. The fourth is that we only compared our proposed framework with the IDW and OK. More state-of-the-art full-coverage PM 2.5 concentrations like machine learning models [49][50][51] will be used to make comparisons with our method on much larger study areas, e.g., the whole of China or the whole continent.

Conclusions
Satellite-based remote sensing techniques have been widely used in mapping PM 2.5 concentrations in the YRD region, particularly using the MODIS AOD product. However, current PM 2.5 concentrations retrieved from the MODIS AOD product have severe missing data, and cannot obtain a full-coverage PM 2.5 concentration map. This severely degrades its utility performance in many realistic applications. This paper proposes an innovative framework to retrieve full-coverage daily average PM 2.5 from the MODIS C6 AOD product. The two-stage IVW algorithm was utilized to fuse MODIS DB and DT AOD from Terra and Aqua satellites. This greatly improves the coverage rate of the MODIS AOD product. After that, the two-stage GAM model was employed to retrieve ground PM 2.5 concentrations from the fused daily average MODIS AOD data, and the full-coverage daily PM 2.5 map was finally achieved. Using the MODIS AOD data from 2013-2016, two groups of experiments were carefully designed in order to comprehensively testify the performances of our proposed framework in both fusing MODIS AOD and retrieving full-coverage PM 2.5 . Experimental results show that the two-stage IVW could improve spatial coverage of MODIS AOD against the original standard product by 230%, meanwhile maintaining the data accuracy of MODIS AOD data. The average spatial coverage ratio of daily MODIS AOD in 2013-2016 of YRD reaches up to 39.29% after the fusion operation of second-state IVW. The annual coverage days of MODIS AOD after two-stage IVW has been clearly improved, especially in the north of the YRD region. Moreover, the experiments on two-stage GAM show that the first-stage GAM can predict the ground PM 2.5 of group (ii) (i.e., AOD but missing PM 2.5 ) well, with slight overfitting throughout the years of 2013-2016. That explains the robustness and reliability of the first-stage GAM model. Meanwhile, experimental results show that the second-stage GAM predicts the PM 2.5 concentrations in groups (iii) (PM 2.5 but missing AOD) and (iv) (i.e., missing PM 2.5 and AOD) well. Furthermore, we compared our framework with two widely used spatial interpolation methods, i.e., IDW and OK, in mapping the full-coverage daily PM 2.5 concentrations. The results show that our retrieved full-coverage PM 2.5 result has higher accuracies than those of IDW and OK, achieving the largest R 2 and the smallest MAPE and RPE in the year of 2013. Therefore, our proposed framework provides a good methodology for mapping full-coverage daily average PM 2.5 from the MODIS AOD product.