Merging MODIS and Ground-Based Fine Mode Fraction of Aerosols Based on the Geostatistical Data Fusion Method

With the rapid development of the economy and society, fine particulate matter (PM2.5) has not only caused severe environmental problems, but also posed a threat to public health. In order to improve the estimated accuracy of PM2.5, the input data fine mode fraction (FMF), a key parameter to the PM2.5 remote sensing method (PMRS), should be improved due to its significant errors. In this study, we merge the observations of the fine mode fraction (FMF) from the Moderate Resolution Imaging Spectroradiometer (MODIS), the Aerosol Robotic Network (AERONET) and the Sun-sky radiometer Observation Network (SONET) using the universal kriging (UK) method to obtain accurate FMF distribution over eastern China. PM2.5 mass concentration is estimated by the fusion and MODIS FMF distributions using the PMRS model. The results show that the parameters in the variogram are relatively stable except for significant differences in correlation lengths in summer. The FMF in the Winter of 2015 shows that the mean error decreases from 0.38 to 0.13 compared with that from MODIS using leave-one-out cross-validation, with the maximum error decreasing from 0.75 to 0.34, indicating that the UK method can provide better estimates of FMF. We also find that PM2.5 estimated from FMF fusion results is closer to the in situ PM2.5 from the Ministry of Environmental Protection (MEP) (87.2 vs. 88.9 μg/m3).


Introduction
In recent years, atmospheric particulate matter has reached a high level due to human activities.PM 2.5 (atmospheric particulate matter with a mass median diameter less than 2.5 µm), which has a great influence not only on air quality but also on human health, has drawn a wide range of attention [1][2][3].In order to obtain the spatial distribution of PM 2.5 mass concentration near the ground, many studies have developed three kinds of methods based on remote sensing observations.These include the statistical method [4][5][6][7][8][9], the simulated method coupled with the atmospheric chemical model [10][11][12], and the physical dependent method [13].The statistical method can be constrained by the model's validation since it needs a long data period and must be re-validated in different study areas.The simulated method is affected by many factors from atmospheric chemical models such as the simulation scale, chemical mechanisms, and emission inventories.The physical dependent method is a semi-physical model based on remote sensing observations to estimate PM 2.5 mass concentration near the ground, which effectively solves the limitations of the above two methods.In this physical dependent method, fine mode fraction (FMF, defined as the fine-mode fraction to the total aerosol optical depth (AOD)) is applied as a key parameter to distinguish the PM 2.5 from the total suspended particles.However, the high uncertainty of FMF from satellite (mainly from the Moderate Resolution Imaging Spectroradiometer (MODIS) sensors) is a problem demanding prompt solution [10].
Data fusion approaches can effectively improve the accuracy of current FMF products.Many studies have attempted to merge the AOD from multi-sensors (e.g., MODIS, Multi-angle Imaging SpectroRadiometer (MISR), etc.) [14][15][16] with ground-based data (e.g., Aerosol Robotic Network (AERONET), etc.) [17][18][19] using the prevalent fusion approaches, including least squares [20], the maximum likelihood estimate method [21][22][23][24], the universal kriging (UK) method [25], and the spatial statistical data fusion (SSDF) method [26], etc.Compared with the approaches listed, the UK method is not only a linear and unbiased method, but also takes spatial correlation into account.Moreover, it is simple and easy to operate and has low computational complexity [27].However, based on the UK method, the past studies mainly focus on merging AOD data from the multi-sensor or the multi-sensor with an AERONET product, with a lesser focus on FMF fusion.
In order to improve the accuracy of MODIS FMF over land, we study the feasibility of merging satellite data (MODIS) with ground-based data (AERONET and Sun-sky radiometer Observation Network (SONET)) to improve the accuracy of FMF.MODIS FMF and ground-based FMF observations of eastern China in the Winter (December, January, and February) of 2015 are merged to test the applicability based on the UK method.The FMF fusion results are then applied to the PM 2.5 remote sensing method (PMRS) to estimate PM 2.5 mass concentration near the ground.In Section 2, the methods and data are described in detail.The spatiotemporal variability analysis, FMF fusion test and cross-validation, and the estimated PM 2.5 results are introduced in Section 3. Section 4 gives the discussion, conclusion, and suggestions for possible improvements in the future.

Comparison of Theretical FMFs
Currently, FMF can be obtained only by the MODIS product in space-borne remote sensing.In order to merge the FMF from satellite and ground-based products, the theoretical FMFs in different methods firstly were compared with each other.So far, FMF can be calculated by three retrieval methods, including the mode combination method, the truncation radius method, and the spectral identification method [28][29][30][31][32].In the study, only two (mode combination and spectral identification) methods are used to calculate the FMFs.The FMF in the mode combination method used for MODIS is determined by different combinations of fine and coarse modes to match the observed spectral reflectance.
where τ f 0.55 is fine optical depth at 0.55 µm and τ tot 0.55 is the total aerosol optical depth at 0.55 µm.Additionally, the spectral identification method, also called the spectral deconvolution algorithm (SDA), calculates FMF based on the particle spectral characteristics.It is the standard algorithm of the AERONET FMF product.The SDA method is expressed as follows: where α is the Ångström index and the subscripts f and c represent the fine and coarse modes, respectivaly.In order to capture the differences between the mode combination and the SDA method, we calculate the FMFs of three typical aerosol models, including the water-soluble (WS), biomass burning (BB), and dust (DU) models.The aerosol size distribution is: where dV/dlnr (in units of µm 3 /µm 2 ) is the aerosol volume particle size distribution, R i is the median radius of the aerosols (µm), σ i is the standard deviation, and C i is the volume concentration of particulate matter (µm 3 /µm 2 ).The subscripts f and c represent the fine and coarse modes, respectivaly.Additionally, the parameters in the fine and coarse modes are listed in Table 1.Table 2 shows the FMF results of three different aerosol models based on two algorithms.We find that the FMFs in the same model are very close to each other.The maximum difference of FMF (0.04) is presented in the WS model, while the minimum difference in the DU model is only 0.01.It is suggested that the theoretical difference of FMF under the two algorithms can be ignored, and the FMF fusion can be executed between ground-based FMF and MODIS FMF.

Fusion Method for Merging FMF
Our fusion method includes three steps: (i) analysis of the experimental variogram; (ii) estimated parameters in the experimental variogram; (iii) merging space-borne and ground-based FMF using the UK method.In our study, the UK method is employed to merge the FMF, which was proposed by G. Matheron in 1969 and has been widely used in several fields (e.g., mining, agriculture, environmental, and atmosphere sciences) [33][34][35].The flowchart is shown in Figure 1.
Atmosphere 2017, 8, 117 3 of 15 particulate matter (μm 3 /μm 2 ).The subscripts f and c represent the fine and coarse modes, respectivaly.Additionally, the parameters in the fine and coarse modes are listed in Table 1.Table 2 shows the FMF results of three different aerosol models based on two algorithms.We find that the FMFs in the same model are very close to each other.The maximum difference of FMF (0.04) is presented in the WS model, while the minimum difference in the DU model is only 0.01.It is suggested that the theoretical difference of FMF under the two algorithms can be ignored, and the FMF fusion can be executed between ground-based FMF and MODIS FMF.

Fusion Method for Merging FMF
Our fusion method includes three steps: (i) analysis of the experimental variogram; (ii) estimated parameters in the experimental variogram; (iii) merging space-borne and ground-based FMF using the UK method.In our study, the UK method is employed to merge the FMF, which was proposed by G. Matheron in 1969 and has been widely used in several fields (e.g., mining, agriculture, environmental, and atmosphere sciences) [33][34][35].The flowchart is shown in Figure 1.(i) Analysis of the experimental variogram: For all of the FMF pairs at any two locations (times) from the MODIS products, the experimental variogram is evaluated as: (i) Analysis of the experimental variogram: For all of the FMF pairs at any two locations (times) from the MODIS products, the experimental variogram is evaluated as: where FMF(x i , t i ) is the FMF observation at location x i and time t i ; h x is the spatial distance, which can be calculated by the longitudes and latitudes of the spatial points; and h t is the time lag in days between two temporal points.Thus, h x can be expressed as follows: where Re is the mean radius of the Earth, ϕ and θ are latitude and longitude separately, and x i and x j are the locations of the observations.If the temporal changes are neglected, the exponential model can well represent the spatial autocorrelation and is often applied to fitting the experimental variogram: where γ theo (h x ) represents the theoretical spatial semivariance at an h x separation distance and σ 2 = σ 2 n + σ 2 b represents the variance of FMF observations beyond the correlation length (3 l).σ 2 n is the nugget, representing both the measurement error and the micro-variability at distances smaller than one pixel.σ 2 b represents the variance of the FMF in the correlation length (3 l).l is the range parameter.
(ii) Estimated parameters in the experimental variogram: A least squares method is employed to estimate the parameters in the experimental variogram (Equation ( 6)).Firstly, the FMF distribution from MODIS is applied to fit the initial variance and correlation length.Then, the MODIS FMFs are matched at the ground-based sites to generate the FMF pairs.Using the FMF pairs, the final parameters can be fit to the experimental variogram.Generally, a higher variance is indicated for greater overall variability, and a shorter correlation length is related to a greater spatial variability at smaller scales.
(iii) Merging space-borne and ground-based FMF using the UK method: When given the ground-based FMF measurements at n locations and times, the estimated FMF distributions s at m locations and times can be expressed as the sum of an unknown component X s β and a zero-mean stochastic component ν: where X s (m × 2) defines the model of the trend, and β is a 2 × 1 vector of drift coefficients that can define the weights of the two variables in the model of the trend.Since FMFs are measured by ground-based sensors as well as MODIS, a linear relationship between them is the simplest and most reasonable model, yielding a linear model of the trend: The model of the trend includes a column of ones and a column of MODIS FMF at all estimated locations and times.The column of ones in the model of the trend will multiply β, representing an overall constant.Analogous to a regression model, the constant term aims to obtain a net offset not captured by MODIS.At the ground-based (n) locations, the stochastic component ν represents the observed spatial and temporal residuals between the ground-based FMF measurements and the weighted MODIS FMF observations.At the estimation (m) locations, this component represents the predicted residuals between the true FMF and the weighted MODIS FMF at those locations/times.Further derivation, including the covariance of these residuals and UK equations, is described in Appendix A.

PMRS Model
PMRS, proposed by Zhang and Li (2015) [13], is a remote sensing model to estimate dry PM 2.5 mass concentration near the ground using the following formula: where ρ f ,dry is the density of dry fine particulate matter, PBLH is the planetary boundary layer height, RH is the relative humidity, and VE f is the columnar volume-to-extinction ratio, which can be obtained using FMF; f o is the particle drying factor depending on RH.VE f can be expressed as follows:

Data and Study Area
The AOD and FMF parameters can be retrieved over land and ocean using the dark target method [36] from MODIS (https://MODIS.gsfc.nasa.gov/)onboard Terra and Aqua.In this study, the Terra Collection 006 FMF (Optical_Depth_Ratio_Small_Land) is used to merge with ground-based data and merging FMF and AOD (Optical_depth_Land_and_Ocean) to calculate PM 2.5 with the PMRS model.The spatial resolution of AOD and FMF is 10 km.
For ground-based observations, both AERONET data (Version 2) and SONET data are used.AERONET is a global ground-based aerosol monitoring network that offers easy access to the data archive, and, currently, it has more than 600 sites [37,38] (http://aeronet.gsfc.nasa.gov/)but only six permanent (more than five years) sites in study area of Eastern China (108 2).SONET has been operational by the Chinese Academy of Sciences since 2010.There are an additional six permanent sites distributed in the study area [39][40][41].Combining ground-based sites in these two networks (AERONET and SONET), the volume of FMF data can be effectively increased in the study area.Furthermore, the FMF from SONET is retrieved by the SDA method in the same way as AERONET.We choose to use both Lev 2.0 (i.e., Beijing-CAMS and Xianghe) and 1.5 data (other sites) of AERONET due to difficulties in calibration and maintenance in China.For SONET, we use the Lev 1.6 data, which reflect the current optimum data-set published at http://www.sonet.ac.cn.
In addition, the auxiliary data of RH and PBLH are extracted from National Centers for Environmental Prediction Final Operational Global Analysis data (NCEP FNL) (http://rda.ucar.edu/),which has a temporal resolution of 6 h and a spatial resolution of 1 • × 1 • [42].The in situ PM 2.5 measurements from the Ministry of Environmental Protection (MEP), China, are used to validate the PM 2.5 estimates from the PMRS model using FMF fusion products.
The satellite and auxiliary data are resampled into 0.2 • × 0.2 • grid cells, applying bilinear interpolation to RH and PBLH data and area averaging to the MODIS observations.The FMF data from MODIS and ground-based sites cannot be compared directly since they have different spatial and temporal resolutions.Therefore, the mean of MODIS FMF observations within a 0.2 • × 0.2 • bounding box around each ground-based site is used to compare them with the ground-based FMF data, which are themselves averaged within ±30 min from the Terra overpass.According to the spatiotemporal analysis (See in Section 3.1), the FMF fusion is performed in one week increments, using weekly averaged ground-based and MODIS FMF data.For each seven day period, the ground-based sites that have FMF data for at least three of the seven days and that have overlapping MODIS FMF data are used in this study.Two to five sites are used during various weeks.

Spatiotemporal Variability Analysis
To analyze the spatio-temporal variability of FMF, we calculate the raw variogram of each 16-day cycle from December 2014 to December 2015 using all of the data in the study area.Then, the raw variogram is separated by a spatial lag of 100 km and a temporal lag of one day.Figure 3a shows the spatial and temporal variograms for all of the MODIS data from 19 January to 3 February 2015.The horizontal axis represents the spatial distance between two observations, while the vertical axis represents the time difference in which the two observations are recorded.In the time scale, the figure does not show any noticeable temporal autocorrelation in the day-to-day variability of FMF distribution at the same distance lag up to seven days.This is likely a result of the change in FMF distribution over these days or of the fact that MODIS does not acquire a change in the FMF for a short time in the area.Thus, the FMF fusion is performed in one week increments.Furthermore, the spatial autocorrelation is obvious since the variances increase with the spatial distance.To show the spatial autocorrelation feature, all of the MODIS FMF data from 19 January to 3 February 2015 are examined in a single spatial variogram (Figure 3a,b).The maximum range of distance is set to 1000 km since the number of pairs of observations in each bin becomes fewer as the spatial distance becomes longer.Figure 3b shows the spatial autocorrelation feature.We can find that the exponential model is well fit to the experimental variogram.In this example case, the spatial correlation of FMF data is approximately 1425 km, and the spatial distance beyond this value is independent.

Spatiotemporal Variability Analysis
To analyze the spatio-temporal variability of FMF, we calculate the raw variogram of each 16-day cycle from December 2014 to December 2015 using all of the data in the study area.Then, the raw variogram is separated by a spatial lag of 100 km and a temporal lag of one day.Figure 3a shows the spatial and temporal variograms for all of the MODIS data from 19 January to 3 February 2015.The horizontal axis represents the spatial distance between two observations, while the vertical axis represents the time difference in which the two observations are recorded.In the time scale, the figure does not show any noticeable temporal autocorrelation in the day-to-day variability of FMF distribution at the same distance lag up to seven days.This is likely a result of the change in FMF distribution over these days or of the fact that MODIS does not acquire a change in the FMF for a short time in the area.Thus, the FMF fusion is performed in one week increments.Furthermore, the spatial autocorrelation is obvious since the variances increase with the spatial distance.To show the spatial autocorrelation feature, all of the MODIS FMF data from 19 January to 3 February 2015 are examined in a single spatial variogram (Figure 3a,b).The maximum range of distance is set to 1000 km since the number of pairs of observations in each bin becomes fewer as the spatial distance becomes longer.Figure 3b shows the spatial autocorrelation feature.We can find that the exponential model is well fit to the experimental variogram.In this example case, the spatial correlation of FMF data is approximately 1425 km, and the spatial distance beyond this value is independent.
Figure 3c presents the parameters of the fitted theoretical spatial variograms for each repeat cycle from December 2014 to December 2015.The smaller correlation lengths during winter and spring indicate the significantly heterogeneous distribution of FMF, mainly because of the frequent occurrence of regional haze in Winter and frequent sandstorms in Spring in the study area.In Summer, the correlation lengths are abnormal (close to zero or larger than 10 hundred kilometers), which fail to fit reasonable parameters.The variance of the FMF observations over a long distance (beyond the correlation distance) ranges from 0.05 to 0.19 throughout the study time period.The variance is slightly higher in Winter and Spring corresponding to the relatively smaller correlation length.As a result, the fusion can be performed in one-week increments, and the spatial-only experimental variogram is in good agreement with the exponential model based on MODIS products.
short time in the area.Thus, the FMF fusion is performed in one week increments.Furthermore, the spatial autocorrelation is obvious since the variances increase with the spatial distance.To show the spatial autocorrelation feature, all of the MODIS FMF data from 19 January to 3 February 2015 are examined in a single spatial variogram (Figure 3a,b).The maximum range of distance is set to 1000 km since the number of pairs of observations in each bin becomes fewer as the spatial distance becomes longer.Figure 3b shows the spatial autocorrelation feature.We can find that the exponential model is well fit to the experimental variogram.In this example case, the spatial correlation of FMF data is approximately 1425 km, and the spatial distance beyond this value is independent.Figure 3c presents the parameters of the fitted theoretical spatial variograms for each repeat cycle from December 2014 to December 2015.The smaller correlation lengths during winter and spring indicate the significantly heterogeneous distribution of FMF, mainly because of the frequent occurrence of regional haze in Winter and frequent sandstorms in Spring in the study area.In Summer, the correlation lengths are abnormal (close to zero or larger than 10 hundred kilometers), which fail to fit reasonable parameters.The variance of the FMF observations over a long distance (beyond the correlation distance) ranges from 0.05 to 0.19 throughout the study time period.The variance is slightly higher in Winter and Spring corresponding to the relatively smaller correlation length.As a result, the fusion can be performed in one-week increments, and the spatial-only experimental variogram is in good agreement with the exponential model based on MODIS products.
We use the parameters in the exponential model from MODIS products as the initial value of the variogram analysis to match a group of parameters in the experimental variogram using both MODIS and ground-based FMF in the Winter of 2015.The nugget value (σ ) is found to be zero (0.0018); σ is 0.0141; and the range parameter l is 475 km.

FMF Test Case
Figure 4a shows the mean MODIS FMF observations from 22 December to 28 December 2014.Note that the white gaps mean no observations in this period.It shows that the MODIS FMF observations vary from 0.1 to one and the discrepancy of FMF observations next to each other may reach up to 0.6.Higher FMF mainly is distributed in the Southwest of the study area.At ground-based locations, the mean absolute error between MODIS FMF and ground-based observations is 0.48, and the maximum error reaches up to 0.64.The FMF fusion result is shown in Figure 4b for the same period.The FMF fusion results mainly range from 0.60 to 0.90, and higher We use the parameters in the exponential model from MODIS products as the initial value of the variogram analysis to match a group of parameters in the experimental variogram using both MODIS and ground-based FMF in the Winter of 2015.The nugget value (σ 2 n .) is found to be zero (0.0018); σ 2 b is 0.0141; and the range parameter l is 475 km.

FMF Test Case
Figure 4a shows the mean MODIS FMF observations from 22 December to 28 December 2014.Note that the white gaps mean no observations in this period.It shows that the MODIS FMF observations vary from 0.1 to one and the discrepancy of FMF observations next to each other may reach up to 0.6.Higher FMF mainly is distributed in the Southwest of the study area.At ground-based locations, the mean absolute error between MODIS FMF and ground-based observations is 0.48, and the maximum error reaches up to 0.64.The FMF fusion result is shown in Figure 4b for the same period.The FMF fusion results mainly range from 0.60 to 0.90, and higher values are distributed in the southern regions in a similar fashion to the ground-based FMF observations.The mean absolute error at ground-based sites decreases to 0.002.The predicted uncertainties shown in Figure 4c are to evaluate the efficiency of this method.The lowest uncertainty with a value of about 0.03 is presented at the ground-based locations, while it is higher within the correlation length and the highest beyond the correlation length with no more than 0.15.The uncertainties are deemed to be within acceptable limits.In addition, the propagating error of estimated FMFs derived from ground-based FMF is relatively larger than that from MODIS.However, the ground-based FMF has a high degree of accuracy, so the errors are minimized.

Comparison and Validation
Before merging the FMF data, a correlation analysis is performed between ground-based FMF and MODIS FMF in the Winter of 2015 to evaluate their temporal and spatial consistency.According to the principle of the spatial and temporal matching of data (see Section 2.4), we can obtain the matching data pairs, as shown in Figure 5. MODIS FMF data have a wide range from 0.1 to one (the value of zero is discarded), while ground-based FMF values are mainly concentrated from 0.6 to 0.9.The maximum absolute error (AE) between them is about 0.75.The correlation coefficient is very low mainly because of the low accuracy of MODIS FMF data.It should be noted that approximately 30 percent of the data pairs have a high consistency (AE < 0.2), indicating that the algorithm of MODIS FMF is not stable when retrieving FMF.In some cases, reasonable FMF retrievals can be obtained.The situation of significant error between data pairs can be divided into two types: (1) the highest value of 1.0, which is overestimated, and (2) significant underestimation in other cases.This may be due to improper selection of the aerosol types in the MODIS FMF retrieval algorithm.To validate and evaluate the efficiency of the fusion approach of UK, leave-one-out (LOO) cross-validation is used to validate the FMF fusion results in the Winter of 2015.The validation method is to select any one of the ground-based locations as unknown and to use the other ground-based locations as known to merge with MODIS FMF data.The validation results can be obtained by comparing the fusion results at ground-based locations with their measurements.
Comparing the pre-fusion data pairs and leave-one-out validation results (Table 3), we can find that MODIS FMF data have a wide range from 0.1 to one (the value of zero is discarded), while ground-based FMF values are mainly concentrated from 0.50 to 0.95.The maximum absolute error (AE) between them is 0.75, which is a great error for FMF with an interval of zero to one.After fusion, the maximum absolute error is decreased to 0.34, which is significantly lower than that before

Comparison and Validation
Before merging the FMF data, a correlation analysis is performed between ground-based FMF and MODIS FMF in the Winter of 2015 to evaluate their temporal and spatial consistency.According to the principle of the spatial and temporal matching of data (see Section 2.4), we can obtain the matching data pairs, as shown in Figure 5. MODIS FMF data have a wide range from 0.1 to one (the value of zero is discarded), while ground-based FMF values are mainly concentrated from 0.6 to 0.9.The maximum absolute error (AE) between them is about 0.75.The correlation coefficient is very low mainly because of the low accuracy of MODIS FMF data.It should be noted that approximately 30 percent of the data pairs have a high consistency (AE < 0.2), indicating that the algorithm of MODIS FMF is not stable when retrieving FMF.In some cases, reasonable FMF retrievals can be obtained.The situation of significant error between data pairs can be divided into two types: (1) the highest value of 1.0, which is overestimated, and (2) significant underestimation in other cases.This may be due to improper selection of the aerosol types in the MODIS FMF retrieval algorithm.

Comparison and Validation
Before merging the FMF data, a correlation analysis is performed between ground-based FMF and MODIS FMF in the Winter of 2015 to evaluate their temporal and spatial consistency.According to the principle of the spatial and temporal matching of data (see Section 2.4), we can obtain the matching data pairs, as shown in Figure 5. MODIS FMF data have a wide range from 0.1 to one (the value of zero is discarded), while ground-based FMF values are mainly concentrated from 0.6 to 0.9.The maximum absolute error (AE) between them is about 0.75.The correlation coefficient is very low mainly because of the low accuracy of MODIS FMF data.It should be noted that approximately 30 percent of the data pairs have a high consistency (AE < 0.2), indicating that the algorithm of MODIS FMF is not stable when retrieving FMF.In some cases, reasonable FMF retrievals can be obtained.The situation of significant error between data pairs can be divided into two types: (1) the highest value of 1.0, which is overestimated, and (2) significant underestimation in other cases.This may be due to improper selection of the aerosol types in the MODIS FMF retrieval algorithm.To validate and evaluate the efficiency of the fusion approach of UK, leave-one-out (LOO) cross-validation is used to validate the FMF fusion results in the Winter of 2015.The validation method is to select any one of the ground-based locations as unknown and to use the other ground-based locations as known to merge with MODIS FMF data.The validation results can be obtained by comparing the fusion results at ground-based locations with their measurements.
Comparing the pre-fusion data pairs and leave-one-out validation results (Table 3), we can find that MODIS FMF data have a wide range from 0.1 to one (the value of zero is discarded), while ground-based FMF values are mainly concentrated from 0.50 to 0.95.The maximum absolute error (AE) between them is 0.75, which is a great error for FMF with an interval of zero to one.After fusion, the maximum absolute error is decreased to 0.34, which is significantly lower than that before To validate and evaluate the efficiency of the fusion approach of UK, leave-one-out (LOO) cross-validation is used to validate the FMF fusion results in the Winter of 2015.The validation method is to select any one of the ground-based locations as unknown and to use the other ground-based locations as known to merge with MODIS FMF data.The validation results can be obtained by comparing the fusion results at ground-based locations with their measurements.
Comparing the pre-fusion data pairs and leave-one-out validation results (Table 3), we can find that MODIS FMF data have a wide range from 0.1 to one (the value of zero is discarded), while ground-based FMF values are mainly concentrated from 0.50 to 0.95.The maximum absolute error (AE) between them is 0.75, which is a great error for FMF with an interval of zero to one.After fusion, the maximum absolute error is decreased to 0.34, which is significantly lower than that before fusion (0.75).The point with a big error (0.34) is because of a relative high offset in the fusion.In addition, the mean absolute error between MODIS FMF and ground-based FMF is 0.38, which is much higher than the 0.13 after fusion.In conclusion, the accuracy of FMF has been remarkably improved and is much more reliable after fusion.

PM 2.5 Test Case
To give a description of the effectiveness of the FMF fusion results in deriving PM 2.5 near the surface, the MODIS FMF and the FMF fusion results are used as the input parameters of the PMRS model separately.Figure 6 illustrates not only the two kinds of PM 2.5 mass concentration results, but also the PM 2.5 measurements from the MEP for a period from 9 February to 15 February 2015.In Figure 6a, PM 2.5 measurements in the whole study area are below 200 µg/m 3 , and most of them are no more than 100 µg/m 3 .Higher PM 2.5 mass concentration is mainly distributed in the Southwest of the study area, with relatively lower values in other regions.As for Figure 6b, higher values are distributed not only in the North, but also in the Southwest of the study area, which is more consistent with the in situ PM 2.5 measurements.In the North of the study area, a point located in Shandong Province has a large discrepancy compared with PM 2.5 from the MEP mainly because of a lack of sampling or a failure to observe the pollution process by the satellite.In conclusion, PM 2.5 from FMF fusion results has better consistency with the measurements from the MEP.
the study area, with relatively lower values in other regions.As for Figure 6b, higher values are distributed not only in the North, but also in the Southwest of the study area, which is more consistent with the in situ PM2.5 measurements.In the North of the study area, a point located in Shandong Province has a large discrepancy compared with PM2.5 from the MEP mainly because of a lack of sampling or a failure to observe the pollution process by the satellite.In conclusion, PM2.5 from FMF fusion results has better consistency with the measurements from the MEP.

Validation of PM 2.5
To quantify the influence of merging FMF on estimating PM 2.5 , validations are performed by making comparisons between the PM 2.5 estimated from fusion FMF, MODIS FMF observations, and PM 2.5 in situ measurements in the Winter of 2015.The results are shown in Figure 7. PM 2.5 estimated from MODIS observations and FMF fusion results is lower than the PM 2.5 in situ measurements.Since the PM 2.5 monitoring stations are mainly distributed in more polluted cities rather than rural areas, the in situ measurements will be higher than remote sensing estimates.Furthermore, the PM 2.5 mass concentration is generally high during haze days, but the satellite is limited to retrieve AOD and FMF successfully on highly polluted days.Thus, it fails to obtain some high PM 2.5 mass concentrations.Compared with the mean of in situ PM 2.5 measurements (88.9 µg/m 3) , the mean of PM 2.5 estimated from FMF fusion results is much closer than that of PM 2.5 estimated from MODIS FMF (87.2 µg/m 3 vs.49.8µg/m 3 ).Additionally, the mean error of PM 2.5 estimated from FMF fusion results is smaller than that of PM 2.5 estimated from MODIS FMF (41.8 µg/m 3 vs.45.3 µg/m 3 ).In conclusion, the FMF fusion result can provide FMF of higher accuracy as the input parameter of PMRS, thus having a better understanding of PM 2.5 .To quantify the influence of merging FMF on estimating PM2.5, validations are performed by making comparisons between the PM2.5 estimated from fusion FMF, MODIS FMF observations, and PM2.5 in situ measurements in the Winter of 2015.The results are shown in Figure 7. PM2.5 estimated from MODIS observations and FMF fusion results is lower than the PM2.5 in situ measurements.Since the PM2.5 monitoring stations are mainly distributed in more polluted cities rather than rural areas, the in situ measurements will be higher than remote sensing estimates.Furthermore, the PM2.5 mass concentration is generally high during haze days, but the satellite is limited to retrieve AOD and FMF successfully on highly polluted days.Thus, it fails to obtain some high PM2.5 mass concentrations.Compared with the mean of in situ PM2.5 measurements (88.9 μg/m 3) , the mean of PM2.5 estimated from FMF fusion results is much closer than that of PM2.5 estimated from MODIS FMF (87.2 μg/m 3 vs.49.8μg/m 3 ).Additionally, the mean error of PM2.5 estimated from FMF fusion results is smaller than that of PM2.5 estimated from MODIS FMF (41.8 μg/m 3 vs.45.3 μg/m 3 ).In conclusion, the FMF fusion result can provide FMF of higher accuracy as the input parameter of PMRS, thus having a better understanding of PM2.5.
It should be noted that the PMRS model is a semi-physical model and has several key parameters.Improving FMF based on the UK method can effectively correct the system error of the model.However, the error that caused by the coarse spatial resolution of RH and PBLH cannot be corrected currently.

Spatio-Temporal Characteristics Analysis of FMF and PM2.5
To have a better understanding of the spatio-temporal feature of FMF and PM2.5 in winter, the mean FMF and PM2.5 results from fusion FMF are shown in Figure 8.The mean FMF in the study area is mainly within the 0.6 to 0.9 distribution, and FMF in the North (mainly in Beijing and its It should be noted that the PMRS model is a semi-physical model and has several key parameters.Improving FMF based on the UK method can effectively correct the system error of the model.However, the error that caused by the coarse spatial resolution of RH and PBLH cannot be corrected currently.

Spatio-Temporal Characteristics Analysis of FMF and PM 2.5
To have a better understanding of the spatio-temporal feature of FMF and PM 2.5 in winter, the mean FMF and PM 2.5 results from fusion FMF are shown in Figure 8.The mean FMF in the study area is mainly within the 0.6 to 0.9 distribution, and FMF in the North (mainly in Beijing and its surrounding area) and the middle of study area is relatively high compared with the surrounding area.As for PM 2.5 , most of the PM 2.5 mass concentration is less than 200 µg/ m 3 in the study area, and higher PM 2.5 is in the middle and South of the study area.Thus, FMF and PM 2.5 have no inevitable connection.To obtain variability in Winter, we choose three regions (8 × 8 pixels) and calculate the mean FMF and PM 2.5 to have a detailed analysis.The regions are marked in Figure 8a.The FMF in Region A is relatively high.As for Regions B and C, a higher PM 2.5 mass concentration is shown in Figure 8b.The variabilities of FMF and PM2.5 in each region in Winter are shown in Figure 9.We can see that FMF in each region changes slightly, while PM2.5 is rather changeable.Region A, which mainly consists of Beijing, Tianjin, and part of Hebei, has a low consistency of the variation trends not only between PM2.5 in situ measurements and PM2.5 estimated from fusion FMF, but also between the ground-based FMF measurements and FMF fusion results.This is mainly because of a lack of samplings resulting from the high surface albedos of this area in winter.Region C has the highest consistency of PM2.5 from in situ measurements and estimated from FMF fusion results.In Regions A and B, the averages between in situ PM2.5 measurements and PM2.5 estimates in the former six time periods are close to each other.In the last several time periods (namely 12 January 2015 to 28 February 2015), the errors between measurements and estimates in Regions A and B are higher, with a maximum value of 150 μg/m 3 , probably because the PM2.5 sites mainly are distributed in cities.As a whole, PM2.5 estimated from FMF fusion results is lower than from MEP since the sites of MEP mainly are distributed in cities.The variabilities of FMF and PM 2.5 in each region in Winter are shown in Figure 9.We can see that FMF in each region changes slightly, while PM 2.5 is rather changeable.Region A, which mainly consists of Beijing, Tianjin, and part of Hebei, has a low consistency of the variation trends not only between PM 2.5 in situ measurements and PM 2.5 estimated from fusion FMF, but also between the ground-based FMF measurements and FMF fusion results.This is mainly because of a lack of samplings resulting from the high surface albedos of this area in winter.Region C has the highest consistency of PM 2.5 from in situ measurements and estimated from FMF fusion results.In Regions A and B, the averages between in situ PM 2.5 measurements and PM 2.5 estimates in the former six time periods are close to each other.In the last several time periods (namely 12 January 2015 to 28 February 2015), the errors between measurements and estimates in Regions A and B are higher, with a maximum value of 150 µg/m 3 , probably because the PM 2.5 sites mainly are distributed in cities.As a whole, PM 2.5 estimated from FMF fusion results is lower than from MEP since the sites of MEP mainly are distributed in cities.
A and B, the averages between in situ PM2.5 measurements and PM2.5 estimates in the former six time periods are close to each other.In the last several time periods (namely 12 January 2015 to 28 February 2015), the errors between measurements and estimates in Regions A and B are higher, with a maximum value of 150 μg/m 3 , probably because the PM2.5 sites mainly are distributed in cities.As a whole, PM2.5 estimated from FMF fusion results is lower than from MEP since the sites of MEP mainly are distributed in cities.

Discussion and Conclusions
In this paper, a UK method is applied to the fusion of MODIS FMF data and ground-based FMF data over eastern China in the Winter of 2015.To test the applicability of this method, the fusion results have been validated using leave-one-out cross-validation and applied to derive PM2.5 using a

Discussion and Conclusions
In this paper, a UK method is applied to the fusion of MODIS FMF data and ground-based FMF data over eastern China in the Winter of 2015.To test the applicability of this method, the fusion results have been validated using leave-one-out cross-validation and applied to derive PM 2.5 using a PMRS model.The validation results show that the mean error of fusion FMF decreases from 0.38 to 0.13, with the maximum error of 0.34 corresponding to that of 0.75 before data fusion.PM 2.5 estimated from FMF fusion results is closer to PM 2.5 from MEP.The mean error of PM 2.5 estimated from fusion FMF is reduced by about 1.7 µg/m 3 (87.2µg/m 3 vs 88.9 µg/m 3 ) compared with PM 2.5 from MEP.In conclusion, merging FMF using the UK method provides the input FMF parameter of the PMRS model with high accuracy, thus providing support for air quality monitoring.
Since there are limited AERONET sites that have Lev 2.0 data and the method requires as many ground reference data as possible, we have used Lev 1.5 data as a supplement.The wrong FMF value, which is caused by cloud, cannot be used due to the combination with MODIS.
FMF data can be carried out on instantaneous fusion in the future to provide reliable data for short-term study such as weather research and air quality monitoring.With the FMF data of multi-sensors being available and reliable, FMF fusion can give better estimates and further our understanding of FMF and fine particulate matters.

Figure 1 .
Figure 1.The flowchart of the fusion method applied to merging MODIS FMF and ground-based FMF.

Figure 1 .
Figure 1.The flowchart of the fusion method applied to merging MODIS FMF and ground-based FMF.

Atmosphere 2017, 8 , 117 6 of 15 Figure 2 .
Figure 2. The study area and ground-based observation locations used in the study.

Figure 2 .
Figure 2. The study area and ground-based observation locations used in the study.

sillFigure 3 .
Figure 3. (a) The spatial and temporal variograms of FMF over a 16-day period from 18 January to 2 February 2015 for MODIS.The color bar indicates the semivariance; (b) The spatial variogram over the same period from MODIS; (c) Correlation length, variance of FMF for all 16-day periods from December 2014 to December 2015.

Figure 3 .
Figure 3. (a) The spatial and temporal variograms of FMF over a 16-day period from 18 January to 2 February 2015 for MODIS.The color bar indicates the semivariance; (b) The spatial variogram over the same period from MODIS; (c) Correlation length, variance of FMF for all 16-day periods from December 2014 to December 2015.

Figure 4 .
Figure 4. (a) Mean FMF observations from MODIS; (b) FMF fusion result obtained from universal kriging (UK); and (c) uncertainty associated from the UK estimates, expressed as a standard deviation.All data are for the period from 22 December to 28 December 2014.The colored solid circles in the figures are the ground-based locations and their values.

Figure 4 .
Figure 4. (a) Mean FMF observations from MODIS; (b) FMF fusion result obtained from universal kriging (UK); and (c) uncertainty associated from the UK estimates, expressed as a standard deviation.All data are for the period from 22 December to 28 December 2014.The colored solid circles in the figures are the ground-based locations and their values.

Figure 4 .
Figure 4. (a) Mean FMF observations from MODIS; (b) FMF fusion result obtained from universal kriging (UK); and (c) uncertainty associated from the UK estimates, expressed as a standard deviation.All data are for the period from 22 December to 28 December 2014.The colored solid circles in the figures are the ground-based locations and their values.

Figure 6 .Figure 6 .
Figure 6.(a) PM2.5 (atmospheric particulate matter with a mass median diameter less than 2.5 μm) mass concentration near the ground estimated from FMF observations; and (b) FMF fusion results based on UK method for a time period from 9 February to 15 February 2015.The colored solid circles in the figures are PM2.5 measurements from the Ministry of Environmental Protection (MEP).The white gaps indicate no values.

Figure 7 .
Figure 7. (a) Comparison between in situ PM2.5 and PM2.5 estimated from MODIS FMF; (b) Comparison between in situ PM2.5 and PM2.5 estimated from FMF fusion results.Note that the red lines are the fitted regression lines.

Figure 7 .
Figure 7. (a) Comparison between in situ PM 2.5 and PM 2.5 estimated from MODIS FMF; (b) Comparison between in situ PM 2.5 and PM 2.5 estimated from FMF fusion results.Note that the red lines are the fitted regression lines.

Figure 8 .
Figure 8.(a) The mean FMF fusion results in the Winter of 2015 and the three regions in black squares; and (b) the mean PM2.5 estimated from the FMF fusion results in the Winter of 2015 (there is no result in the time period from 26 January 2015 to 1 February 2015 because of the lack of data pairs).

Figure 8 .
Figure 8.(a) The mean FMF fusion results in the Winter of 2015 and the three regions in black squares; and (b) the mean PM 2.5 estimated from the FMF fusion results in the Winter of 2015 (there is no result in the time period from 26 January 2015 to 1 February 2015 because of the lack of data pairs).

Figure 9 .
Figure 9.The variabilities of fusion results and ground-based FMF and variabilities of PM2.5 from in-situ measurements and estimated from fusion results in each region, including the mean, maximum, and minimum value in each region.(a,d) are from Region A; (b,e) are from Region B; (c,f) are from Region C.

Figure 9 .
Figure 9.The variabilities of fusion results and ground-based FMF and variabilities of PM 2.5 from in-situ measurements and estimated from fusion results in each region, including the mean, maximum, and minimum value in each region.(a,d) are from Region A; (b,e) are from Region B; (c,f) are from Region C.

Table 1 .
The parameters of the water-soluble (WS), biomass burning (BB), and dust (DU) models in the fine mode fraction (FMF) calculation.
Note: n and k are the real part and imaginary part of complex refractive index, respectively.

Table 2 .
The FMF differences between the Moderate Resolution Imaging Spectroradiometer (MODIS) algorithm and the spectral deconvolution algorithm (SDA) method in the WS, BB, and DU models.

Table 1 .
The parameters of the water-soluble (WS), biomass burning (BB), and dust (DU) models in the fine mode fraction (FMF) calculation.

Table 2 .
The FMF differences between the Moderate Resolution Imaging Spectroradiometer (MODIS) algorithm and the spectral deconvolution algorithm (SDA) method in the WS, BB, and DU models.

Table 3 .
Comparison between the pre-fusion and leave-one-out validation results of Winter 2015 at ground-based locations a .
a Ground-based FMF is FMF from the Aerosol Robotic Network (AERONET) and the Sun-sky radiometer Observation Network (SONET); MODIS FMF is FMF from MODIS; FMF (leave-one-out (LOO)) is the cross-validation results of FMF; ∆ (MODIS) and ∆ (LOO) are the errors between ground-based FMF and MODIS FMF and between ground-based FMF and FMF (LOO), respectively.