Validation of OMI HCHO Products Using MAX-DOAS observations from 2010 to 2016 in Xianghe, Beijing: Investigation of the Effects of Aerosols on Satellite Products

Formaldehyde (HCHO) is one of the most abundant hydrocarbons in the atmosphere. Its absorption features in the 320–360 nm range allow its concentration in the atmosphere to be retrieved from space. There are two versions of HCHO datasets derived from the Ozone Monitoring Instrument (OMI)—one provided by the Royal Belgian Institute for Space Aeronomy (BIRA-IASB) and one provided by the National Aeronautics and Space Administration (NASA)—referred to as OMI-BIRA and OMI-NASA, respectively. We conducted daily comparisons of OMI-BIRA and multi-axis differential optical absorption spectrometry (MAX-DOAS), OMI-NASA and MAX-DOAS, and OMI-BIRA and OMI-NASA and monthly comparisons of OMI-BIRA and MAX-DOAS and OMI-NASA and MAX-DOAS. Daily comparisons showed a strong impact of effective cloud fraction (eCF), and correlations were better for eCF < 0.1 than for eCF < 0.3. By contrast, the monthly and multi-year monthly mean values yielded correlations of R2 = 0.60 and R2 = 0.95, respectively, for OMI-BIRA and MAX-DOAS, and R2 = 0.45 and R2 = 0.78 for OMI-NASA and MAX-DOAS, respectively. Therefore, use of the monthly mean HCHO datasets is strongly recommended. We conducted a sensitivity test for HCHO air mass factor (AMF) calculations with respect to the HCHO profile, the aerosol extinction coefficient (AEC), the HCHO profile–AEC combination, the aerosol optical depth (AOD), and the single scattering albedo (SSA) to explicitly account for the aerosol optical effects on the HCHO AMF. We found that the combination of AEC and HCHO profiles can account for 23–39% of the HCHO AMF variation. Furthermore, a high load of absorptive aerosols can exert a considerable effect (−53%) on the AMF. Finally, we used the HCHO monthly mean profiles from Goddard Earth Observing System coupled to Chemistry (GEOS-Chem), seasonal mean AECs from Cloud-Aerosol LIDAR with Orthogonal Polarization (CALIOP) and monthly climatologies of AOD and SSA from the OMAERUV (OMI level-2 near UV aerosol data product) dataset at Xianghe station to determine the aerosol correction. The results reveal that aerosols can account for +6.37% to +20.7% of the HCHO monthly change. However, the changes are greatest in winter and are weaker in summer and autumn, indicating that the aerosol correction is more applicable under high-AAOD conditions and that there may be other reasons for the significant underestimation between satellite and MAX-DOAS observations.


Introduction
In recent years, with air pollution control, PM 2.5 has declined significantly in China (http://kjs.mep.gov.cn/).O 3 is now the only pollutant whose mass burden has continued to increase in the 74 experimental cities of China, and the mass concentration is expected to continue to increase [1,2].Furthermore, surface ozone has become the primary pollutant in many cities in China in the summer [3].Volatile organic compounds (VOCs), as an important precursor of ozone, play an important role in air quality [4,5].Studies have shown that VOCs are the key to ozone control [2,6].In addition, VOCs can also generate secondary organic aerosols [7].Formaldehyde (HCHO) is an intermediate of the oxidation reaction of various VOCs in the atmosphere.Therefore, HCHO can be used as a tracer for VOCs in the absence of other VOC observations [8].Background HCHO in the atmosphere is mainly derived from the oxidation of CH 4 .In continental regions, biomass burning, direct emissions from the industrial sector, and the oxidation of non-methane VOCs (NMVOCs) emitted by biological sources can result in regional features of the spatial and temporal distributions of HCHO [9,10].Due to photochemical reactions and the oxidation of OH radicals, the life cycle of HCHO is only 1.5 h [11].
HCHO has a unique and sufficiently strong absorption signature in the 320-360 nm range that can be used to detect HCHO concentrations from space.UV-Vis hyperspectral sensors, such as Global Ozone Monitoring Experiment (GOME), prove that it is possible to monitor the HCHO column [12].HCHO retrieval has also been conducted based on GOME-2 [13], Scanning Imaging Absorption spectrometer for Atmospheric ChartographY (SCIAMACHY) [10], the Ozone Measurement Instrument (OMI) [14,15], Ozone Mapping and Profiler Suite (OMPS) [16,17], and TROPOspheric Monitoring Instrument (TROPOMI) [18].The OMI has been in orbit for 15 years, and its long-term observation of HCHO provides valuable HCHO information.There are two HCHO algorithms based on OMI data: the operational OMHCHO product published by the National Aeronautics and Space Administration (NASA) and the HCHO dataset released by the Belgian Institute for Space Aeronomy (BIRA).For convenience, these products are referred to as OMI-NASA and OMI-BIRA, respectively.With improvements in sensors, many users have already benefitted from these products for several applications, such as the study of long-term trends [10,19], top-down VOC inventory studies [20,21], sensitivity analysis of ozone production [8], and biomass burning source emission assessment [22].
According to previous studies, eastern China and the Pearl River Delta region in southern China are hotspots of HCHO.However, previous validation results in Xianghe show that the regression line between satellite observation and multi-axis differential optical absorption spectrometry (MAX-DOAS) strongly depends on the parameter settings when calculating the air mass factor (AMF).Clear underestimation is observed when HCHO profiles from IMAGES mode are used a priori (−2.6 × 10 15 molec/cm 2 ); by contrast, the correction of the multi-year average data during the period 2008-2013 increases from 0.6 to 0.9 when the HCHO profiles from local MAX-DOAS are used, indicating an underestimation of −0.9 × 10 15 molec/cm 2 [15].The validation results by Wang et al. [23] in Wuxi show that HCHO is biased by approximately −20%, 20%, and 12% based on OMI, GOME-2A and GOME-2B, respectively.Under clear-sky conditions using the vertical profile of MAX-DOAS as the shape factor, the AMF of HCHO changes by 11%, and the correlation is improved by 35%.However, the results show that the HCHO profile cannot fully explain the underestimation phenomenon; other error sources should be identified to further improve the satellite retrieval algorithms.
The AMF calculation error is one of the main sources of uncertainty in the HCHO satellite retrieval products [18].The AMF uncertainty is mainly due to assumptions about atmospheric conditions, especially for the HCHO shape factor and cloud-aerosol correction [24].In existing HCHO inversion algorithms, the aerosol information is implicitly considered in the cloud information, whereas there is no explicit consideration of the aerosol contribution.However, the contribution of aerosols to trace gases (NO 2 ) has received increasing attention.Lin et al. [25,26] quantitatively assessed the effects of aerosols on cloud parameters and the NO 2 AMF during NO 2 retrieval in China and developed the POMINO algorithm, which can calculate the NO 2 AMF online without look-up tables and considers aerosols explicitly.The results show that the correlation coefficient (R 2 ) of the POMINO algorithm increased from 0.72 to 0.96, indicating that an explicit treatment of aerosols is important for space-based NO 2 retrievals.However, because the aerosol parameters (extinction coefficient profile, aerosol optical depth (AOD), single scattering albedo (SSA), asymmetry factor, etc.) are complex, it is difficult to quantify them using the conventional look-up table method and, thus, the effects of aerosols on the HCHO AMF have not been explicitly considered.Both HCHO retrieval algorithms implicitly exploit aerosol information in cloud information.To some extent, the scattering effect of non-absorptive aerosols implicit in cloud information can offset the contribution of some aerosols to trace gas retrieval [18,27] because non-absorptive aerosols and clouds have similar radiation contributions.However, absorptive aerosols have the opposite effect of clouds, which can reduce the spectral reflectance.Compared with considering the absorptive aerosol contribution (AMFa), not considering the absorptive aerosol contribution (AMF off ) will cause an overestimation of the AMF, which in turn will underestimate the concentration of HCHO in the vertical column.
In previous studies, the satellite-retrieved HCHO datasets produced by different research institutions were validated using the MAX-DOAS or model simulation individually.No comparative evaluation of the applicability of different algorithms in specific regions has been performed.In this study, we validate daily and monthly averages of the HCHO data derived from OMI by BIRA and NASA using the MAX-DOAS observations in Xianghe.To account for the impact of clouds on both MAX-DOAS and satellite retrieval, we adopt the method used by Wang et al. [23], which separately evaluates the cloud effects on both satellite and MAX-DOAS observations.Additionally, the aerosol optical proprieties at Xianghe station derived from OMI are investigated, and the aerosol effect on the HCHO AMF is examined using SCIATRAN.Finally, the monthly mean HCHO vertical column density (VCD) is corrected based on the contribution of aerosols.
The paper is organized as follows: in Section 2, we describe the MAX-DOAS observations in Xianghe and the satellite products involved in this study.In Section 3, we compare the daily and monthly HCHO VCDs derived from BIRA and NASA with MAX-DOAS observations.In addition, we evaluate the effect of aerosols on the HCHO AMF and correct its effect on the monthly mean HCHO BIRA and HCHO NASA .Section 4 discusses the validation results.Section 5 presents the summary and conclusions.

MAX-DOAS Instrument and Data Analysis
The MAX-DOAS measurement technique has been developed to retrieve tropospheric trace gas total columns and profiles.The most recent generation of MAX-DOAS instruments allows the measurement of aerosols and a number of tropospheric pollutants, such as NO 2 , HCHO, SO 2 , HONO, O 4 and CHOCHO [28][29][30].The MAX-DOAS instrument used in this study was constructed and assembled by the Royal Belgian Institute for Space Aeronomy (BIRA-IASB) (see Clémer et al. [31]).It consists essentially of a telescope mounted on a sun-tracker (which can point at any angle and in any azimuthal direction) combined with two spectrographs: one for the UV spectrum (300-390 nm) and one for the visible spectrum (400-720 nm).HCHO retrieval is based on sequential observations at elevation angles of 2, 4, 6, 8, 10, 12, 15, 30 and 90 • .During the period analysed in this study (2010-2016), the instrument was stationed approximately 55 km to the east-southeast of Beijing at the meteorological observatory in Xianghe (39.75 • N, 116.96 • E).The vertical profile retrievals can also be treated as optimal estimates because of the scientific-grade instruments installed in China [15].Ground observations indicate a clear seasonal cycle in HCHO concentrations, with a winter minimum of approximately 5 × 10 15 molec cm −2 and a summer maximum roughly five times higher [29].In general, we expect that the VCDs from MAX-DOAS have much lower uncertainties than those from satellite observations, though the bias of MAX-DOAS HCHO VCDs is −23 ± 28% with respect to near-surface volume mixing ratios [23,29].

OMI-BIRA HCHO Product
The BIRA-IASB algorithm for the retrieval of HCHO columns was updated by De Smedt et al. [15].To better account for O 2 -O 2 and BrO absorption, the new version (V14) of the HCHO retrieval scheme includes three fitting intervals: 339-364 nm for the pre-fit of O 4 , 328.5-359 nm for the pre-fit of BrO and 328.5-346 nm for the fit of the HCHO slant column density (SCD).QDOAS software is used for the DOAS retrieval, with the earthshine radiances averaged in the equatorial Pacific as reference spectra.A destriping correction and background normalization are applied in the across-swath position.The 340-nm HCHO AMF is calculated using the LIDORT v3.3 radiative transfer model, which uses a priori HCHO profiles from a global 2.0 • (latitude) × 2.5 • (longitude) IMAGESv2 chemistry transport model simulation.Cloud effects are corrected using the independent pixel approximation [15].No explicit correction is applied for aerosols, but the cloud correction scheme accounts for a large part of their scattering.All rows affected by an OMI row anomaly are identified using the fitting residuals as follows: fitting residuals 3 times larger than the average fitting residual, pixels with effective cloud fractions of >0.4,solar zenith angles larger than 70 • , and individual vertical column errors larger than 3 times the column are assigned a poor-quality (<0) flag.

OMI-NASA HCHO Product
The official NASA HCHO product is provided by the updated Smithsonian Astrophysical Observatory (SAO) retrieval, as described in González Abad et al. [14] and evaluated in Zhu et al. [32].HCHO SCDs are retrieved through a direct non-linear least-squares fitting of spectral radiances within the interval 328.5-356.5 nm.The retrieval algorithm includes dynamic calibration of solar and radiance wavelengths, an under-sampling correction, and a common mode residual spectrum.Daily radiances are used as reference spectra.The retrieved SCDs are converted to VCDs using AMFs taken from the look-up table pre-computed using the VLIDORT radiative transfer model, which uses a priori HCHO profiles from a global 2.0 • (latitude) × 2.5 • (longitude) GEOS-Chem chemistry transport model simulation.A daily post-processing normalization correction (as a function of latitude and detector row) is applied to reduce retrieval biases, minimize noise, and reduce cross-track striping.Only good pixels with a main data quality flag of 0 are used in this work.Barkley et al. [33] used the OMI-NASA HCHO data as a factor in the assessment of air quality changes over the Middle East during 2005-2014.

Statistical Evaluation Methods
In this study, the OMI-BIRA and OMI-NASA HCHO products are compared with MAX-DOAS observations and with each other.The root mean square error (RMSE) between the satellite products and MAX-DOAS columns is used to quantify the systematic biases and is calculated following Equation (1).

AMF Calculation
The AMF is defined as the ratio of the SCD to the VCD.A geometric AMF (AMF G ) can be defined as AMF G = sec θ s + sec θ v , which is a function of the solar zenith angle θ s and the satellite viewing angle θ v in the absence of atmospheric scattering.In the actual atmosphere, the AMF also depends on the trace gas profile, air pressure, surface albedo, and aerosol profiles.In this work, the AMFs are calculated at 340 nm by SCIATRAN [34,35].The HCHO profiles from GEOS-Chem, which feature maximum values concentrated near the ground and an exponential decrease in values with height, are used to specify the vertical distribution.High-resolution GEOS-Chem model runs were performed for the period 2007 (0.667 • long.× 0.5 • lat.) and 2014 (0.3125 • long.× 0.25 • lat.), and each run involved 36 layers, with ~10 layers below 3 km.The model profiles from this run were also used for NO 2 retrievals over China [25,26].The HCHO AMF is sensitive to the vertical distribution of atmospheric species, and a more sophisticated calculation of the HCHO AMF includes the effects of clouds [15,18] and aerosols [25,26].The AMF is calculated by applying the following equation: where I +i and I −i are the intensities with and without the absorber i, respectively.The vertical optical density VOD i (λ) from the ground to the top of the atmosphere h 0 is defined as , where h is the altitude, ρ is the number density of the absorber, and σ is the absorption cross-section at a given wavelength.This method is similar to [36] and is valid only for an optically thin atmosphere.work, the AMFs are calculated at 340 nm by SCIATRAN [34,35].The HCHO profiles from GEOS-Chem, which feature maximum values concentrated near the ground and an exponential decrease in values with height, are used to specify the vertical distribution.High-resolution GEOS-Chem model runs were performed for the period 2007 (0.667° long. 0.5° lat.) and 2014 (0.3125° long. 0.25° lat.), and each run involved 36 layers, with ~10 layers below 3 km.The model profiles from this run were also used for NO2 retrievals over China [25,26].The HCHO AMF is sensitive to the vertical distribution of atmospheric species, and a more sophisticated calculation of the HCHO AMF includes the effects of clouds [15,18] and aerosols [25,26].The AMF is calculated by applying the following equation:

Validation of Satellite Datasets
where i I  and i I  are the intensities with and without the absorber i , respectively.The vertical optical density ( ) i VOD  from the ground to the top of the atmosphere 0 h is defined as , where h is the altitude,  is the number density of the absorber, and  is the absorption cross-section at a given wavelength.This method is similar to [36] and is valid only for an optically thin atmosphere.

Daily Comparisons
Based on the daily averaged data in Figure 2, comparisons were performed between the daily products of OMI-BIRA and MAX-DOAS, OMI-NASA and MAX-DOAS, and OMI-NASA and OMI-BIRA.To characterize the cloud effect on the comparisons, the effective cloud fraction (eCF),

Daily Comparisons
Based on the daily averaged data in Figure 2, comparisons were performed between the daily products of OMI-BIRA and MAX-DOAS, OMI-NASA and MAX-DOAS, and OMI-NASA and OMI-BIRA.To characterize the cloud effect on the comparisons, the effective cloud fraction (eCF), which is obtained from the OMCLDO2 cloud product [37], was classified using value ranges of 0-0.3, 0-0.2, and 0-0.1 for the three comparisons.
The coincident observation number of correlative days for eCF < 0.3, eCF < 0.2 and eCF < 0.1 amounts to 1003 (40.17%), 748 (29.96%) and 447 (17.90%), respectively, in Xianghe during the period 2010.03-2016.12.Notably, a previous study found that cloud effects on MAX-DOAS results are negligible for satellite validation activities and that cloud effects on the HCHO products become substantial for eCF > 0.3 [23].Figure 3a-c displays scatter plots of the daily averaged OMI-BIRA and OMI-NASA datasets versus the corresponding MAX-DOAS measurements for different eCF bins of 0-0.3, 0-0.2 and 0-0.1, respectively.The correlation coefficient (R) for eCF < 10% is better than the R for eCF < 30% for the three groups of comparisons.The OMI-BIRA and OMI-NASA datasets have similar correlations with MAX-DOAS, i.e., R = 0.26 (R = 0.34) and R = 0.27 (R = 0.32), respectively, for eCF < 0.3 (eCF < 0.1).The correlations between OMI-NASA and OMI-BIRA are 0.48, 0.53 and 0.56 for eCF < 0.3, eCF < 0.2, and eCF < 0.1, respectively, based on the daily comparison.Figure A1 to Figure A3 show daily comparisons between OMI-BIRA and MAX-DOAS, OMI-NASA and MAX-DOAS, and OMI-NASA and OMI-BIRA by season.Figure A1 shows that the correlation between OMI-BIRA and MAX-DOAS is the best in spring, with R = 0.25 (R = 0.26) for 0 < eCF < 0.2 (0 < eCF < 0.1).The changes in eCF can change the correlation between the satellite data and the MAX-DOAS observations when the upper limit of eCF changes from 0.3 to 0.2 for all seasons except winter; however, less variation is observed for stricter cloud screening (eCF < 0.1) for summer.In winter, there is a weak correlation or even a negative correlation between OMI-BIRA and MAX-DOAS, indicating that the winter satellite data need to be used cautiously.Figure A2 shows that OMI-NASA has a better correlation with MAX-DOAS in spring.When stricter eCF control is employed, the satellite dataset's correlation with the ground observations can be improved for spring and summer, but the two datasets have a poor correlation in summer and winter.OMI-NASA and OMI-BIRA have similar correlations except for winter, with R values ranging from 0.25 to 0.48 for eCF < 0.3 (Figure A3).The differences in dependence on eCF of OMI-BIRA and MAX-DOAS, OMI-NASA and MAX-DOAS, and OMI-BIRA and OMI-NASA will be discussed in Section 4.

Monthly Comparisons
This section describes the monthly comparisons of OMI-BIRA and MAX-DOAS, and OMI-NASA and MAX-DOAS.The R 2 values of OMI-BIRA and MAX-DOAS, and OMI-NASA and MAX-DOAS are 0.60 and 0.45, respectively, and the corresponding RMSEs are 2.20 × 10 15 molec/cm 2 and 2.90 × 10 15 molec/cm 2 .These results are significantly better than the daily correlations in Section 3.1.The multi-year monthly average is shown in Figure 4b.The R 2 of OMI-BIRA (OMI-NASA) and MAX-DOAS is 0.95 (0.78), with an RMSE of 0.72 × 10 15 molec/cm 2 (1.64 × 10 15 molec/cm 2 ).The correlation is significant within the 95% confidence interval.The multi-year monthly averages of the three datasets are shown in Figure 5.The multi-year monthly average of OMI-BIRA is significantly lower than that of MAX-DOAS.The largest difference occurs in June, reaching −9.472 × 10 15 molec/cm 2 (−59.4723%), and the smallest difference occurs in January, with a value of −1.337 × 10 15 molec/cm 2 (−15.05%), as indicated in Table 1.By contrast, the multi-monthly average of OMI-NASA is higher than that of MAX-DOAS from October to February, with a range of +9.70% to 33.74%, and is slightly lower in March-October, with the smallest difference occurring in September.October has the largest negative difference, reaching −16.28%.However, the difference is less significant than that of OMI-BIRA.This section describes the monthly comparisons of OMI-BIRA and MAX-DOAS, and OMI-NASA and MAX-DOAS.The R 2 values of OMI-BIRA and MAX-DOAS, and OMI-NASA and MAX-DOAS are 0.60 and 0.45, respectively, and the corresponding RMSEs are 2.20 × 10 15 molec/cm 2 and 2.90 × 10 15 molec/cm 2 .These results are significantly better than the daily correlations in Section 3.1.The multi-year monthly average is shown in Figure 4b.The R 2 of OMI-BIRA (OMI-NASA) and MAX-DOAS is 0.95 (0.78), with an RMSE of 0.72 × 10 15 molec/cm 2 (1.64 × 10 15 molec/cm 2 ).The correlation is significant within the 95% confidence interval.The multi-year monthly averages of the three datasets are shown in Figure 5.The multi-year monthly average of OMI-BIRA is significantly lower than that of MAX-DOAS.The largest difference occurs in June, reaching −9.472 × 10 15 molec/cm 2 (−59.4723%), and the smallest difference occurs in January, with a value of −1.337 × 10 15 molec/cm 2 (−15.05%), as indicated in Table 1.By contrast, the multi-monthly average of OMI-NASA is higher than that of MAX-DOAS from October to February, with a range of +9.70% to 33.74%, and is slightly lower in March-October, with the smallest difference occurring in September.October has the largest negative difference, reaching −16.28%.However, the difference is less significant than that of OMI-BIRA.Table 1.Absolute and relative differences between OMI-BIRA, OMI-NASA and MAX-DOAS based on multi-year averages.

Aerosol Effects on the Satellite Results
To further understand the impact of aerosols on the HCHO AMF calculation, we performed a sensitivity analysis to determine the dependence of the HCHO AMF calculation on the vertical distribution of the HCHO profile, vertical distribution of the aerosol extinction coefficient (AEC), AOD, and SSA.

HCHO AMF Dependence on the AEC
In this study, the AEC provided by Regional Atmospheric Modeling System and Models-3 Community Multiscale Air Quality (RAMS-CMAQ) and Cloud-Aerosol LIDAR with Orthogonal Polarization (CALIOP) are used to test the effects of AEC on the calculation of the HCHO AMF.The air quality modelling system RAMS-CMAQ has been used to investigate regional atmospheric pollution and environmental issues in many studies [2, 38,39].The regional model CMAQs can simulate various chemical and physical processes valuable for understanding atmospheric trace gas transformation and distribution [40].CALIOP, which is aboard the Calipso satellite, is an elastic backscatter LIDAR that transmits polarized laser light at 532 and 1064 nm [41].The LIDAR signal inversion begins at approximately 30 km above the ground and continues to the surface.Its orbit repeats every 16 days [42].Wu et al. [43] noted that RAMS-CMAQ can reflect the vertical distribution of aerosols in China.The AEC peak is ~0.5 km above the ground, but it is underestimated (0.1 km −1 ) in north-western China and overestimated (0.3 km −1 ) in central and eastern China.Considering the low temporal resolution of Calipso, the observations of the RAMS-CMAQ and CALIOP datasets complement each other to some extent.The two datasets over China in 2013-2015 were used to perform cluster analysis, and Figure 7a,b shows the results of 5-8 categories (NAEC) for RAMS-CMAQ and CALIOP, respectively.Figure 7 shows that the magnitude and height of the AEC peak between different clusters are different and that the differences in the AEC among the classes are significantly larger for CALIOP than for RAMS-CMAQ.The RAMS-CMAQ classes show a similar vertical shape factor, which consistently decreases with increasing altitude.By contrast, CALIOP has a finer vertical resolution than RAMS-CMAQ and can reflect more complex features, e.g., the peak position changes from the surface to 2 km.The dependence of the AMF on the two AEC datasets is tested with an AOD set to 1 and an SSA set to 0.92.We find that the changes in the AMF caused by changes in the AEC types are relatively insignificant.As shown in Figure 8, for RAMS-CMAQ, the change caused by the AEC profile type is small (between 5% and 8%), and for Calipso, the change is larger (between 12% and 23%).When more categories are used, more subtle differences can be reflected.

HCHO AMF Dependence on the AEC
In this study, the AEC provided by Regional Atmospheric Modeling System and Models-3 Community Multiscale Air Quality (RAMS-CMAQ) and Cloud-Aerosol LIDAR with Orthogonal Polarization (CALIOP) are used to test the effects of AEC on the calculation of the HCHO AMF.The air quality modelling system RAMS-CMAQ has been used to investigate regional atmospheric pollution and environmental issues in many studies [2, 38,39].The regional model CMAQs can simulate various chemical and physical processes valuable for understanding atmospheric trace gas transformation and distribution [40].CALIOP, which is aboard the Calipso satellite, is an elastic backscatter LIDAR that transmits polarized laser light at 532 and 1064 nm [41].The LIDAR signal inversion begins at approximately 30 km above the ground and continues to the surface.Its orbit repeats every 16 days [42].Wu et al. [43] noted that RAMS-CMAQ can reflect the vertical distribution of aerosols in China.The AEC peak is ~0.5 km above the ground, but it is underestimated (0.1 km −1 ) in north-western China and overestimated (0.3 km −1 ) in central and eastern China.Considering the low temporal resolution of Calipso, the observations of the RAMS-CMAQ and CALIOP datasets complement each other to some extent.The two datasets over China in 2013-2015 were used to perform cluster analysis, and Figure 7a,b shows the results of 5-8 categories (N AEC ) for RAMS-CMAQ and CALIOP, respectively.Figure 7 shows that the magnitude and height of the AEC peak between different clusters are different and that the differences in the AEC among the classes are significantly larger for CALIOP than for RAMS-CMAQ.The RAMS-CMAQ classes show a similar vertical shape factor, which consistently decreases with increasing altitude.By contrast, CALIOP has a finer vertical resolution than RAMS-CMAQ and can reflect more complex features, e.g., the peak position changes from the surface to 2 km.The dependence of the AMF on the two AEC datasets is tested with an AOD set to 1 and an SSA set to 0.92.We find that the changes in the AMF caused by changes in the AEC types are relatively insignificant.As shown in Figure 8, for RAMS-CMAQ, the change caused by the AEC profile type is small (between 5% and 8%), and for Calipso, the change is larger (between 12% and 23%).When more categories are used, more subtle differences can be reflected.When the HCHO vertical profiles of NHCHO = 5 in Section 3.3.1 and the CALIOP AEC profiles of NAEC = 8 in Section 3.3.2are selected, the range of AMF variation in the group is 23-39% (Figure 9), substantially larger than the variability induced by changes in the vertical HCHO profile and AEC individually.Therefore, the relative positions of the HCHO and AEC profiles affect the AMF results.Consequently, in the AMF calculation, it is necessary to consider the combination of both.When the HCHO vertical profiles of NHCHO = 5 in Section 3.3.1 and the CALIOP AEC profiles of NAEC = 8 in Section 3.3.2are selected, the range of AMF variation in the group is 23-39% (Figure 9), substantially larger than the variability induced by changes in the vertical HCHO profile and AEC individually.Therefore, the relative positions of the HCHO and AEC profiles affect the AMF results.Consequently, in the AMF calculation, it is necessary to consider the combination of both.

HCHO AMF Dependence on the AEC and HCHO Profile Combination
When the HCHO vertical profiles of N HCHO = 5 in Section 3.3.1 and the CALIOP AEC profiles of N AEC = 8 in Section 3.3.2are selected, the range of AMF variation in the group is 23-39% (Figure 9), substantially larger than the variability induced by changes in the vertical HCHO profile and AEC individually.Therefore, the relative positions of the HCHO and AEC profiles affect the AMF results.Consequently, in the AMF calculation, it is necessary to consider the combination of both.

HCHO AMF Dependence on AOD and SSA
As shown in Figure 10, when the aerosol SSA remains unchanged, the HCHO AMF decreases with increasing AOD.Furthermore, the smaller the SSA, the greater the effect of a change in AOD on the AMF.For SSA = 0.82, when AOD changes from 0.1 to 2, the AMF decreases by 53%, and the AMF differences lessen with increasing AOD.When the SSA is 0.97, the AOD changes from 0.1 to 2, and the AMF decreases by 14%.When AOD remains unchanged, the AMF shows an increasing trend with increasing SSA.The larger the AOD, the more obvious the increasing trend of the AMF. Figure 10b shows the percentage change in the AMF with SSA = 0.82 as a reference.The change in the AMF is largest for SSA = 0.97 and AOD = 2, up to 83%.

Aerosol Correction
According to the results of the HCHO AMF sensitivity analysis, we extract monthly average HCHO profiles based on GEOS-Chem for 2007 and 2014 in the Xianghe area and the seasonal AEC averages during the period 2013-2015 as the initial input of the AMF calculation (Figure 11).At Xianghe station, the vertical distribution of HCHO decreases with altitude.The HCHO is mainly distributed in the height range of 2 km near the ground.The maximum is 2.8 ppb in August, and the minimum is less than 1 ppb in April.However, in December, the HCHO concentration near the ground is larger than the average of the adjacent months and is greater than 2 ppb.The vertical The colours correspond to the AEC profiles in Figure 7 for each cluster.The change in the x-axis corresponds to the HCHO profiles in Figure 6a for N HCHO = 5.

HCHO AMF Dependence on AOD and SSA
As shown in Figure 10, when the aerosol SSA remains unchanged, the HCHO AMF decreases with increasing AOD.Furthermore, the smaller the SSA, the greater the effect of a change in AOD on the AMF.For SSA = 0.82, when AOD changes from 0.1 to 2, the AMF decreases by 53%, and the AMF differences lessen with increasing AOD.When the SSA is 0.97, the AOD changes from 0.1 to 2, and the AMF decreases by 14%.When AOD remains unchanged, the AMF shows an increasing trend with increasing SSA.The larger the AOD, the more obvious the increasing trend of the AMF. Figure 10b shows the percentage change in the AMF with SSA = 0.82 as a reference.The change in the AMF is largest for SSA = 0.97 and AOD = 2, up to 83%. .AMF values corresponding to the combination of the HCHO profile and AEC clusters.The colours correspond to the AEC profiles in Figure 7 for each cluster.The change in the x-axis corresponds to the HCHO profiles in Figure 6a for NHCHO = 5.

HCHO AMF Dependence on AOD and SSA
As shown in Figure 10, when the aerosol SSA remains unchanged, the HCHO AMF decreases with increasing AOD.Furthermore, the smaller the SSA, the greater the effect of a change in AOD on the AMF.For SSA = 0.82, when AOD changes from 0.1 to 2, the AMF decreases by 53%, and the AMF differences lessen with increasing AOD.When the SSA is 0.97, the AOD changes from 0.1 to 2, and the AMF decreases by 14%.When AOD remains unchanged, the AMF shows an increasing trend with increasing SSA.The larger the AOD, the more obvious the increasing trend of the AMF. Figure 10b shows the percentage change in the AMF with SSA = 0.82 as a reference.The change in the AMF is largest for SSA = 0.97 and AOD = 2, up to 83%.

Aerosol Correction
According to the results of the HCHO AMF sensitivity analysis, we extract monthly average HCHO profiles based on GEOS-Chem for 2007 and 2014 in the Xianghe area and the seasonal AEC averages during the period 2013-2015 as the initial input of the AMF calculation (Figure 11).At Xianghe station, the vertical distribution of HCHO decreases with altitude.The HCHO is mainly distributed in the height range of 2 km near the ground.The maximum is 2.8 ppb in August, and the minimum is less than 1 ppb in April.However, in December, the HCHO concentration near the ground is larger than the average of the adjacent months and is greater than 2 ppb.The vertical

Aerosol Correction
According to the results of the HCHO AMF sensitivity analysis, we extract monthly average HCHO profiles based on GEOS-Chem for 2007 and 2014 in the Xianghe area and the seasonal AEC averages during the period 2013-2015 as the initial input of the AMF calculation (Figure 11).At Xianghe station, the vertical distribution of HCHO decreases with altitude.The HCHO is mainly distributed in the height range of 2 km near the ground.The maximum is 2.8 ppb in August, and the minimum is less than 1 ppb in April.However, in December, the HCHO concentration near the ground is larger than the average of the adjacent months and is greater than 2 ppb.The vertical profile distribution of aerosols is similar in summer, autumn and winter.The maximum value occurs at 500 m.Except for the relatively high value at 500 m, the vertical distribution characteristics in spring differ from those in the other seasons in the range of 2-4 km.These differences may be related to dust transport in spring.
The AOD and SSA parameters are derived from the OMI level-2 near UV aerosol data product 'OMAERUV'.OMAERUV products are derived from the measured reflectance data from OMI observations and climatological surface albedo data from Total Ozone Mapping Spectrometer (TOMS) at 354 nm and 388 nm [44].The validation results at Xianghe station in 2005-2008 indicate that the correlation of AOD is 0.87, with an RMSE of 0.17 [45].Furthermore, the validation results in 2005-2014 show that the correlation of AOD is 0.79, with an RMSE of 0.21 [46].Despite the 13 × 24 km 2 coarse spatial resolution, using near-UV observations for aerosol characterization has the advantage of reducing retrieval errors associated with land surface reflectance characteristics due to the low surface albedo in this spectral region.In this paper, monthly climatologies of aerosol AOD and SSA at Xianghe station are calculated.The regional heterogeneity of aerosol optical properties in China is large and varies with the seasons [47].The monthly SSA ranges from 0.86 to 0.93.Starting in May, there was a marked rise in SSA, indicating a decrease in aerosol absorption.The seasonal cycle of AOD at 354 nm is the same as that in other regions: AOD increased significantly to a maximum of 1.30 in August and reached a minimum of 0.15 in October.
Remote Sens. 2018, 10, x FOR PEER REVIEW 13 of 22 profile distribution of aerosols is similar in summer, autumn and winter.The maximum value occurs at 500 m.Except for the relatively high value at 500 m, the vertical distribution characteristics in spring differ from those in the other seasons in the range of 2-4 km.These differences may be related to dust transport in spring.
The AOD and SSA parameters are derived from the OMI level-2 near UV aerosol data product 'OMAERUV'.OMAERUV products are derived from the measured reflectance data from OMI observations and climatological surface albedo data from Total Ozone Mapping Spectrometer (TOMS) at 354 nm and 388 nm [44].The validation results at Xianghe station in 2005-2008 indicate that the correlation of AOD is 0.87, with an RMSE of 0.17 [45].Furthermore, the validation results in 2005-2014 show that the correlation of AOD is 0.79, with an RMSE of 0.21 [46].Despite the 13 × 24 km 2 coarse spatial resolution, using near-UV observations for aerosol characterization has the advantage of reducing retrieval errors associated with land surface reflectance characteristics due to the low surface albedo in this spectral region.In this paper, monthly climatologies of aerosol AOD and SSA at Xianghe station are calculated.The regional heterogeneity of aerosol optical properties in China is large and varies with the seasons [47].The monthly SSA ranges from 0.86 to 0.93.Starting in May, there was a marked rise in SSA, indicating a decrease in aerosol absorption.The seasonal cycle of AOD at 354 nm is the same as that in other regions: AOD increased significantly to a maximum of 1.30 in August and reached a minimum of 0.15 in October.Using the monthly HCHO vertical profile as the reference scene, the seasonal vertical extinction coefficient profile at Xianghe station and the monthly AOD and SSA parameters are included in the AMF calculation.Compared with the reference scene, which does not consider aerosol parameters, the relative changes in November, December, April and March were large, i.e., 20.70%, 20.60%, 18.10%, and 10.31%, respectively.These changes were all positive, and the HCHO concentration increased after correction.The OMI-BIRA and OMI-NASA HCHO multi-month monthly mean data after aerosol correction are shown Figure 12.After the correction, the OMI-NASA results are higher than the ground-based MAX-DOAS results from October to May, and the monthly averages from May to September are closer to the MAX-DOAS data.OMI-BIRA increased in different proportions each month, and the correction reduced the discrepancy between the OMI-BIRA and MAX-DOAS data.Using the monthly HCHO vertical profile as the reference scene, the seasonal vertical extinction coefficient profile at Xianghe station and the monthly AOD and SSA parameters are included in the AMF calculation.Compared with the reference scene, which does not consider aerosol parameters, the relative changes in November, December, April and March were large, i.e., 20.70%, 20.60%, 18.10%, and 10.31%, respectively.These changes were all positive, and the HCHO concentration increased after correction.The OMI-BIRA and OMI-NASA HCHO multi-month monthly mean data after aerosol correction are shown Figure 12.After the correction, the OMI-NASA results are higher than the ground-based MAX-DOAS results from October to May, and the monthly averages from May to September are closer to the MAX-DOAS data.OMI-BIRA increased in different proportions each month, and the correction reduced the discrepancy between the OMI-BIRA and MAX-DOAS data.

Discussion
In this work, the weak correlation values between the daily satellite datasets and the MAX-DOAS observations compared with NO2 [25,26,48] indicate that the uncertainties in the satellite daily HCHO columns are large, resulting in errors that cause these differences, especially in winter.The daily comparison results are in qualitative agreement with the validation results for Wuxi, where the OMI-BIRA and MAX-DOAS relationship has R 2 = 0.12 for eCF < 0.1 [23].The daily comparison shows a larger random error, and the cloud effect can partly explain the bias in the whole dataset.However, at the season scale, the random error remains significant, and the effect of cloud correction depends on the season.This dependence may be due to cloud properties in different seasons or the cloud retrieval algorithm.At low eCF (eCF < 0.1, close to the detection limit of the cloud fraction [37,49]), cloud top pressure retrieval is highly unstable, which can introduce uncertainly in HCHO retrieval [18].This can also be seen in Figure A4, in which the correlation of monthly comparison for eCF < 0.3 is better than that for eCF < 0.1.In addition, because OMI-BIRA and OMI-NASA adopt similar cloud correction methods, the dependence of the correlation between two on eCF is weaker than those of the correlations of the satellite datasets with MAX-DOAS measurements.The poor correction in winter can be attributed to the low winter seasonal character of HCHO over Xianghe and the detection limitations for OMI-BIRA (7 × 10 15 molec/cm 2 ) and OMI-NASA (10 × 10 15 molec/cm 2 ).In summary, for both OMI-BIRA and OMI-NASA, the daily data should be used with caution.Use of the daily average data in winter is not recommended.
The monthly comparison shows that the monthly average can reduce the random error in the satellite products and ground-based data.OMI-BIRA has a larger systematic underestimation error than OMI-NASA, consistent with previous studies [32].The GEOS-Chem and IMAGES chemical transport models supply the HCHO profiles (shape factors) for OMI-NASA and OMI-BIRA, respectively.The HCHO profiles assumed for the HCHO AMF calculation may be a key factor accounting for the systematic bias [15,23].The other main factor related to the systematic difference between OMI-BIRA and OMI-NASA is background correction (see Table 2 for more detail).The inconsistent simulation of the HCHO concentration in the chemical transport models in the reference sector will dominate the monthly systematic bias.Furthermore, compared with OMI-BIRA, no pre-processing related to BrO interference is conducted for OMI-NASA.The overestimate provided by OMI-NASA in winter months may be unrealistic and may be due to strong interference of BrO at high latitudes.Theys et al. [50] showed that the zonal mean of tropospheric BrO vertical columns was larger from September to April of the next year at 30-60°N.The monthly comparisons prove that the monthly mean value of satellite observations can be used for air quality assessment.In addition, the RMSE of OMI-BIRA is smaller than that of OMI-NASA.This difference can be attributed to the consistency of the algorithm between the ground-based MAX-DOAS and the satellite-based OMI-BIRA.
With respect to the aerosol effect on HCHO AMF, our results indicate that the combination of the HCHO profile and the AEC profile can make a considerable difference under pollution

Discussion
In this work, the weak correlation values between the daily satellite datasets and the MAX-DOAS observations compared with NO 2 [25,26,48] indicate that the uncertainties in the satellite daily HCHO columns are large, resulting in errors that cause these differences, especially in winter.The daily comparison results are in qualitative agreement with the validation results for Wuxi, where the OMI-BIRA and MAX-DOAS relationship has R 2 = 0.12 for eCF < 0.1 [23].The daily comparison shows a larger random error, and the cloud effect can partly explain the bias in the whole dataset.However, at the season scale, the random error remains significant, and the effect of cloud correction depends on the season.This dependence may be due to cloud properties in different seasons or the cloud retrieval algorithm.At low eCF (eCF < 0.1, close to the detection limit of the cloud fraction [37,49]), cloud top pressure retrieval is highly unstable, which can introduce uncertainly in HCHO retrieval [18].This can also be seen in Figure A4, in which the correlation of monthly comparison for eCF < 0.3 is better than that for eCF < 0.1.In addition, because OMI-BIRA and OMI-NASA adopt similar cloud correction methods, the dependence of the correlation between the two on eCF is weaker than those of the correlations of the satellite datasets with MAX-DOAS measurements.The poor correction in winter can be attributed to the low winter seasonal character of HCHO over Xianghe and the detection limitations for OMI-BIRA (7 × 10 15 molec/cm 2 ) and OMI-NASA (10 × 10 15 molec/cm 2 ).In summary, for both OMI-BIRA and OMI-NASA, the daily data should be used with caution.Use of the daily average data in winter is not recommended.
The monthly comparison shows that the monthly average can reduce the random error in the satellite products and ground-based data.OMI-BIRA has a larger systematic underestimation error than OMI-NASA, consistent with previous studies [32].The GEOS-Chem and IMAGES chemical transport models supply the HCHO profiles (shape factors) for OMI-NASA and OMI-BIRA, respectively.The HCHO profiles assumed for the HCHO AMF calculation may be a key factor accounting for the systematic bias [15,23].The other main factor related to the systematic difference between OMI-BIRA and OMI-NASA is background correction (see Table 2 for more detail).The inconsistent simulation of the HCHO concentration in the chemical transport models in the reference sector will dominate the monthly systematic bias.Furthermore, compared with OMI-BIRA, no pre-processing related to BrO interference is conducted for OMI-NASA.The overestimate provided by OMI-NASA in winter months may be unrealistic and may be due to strong interference of BrO at high latitudes.Theys et al. [50] showed that the zonal mean of tropospheric BrO vertical columns was larger from September to April of the next year at 30-60 • N. The monthly comparisons prove that the monthly mean value of satellite observations can be used for air quality assessment.In addition, the RMSE of OMI-BIRA is smaller than that of OMI-NASA.This difference can be attributed to the consistency of the algorithm between the ground-based MAX-DOAS and the satellite-based OMI-BIRA.
With respect to the aerosol effect on HCHO AMF, our results indicate that the combination of the HCHO profile and the AEC profile can make a considerable difference under pollution conditions.The SSA is obviously smaller on the North China Plain, where there are more absorptive aerosols.Therefore, in the calculation of the AMF, aerosol parameters such as AOD and SSA need to be taken into account.The aerosol correction can partly eliminate the systematic underestimation despite other more effective factors (HCHO shape factor and cloud effect).The explicit consideration of the contribution of aerosol is necessary, especially under a high load of absorptive aerosols.

Figure 1
Figure 1 shows the spatial distribution of multi-year mean HCHO data from OMI-BIRA and OMI-NASA during the period 2010-2016.The daily averaged OMI-BIRA (in square dots) and OMI-NASA (in circle dots) data within 50 km around Xianghe station and the daily averaged MAX-DOAS data (in triangle dots) within the period 12:00-15:00 local time (LT) (corresponding to the satellite overpass time of 13:30) are shown in Figure 2. The black, red and green lines represent the monthly averaged OMI-BIRA, MAX-DOAS and OMI-NASA data, respectively.The vertical bars indicate one standard deviation.The comparison in the following section is based on these data.

Figure 1
Figure 1 shows the spatial distribution of multi-year mean HCHO data from OMI-BIRA and OMI-NASA during the period 2010-2016.The daily averaged OMI-BIRA (in square dots) and OMI-NASA (in circle dots) data within 50 km around Xianghe station and the daily averaged MAX-DOAS data (in triangle dots) within the period 12:00-15:00 local time (LT) (corresponding to the satellite overpass time of 13:30) are shown in Figure 2. The black, red and green lines represent the monthly averaged OMI-BIRA, MAX-DOAS and OMI-NASA data, respectively.The vertical bars indicate one standard deviation.The comparison in the following section is based on these data.

FigureFigure 3 .
FigureA1to FigureA3show daily comparisons between OMI-BIRA and MAX-DOAS, OMI-NASA and MAX-DOAS, and OMI-NASA and OMI-BIRA by season.FigureA1shows that the correlation between OMI-BIRA and MAX-DOAS is the best in spring, with R = 0.25 (R = 0.26) for 0 < eCF < 0.2 (0 < eCF < 0.1).The changes in eCF can change the correlation between the satellite data and the MAX-DOAS observations when the upper limit of eCF changes from 0.3 to 0.2 for all seasons except winter; however, less variation is observed for stricter cloud screening (eCF < 0.1) for

3. 3
.1.HCHO AMF Dependence on the HCHO Profile Cluster analysis of the HCHO vertical profile from GEOS-Chem in 2007 over China was performed, and the results of the 5-8 categories (N HCHO ) are shown in Figure 6a.The AMF corresponding to the vertical HCHO distribution is shown in Figure 6b.The results show that the position and magnitude of the HCHO maximum between different clusters are well-defined and can represent background and hotspot regions (e.g., cluster5 (blue line) and cluster6 (brick-red line) for N HCHO = 8 represent low and high HCHO concentrations near the surface, respectively], in addition to elevation information (e.g., cluster3 (magenta line) for N HCHO = 6, cluster7 (orange line) for N HCHO = 7 and cluster4 (green line) for N HCHO = 8 indicate HCHO profiles of high elevation).N HCHO = 8 can fully reflect the influence of the vertical HCHO distribution on the AMF calculation.The maximum difference in the AMF between different clusters is approximately 20%.The test results for N HCHO > 8 show that the effect on the AMF reflected by the different classes is basically equivalent to N HCHO = 8.

Figure 6 .
Figure 6.(a) HCHO profile clusters (NHCHO: 5-8); (b) air mass factor (AMF_ corresponding to the HCHO profiles in panel (a).Each colour bar corresponds to the HCHO profile of the same colour.

Figure 6 .
Figure 6.(a) HCHO profile clusters (N HCHO : 5-8); (b) air mass factor (AMF_ corresponding to the HCHO profiles in panel (a).Each colour bar corresponds to the HCHO profile of the same colour.

Figure 8 .
Figure 8. AMF values corresponding to the AEC clusters.The colours correspond to the AEC profiles in Figure 7 for each cluster.

Figure 9 .
Figure 9. AMF values corresponding to the combination of the HCHO profile and AEC clusters.The colours correspond to the AEC profiles in Figure7for each cluster.The change in the x-axis corresponds to the HCHO profiles in Figure6afor NHCHO = 5.

Figure 9 .
Figure 9. AMF values corresponding to the combination of the HCHO profile and AEC clusters.The colours correspond to the AEC profiles in Figure7for each cluster.The change in the x-axis corresponds to the HCHO profiles in Figure6afor N HCHO = 5.

Figure 9
Figure 9. AMF values corresponding to the combination of the HCHO profile and AEC clusters.The colours correspond to the AEC profiles in Figure7for each cluster.The change in the x-axis corresponds to the HCHO profiles in Figure6afor NHCHO = 5.

Figure 11 .
Figure 11.Monthly mean HCHO profile and seasonal mean AEC in Xianghe.

Figure 12 .
Figure 12.Before and after aerosol correction.

Figure 12 .
Figure 12.Before and after aerosol correction.

Table 1 .
Absolute and relative differences between OMI-BIRA, OMI-NASA and MAX-DOAS based on multi-year averages.