Improved Satellite Retrieval of Tropospheric NO 2 Column Density via Updating of Air Mass Factor ( AMF ) : Case Study of Southern China

Improving air quality and reducing human exposure to unhealthy levels of airborne chemicals are important global missions, particularly in China. Satellite remote sensing offers a powerful tool to examine regional trends in NO2, thus providing a direct measure of key parameters that strongly affect surface air quality. To accurately resolve spatial gradients in NO2 concentration using satellite observations and thus understand local and regional aspects of air quality, a priori input data at sufficiently high spatial and temporal resolution to account for pixel-to-pixel variability in the characteristics of the land and atmosphere are required. In this paper, we adapt the Berkeley High Resolution product (BEHR-HK) and meteorological outputs from the Weather Research and Forecasting (WRF) model to describe column NO2 in southern China. The BEHR approach is particularly useful for places with large spatial variabilities and terrain height differences such as China. There are two major objectives and goals: (1) developing new BEHR-HK v3.0C product for retrieving tropospheric NO2 vertical column density (TVCD) within part of southern China, for four months of 2015, based upon satellite datasets from Ozone Monitoring Instrument (OMI); and (2) evaluating BEHR-HK v3.0C retrieval result through validation, by comparing with MAX-DOAS tropospheric column measurements conducted in Guangzhou. Results show that all BEHR-HK retrieval algorithms (with R-value of 0.9839 for v3.0C) are of higher consistency with MAX-DOAS measurements than OMI-NASA retrieval (with R-value of 0.7644). This opens new windows into research questions that require high spatial resolution, for example retrieving NO2 vertical column and ground pollutant concentration in China and other countries.


Introduction
Air pollution is a worldwide issue.Atmospheric pollutants can have devastating effects on human health and ecosystems, including contributions to long term human diseases such as cancer and cardiovascular malfunctions [1,2].Poor air quality also causes deterioration of forest ecosystems and direct damage to trees and plants [3,4].In particular, nitrogen oxides (NO x = NO + NO 2 ) are trace gases that have both anthropogenic and natural sources, such as the burning of fossil fuels, emissions from motor vehicles, lightning and biomass burning [5][6][7][8].These trace gases have important effects on air quality and the production of ozone and aerosols within the atmosphere [9,10].NO x regulates surface ozone levels, initiates acid rain, and contributes to enhanced aerosol formation (e.g., NH 4 NO 3 ) and affects the global climate through chemistry of the greenhouse gases, ozone and methane [11,12].
In recent years, the Chinese government has implemented control policies and plans to reduce the amount of trace gases emitted by fossil fuel burning and vehicle emissions.These plans include clear targets and goals for the reduction of energy consumption, and emission of trace gases (such as NO 2 and SO 2 ) in Five-Year Plans (FYPs) 12 (2011-2015) and 13 (2016-2020) [13,14].These documents include targeted reduction percentages for each specific pollutant at both national and provincial levels.However, it is difficult to assess the effectiveness of control measures practiced at the provincial level, especially for NO 2 .This is mainly due to the lack of a complete ground monitoring network for the whole of China.Only Beijing, Guangzhou, Nanjing, Shanghai and a few more developed cities in China had ground monitoring stations for air quality research studies before 2013 [15], so there are no long-term and large-scale monitoring datasets from which to assess changes in pollutant emissions throughout previous years.To conduct in-situ NO x and NO 2 measurements, some Chinese local governments follow the example of many countries and have used chemiluminescence techniques for measuring ground surface NO 2 concentrations since 2010 [16,17], and the number of monitoring stations within the "Guangdong-Hong Kong-Macao-Pearl River Delta Regional Air Monitoring Network" has increased from 13 to 18 since its establishment in September 2014 [18].Satellite datasets and remote sensing techniques monitor the column density of pollutants within the lower troposphere, and ground pollutant concentrations are derived from inverse formulations [19] and nonlinear statistical models [20].These data provide a long record (more than 20 years) that can be used to evaluate changes in emissions and chemistry.
All NO 2 retrievals require a priori data, including NO 2 vertical profiles, surface pressure and surface reflectance, to correctly interpret the amount of NO 2 observed.Numerous studies have shown the importance of using high spatial and temporal resolution a priori data [22][23][24][25]; however, computational constraints limit the global retrievals to coarse (>1 • ) NO 2 profiles.
Regional products (covering only a part of the world) provide a means to use higher resolution a priori inputs.Two such products covering part of China are the POMINO [26] and HKOMI [27] retrievals.The POMINO retrieval uses NO 2 profiles at 0.5 • × 0.667 • resolution [26], which is better than OMI-NASA standard retrieval, but still coarser than necessary to capture urban NO x gradients well [22,23], for which emission changes occur on local and regional scales with gradients of change that are often small compared to standard approaches to interpreting satellite observations.Although HKOMI retrieval uses daily profiles at 3 km resolution [27], to our knowledge, retrieval has only been produced for four months (October 2006-January 2007), while the possibility of extending it to longer period is highly uncertain.
Therefore, we need an algorithm that has been applied to satellite retrieval for years or seasons.BEHR is one of these examples, and has been applied in different places within the United States (US) in previous years [24,28,29].Moreover, daily high resolution NO 2 profiles are necessary to correctly interpret the emissions and lifetimes implied by the observations, because the systematic errors without daily profiles can be up to 40% [30].
In this paper, we combine detailed high resolution land-use data for China with similarly high resolution meteorological and chemical information derived from the Weather Research and Forecasting (WRF) and Community Multiscale Air Quality (CMAQ) models to develop high spatial resolution NO 2 columns, using the BEHR [27][28][29] approach in deriving an air mass factor (AMF) and applying newly developed algorithm (BEHR-HK v3.0C) to the main part of southern China.Then, by combining the adjusted AMF spatial patterns with available satellite datasets and remote sensing techniques, we retrieve the tropospheric NO 2 VCD for the different seasons of 2015.Moreover, we also compare different versions of BEHR (in particular, BEHR-HK v3.0C) and OMI-NASA retrieval results with MAX-DOAS tropospheric column measurement datasets obtained in Guangzhou, and find that BEHR-HK VCDs show better agreement with the independent MAX-DOAS measurements.
This article is structured as follows: Section 2 includes the description of the BEHR-HK v3.0C product.Section 3 describes our study areas, the model and satellite datasets and the set-up of a common grid for easier manipulation and comparison of the big datasets required for our analysis.Section 4 presents the results and comparison of VCDs based on different model outputs as well as satellite products and retrieval algorithms.Section 5 includes more details regarding improvement and correction of BEHR v3.0B to BEHR-HK v3.0C, and provides a statistical comparison of these retrieval methods.MAX-DOAS measurements in Guangzhou are used for validation in Section 6, which verifies that the revised retrieval algorithm is an improvement compared with previous methods.Numerical accuracies of each NO 2 satellite retrieval method are discussed and highlighted for further research studies.Finally, Section 7 summarizes the main findings and concludes with some possible future research direction.

Principles of NO 2 Remote Sensing
In 2004, OMI was launched with satellite EOS-Aura to measure solar radiation backscattered by the Earth using 2D charge-coupled device (CCD) detectors [31,32].It orbits the Earth around every 99 min, crossing the equator on the sunlit side of the Earth at around 13:30 LT, with a spatial resolution of 13 × 24 km 2 at its nadir and a spectral resolution of 0.5 nm.The reflected sunlight is fit using Differential Optical Absorption Spectroscopy (DOAS) [32], based on Beer-Lambert law, that the light intensity at a specific wavelength can be expressed as in Equation (1).
The intensity can be separated into a component that varies slowly with wavelength (I 0 (λ)), and another that varies quickly with wavelength (the exponential part), with α i being the approximated absorption cross section of single trace gas i (temperature and wavelength dependent), and C i (x) stands for the concentration element in trace gas i.By using slowly varying component as baseline, the slant column density (SCD) can be computed for individual trace gases by multistep removal of instrumental noise and effects such as Rayleigh scattering [33].Afterwards, the stratospheric contribution of SCD is removed, with methods in [31,34] as the two main approaches, and the AMF is estimated from the a priori NO 2 profile, scattering weights and radiance cloud fraction obtained from the regional model, chemistry transport model and satellite products.The AMF is a necessary quantity that converts tropospheric SCD into vertical column density (VCD) to account for atmospheric light scattering [35].The AMF is conceptually the ratio of modeled slant to vertical column densities (Equation ( 2)): The tropospheric AMF (AMF trop ) is calculated as in Equation (3).
The AMF trop is used to calculate the tropospheric VCD (VCD trop ) with the aid of Equation ( 4), VCD trop = SCD trop AMF trop (4) where SCD trop represents the tropospheric slant column density, W is a scattering weight from TOMRAD, NO 2p represents the a priori NO 2 profile from the CMAQ simulation, and we assume "0" and X T to be the surface layer and tropospheric height, respectively.

BEHR Algorithm for NO 2 Column Retrieval
This paper adopts NASA OMI NO 2 Standard Product v3.0 (SPv3) (presented as OMI-NASA in the rest of this paper) [36] and a modified version of the BEHR product [28].BEHR computes tropospheric VCDs from the OMI-NASA tropospheric SCDs, using a custom AMF, therefore the methods of spectral fitting and stratospheric subtraction in BEHR algorithm are the same as for OMI-NASA retrieval.However, the BEHR algorithm uses higher resolution NO 2 profiles, surface reflectance and terrain pressure data compared with the NASA algorithm [37].For this domain, these NO 2 profiles are simulated by WRF-CMAQ, with its configuration explained and listed in Section 3.2.Throughout this paper, we only use pixels that have Cloud Fraction f ≤ 0.2, tropospheric AMF > 10 −6 , and satisfy Quality Flag constraints [36], and reject all low quality pixels.

Calculation of AMF and Tropospheric VCDs
The BEHR algorithm is described in detail in [28].Briefly, the BEHR algorithm uses a custom AMF to convert NO 2 tropospheric SCDs from the NASA standard product to tropospheric VCDs.The custom AMF relies on the ancillary datasets listed in Table 1, and is calculated as an integral of scattering weights and shape factor, as shown in Equations ( 5) and (6), where p 0 refers to the surface pressure for the clear-sky AMF and the cloud pressure for the cloudy-sky AMF, p trop represents tropopause pressure, and w(p) = w 0 (p)(1 − 0.003T(p) + 0.66) represents the scattering weight that depends on several meteorological factors: w 0 (p), obtained from look-up tables; T(p), taken from WRF; and S(p), the shape factor that depends on NO 2 vertical profile g(p).
For cloudy-sky conditions, the cloud radiance fraction f is taken into consideration.The tropospheric AMF for general atmospheric conditions is written as the linear combination of clear-sky AMF and cloudy-sky AMF, as shown in Equation (7).
Combining Equations ( 5) and (7), and letting w clear (p) and w cloudy (p) be the scattering weights of the clear and cloudy parts of the concerned pixel, the BEHR AMF can be evaluated by Equation (8), based on Section 2 of [28], for which the parameters are well-defined in previous paragraphs.
The BEHR retrieved tropospheric VCD is calculated by dividing the OMI-NASA tropospheric SCD by AMF BEHR-trop , as shown in Equation (9) [36,38].The AMF calculation requires several a priori inputs, including surface pressure and reflectivity, and NO 2 vertical profiles.Compared to the OMI-NASA product, the values used in the BEHR-HK AMF calculation are ingested at higher resolution and averaged to the OMI pixels.Vertical profiles of NO 2 , temperature, pressure, and geopotential are inputs for applying BEHR algorithm, and are obtained from WRF-CMAQ model simulations with 9 km spatial resolution, as described in details in Section 3.2.
BEHR-HK v3.0A used a fixed 200 hPa tropopause and calculated surface pressure from the GLOBE terrain height, assuming a 7.4 km scale height.However, in BEHR-HK v3.0B, the calculation of surface pressure makes use of the method described in [42] to combine the GLOBE terrain height with WRF surface pressure, while the tropopause pressure is calculated based on the WMO thermal tropopause definition [43], with temperature profiles obtained from WRF model.Discontinuities in the tropopause pressure are masked and filled in by an iterative interpolation algorithm, as the tropopause pressure is assumed to vary smoothly in space.Finally, the tropopause height is calculated from interpolated thermal tropopause pressure [44].The differences between BEHR-HK v3.0A and v3.0B retrievals are described in details in [28] and by the changelog in [45].
The implementation of calculating tropopause pressure in BEHR-HK v3.0B has been found to be sensitive to the spacing of the levels within temperature profiles, therefore here we test an experimental branch of BEHR-HK, v3.0C that uses the tropopause pressure available in the NASA Standard Product [31,36].

CMAQ Tropospheric VCD Simulation
A priori NO 2 profiles are necessary for running BEHR.A CMAQ modeling system driven by WRF outputs simulates the NO 2 concentration (in ppbv) of each pixel within each of the 26 vertical layers to provide the corresponding profiles.CMAQ simulations are based on corresponding WRF outputs and the Sparse Matrix Operator Kernel Emissions (SMOKE) modeling system, which provides emission data, and initial and boundary conditions [46,47].Our retrieved domain has a spatial resolution of 9 × 9 km 2 .Suppose we take C i to be the sum of concentration (ppmV) of all valid pixels within layer i, h i as the thickness (in m) of the ith vertical layer, and the room temperature as (273.15+ 25) K, the tropospheric NO 2 VCD of a particular timeslot is calculated by Equation (10), with the aid of the ideal gas equation (PV = nRT).
VCD CMAQ in molecules cm 2 = 12.187 × 1000 × 46.01 Here, 12.187 represents the inverse of Universal Gas Constant (R), 46.01 is the molecular weight (g/mol) of NO 2 , 7.64 × 10 −13 is the factor that converts µg/m 2 into molecules/cm 2 , and the term ∑ 26 i=1 9000 2 h i C i represents total tropospheric NO 2 column concentration in µg.

Region of Interest
We focused on OMI satellite data for part of southern China (including Dongguan, Foshan, Guangzhou, Huizhou, Jiangmen, Shenzhen and Zhuhai), with longitude from 109 • E to 117.5 • E and latitude from 19.5 • N to 25.5 • N, as shown in Figure 1.Most of these cities are located in the coastal region of southern China, which is characterized by complex topography and a variety of geographical features, including hills and islands.In particular, Shenzhen and Hong Kong are close to the sea and have steep gradients in land surface [48], thus sea-land breezes cause some changes in meteorological and climatic conditions in these areas [49,50].The region shown in Figure 1 is intentionally chosen to investigate and test the accuracy and validity of satellite remote sensing algorithms and chemical transport models in retrieving tropospheric NO 2 VCD, under conditions where high spatial resolution is expected to be important.Here, 12.187 represents the inverse of Universal Gas Constant (R), 46.01 is the molecular weight (g/mol) of NO2, 7.64 × 10 −13 is the factor that converts μg/m 2 into molecules/cm 2 , and the term ∑ 9000 2 ℎ

Region of Interest
We focused on OMI satellite data for part of southern China (including Dongguan, Foshan, Guangzhou, Huizhou, Jiangmen, Shenzhen and Zhuhai), with longitude from 109°E to 117.5°E and latitude from 19.5°N to 25.5°N, as shown in Figure 1.Most of these cities are located in the coastal region of southern China, which is characterized by complex topography and a variety of geographical features, including hills and islands.In particular, Shenzhen and Hong Kong are close to the sea and have steep gradients in land surface [45], thus sea-land breezes cause some changes in meteorological and climatic conditions in these areas [46,47].The region shown in Figure 1 is intentionally chosen to investigate and test the accuracy and validity of satellite remote sensing algorithms and chemical transport models in retrieving tropospheric NO2 VCD, under conditions where high spatial resolution is expected to be important.

Datasets
As mentioned in Section 2.2.2, BEHR is an algorithm that focuses on achieving high spatial resolution from satellite remote sensing of NO2.There are several required inputs to drive the BEHR algorithm for retrieving the vertical column of NO2.
In particular, all meteorology fields (for example, temperature, pressure and geopotential) in our

Datasets
As mentioned in Section 2.2.2, BEHR is an algorithm that focuses on achieving high spatial resolution from satellite remote sensing of NO 2 .There are several required inputs to drive the BEHR algorithm for retrieving the vertical column of NO 2 .
In particular, all meteorology fields (for example, temperature, pressure and geopotential) in our study are simulated by WRF version 3.2 based on [51].There are 38 vertical layers for most WRF output variables, with more than 10 layers positioned within planetary boundary layer (PBL).We used the Yonsei University PBL scheme [52], Grell-Devenyi ensemble cumulus parameterization scheme [53], the Noah Land Surface Model [54], and the WRF single-moment six-class microphysics scheme (WSM6) [55].For longwave and shortwave radiation, the Rapid Radiative Transfer Model (RRTM) [56] and Dudhia shortwave radiation scheme [57] were used.Throughout the study, we used two of the nested meshes of our WRF modeling system, namely Domain 1 (D1) and Domain 2 (D2).These two domains have spatial resolutions of 27 km and 9 km.We focused on D2 simulation and retrieval in this current study, as it is the smallest domain that covers most parts of southern China, including all 10 major cities in Pearl River Delta (PRD) region and the Guangdong Province.The CMAQ simulation was then driven based on WRF outputs to give a priori NO 2 profiles, therefore WRF domain is larger than the corresponding CMAQ domain (of same spatial resolution) to minimize boundary effects towards CMAQ simulation.
For the NO 2 slant column, we adopted the Version 3 Aura Ozone Monitoring Instrument (OMI) Nitrogen Dioxide (NO 2 ) Standard Product (OMNO2) from NASA [41]; for albedo, we used the Moderate Resolution Imaging Spectroradiometer (MODIS) Bidirectional Reflectance Distribution Function (BRDF) reflectance from MCD43D07-09 [39] (with spatial resolution of 0.05 • × 0.05 • , and calculation of 16-day average every 8 days).For terrain pressure, elevation data from the Global Land One-Kilometer Elevation (GLOBE) database (with a resolution of 1 km × 1 km) [40] was averaged within the OMI pixel bounds to yield the pixel elevation.In BEHR-HK v3.0B and onwards, base and perturbation geopotential profiles from WRF are also necessary to calculate the surface pressure, based on the approach outlined in [42].Table 1 shows the attributes of the aforementioned datasets, including their sizes, array structures, units and name of product (if applicable).

Grid Formation and Decomposition
As shown in Table 1, the different datasets mentioned in Section 3.2 have different array structures.In the native OMI resolution BEHR-HK data, the AMF and tropospheric VCD will be 2D arrays with latitude and longitude coordinates in every satellite-passing hour.For consistent projection and fair comparison of spatial plots, we created a rectangular grid within our domain of interest (from 108 • E to 118 • E and from 18 • N to 26 • N), with a mesh size of 0.1 • .In this grid, we decomposed the area into 8181 small grids, each with an area of 0.1 • × 0.1 • .For all satellite datasets, we first projected every available pixel onto this grid hour by hour, by weighting the contribution of each pixel to the grid cell based on amount of areas that overlaps, and then took the temporal average within each prescribed period.We then constructed spatial plots based on the longitude, latitude and information of concerned attribute.

Results
This section provides the retrieval results of tropospheric NO 2 column density obtained from OMI-NASA retrieval (Section 2.1), BEHR-HK retrieval (Section 2.2) and WRF-CMAQ simulation (Section 2.3).Corresponding AMF spatial distributions within Southern China for OMI-NASA and BEHR-HK retrieval for all four months are presented as part of the Supplementary Materials (Figures S1 and S2).

Comparison of OMI-NASA, BEHR-HK and WRF-CMAQ VCDs
We first compared NO 2 VCDs from the OMI-NASA product, BEHR-HK v3.0C product, and CMAQ simulation.BEHR-HK VCDs were computed with Equation ( 9) and CMAQ VCDs with Equation (10).Corresponding plots of the AMFs from the OMI-NASA and BEHR-HK products are available in Figures S1 and S2.
Figure 2 shows the monthly average tropospheric NO 2 VCD based upon the OMI-NASA retrieval.The tropospheric VCD in winter (represented by January 2015 in Figure 2a) is the highest out of all four seasons (many pixels exceed 10 16 molecules/cm 2 ), especially cities lying on coastal region such as Dongguan (DG), Foshan (FS), Guangzhou (GZ), Shenzhen (SZ) and Zhongshan (ZS), followed by spring and autumn (represented by April and October 2015), ranging 3-9 × 10 15 molecules/cm 2 .The average tropospheric VCD in summer is the lowest, at around 2-6 × 10 15 molecules/cm 2 .As mentioned in Section 2.2, BEHR makes use of higher spatial resolution a priori NO2 profiles for retrieval, in combination with meteorological variables within southern China from WRF simulations.Therefore, it is believed to capture differences between retrieved NO2 VCDs for neighboring pixels well, especially when a pixel representing a rural area is situated next to a pixel that represents an urban region.Within many provinces in China, for example in Guangdong, the land-use pattern can change abruptly, in particular, the spatial variability of Guangzhou (GZ) is illustrated in [58].Figure 3 shows the spatial distribution of BEHR-HK v3.0C retrieved tropospheric NO2 column density within four retrieval periods.Further statistical analysis and correlation between v3.0B and v3.0C retrieval are provided and discussed in Section 5.2.The spatial pattern looks similar As mentioned in Section 2.2, BEHR makes use of higher spatial resolution a priori NO 2 profiles for retrieval, in combination with meteorological variables within southern China from WRF simulations.Therefore, it is believed to capture differences between retrieved NO 2 VCDs for neighboring pixels well, especially when a pixel representing a rural area is situated next to a pixel that represents an urban region.Within many provinces in China, for example in Guangdong, the land-use pattern can change abruptly, in particular, the spatial variability of Guangzhou (GZ) is illustrated in [58].Figure 3 Remote Sens. 2018, 10, 1789 9 of 23 shows the spatial distribution of BEHR-HK v3.0C retrieved tropospheric NO 2 column density within four retrieval periods.Further statistical analysis and correlation between v3.0B and v3.0C retrieval are provided and discussed in Section 5.2.The spatial pattern looks similar to OMI-NASA retrieval (in Figure 2), except that BEHR-HK captures some relatively higher values of NO 2 VCD in the east and north of our domain, for example Meizhou (MZ) and Shanwei (SW) in the east, and Shaoguan (SG) in the north of Guangzhou.In BEHR-HK retrieval, these areas show significantly higher amounts of NO 2 VCD compared with OMI-NASA retrieval.Interestingly, BEHR-HK shows greater NO 2 VCDs in April than October, whereas OMI-NASA shows the reverse pattern.BEHR-HK v3.0C was developed while writing this manuscript to address challenges in correctly computing the tropopause pressure for this domain.We compared the BEHR-HK v3.0C tropospheric NO 2 VCD with that of the OMI-NASA product by calculating the percentage difference between the two, where percentage difference = (BEHR-HK VCD − OMI-NASA VCD)/OMI-NASA VCD × 100%.Corresponding spatial plots are illustrated in Figure 5. From these plots, we find that in cities located in coastal regions, such as Guangzhou and Shenzhen, BEHR-HK v3.0C retrieval gives lower NO 2 VCD in January and October 2015, while higher NO 2 VCD is obtained in April and July 2015.However, the trend is opposite for Chaozhou and Shanwei.For cities that are located in inland areas such as Hezhou, Qingyuan and Shaoguan, as well as Haikou (near Hainan Province), BEHR-HK retrieval gives higher NO 2 VCD than the OMI-NASA product in all four periods.
As shown in Figures S1 and S2, BEHR-HK v3.0C generally has lower AMF and higher tropospheric NO 2 column density compared with OMI-NASA, which could be caused by numerous factors.Previous studies have outlined several factors to account for such difference, including high-resolution NO 2 profiles, MODIS surface reflectance and aerosols [26,27].In contrast to previous regional retrievals over China [26,27], we do not treat aerosols explicitly, instead relying on the implicit correction through the aerosol effect on retrieved cloud properties [34].The largest effects, especially increases in VCDs in urban areas and decreases in rural areas, are most likely due to the increased resolution of the NO 2 profile [24,25,28,30].Other factors affecting the difference between OMI-NASA and BEHR-HK VCDs are the lower surface reflectance in BEHR-HK compared with OMI-NASA, and the lack of lightning NO x information in the WRF-CMAQ generated NO 2 a priori profiles.The latter factor is likely a major reason some locations distant from cities show higher BEHR-HK than OMI-NASA VCDs in July.This is not the expected effect due to higher resolution profiles; improved resolution generally decreases rural VCDs, as the high resolution rural profiles have less surface NO 2 when they are not averaged with urban profiles, which always happen when coarser profiles are adopted [59].
computing the tropopause pressure for this domain.We compared the BEHR-HK v3.0C tropospheric NO2 VCD with that of the OMI-NASA product by calculating the percentage difference between the two, where percentage difference = (BEHR-HK VCD − OMI-NASA VCD)/OMI-NASA VCD × 100%.Corresponding spatial plots are illustrated in Figure 5. From these plots, we find that in cities located in coastal regions, such as Guangzhou and Shenzhen, BEHR-HK v3.0C retrieval gives lower NO2 VCD in January and October 2015, while higher NO2 VCD is obtained in April and July 2015.However, the trend is opposite for Chaozhou and Shanwei.For cities that are located in inland areas such as Hezhou, Qingyuan and Shaoguan, as well as Haikou (near Hainan Province), BEHR-HK retrieval gives higher NO2 VCD than the OMI-NASA product in all four periods.As shown in Figures S1 and S2, BEHR-HK v3.0C generally has lower AMF and higher tropospheric NO2 column density compared with OMI-NASA, which could be caused by numerous factors.Previous studies have outlined several factors to account for such difference, including highresolution NO2 profiles, MODIS surface reflectance and aerosols [26,27].In contrast to previous regional retrievals over China [26,27], we do not treat aerosols explicitly, instead relying on the implicit correction through the aerosol effect on retrieved cloud properties [34].The largest effects, especially increases in VCDs in urban areas and decreases in rural areas, are most likely due to the increased resolution of the NO2 profile [24,25,28,30].Other factors affecting the difference between OMI-NASA and BEHR-HK VCDs are the lower surface reflectance in BEHR-HK compared with OMI-NASA, and the lack of lightning NOx information in the WRF-CMAQ generated NO2 a priori profiles.The latter factor is likely a major reason some locations distant from cities show higher BEHR-HK than OMI-NASA VCDs in July.This is not the expected effect due to higher resolution Nevertheless, comparing with WRF-CMAQ simulation in Figure 4, we can conclude that BEHR-HK retrieval is able to better capture the spatial variability in southern China, for example the changes of NO 2 VCD or NO x emission within neighboring cities in Guangdong province.Moreover, having a priori inputs and satellite products of higher spatial resolution generally lead to higher VCD in cities, but the situation is more variable in China due to its highly heterogeneous land use.Many cities in southern China contain both rural and urban areas, and it is believed that urban areas with higher transportation flow rate and more industrial activities will emit higher amount of NO 2 , while rural areas with less commercial activity produce lower NO 2 emissions.Complex mixture of urban and rural areas may explain why BEHR-HK VCDs are not uniformly greater than NASA VCDs in cities.

Correction of BEHR-HK v3.0B and Improvement to v3.0C
In the process of retrieval, we identified certain insufficiencies of BEHR-HK v3.0B when it is applied in southern China, therefore we developed a new tool (v3.0C) accordingly.As described below, the issue appears to be associated with coarse spacing of the WRF-derived temperature profiles in the upper troposphere for this domain.This resulted in a high bias in the WRF-derived tropopause pressure throughout this domain.This is not expected to significantly affect conclusions related to application of v3.0B to North America [28,29].In Section 5.1, we describe the abnormally large VCDs encountered in BEHR-HK v3.0B retrieval within July 2015 due to this bias and discuss ways of dealing with this issue.In Section 5.2, we focus on comparing BEHR-HK v3.0B and v3.0C retrieval, and provide comments regarding the potential effects after tropopause pressure formulation is altered.

Causes of Abnormally Large VCDs in BEHR-HK v3.0B
In previous literature based on OMI-NASA satellite retrieval [24,60], the reported tropospheric NO 2 VCD within southern China (see Section 3.1) is within the range of 10 15 -10 16 molecules/cm 2 .After updating AMF based on BEHR algorithm, all output files of BEHR-HK v3.0A, v3.0B and v3.0C give reasonable numerical values of tropospheric NO 2 VCD throughout our interested domain in January, April, and October 2015.However, some pixels of several dates in July 2015 have unreasonably high numerical values of tropospheric NO 2 VCD (>10 17 molecules/cm 2 ).Table 2 provides details of occurrences of these unexpectedly high values after regridding is conducted in BEHR-HK v3.0B retrieval.All of them >10 19 molecules/cm 2 , with 10 pixels >10 20 molecules/cm 2 These abnormally large VCDs occur in pixels where both p cloud < p trop and the cloud radiance fraction ( f ) is large.In BEHR-HK v3.0B and v3.0C, if p cloud < p trop , the cloudy-sky AMF is set to zero, representing a case where there is no above-cloud NO 2 .When the cloud fraction is also of large quantity, this will make the overall AMF (AMF trop ) very small by Equation (7).Since the VCDs are inversely proportional to the AMF by Equation ( 9), this results in extremely large VCDs.Fundamentally, this error occurs when most of the NO 2 column are obscured by clouds and the AMF calculation must attempt to reconstruct the total tropospheric NO 2 VCD with limited observational information.Such error is more pronounced in the southern China domain than the US domain, mainly because, in southern China domain, the tropopause pressure calculated from WRF temperature profiles in BEHR-HK v3.0B is biased high by 180 hPa comparing with the tropopause calculated by NASA, whereas the difference is only around 50 hPa in the US domain.This high bias increases the frequency of pixels with p cloud < p trop .This bias in the tropopause was traced to the coarse spacing of the WRF temperature profiles in the upper troposphere for the southern China domain, and can be corrected in BEHR-HK v3.0B retrieval by removing any pixels for which p cloud < p trop , however BEHR-HK v3.0C replaces the WRF-derived tropopause pressure with the NASA-calculated one.Figure 6 shows the spatial plots of average tropospheric NO 2 VCD in July 2015 before (Figure 6a) and after pixel filtering (Figure 6b) is adopted.Figure 6c shows the BEHR-HK v3.0C retrieval result.From the figures, we conclude that both of the aforementioned approaches successfully remove pixels with anomalously large VCDs.
domain, and can be corrected in BEHR-HK v3.0B retrieval by removing any pixels for which   <   , however BEHR-HK v3.0C replaces the WRF-derived tropopause pressure with the NASAcalculated one.Figure 6 shows the spatial plots of average tropospheric NO2 VCD in July 2015 before (Figure 6a) and after pixel filtering (Figure 6b) is adopted.Figure 6c shows the BEHR-HK v3.0C retrieval result.From the figures, we conclude that both of the aforementioned approaches successfully remove pixels with anomalously large VCDs.

BEHR-HK v3.0B vs. BEHR-HK v3.0C
Figure 7 shows the correlation plot between BEHR-HK v3.0B and v3.0C for four different months in 2015.For Figure 7c, we conduct filtering and eliminate 56 pixels affected by the issue described in

BEHR-HK v3.0B vs. BEHR-HK v3.0C
Figure 7 shows the correlation plot between BEHR-HK v3.0B and v3.0C for four different months in 2015.For Figure 7c, we conduct filtering and eliminate 56 pixels affected by the issue described in Section 5.1 in BEHR-HK v3.0B dataset and obtain the correlation plot for July 2015.Table 3 presents the statistical parameters that assess the relationship between NO 2 VCD for the two BEHR-HK retrievals.Readers can refer to Supplementary Information S3 for details of these statistical parameters.From the relative position of data points with respect to y = x (blue line) shown in Figure 7, together with the slope of best-fit line for the four months, we conclude that BEHR-HK v3.0B retrieval generally gives a higher amount of NO 2 VCD than BEHR-HK v3.0C.This happens in all 4 retrieval periods in 2015.For July 2015, even after filtering out problematic pixels in BEHR-HK v3.0B, around 85% of data points lie below the blue line (as shown in Figure 7c).This is the expected effect of adopting the smaller NASA tropopause pressure, as that includes more of the upper troposphere in the AMF integral, and since the scattering weights in the upper troposphere are generally large, this increases the AMF and therefore decreases the VCDs in BEHR-HK v3.0C compared with v3.0B.Since the thermal tropopause calculation overestimates the tropopause pressure because of coarse spacing of temperature profiles in the UT, the NO 2 VCDs obtained by v3.0C retrieval likely represents the actual situation better.Thus, we conclude that there is slight over-estimation in this domain when the v3.0B retrieval method is used.the v3.0B retrieval method is used.
Moreover, the R-value in winter and spring is slightly higher than that in summer and autumn; thus, the corresponding RMSE values in these seasons are lower.One possible reason is the changing dispersion rate of pollutants in different seasons.Moreover, winter tropopause is at a greater pressure and lower altitude, which is being brought to a level closer to model level, therefore calculations using our BEHR-HK products capture it better.Table 3.Comparison of statistical parameters for BEHR-HK v3.0C (y-axis) and v3.0B (x-axis) retrieval (Least Square Fitting by assuming linear regression holds).Moreover, the R-value in winter and spring is slightly higher than that in summer and autumn; thus, the corresponding RMSE values in these seasons are lower.One possible reason is the changing dispersion rate of pollutants in different seasons.Moreover, winter tropopause is at a greater pressure and lower altitude, which is being brought to a level closer to model level, therefore calculations using our BEHR-HK products capture it better.

Discussion
In this section, we compare the satellite retrieval algorithms with available raw measurement datasets and obtain corresponding uncertainties and errors for each method, thus providing a source to judge what kind of algorithm is feasible and applicable to NO 2 retrieval over a larger domain.

MAX-DOAS Validation in Guangzhou
For validation purposes, we compared satellite retrieval results for all algorithms with MAX-DOAS tropospheric measurements in Guangzhou.Within our retrieved domain, Guangzhou is the largest city.It has enormous commercial and industrial activities, and thus high NO x emission [61], so obtaining time series of both ground NO 2 concentrations and tropospheric NO 2 VCD is extremely important for imposing future control policies.We adopted the tropospheric measurements conducted in [62] for comparison, for which a MAX-DOAS system (comprised of CCD spectrograph) was installed on the roof of a nine-story building in Guangzhou Institute of Geochemistry of the Chinese Academy of Sciences (GIG-CAS) from March 2015 to March 2016 (Geographical location: latitude: 23 • 8 54 N, longitude: 113 • 21 32 E) [62].A spectrograph with spectral resolution of 0.35 nm captured radiation of wavelength 300-450 nm, as well as measuring sky radiance at different elevation angles [62,63].The tropospheric NO 2 VCD measurement figures are based on homogeneous spatial distribution of NO 2 within the lower troposphere along every azimuthal direction.Within our retrieval period, comparisons can only be made for April and July 2015, due to damage to the optical system [62].We first considered all valid daily measurements that are ±0.5 h of the satellite retrieval period, for example if the satellite retrieval period on a specific day is 06:00 and 07:00 UTC, then all MAX-DOAS measurements conducted during 05:30-07:30 UTC were considered.Based on all these available measurements, a daily average was calculated, which provided us MAX-DOAS tropospheric NO 2 VCD measurements for comparison purposes.Next, for our satellite-retrieved datasets, upon projection onto our preset grid (Section 3.3), we referred to the grid that represents Guangzhou (latitude: 23.1 • N, longitude: 113.3 • N) and obtained corresponding OMI-NASA and BEHR-HK satellite-retrieved tropospheric NO 2 VCDs for each day.Figures 8 and 9 show the comparison of MAX-DOAS, OMI-NASA, and BEHR-HK (v3.0C) retrieved NO 2 VCD for April and July 2015, together with their corresponding error bounds, obtained based on measurement set-ups (MAX-DOAS), regridding of satellite datasets into the grid described in Section 3.3 (OMI-NASA), while the error bounds for BEHR-HK v3.0B and v3.0C NO 2 TVCD retrieval are ±40% (for April 2015) and ±36% (for July 2015), respectively.The uncertainty of each month was derived based on the approach mentioned in [37], and detailed contribution and estimates are provided in Figure S4.The corresponding version showing comparison of BEHR-HK v3.0B, BEHR-HK v3.0C and MAX-DOAS measurements are shown in Figure S5a,b.
As shown in Figures 8 and 9, the OMI-NASA product underestimates the tropospheric NO 2 VCD in Guangzhou throughout the whole retrieval period, although its time trend is in line with MAX-DOAS measurements.In contrast, both the magnitude and temporal pattern of the BEHR-HK VCDs is similar to that observed by the MAX-DOAS instrument.This is true for BEHR-HK v3.0A and v3.0B as well (not shown).Within the satellite passing hours of some dates (e.g., 10 April, 22 April, 5 July, 18 July, 20 July, and 30 July), BEHR-HK v3.0A retrieval gives relatively higher NO 2 VCD than BEHR-HK v3.0B, but the situation is reversed or no different for most other dates.In Guangzhou, BEHR-HK v3.0B gives slightly higher NO 2 VCD than v3.0C in most of the dates, especially in the main peaks.

Numerical Uncertainties of Each Satellite Retrieval Method
We investigated the correlation between each satellite retrieval with MAX-DOAS measurements in Figure 10.The 30 data points represent 12 dates in April 2015 and 18 dates in July 2015 (as shown in the data points of Figures 8 and 9).In each of these plots, the blue line represents y = x.In Figure 10a, we can see that there will be a large underestimation if we use the OMI-NASA product to retrieve NO 2 VCD within China.Based on Figure 10b-d, we see that all versions of the BEHR-HK algorithm may over-or underestimate the VCDs comparing with the MAX-DOAS measurements, but do not systematically underestimate the NO 2 VCD.However, due to updating of AMF, NO 2 VCDs of some dates have become overestimated in BEHR-HK v3.0A retrieval.One possible reason is that we assumed a fixed tropopause pressure, which is not true in general.By using variable tropopause top pressure, there is a clear improvement, as shown in Figure 10c (BEHR-HK v3.0B) and Figure 10d (BEHR-HK v3.0C),where the R-values of the data pairs are 0.9338 and 0.9839, respectively, which are significantly higher than that in Figure 10a,b, with R-values of 0.7644 and 0.8468, respectively.The improvement can also be seen in the decrease in RMSE, from 8.961 × 10 15 molecules/cm 2 in the OMI-NASA retrieval to 3.992 ×10 15 molecules/cm 2 in BEHR-HK v3.0B retrieval, and then a further decrease to 2.083 ×10 15 molecules/cm 2 for BEHR-HK v3.0C retrieval.The improvement in RMSE between version 3.0B and 3.0C is especially impressive and beyond what we expected to find.The small RMSE suggests that even if there is an absolute bias due to omitting lightning from the model used to calculate the prior, the bias is likely uniform across large swaths of the domain.
Here, the absolute sign is to make the percentage difference non-negative for comparison purposes.A bar chart of the distribution of percentage differences is shown in Figure 11, together with the assumption that MAX-DOAS measurements give accurate results, we can conclude that, for OMI-NASA product, degree of accuracy in southern China is 30-90%, while ~86.7% of dates for As noticed in Figure 10c,d, most valid data points are closer to the y = x line for BEHR-HK v3.0B and v3.0C retrieval, demonstrating that the BEHR-HK NO 2 VCDs are more accurate than the NASA VCDs in Guangzhou.This is mainly due to the use of higher resolution a priori NO 2 profile, together with albedo and terrain pressure satellite products in BEHR-HK retrieval algorithms, comparing with OMI-NASA product.Current success suggests that the BEHR-HK VCDs better represent the NO 2 concentration in southern China, especially within Guangdong province.(Table 4) Based on the data values, we compute the percentage difference between each satellite retrieved NO 2 VCD and MAX-DOAS measurement value by Equations ( 11) and ( 12 Here, the absolute sign is to make the percentage difference non-negative for comparison purposes.A bar chart of the distribution of percentage differences is shown in Figure 11, together with the assumption that MAX-DOAS measurements give accurate results, we can conclude that, for OMI-NASA product, degree of accuracy in southern China is 30-90%, while ~86.7% of dates for BEHR-HK v3.0C are within 20% of MAX-DOAS VCD.analysis.This will enable more thorough assessments of different satellite retrieval methods, and estimations of spatial and temporal uncertainties within different cities around the world, and within different months or seasons throughout a year.

Conclusions
We adapted the BEHR-HK approach to update AMF distribution within southern China and conducted NO2 retrievals.In particular, the tropospheric NO2 VCDs in four selected months within 2015 were retrieved.We considered the meteorological outputs from WRF model, and identified

Figure 1 .
Figure 1.This study's region of interest (the main part of southern China), with MAX-DOAS measurement site indicated (blue triangle).

Figure 1 .
Figure 1.This study's region of interest (the main part of southern China), with MAX-DOAS measurement site indicated (blue triangle).

Figure 4
Figure 4 provides the spatial distribution of tropospheric NO 2 VCD obtained by WRF-CMAQ simulation, based upon the formulation discussed in Section 2.3.

Figure 6 .
Figure 6.(a) Average tropospheric NO2 VCD of southern China by BEHR-HK v3.0B retrieval for July 2015 (before filtering pixels); (b) average tropospheric NO2 VCD of southern China by BEHR-HK v3.0B retrieval for July 2015 (after filtering pixels); and (c) average tropospheric NO2 VCD of southern China by BEHR-HK v3.0C retrieval for July 2015 (without pixels being filtered).The units for all plots are in molecules/cm 2 .

Figure 6 .
Figure 6.(a) Average tropospheric NO 2 VCD of southern China by BEHR-HK v3.0B retrieval for July 2015 (before filtering pixels); (b) average tropospheric NO 2 VCD of southern China by BEHR-HK v3.0B retrieval for July 2015 (after filtering pixels); and (c) average tropospheric NO 2 VCD of southern China by BEHR-HK v3.0C retrieval for July 2015 (without pixels being filtered).The units for all plots are in molecules/cm 2 .

Figure 7 .
Figure 7. BEHR-HK v3.0C vs. BEHR-HK v3.0B.(a) January 2015; (b) April 2015; (c) July 2015 (based on filtered datasets in BEHR-HK v3.0B); and (d) October 2015.The blue lines represent the line y = x, while the data points represent the pixel value at each small grid based upon the two retrievals.Both x and y axes are tropospheric NO2 VCD; units are molecules/cm 2 .

Figure 7 .
Figure 7. BEHR-HK v3.0C vs. BEHR-HK v3.0B.(a) January 2015; (b) April 2015; (c) July 2015 (based on filtered datasets in BEHR-HK v3.0B); and (d) October 2015.The blue lines represent the line y = x, while the data points represent the pixel value at each small grid based upon the two retrievals.Both x and y axes are tropospheric NO 2 VCD; units are molecules/cm 2 .

Figure 8 .
Figure 8. Tropospheric NO2 VCD in April 2015 obtained through MAX-DOAS measurements (Purple Triangles), and satellite retrieval: OMI-NASA (Red Squares), BEHR-HK v3.0C (Green Circles) for 12 dates for which all datasets have available information.The error bound indicates the uncertainty estimates of each measurement or retrieval result, based on description provided in Section 6.1.

Figure 9 .
Figure 9. Tropospheric NO2 VCD in July 2015 obtained through MAX-DOAS measurements (Purple Triangles), and satellite retrieval: OMI-NASA (Red Squares), BEHR-HK v3.0C (Green Circles) for 18 dates for which all datasets have available information.The error bound indicates the uncertainty estimates of each measurement or retrieval result, based on description provided in Section 6.1.

Figure 8 .
Figure 8. Tropospheric NO 2 VCD in April 2015 obtained through MAX-DOAS measurements (Purple Triangles), and satellite retrieval: OMI-NASA (Red Squares), BEHR-HK v3.0C (Green Circles) for 12 dates for which all datasets have available information.The error bound indicates the uncertainty estimates of each measurement or retrieval result, based on description provided in Section 6.1.

Figure 8 .
Figure 8. Tropospheric NO2 VCD in April 2015 obtained through MAX-DOAS measurements (Purple Triangles), and satellite retrieval: OMI-NASA (Red Squares), BEHR-HK v3.0C (Green Circles) for 12 dates for which all datasets have available information.The error bound indicates the uncertainty estimates of each measurement or retrieval result, based on description provided in Section 6.1.

Figure 9 .
Figure 9. Tropospheric NO2 VCD in July 2015 obtained through MAX-DOAS measurements (Purple Triangles), and satellite retrieval: OMI-NASA (Red Squares), BEHR-HK v3.0C (Green Circles) for 18 dates for which all datasets have available information.The error bound indicates the uncertainty estimates of each measurement or retrieval result, based on description provided in Section 6.1.

Figure 9 .
Figure 9. Tropospheric NO 2 VCD in July 2015 obtained through MAX-DOAS measurements (Purple Triangles), and satellite retrieval: OMI-NASA (Red Squares), BEHR-HK v3.0C (Green Circles) for 18 dates for which all datasets have available information.The error bound indicates the uncertainty estimates of each measurement or retrieval result, based on description provided in Section 6.1.

Table 1 .
Ancillary datasets used to calculate BEHR-HK Tropospheric VCDs.

Table 2 .
Dates and pixels that return unusually high tropospheric VCDs after running BEHR-HK v3.0B.

Table 4 .
Comparison of statistical parameters for satellite retrieval algorithms (y-axis) and MAX-DOAS tropospheric measurements (x-axis) (Least Square Fitting by assuming linear regression holds).