Estimation of Surface NO 2 Volume Mixing Ratio in Four Metropolitan Cities in Korea Using Multiple Regression Models with OMI and AIRS Data

Surface NO2 volume mixing ratio (VMR) at a specific time (13:45 Local time) (NO2 VMRST) and monthly mean surface NO2 VMR (NO2 VMRM) are estimated for the first time using three regression models with Ozone Monitoring Instrument (OMI) data in four metropolitan cities in South Korea: Seoul, Gyeonggi, Daejeon, and Gwangju. Relationships between the surface NO2 VMR obtained from in situ measurements (NO2 VMRIn-situ) and tropospheric NO2 vertical column density obtained from OMI from 2007 to 2013 were developed using regression models that also include boundary layer height (BLH) from Atmospheric Infrared Sounder (AIRS) and surface pressure, temperature, dew point, and wind speed and direction. The performance of the regression models is evaluated via comparison with the NO2 VMRIn-situ for two validation years (2006 and 2014). Of the three regression models, a multiple regression model shows the best performance in estimating NO2 VMRST and NO2 VMRM. In the validation period, the average correlation coefficient (R), slope, mean bias (MB), mean absolute error (MAE), root mean square error (RMSE), and percent difference between NO2 VMRIn-situ and NO2 VMRST estimated by the multiple regression model are 0.66, 0.41, −1.36 ppbv, 6.89 ppbv, 8.98 ppbv, and 31.50%, respectively, while the average corresponding values for the other two models are 0.75, 0.41, −1.40 ppbv, 3.59 ppbv, 4.72 ppbv, and 16.59%, respectively. All three models have similar performance for NO2 VMRM, with average R, slope, MB, MAE, RMSE, and percent difference between NO2 VMRIn-situ and NO2 VMRM of 0.74, 0.49, −1.90 ppbv, 3.93 ppbv, 5.05 ppbv, and 18.76%, respectively.


Introduction
The main anthropogenic source of nitrogen dioxide (NO 2 ) is fossil fuel combustion, while natural sources of NO 2 include lightning, forest fires, and soil emissions [1,2].In particular, since NO 2 is emitted in large quantities in automobile exhaust gas, NO 2 is often used as an indicator of traffic-related air pollution in urban areas [3].In terms of its effect on human health, long-term NO 2 exposure can lead to respiratory depression and respiratory illness [4][5][6][7][8].In addition, it is a precursor of aerosol nitrate, tropospheric ozone, and the hydroxyl radical (OH), the main atmospheric oxidant [9].It is therefore important to measure NO 2 and various methods are used, with chemiluminescence, a well-known technique for measuring surface NO 2 volume mixing ratio (VMR) [10].In situ measurements such as the chemiluminescence method are, in general, more accurate than remote sensing techniques, but require a large number of in situ instruments to provide the spatial distribution of the NO 2 VMR at high resolution [11].In recent years, NO 2 vertical column density (VCD) has been measured from satellites that can monitor NO 2 at global scale over a short time scale.Space-borne sensors that have observed global distributions of NO 2 are the Global Ozone Monitoring Experiment (GOME) aboard European Remote Sensing-2 (ERS-2) (1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003), Scanning Imaging Absorption Spectrometer for Atmospheric Chartography/Chemistry (SCIAMACHY) aboard Environmental Satellite (Envisat) (2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012), the Ozone Monitoring Instrument (OMI) aboard EOS-AURA (2004-present), and GOME-2 aboard the Meteorological Operational satellite (MetOp)-A (2007-present) and MetOp-B (2012-present) [12][13][14][15][16][17].In many countries, air quality regulation requires surface NO 2 VMR so the NO 2 VCD obtained from satellites cannot be used directly.In recent years, studies have been conducted to investigate the feasibility of estimating the surface NO 2 VMR using the NO 2 VCD obtained from satellite measurements and, in particular, the correlation between the NO 2 VCD obtained from satellite measurements and the surface NO 2 VMR.
Ordónez et al. [18] reported the correlation between tropospheric NO 2 VCD and the NO 2 VCD measured by GOME and ground based in situ devices in Milan.Kharol et al. [3] estimated the annual average ground-level NO 2 concentrations in North America using chemical transport model (GEOS-Chem) data and OMI NO 2 columns and also reported the annual trend of the estimated ground-level NO 2 concentrations.However, no studies have attempted to estimate the surface NO 2 VMR at higher temporal resolutions such as hourly and monthly using the NO 2 VCD measured by satellites.
In this present study, we estimate for the first time the surface NO 2 VMR at a specific time (13:45 Local time (LT)) (NO 2 VMRST) and the monthly mean surface NO 2 VMR (NO 2 VMRM) using two linear regression models and a multiple regression model with the tropospheric NO 2 VCD obtained from OMI (Trop NO 2 VCDOMI) in five metropolitan cities.In addition, the performance of each regression method is evaluated by comparing the estimated surface NO 2 VMRs with those obtained from in situ measurement (NO 2 VMRIn-situ).

Study Area and Period
A large amount of anthropogenic NO X is emitted in Northeast Asia including China, Korea and Japan [19].Especially, the annual mean NO 2 tended to increase in Seoul from 1995 to 2009 [20].The study areas were selected where the surface NO 2 VMR is continuously measured in Korean metropolitan cities (Figure 1).Metropolitan cities such as Busan and Incheon where the OMI pixel covers both sea and land are excluded since there are no surface NO 2 data available over the sea.Therefore, the selected areas are Seoul, Gyeonggi, Daejeon, and Gwangju.Seoul is covered by four OMI pixels and is divided into eastern and western areas (West Seoul and East Seoul).The study period is the nine years from 2006 to 2014.This is split into a seven-year training period (2007-2013) to determine the coefficients of the regression models used in this study, and two years of validation (2006 and 2014) when the surface NO 2 VMRs estimated from the resulting three regression models are evaluated by comparison with the in situ data.The three regression models used in this study are described in detail in Section 3.

Data
The data used in this study are Trop NO 2 VCD OMI and Atmospheric Infrared Sounder (AIRS) boundary layer height (BLH AIRS ), atmospheric temperature (Temp AIRS ) and pressure (Press AIRS ), together with in situ measurements of NO 2 VMR In-situ , surface temperature (Temp In-situ ), surface pressure (Press In-situ ), surface dew point (Dewpoint In-situ ), surface wind speed (WS In-situ ), and surface wind direction (WD In-situ ) (see Table 1).The Trop NO 2 VCD OMI data were obtained from OMI Level3 NO 2 Daily Data (OMNO2d) provided by the NASA Goddard Earth Sciences Data and Information Services Center (http://disc.sci.gsfc.nasa.gov/Aura/data-holdings/OMI)[17,21,22].OMI is a nadir-viewing UV-visible (270-500 nm) spectrometer aboard the Aura platform launched in July 2004 [23].Aura is a polar orbiting satellite with an overpass time of 13:45 LT.The spectral resolution of the OMI is about 0.5 nm and the spatial resolution is 13 × 24 km at nadir.Cloud-screened NO 2 data (Level-3 OMI NO 2 Cloud-Screened Total and Tropospheric Column NO 2 (V003)) are used in the present study (Cloud Fraction <30%).

In Situ NO 2 Data
The NO 2 VMR In-situ data were obtained from Air Korea (http://www.airkorea.or.kr/last_amb_ hour_data).Since NO 2 VMR In-situ is available hourly, the average of the values at 13:00 and 14:00 LT is used to be closer to the OMI overpass time.In a previous study [18], the in situ measurements were grouped into five different NO 2 levels: clean, slightly polluted, averagely polluted, polluted, and heavily polluted.Many stations are located close to roads and are exposed to emissions.In addition, the in situ NO 2 data from stations within GOME pixels (320 × 40 km) were averaged, since in situ measurements are only representative of a small fraction of the satellite ground scene.In the present study, the NO 2 VMR In-situ obtained from in situ measurements located close to streets were excluded in this study.We used the average of three or more NO 2 VMR In-situ from stations located at least 2 km from each other.

In Situ Meteorological Data
The Temp In-situ , Press In-situ , Dewpoint In-situ , WS In-situ , and WD In-situ used in this study are Automatic Weather System (AWS) data provided by the Korea Meteorological Administration (http: //sts.kma.go.kr/jsp/home/contents/statistics/newStatisticsSearch.do?menu=SFC&MNU=MNU).Since meteorological data are available hourly, the average of the data at 13:00 LT and 14:00 LT is used.The surface wind data, especially wind direction can be impacted by local topography and interferences.

Methodology
In this study, NO 2 VMR ST and NO 2 VMR M were estimated using three regression models with Trop NO 2 VCD OMI .Table 2 summarizes the three models., where the AIRS pressure and temperature are boundary layer mean values, Gas constant R = 8.314472 m 3 pa K −1 mol −1 and Avogadro constant NA = 6.022 × 10 23 mol −1 .

M1
M1 is the linear regression equation where Trop NO 2 VCD OMI is used as the independent variable.Figure 2 shows the linear regression between Trop NO 2 VCD OMI and NO 2 VMR In-situ at 13:45 LT during the training period, with R 2 (coefficient of determination), slope, and intercept of 0.47, 0.80 and 11.47, respectively.Figure 3 shows the linear regression between monthly mean Trop NO 2 VCD OMI and monthly mean NO 2 VMR In-situ during the training period, with R 2 , slope, and intercept of 0.62, 0.77, and 10.95, respectively.The final form of the M1 equation for estimating NO 2 VMR ST is shown in Table 3, and that for estimating NO 2 VMR M in Table 4.
Tables 3 and 4 show the equations M1, M2, M3, and M4 with the regression coefficients determined from the training period.
variable.Figure 2 shows the linear regression between Trop NO2 VCDOMI and NO2 VMRIn-situ at 13:45 LT during the training period, with R 2 (coefficient of determination), slope, and intercept of 0.47, 0.80 and 11.47, respectively.Figure 3 shows the linear regression between monthly mean Trop NO2 VCDOMI and monthly mean NO2 VMRIn-situ during the training period, with R 2 , slope, and intercept of 0.62, 0.77, and 10.95, respectively.The final form of the M1 equation for estimating NO2 VMRST is shown in Table 3, and that for estimating NO2 VMRM in Table 4.  Table 3. Final form of the regression models used for estimating surface NO2 VMR at a specific time and R 2 obtained from the regression between NO2 VMRIn-situ and the corresponding independent variable for the training period.Tables 3 and 4 show the equations M1, M2, M3, and M4 with the regression coefficients determined from the training period.

M2
There might exist a minor fraction of the tropospheric NO 2 column in upper troposphere particularly because of lightning.However, the NO 2 amount in upper troposphere could be considered negligible in metropolitan cities, where a significant amount of NO X is emitted.Therefore, assuming Trop NO 2 VCD OMI is mostly present within the PBL, the relationship between Trop NO 2 VCD OMI and the surface NO 2 VMR may change as the PBL varies.However, a minor fraction of the tropospheric NO 2 column can also be in the upper tropospheric, particularly because of lightning.This NO 2 fraction in upper tropospheric might cause either small or negligible reduction in correlations of the OMI NO 2 VCD between and surface NO 2 VMR as the upper part of the troposphere (free troposphere) contribution is assumed to be negligible [28].To reflect the BLH in the regression equation, Trop NO 2 VCD OMI is first divided by BLH AIRS to calculate the NO 2 concentration in the PBL and then converted to the NO 2 mixing ratio in the PBL (BLH NO 2 VMR OMI ) using Temp AIRS and Press AIRS [29] as shown Table 2.Only a single OMI pixel contained completely within an AIRS pixel was used.Figure 4 shows the linear regression between BLH NO 2 VMR OMI and NO 2 VMR In-situ at 13:45 LT during the training period.Here R 2 , slope and intercept are 0.38, 1.58, and 14.30, respectively.Figure 5 shows the corresponding linear regression for the monthly mean data, with R 2 , slope and intercept of 0.59, 1.71, and 12.75, respectively.The final form of equation M2 to estimate NO 2 VMR ST is shown in Table 3, and for the monthly values in Table 4.
Remote Sens. 2017, 9, 627 6 of 15 PressAIRS [29] as shown Table 2.Only a single OMI pixel contained completely within an AIRS pixel was used.Figure 4 shows the linear regression between BLH NO2 VMROMI and NO2 VMRIn-situ at 13:45 LT during the training period.Here R 2 , slope and intercept are 0.38, 1.58, and 14.30, respectively.Figure 5 shows the corresponding linear regression for the monthly mean data, with R 2 , slope and intercept of 0.59, 1.71, and 12.75, respectively.The final form of equation M2 to estimate NO2 VMRST is shown in Table 3, and for the monthly values in Table 4.

M3 and M4
M3 and M4 are multiple regression equations for estimating NO2 VMRST and NO2 VMRM.Multiple regression equations consist of a dependent variable, independent variables, and their regression coefficients.In addition to Trop NO2 VCDOMI and BLHAIRS, meteorological factors (surface  2. Only a single OMI pixel contained completely within an AIRS pixel was used.Figure 4 shows the linear regression between BLH NO2 VMROMI and NO2 VMRIn-situ at 13:45 LT during the training period.Here R 2 , slope and intercept are 0.38, 1.58, and 14.30, respectively.Figure 5 shows the corresponding linear regression for the monthly mean data, with R 2 , slope and intercept of 0.59, 1.71, and 12.75, respectively.The final form of equation M2 to estimate NO2 VMRST is shown in Table 3, and for the monthly values in Table 4.

M3 and M4
M3 and M4 are multiple regression equations for estimating NO2 VMRST and NO2 VMRM.Multiple regression equations consist of a dependent variable, independent variables, and their regression coefficients.In addition to Trop NO2 VCDOMI and BLHAIRS, meteorological factors (surface temperature, dew point, atmospheric pressure, wind direction, and wind speed) are used as

M3 and M4
M3 and M4 are multiple regression equations for estimating NO 2 VMR ST and NO 2 VMR M .Multiple regression equations consist of a dependent variable, independent variables, and their regression coefficients.In addition to Trop NO 2 VCD OMI and BLH AIRS , meteorological factors (surface temperature, dew point, atmospheric pressure, wind direction, and wind speed) are used as candidate independent variables for the multiple regression equation in the present study.In a previous study [30], these meteorological factors were also used as candidate independent variables to estimate surface SO 2 concentration in Shanghai, China.Temperature, pressure, boundary layer height, wind speed, and wind direction were selected as the candidates for independent variables since they are known to either directly or indirectly affect the spatial mixing of NO 2 molecules in boundary layer.Furthermore, temperature and dewpoint were selected as candidates for independent variables as they affect the boundary layer height [31].
The multiple regression equation can be defined by the following equations: where ŷ and β 0 are the dependent variable (NO where y j is the observed value with m data points.By minimizing the sum of ε 2 , regression coefficients can be derived.These least square fitting techniques are based on the following assumptions: the linear relationship, a normal distribution and equal variance in the residuals.The least squares regression is sensitive to the presence of some points that are excessively large or small values in the training data [32].To determine the independent variables (x n ) and regression coefficients (β n ) included in the final form of equations M3 and M4, we considered the variation inflation factor (VIF) and p-value to ensure their statistical significance.First, we examined the VIF that explains the multicollinearity of a candidate independent variable with regard to other candidate independent variables.The VIF of the j-th independent variable is expressed as: where R 2 j is the coefficient of determination for the regression of x j against another independent variable (a regression that does not involve the dependent variable j).The VIF indicates how much x j is correlated with the other candidate variables.A candidate independent variable with a very high VIF can be considered redundant and should be removed from the multiple regression equations.Candidate independent variables that do not satisfy the criterion VIF < 10 [33], were excluded from the independent variables.The p-value was also used to select independent variables.The highest still statistically significant p-level was shown by Sellke et al. [34] to be 5%.Among the independent variables that satisfy the VIF criterion, those that also satisfy p-value <0.05 are selected as final independent variables in the multiple regression equations.The independent variables selected for equations M3 and M4 are shown in Table 5.The final form of equation M3 to estimate NO 2 VMR ST is shown Table 3, and that for NO 2 VMR M in Table 4.

Daily Estimates
Figure 6 shows the day-to-day variations of NO 2 VMR In-situ and NO 2 VMR ST estimated at 13:45 LT in West Seoul and East Seoul using M1, M2 and M3 in Table 3 for 2006 and 2014.A slightly larger difference in magnitude is found between NO 2 VMR In-situ and NO 2 VMR ST obtained with M3 compared to those between NO 2 VMR In-situ and NO 2 VMR ST obtained with M1 and M2.However, NO 2 obtained from M3 showed moderate agreement with NO 2 VMR In-situ in the form of the day-to-day variation.Results for Daejeon, Gwangju, and Gyeonggi are included in the Supplementary Materials.
Figure 7 shows the R, slope, mean bias (MB), mean absolute error (MAE), root mean square error (RMSE) and percent difference between NO 2 VMR ST and NO 2 VMR In-situ for the validation period (2006 and 2014).The R obtained with M1 ranges from 0.49 to 0.71, showing better agreement than that with M2 (0.47 < R < 0.65).M3 showed the best correlation with NO 2 VMR In-situ (0.67 < R < 0.90).The slopes from both M1 and M2 are close to one in East Seoul, whereas they are lower in the other cities.The MB from M1, M2, and M3 ranges from −7.74 to 5.80 ppbv.In all study areas, the MAE (5.79 ppbv < MAE < 8.25 ppbv) of M3 is lower than those (6.58 ppbv < MAE < 11.41 ppbv) of M1 and M2, which means that NO 2 VMR ST estimated from M3 show moderate agreement with NO 2 VMR In-situ in terms of magnitude.The RMSE from M3 is found to be lower than those from M1 and M2.The NO 2 VMR ST from M3 have the lowest RMSE in all study areas (7.21 ppbv < RMSE < 11.37 ppbv).In addition, percent differences estimated from M3 and NO 2 VMR In-situ are lower in all study areas than from M1 and M2.In estimating NO 2 VMR ST , M3, which is a multiple regression method with various independent variables as inputs, generally showed good statistical performance except for MB.
Figure 6 shows the day-to-day variations of NO2 VMRIn-situ and NO2 VMRST estimated at 13:45 LT in West Seoul and East Seoul using M1, M2 and M3 in Table 3 for 2006 and 2014.A slightly larger difference in magnitude is found between NO2 VMRIn-situ and NO2 VMRST obtained with M3 compared to those between NO2 VMRIn-situ and NO2 VMRST obtained with M1 and M2.However, NO2 obtained from M3 showed moderate agreement with NO2 VMRIn-situ in the form of the day-to-day variation.Results for Daejeon, Gwangju, and Gyeonggi are included in the Supplementary Materials.Figure 7 shows the R, slope, mean bias (MB), mean absolute error (MAE), root mean square error (RMSE) and percent difference between NO2 VMRST and NO2 VMRIn-situ for the validation period (2006 and 2014).The R obtained with M1 ranges from 0.49 to 0.71, showing better agreement than that with M2 (0.47 < R < 0.65).M3 showed the best correlation with NO2 VMRIn-situ (0.67 <R <0.90).The slopes from both M1 and M2 are close to one in East Seoul, whereas they are lower in the other cities.The MB from M1, M2, and M3 ranges from −7.74 to 5.80 ppbv.In all study areas, the MAE (5.79 ppbv means that NO2 VMRST estimated from M3 show moderate agreement with NO2 VMRIn-situ in terms of magnitude.The RMSE from M3 is found to be lower than those from M1 and M2.The NO2 VMRST from M3 have the lowest RMSE in all study areas (7.21 ppbv < RMSE < 11.37 ppbv).In addition, percent differences estimated from M3 and NO2 VMRIn-situ are lower in all study areas than from M1 and M2.In estimating NO2 VMRST, M3, which is a multiple regression method with various independent variables as inputs, generally showed good statistical performance except for MB.

Monthly Estimates
Figure 8 shows the temporal variation of monthly mean NO2 VMRIn-situ and NO2 VMRM estimated using M1, M2 and M4 of Table 4 in West Seoul and East Seoul using monthly mean independent variables during the validation period (see the detailed input data in Section 2.1). Figure 8 shows good agreement in terms of the temporal pattern between the estimated NO2 VMRM and

Monthly Estimates
Figure 8 shows the temporal variation of monthly mean NO 2 VMR In-situ and NO 2 VMR M estimated using M1, M2 and M4 of Table 4 in West Seoul and East Seoul using monthly mean independent variables during the validation period (see the detailed input data in Section 2.1). Figure 8 shows good agreement in terms of the temporal pattern between the estimated NO 2 VMR M and monthly mean NO 2 VMR In-situ .However, we found a large difference between NO 2 VMR In-situ and NO 2 VMR M in periods when there was a jump in NO 2 VMR In-situ between successive months.For example, no models calculated NO 2 VMR M that were similar to NO    VMR In-situ ranged from 0.68 to 0.82 in all areas.MB was close to 0 in most study areas.MAE was less than 5 ppbv in Daejeon, Gwangju, Gyeonggi, and East Seoul where there is good agreement between NO 2 VMR M from M1, M2, and M4 and monthly mean NO 2 VMR In-situ , whereas MAEs in West Seoul ranged from 5.66 to 6.79.RMSEs between NO 2 VMR In-situ and NO 2 VMR M from M1, M2, and M3 are found to be lower than 7 ppbv in the study areas except for West Seoul.In addition, the three models showed percent differences of less than 30% except for the value estimated from M1 in Gwangju.
VMRM from M1, M2, and M4 and monthly mean NO2 VMRIn-situ, whereas MAEs in West Seoul ranged from 5.66 to 6.79.RMSEs between NO2 VMRIn-situ and NO2 VMRM from M1, M2, and M3 are found to be lower than 7 ppbv in the study areas except for West Seoul.In addition, the three models showed percent differences of less than 30% except for the value estimated from M1 in Gwangju.

Discussion
In a previous study [18], tropospheric NO2 VCDs obtained from GOME were compared with tropospheric NO2 VCDs calculated using NO2 concentrations obtained from both in situ measurements and the Model of Ozone and Related Tracers 2 (MOZART-2).There are also several previous studies estimating surface NO2 VMR using satellite data [3,35].Among them, Kharol et al. [3] estimated the annual variation of ground-level NO2 concentrations using both GEOS-Chem data and OMI data.However, in the present study, NO2 VMRST and NO2 VMRM were estimated for the first time at higher temporal resolution using three regression models with Trop NO2 VCDOMI as input.

Discussion
In a previous study [18], tropospheric NO 2 VCDs obtained from GOME were compared with tropospheric NO 2 VCDs calculated using NO 2 concentrations obtained from both in situ measurements and the Model of Ozone and Related Tracers 2 (MOZART-2).There are also several previous studies estimating surface NO 2 VMR using satellite data [3,35].Among them, Kharol et al. [3] estimated the annual variation of ground-level NO 2 concentrations using both GEOS-Chem data and OMI data.However, in the present study, NO 2 VMR ST and NO 2 VMR M were estimated for the first time at higher temporal resolution using three regression models with Trop NO 2 VCD OMI as input.

•
Among the three regression models, the multiple regression model M3 performed best in estimating NO 2 VMR ST .The linear regression model (M2), in which BLH is used as an independent variable in addition to Trop NO 2 VCD OMI , has comparable performance to that of the model (M1) which uses Trop NO 2 VCD OMI as the only independent variable.The BLH varies with latitude [36], but the latitudinal variation of BLH is not well represented since the spatial resolution of the AIRS used in this study is coarser than the spatial resolution of OMI.It might also be associate the BLH AIRS data quality.We expect better results using BLH data obtained from LIDAR.

•
The average difference was found to be 46.04% between NO 2 VMR In-situ and NO 2 VMR ST obtained from M1, 44.29% between NO 2 VMR In-situ and NO 2 VMR ST obtained from M2, and 31.50% between NO 2 VMR In-situ and NO 2 VMR ST obtained from M3 in all cities, while there was moderate agreement in the temporal pattern of NO 2 variation between NO 2 VMR In-situ and NO 2 VMR ST obtained from M1, M2, and M3 (Figure 6).

•
In terms of statistical evaluation with respect to the in situ data, M3 showed the best performance in general.

•
The results produced by M2 are not improved compared to those by M1 which may imply that surface NO 2 VMR is dominantly affected by tropospheric NO 2 column while the BLH effect could be negligible in areas of the present study.It might also be associate the AIRS BLH data quality.

Estimation of Monthly Mean
Surface NO 2 VMRs of a Specific Time (13:45 LT)

•
We found good agreement in the temporal pattern between the estimated NO 2 VMR M and monthly mean NO 2 VMR In-situ (Figure 8).However, there was a large difference between NO 2 VMR In-situ and NO 2 VMR M in the period when there was a clear change in NO 2 VMR M between one month and the next.Despite the use of NO 2 VMR In-situ located away from streets, the in situ measurement sites in West Seoul are located closer to streets than the in situ measurement sites in Daejeon and Gwangju.This may explain why there are more periods when NO 2 VMR In-situ changes rapidly in successive months.It is difficult to estimate the rapid change of NO 2 VMR near NO 2 sources with regression models that reflect the relationship between the in situ measurements and the OMI sensor covering both source and non-source areas in a single pixel.

•
In terms of statistical evaluation, the three regression models (M1, M2, and M4) were found to be similar (Figure 9).

•
NO 2 VMR M shows better agreement with the NO 2 VMR In-situ than does NO 2 VMR ST .The reason for the better performance in the monthly mean estimation could be attributed to reduced errors in the monthly mean OMI data [37] as well as fewer occasions with sudden monthly changes in NO 2 VMR In-situ than rapid day-to-day changes in NO 2 VMR In-situ .
This present study provides the results in the condition of 2 km distance between the in situ NO 2 measurement location and NO X point source.For a future study, performances of the models need to be investigated depending on the distance between the in situ NO 2 data and point sources.We expect that the regression methods used to estimate the surface NO 2 VMR using Trop NO 2 VCD OMI will be useful in providing information on surface NO 2 VMR in metropolitan cites on a monthly timescale.In future research, the estimation of surface NO 2 VMR may be attempted at higher time resolution with geostationary satellite sensors (e.g., geostationary environmental monitoring spectrometer (GEMS), tropospheric emissions: monitoring of pollution (TEMPO), and Sentinel-4).In further work, improvements are needed in the input data or the model formulation before the surface NO 2 can be estimated on a daily basis.

Conclusions
In this study, monthly and specific time estimates of NO 2 VMR were obtained for the first time using three regression models in four metropolitan cities for two years, 2006 and 2014.The multiple regression model (M3) was found to perform best in estimating NO 2 VMR ST in all cities.For surface NO 2 estimates at the specific time (13:45 LT), M3 generally gives better R, MAE, RMSE, and percent difference than the other two models (M1 and M2).A comparison between monthly surface NO 2 VMR estimates and those at the specific time showed that agreement with NO 2 VMR In-situ was better for monthly estimates.In estimating NO 2 VMR M , three regression models (M1, M2, and M4) showed similar performance.In estimating daily and monthly surface NO 2 VMR variations, when the surface NO 2 VMR changes rapidly, the difference between surface NO 2 VMR estimated from all models and NO 2 VMR In-situ is found to be large.In future studies, using higher spatial resolution satellites is expected to improve the relationship with in situ measurements.In addition, the use of other independent variables that may co-vary with rapid changes of surface NO 2 VMR should be investigated.
period is the nine years from 2006 to 2014.This is split into a seven-year training period(2007)(2008)(2009)(2010)(2011)(2012)(2013) to determine the coefficients of the regression models used in this study, and two years of validation (2006 and 2014) when the surface NO2 VMRs estimated from the resulting three regression models are evaluated by comparison with the in situ data.The three regression models used in this study are described in detail in Section 3.

Figure 1 .
Figure 1.Study areas in South Korea.Figure 1. Study areas in South Korea.

Figure 1 .
Figure 1.Study areas in South Korea.Figure 1. Study areas in South Korea.

Figure 2 .
Figure 2. Scatter plot between Trop NO2 VCDOMI at 13.45 LT and NO2 VMRIn-situ to determine the regression coefficient for M1 for the training period 2007-2013.

Figure 3 .
Figure 3.As Figure 2 but for the monthly mean values.

Figure 4 .
Figure 4. Scatter plot between BLH NO2 VMROMI at a specific time (13:45 LT) and NO2 VMRIn-situ to determine the regression coefficient for M1 for the training period 2007-2013.

Figure 5 .
Figure 5.As Figure 4 but for the monthly mean values.

Figure 4 .
Figure 4. Scatter plot between BLH NO 2 VMR OMI at a specific time (13:45 LT) and NO 2 VMR In-situ to determine the regression coefficient for M1 for the training period 2007-2013.

Figure 4 .
Figure 4. Scatter plot between BLH NO2 VMROMI at a specific time (13:45 LT) and NO2 VMRIn-situ to determine the regression coefficient for M1 for the training period 2007-2013.

Figure 5 .
Figure 5.As Figure 4 but for the monthly mean values.

Figure 5 .
Figure 5.As Figure 4 but for the monthly mean values.

Figure 6 .
Figure 6.Time series of NO 2 VMR In-situ and NO 2 VMR ST at 13:45 LT estimated by M1, M2 and M3 in East Seoul and West Seoul for: 2006 (a,c); and 2014 (b,d).

Figure 9
Figure 9 shows the R, slope, MB, MAE, RMSE and percent difference between NO2 VMRM and monthly mean NO2 VMRIn-situ in 2006 and 2014.In general, NO2 VMRM agreed better with NO2 VMRInsitu than did the NO2 VMRST.The value of R from M1, M2 and M4 and monthly mean NO2 VMRIn-situ ranged from 0.68 to 0.82 in all areas.MB was close to 0 in most study areas.MAE was less than 5 ppbv in Daejeon, Gwangju, Gyeonggi, and East Seoul where there is good agreement between NO2

Figure 8 .
Figure 8.Time series of NO 2 VMR In-situ and NO 2 VMR M estimated by M1, M2, and M4 for 2006 and 2014.

Figure 9
Figure9shows the R, slope, MB, MAE, RMSE and percent difference between NO 2 VMR M and monthly mean NO 2 VMR In-situ in 2006 and 2014.In general, NO 2 VMR M agreed better with NO 2 VMR In-situ than did the NO 2 VMR ST .The value of R from M1, M2 and M4 and monthly mean NO 2 VMR In-situ ranged from 0.68 to 0.82 in all areas.MB was close to 0 in most study areas.MAE was less than 5 ppbv in Daejeon, Gwangju, Gyeonggi, and East Seoul where there is good agreement between NO 2 VMR M from M1, M2, and M4 and monthly mean NO 2 VMR In-situ , whereas MAEs in West Seoul ranged from 5.66 to 6.79.RMSEs between NO 2 VMR In-situ and NO 2 VMR M from M1, M2, and M3 are found to be lower than 7 ppbv in the study areas except for West Seoul.In addition, the three models showed percent differences of less than 30% except for the value estimated from M1 in Gwangju.

Table 1 .
Satellite and in situ data used in this study.

Table 2 .
Regression models used for surface NO 2 VMR estimation in this study.
2 tropospheric vertical column density obtained from OMI; and (b) BLH NO 2 V MR OMI = Trop NO 2 VCD OMI Gas constant R Temp AIRS ×10 13Avogadro constant NA BLH AIRS Press AIRS

Table 4 .
As Table3but for monthly mean surface NO2 VMR.

Table 3 .
Final form of the regression models used for estimating surface NO 2 VMR at a specific time and R 2 obtained from the regression between NO 2 VMR In-situ and the corresponding independent variable for the training period.V MR ST = 0.000602 × Trop NO 2 VCD OMI − 0.000107 × Temp In-situ −0.000083 × Dewpoint In-situ + 0.000061 × Press In-situ −0.000002 × BLH AIRS − 0.002435 × WS In-situ +0.001190 × WD In-situ − 0.039996

Table 4 .
As Table3but for monthly mean surface NO 2 VMR.

Table 5 .
Final independent variables included in multiple regression equations (M3 and M4).