Estimation of PM 2.5 Concentrations in New York State: Understanding the Inﬂuence of Vertical Mixing on Surface PM 2.5 Using Machine Learning

: In New York State (NYS), episodic high ﬁne particulate matter (PM 2.5 ) concentrations associated with aerosols originated from the Midwest, Mid-Atlantic, and Paciﬁc Northwest states have been reported. In this study, machine learning techniques, including multiple linear regression (MLR) and artiﬁcial neural network (ANN), were used to estimate surface PM 2.5 mass concentrations at air quality monitoring sites in NYS during the summers of 2016–2019. Various predictors were considered, including meteorological, aerosol, and geographic predictors. Vertical predictors, designed as the indicators of vertical mixing and aloft aerosols, were also applied. Overall, the ANN models performed better than the MLR models


Introduction
Fine particulate matter (PM 2.5 ) (particulate matter with aerodynamic diameter less than 2.5 µm) is one of the criteria of air pollutants because of its detrimental impacts on human health and the environment [1,2].Previous studies have reported that the exposure to high PM 2.5 concentrations can increase the risk of respiratory diseases and mortality [3,4].Many processes can affect ground-level PM 2.5 concentrations, including emissions, removal (e.g., deposition), transport, aerosol physical processes (e.g., nucleation), and atmospheric chemistry [5][6][7][8][9].These processes are potentially affected by meteorological conditions (e.g., surface temperature and horizontal winds) [10][11][12][13][14]. Yu et al. (2008) [9] used the Eta-Community Multiscale Air Quality (CMAQ) coupled model to estimate the surface PM 2.5 concentrations over the eastern United States (US) during the summer of 2004, and indicated that aerosol physical and chemical processes dominated PM 2.5 concentration.Dawson et al. (2007) [10] analyzed the sensitivities of PM 2.5 concentrations to meteorological variables, and showed that temperature, absolute humidity, wind speed, mixing layer height, and precipitation had the strongest effects on PM 2.5 concentrations.Additionally, atmospheric vertical mixing has been shown to significantly impact PM 2.5 concentrations [7,[15][16][17][18][19].In Zhang et al. (2020) [19], the radar wind profiler measurements showed weak vertical wind shears at a shallower planetary boundary layer (PBL) under polluted conditions, as evidence of weak vertical mixing leading to strong PM 2.5 accumulation in the PBL.They indicated that strong vertical wind shears were associated with strong vertical mixing and often accompanied with low PM 2.5 concentrations.Strong winds above the PBL were also favorable for the transport of aerosols, which could potentially affect surface PM 2.5 concentrations through vertical mixing.
Various approaches have been applied for surface PM 2.5 estimation, including chemistry transport models (CTMs) [9,20,21] and land use regression (LUR) models [22,23].Statistical approaches based on the relationships between satellite aerosol optical depth (AOD) and surface PM 2.5 concentrations have also been applied [24][25][26].Recently, machine learning (ML), an application of artificial intelligence (AI), has become an increasingly popular approach for PM 2.5 estimation [27][28][29].ML also provides insights into contributions of different influencing factors to PM 2.5 concentrations.For instance, Reid et al. (2015) [29] estimated the PM 2.5 concentrations during the Northern California wildfires in 2008.Various predictors were used in ML training, including meteorological variables, such as temperature and humidity, land use variables, such as the distance to the nearest emission source, geographic variables, such as site location and Julian date, satellite AOD measurements, and CTM simulated PM 2.5 concentrations.Meteorological, land use, and geographic variables provided the baseline PM 2.5 estimation.The application of AOD and CTM products further improved model performance by providing the estimation of total-column aerosol loading and prior knowledge of the aerosol physical processes and chemical reactions, respectively.Furthermore, several studies focused on the influence of AOD on PM 2.5 estimation and indicated that AOD could improve the accuracy of PM 2.5 estimation with increased correlation coefficients and decreased model errors [30][31][32][33].However, Yao et al. (2018) [34] reported an unchanged model performance when using AOD in ML training, probably due to complex terrain, uncertainties of cloud filtering and the presence of aloft aerosols.
In New York State (NYS), the PM 2.5 concentrations have been decreasing continuously over the past decades [35][36][37][38].Rattigan et al. (2015) [37] analyzed the 2000-2014 observed PM 2.5 concentrations at 16 air quality monitoring sites across the state and reported a downward trend with decreases of 4-7 µg m −3 on annual scale.The annual average PM 2.5 concentrations at 15 sites were 9-17 µg m −3 and 6-10 µg m −3 in 2000 and 2014, respectively, while the annual average PM 2.5 concentration at the Whiteface Mountain site decreased from about 6 µg m −3 in 2000 to 4 µg m −3 in 2014.In addition, according to the New York State Ambient Air Quality Report for 2019 (https://www.dec.ny.gov/docs/air_pdf/2019airqualreport.pdf), the 2019 PM 2.5 annual averages ranged from 5 to 9 µg m −3 , except for the Whiteface Mountain site with an annual average around 3 µg m −3 .
However, episodic high PM 2.5 concentrations across NYS have been reported.During wintertime, the high PM 2.5 concentrations have been attributed to local heating sources and lower mixing layer heights [36,38].As for summer, the high PM 2.5 concentrations have been attributed to local emissions [39,40], anthropogenic aerosols transported from the Midwest and the Great Lake region [39-43], and long-range transported smoke aerosols from the western US and Canada [44][45][46][47][48]. Climatically, NYS is affected by the prevailing westerlies.Additionally, the Bermuda High locating over the western Atlantic Ocean introduces southerly and southwesterly winds along the east coast in summertime.Under the influences of synoptic flows, air masses could transport aerosols from the polluted areas to NYS.These aerosols could be potentially transported from free troposphere to the surface, resulting in increased PM 2.5 concentrations.In Dutkiewicz et al. (2004) [43], 1-year observations of the concentrations of PM 2.5 sulfate showed that more than 40% of the high sulfate concentrations were associated with westerly flows and around 20-30% were associated with southwesterly flows, reflecting the influences of transported pollutants from the Midwest and Mid-Atlantic states, respectively.The contribution of such transported pollutants was more significant at rural sites, as up to 60% of PM 2.5 sulfate was associated with transported pollutants on annual basis.On the other hand, a statewide event of high PM 2.5 concentrations in August 2018 caused by transported smoke aerosols was investigated in Hung et al. (2020) [45].Multi-platform measurements and model simulations demonstrated the long-range transport of smoke aerosols from the western states to NYS.These smoke aerosols transported in free troposphere before reaching NYS and descended to around 2 km a.g.l.driven by synoptic downward mixings.The PBL entrainment further brought these aloft aerosols to the surface, resulting in a threefold increase (from 8 to 24 µg m −3 ) in the average PM 2.5 concentrations.
Since transported aerosols make significant contributions to the high PM 2.5 concentrations in NYS, understandings of the relationships between the vertical mixing of these aloft aerosols and surface PM 2.5 concentrations are critical for air quality forecast, monitoring and management.Additionally, studies exploring the roles of vertical mixing in PM 2.5 estimation are needed.In this study, ML techniques were used to estimate the PM 2.5 mass concentrations at air quality monitor sites in NYS during summer seasons (July, August and September, JAS) of 2016-2019.Multiple predictors were applied, including meteorological, aerosol and geographic variables.Predictors associated with vertical mixing and aloft aerosols were also considered.The statistical correlations between selected predictors and PM 2.5 concentrations were investigated.Additionally, to understand the influences of predictors on the PM 2.5 concentration in NYS, the results at monitoring sites with different ambient conditions were discussed.

Datasets
The variables used in this study are summarized in Table 1 and are briefly described in this section.Details can be found in the cited references.The US Environmental Protection Agency (EPA) collects real-time air quality measurements from over 2000 surface monitoring sites nationwide maintained by state or local air quality agencies.In NYS, air quality data are collected and quality controlled by the NYS Department of Environmental Conservation (NYSDEC).In this study, hourly PM 2.5 mass concentrations from 21 monitoring sites across NYS were used (Figure 1; Table 2).According to the US EPA and Squizzato et al. (2018) [38], these sites consisted of 18 urban/suburban sites and 3 rural sites.The urban/suburban sites were divided into New York City (NYC) metropolitan sites and upstate NY (UNY) sites based on their locations defined by the US EPA core based statistical areas (CBSA).As a result, in this study, 21 selected sites were categorized into: (1) 5 UNY sites, which is a group of sites in Buffalo, Rochester, and Albany areas, (2) 3 rural sites, and (3) 13 NYC sites, which located in the New York-Newark-Jersey City area.The PM 2.5 concentration measurements at three sites (marked with asterisks in Table 2) are based on the Federal Equivalence Method (FEM), while others are based on the Tapered Element Oscillating Microbalances (TEOM) technology. https://lpdaac.usgs.gov/products/vnp13c2v001/.

Surface PM2.5 Observations
The US Environmental Protection Agency (EPA) collects real-time air quality measurements from over 2000 surface monitoring sites nationwide maintained by state or local air quality agencies.In NYS, air quality data are collected and quality controlled by the NYS Department of Environmental Conservation (NYSDEC).In this study, hourly PM2.5 mass concentrations from 21 monitoring sites across NYS were used (Figure 1; Table 2).According to the US EPA and Squizzato et al. (2018) [38], these sites consisted of 18 urban/suburban sites and 3 rural sites.The urban/suburban sites were divided into New York City (NYC) metropolitan sites and upstate NY (UNY) sites based on their locations defined by the US EPA core based statistical areas (CBSA).As a result, in this study, 21 selected sites were categorized into: 1) 5 UNY sites, which is a group of sites in Buffalo, Rochester, and Albany areas, 2) 3 rural sites, and 3) 13 NYC sites, which located in the New York-Newark-Jersey City area.The PM2.5 concentration measurements at three sites (marked with asterisks in Table 2) are based on the Federal Equivalence Method (FEM), while others are based on the Tapered Element Oscillating Microbalances (TEOM) technology.

Meteorological Predictors
Several meteorological predictors were obtained in this study, including 2 m temperature (T), 2 m relative humidity (RH), surface pressure (PS), planetary boundary layer height (PBLH), and the u and v components of 10 m horizontal winds (U, V).These variables were taken from the analysis fields of the High-Resolution Rapid Refresh (HRRR) [49], which is an atmospheric model developed by the NOAA/Earth System Research Laboratories (ESRL)/Global Systems Laboratory (GSL).HRRR has 3 km horizontal resolution and 51 vertical levels in hybrid coordinates, and provides hourly analysis over the contiguous US (CONUS) and Alaska.Details about HRRR can be found at https://rapidrefresh.noaa.gov/hrrr/and http://www.nco.ncep.noaa.gov/pmb/products/hrrr/.

Aerosol Predictors
AOD measurements from the Visible Infrared Imaging Radiometer Suite (VIIRS) sensor onboard the Suomi National Polar-orbiting Partnership (S-NPP) satellite, launched in October 2011, were used.VIIRS measures 22 spectrum channels in the range of 412-12,050 nm, including imagery bands, moderate resolution bands (M-bands) and the day-night band.The wide spectral range allows VIIRS to provide multiple land and atmosphere products and the M-bands are mainly used for aerosol retrieval [50].VIIRS provides daily global coverage with 750 m resolution and only daytime measurements are used for AOD retrieval.In this study, level 3 Environmental Data Record (EDR) daily gridded AOD at 550 nm products were used.
In addition, surface PM 2.5 mass concentrations from the Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA-2) [51] were used.MERRA-2 is a global atmospheric reanalysis developed by the National Aeronautics and Space Administration's (NASA's) Global Modeling and Assimilation Office (GMAO).MERRA-2 uses the Goddard Earth Observing System, Version 5 (GEOS-5) Atmospheric General Circulation Model (AGCM) coupled with the Goddard Chemistry Aerosol Radiation and Transport (GOCART) model [52].Aerosol and meteorological observations are jointly assimilated in MERRA-2.MERRA-2 aerosol reanalysis considers aerosol emissions, transport, removal processes, and chemistry.Details can be found in Buchard et al. (2017) [53] and Randles et al. (2017) [54].A brief comparison between HRRR and MERRA-2 meteorological fields (T, RH, PS, U and V) showed that two models are in good agreement with correlation coefficients higher than 0.7 (Appendix A).

Geographic Predictors
In this study, the monthly enhanced vegetation index (VI) [55,56] products from VIIRS S-NPP were used as terrestrial vegetation estimation.Weekday and geographic information (latitude, longitude and altitude) of selected monitor sites, referred from EPA AQS database, were also used.

Vertical Predictors
To describe the atmospheric vertical mixing, vertical wind shears (VWS) of three layers, including surface-850 hPa (Low-Level; L-VWS), 850-700 hPa (Mid-Level; M-VWS) and 700-500 hPa (High-Level; H-VWS) from HRRR analysis, were used in this study.VWS were defined as the gradients of horizontal winds between the top and bottom model levels.Note that this study only considered the magnitude of VWS, which was computed from the u and v components of VWS.The average vertical velocities of the layer of surface to 500 hPa (W_avg) were also used.
Furthermore, to indicate the presence of aloft aerosols, the daily change rates (R) of AOD and PM 2.5 concentrations at monitor sites were calculated as follows: where d is date.The day-by-day variation of AP_ratio reflects the change in aerosol vertical distribution.Since AOD and PM 2.5 concentrations present the aerosol loadings in the total-column atmosphere and near the surface, respectively, their daily change rates should be comparable if most of the aerosols are near the surface.R AOD should be higher than R PM2.5 if there are aerosols aloft.In contrast, under weak advection with no change in aloft aerosols, R PM2.5 could be higher than R AOD , since surface PM 2.5 concentration is mainly determined by local emissions.In this study, the ratio of R AOD to R PM2.5 (AP_ratio) was used as the indicator of aloft aerosols.

Data Processing
Datasets in the domain of 40.5 • N -45.5 • N, 72 • W-80 • W during the summer seasons (JAS) of 2016-2019, 368 days in total, were used.Four data processing steps were taken prior to training.First, spatial linear interpolations were applied for MERRA-2 data, and 3 × 3 grid averages were computed for satellite AOD and VI as the representative at the location of selected air quality sites.Second, the averages of daytime data during 0700-1800 LT (1200-2300 UTC), except for AOD and VI, were calculated.Afterward, days with missing data were removed.The outliers (i.e., values beyond average ±3 standard deviations) of PM 2.5 observation and AOD were also removed to exclude extreme cases (about 1.5% of the data).Lastly, aerosol predictors were transformed to become Gaussian distribution by taking square roots on PM 2.5 observation and AOD and taking a log of MERRA-2 PM 2.5 concentration.
The assumption of independence of predictors, required for ML algorithms, is examined by correlation matrix (Appendix B).While the assumption is not strictly met, the impact on training results may not be significant given only one pair exceeding 0.8.It is worth mentioning that, although some variables may be connected physically (e.g., T and PBLH), the correlations between these variables are relatively weak.

Model Configuration
Two ML algorithms were used in this study: multiple linear regression (MLR) and artificial neural network (ANN).MLR is a statistical technique which estimates the relationships between several explanatory variables (i.e., predictors) and a response variable (i.e., target) by fitting a linear regression to the ground truth (PM 2.5 observations in this study).It generates a linear regression of given variables, including the coefficients for predictors and intercept.In this study, the Python Scikit-Learn package [57] was used to build the MLR models.
An artificial neural network (ANN), one of the most popular ML algorithms, is effective for handling complex nonlinear problems, particularly in classification and prediction [58,59].An ANN model consists of a set of layers, including the input layer, multiple hidden layers, and the output layer.There is no specific rule to decide how many hidden layers an ANN model should have but one hidden layer with abundant neurons usually provides good results [58].In addition, previous studies showed that ANN models performed better with the number of neurons in a range of (2

√
n + ϕ,2n + 1), where n and ϕ are the numbers of predictors and targets, respectively [60].In this study, the Python Keras and Tensorflow packages [61] were used.Back-propagation models with one hidden layer obtaining 20 neurons were applied.The maximum number of iterations was set to 1000 and an EarlyStopping function (https://keras.io/api/callbacks/early_stopping/)was used to avoid overfitting.Variables were randomly split into the training set (80%) and validation set (20%) in the training process, and an additional testing set, which will be described in the following section, was applied for model evaluation.
To investigate the influence of vertical mixing and aloft aerosols on surface PM 2.5 concentration, two sets of predictors were applied.Set 1 contained meteorological, aerosol, and geographic predictors, while set 2 contained the same predictors as set 1 but also included vertical predictors.Therefore, four models were applied in this study:

Statistical Analysis
In this study, leave-one-out cross validation (LOOCV) [62] was used for model performance evaluation.All four models were trained on data from all monitor sites but one and were tested on the leave-out site.This process was repeated until all 21 sites served as the test site once.Therefore, LOOCV ensured independent evaluation of the trained model via comparison of the predicted and observed values at the locations that did not participate in the training.The involvement of spatial dependence allowed LOOCV to provide more reliable estimations of model performance [63].Three statistical scores were used for model evaluation, including mean bias (MB), coefficient of determination (R 2 ), and root mean square error (RMSE).These scores were calculated as follows: where n is number of data points, X is observation, Y is prediction, and X and Y are the averages of observation and prediction, respectively.The scores at test sites were then averaged as the estimated model errors.Although LOOCV is beneficial for estimating the spatial dependence within trained models, the imbalanced site distribution in NYS (over 60% of the sites are in NYC region) may introduce representativeness issues in the model performance.
To understand the relationships between predictors and PM 2.5 concentrations, the regression coefficients from the MLR models and permutation importance estimated from the ANN models were analyzed.The values of regression coefficients explained the change in target when applying one unit of change in predictors, and the signs explained the direction of such a change.Permutation importance is an approach for ranking variable importance [64,65].The goal of this approach is to estimate how model performance changes when breaking the correlations between predictors and target.This is done by randomly shuffling one predictor in the test data, and comparing the statistical scores of the shuffled model and unshuffled model.Larger percentage errors indicate higher importance.This process is repeated until all predictors have been shuffled once.In this study, the percentage error of the RMSE values of two models was regarded as the estimation of variable importance and is calculated as follows.

Model Performance
Figure 2 illustrates the LOOCV testing results at selected monitoring sites of the four models.The averages of statistical scores are shown in Table 3, and the statistical scores at individual sites are shown in Appendices C and D. Overall, the ANN models performed better than the MLR models with higher R 2 and lower RMSE, and the ANN models showed larger cross-site variations compared to the MLR models.The absolute values of averaged MB of the ANN models (0.63 and 0.29 µg m −3 ) were higher than the MLR models (0.03 and 0.06 µg m −3 ).However, the MBs at individual sites of the four models had a similar range of 2 µg m −3 , except for the MLR-2 model which had a wider range of ±3 µg m −3 .In addition, the near-zero averaged MBs compared to RMSEs indicated that there were negative and positive biases and those non-systemic biases cancelled out in averaging for the MLR and ANN models.model errors.Although LOOCV is beneficial for estimating the spatial dependence within trained models, the imbalanced site distribution in NYS (over 60% of the sites are in NYC region) may introduce representativeness issues in the model performance.
To understand the relationships between predictors and PM2.5 concentrations, the regression coefficients from the MLR models and permutation importance estimated from the ANN models were analyzed.The values of regression coefficients explained the change in target when applying one unit of change in predictors, and the signs explained the direction of such a change.Permutation importance is an approach for ranking variable importance [64,65].The goal of this approach is to estimate how model performance changes when breaking the correlations between predictors and target.This is done by randomly shuffling one predictor in the test data, and comparing the statistical scores of the shuffled model and unshuffled model.Larger percentage errors indicate higher importance.This process is repeated until all predictors have been shuffled once.In this study, the percentage error of the RMSE values of two models was regarded as the estimation of variable importance and is calculated as follows.

Model Performance
Figure 2 illustrates the LOOCV testing results at selected monitoring sites of the four models.The averages of statistical scores are shown in Table 3, and the statistical scores at individual sites are shown in Appendices C and D. Overall, the ANN models performed better than the MLR models with higher R 2 and lower RMSE, and the ANN models showed larger cross-site variations compared to the MLR models.The absolute values of averaged MB of the ANN models (0.63 and 0.29 µg m −3 ) were higher than the MLR models (0.03 and 0.06 µg m −3 ).However, the MBs at individual sites of the four models had a similar range of 2 µg m −3 , except for the MLR-2 model which had a wider range of ±3 µg m −3 .In addition, the near-zero averaged MBs compared to RMSEs indicated that there were negative and positive biases and those non-systemic biases cancelled out in averaging for the MLR and ANN models.For the MLR models, the application of vertical predictors introduced a neutral impact to the model performance since the differences of averaged statistical scores between MLR-1 and MLR-2 were relatively minor.The RMSE at two rural sites (sites 6 and 8) even increased by 0.5-1 µg m −3 when applying vertical predictors (Figure 2a), resulting in a slightly increased averaged RMSE (Table 3).
In contrast, the improvement due to vertical predictors was more significant for the ANN models (Figure 2b).The application of vertical predictors slightly increased the averaged MB for the MLR models (from 0.03 to 0.06 µg m −3 ), while it reduced the averaged MB for ANN models (from −0.63 to −0.29 µg m −3 ) (Table 3).Additionally, the range of RMSE changed from (1.92 to 3.89 µg m −3 ) for the ANN-1 model to (1.64 to 3.32 µg m −3 ) for the ANN-2 model (Appendix D), showing a general improvement in model performance.It is worth mentioning that the MB and RMSE at sites 6 and 8 decreased by around 2 µg m -3 , showing contrary results to the MLR models.

The Site-Variations of Model Performance
The model performance with the influence of vertical predictors on PM 2.5 prediction at different category of sites were investigated.Figure 3 demonstrates the differences in statistical scores between the models using set 1 and set 2 predictors at selected air quality sites.Table 4 shows the averages of statistic scores of each category of sites.The statistical scores at each site are listed in Appendices C and D. Overall, the results showed variations in model performance across the state, reflecting the different air quality characteristics.
For rural sites, the RMSEs of the MLR-2 model at site 6 and 8 increased from 3.22 and 3.29 µg m −3 to 4.13 and 3.65 µg m −3 , respectively, compared to the MLR-1 model.The MBs at the two sites also increased, showing that the model performance degraded when applying vertical predictors to the MLR model.In contrast, the performance of the ANN models at three rural sites showed significant improvement with increased R 2 and decreased MB and RMSE.The contrasting performance of the MLR and ANN models could be attributed to the complexity and nonlinearity of the influences of vertical predictors on PM 2.5 concentrations, which could not be learned by the MLR models.Additional predictors could even reduce the significance of the correlations between other predictors and PM 2.5 concentrations in the MLR models.It is worth noting that the testing results at site 6 (Whiteface Mountain site, the fifth highest mountain in NYS) showed the most significant differences after applying vertical predictors for both the MLR and ANN models.This is probably because the PM 2.5 concentrations at Whiteface Mountain are mainly affected by meteorological conditions and transported aerosols [43].
For NYC sites, both the averaged values (Table 4) and testing results at individual site (Appendix C) showed comparable statistical scores for two MLR models.Although the performance of the ANN models showed neutral to positive impacts when applying vertical predictors, the differences in statistical scores were less significant than those at rural sites.This is probably because the air quality of NYC sites is influenced by local anthropogenic emissions and photochemical reactions near the surface, and thus the influence of vertical mixing is limited.
For UNY sites, the statistical scores of the MLR models were comparable, showing limited influence of vertical predictors.As for the ANN models, model performance showed degradation with increased MB at most of the sites after applying vertical predictors (Appendix D).The reductions in RMSE at three sites were relatively minor compared to the increases at two sites, resulting in an increased average value.This may be due to the spatial variability among UNY sites.Unlike NYC sites, UNY sites are a group of urban/suburban sites across the state influenced by different local emissions.The influences of vertical mixing varied among these sites, leading to the degraded testing results on average.Additionally, to better understand the influence of vertical predictors on PM 2.5 prediction at each category of sites, the testing results at three sites with the lowest RMSEs were analyzed.The PS 314, Rockland County, and Rochester sites were selected as the representatives of NYC, rural and UNY sites, respectively.Additionally, since vertical predictors showed limited contributions in the MLR models in previous discussions, only the results from the ANN models were discussed.Figure 4 illustrates the scatter plots between observed and predicted PM 2.5 concentrations from the ANN models and the corresponding data plots at selected sites.For PS 314 site (Figure 4a), data plots showed that the ANN-2 model had better in predicting spikes with smaller differences between observations and predictions, compared to the ANN-1 model.Although the differences were small, the positive effects of vertical predictors on PM 2.5 concentrations at NYC sites were not neglectable.Similarly, the testing results of the ANN models at Rockland County site (Figure 4b) showed that the ANN-2 model had better capability of predicting spikes.Although the ANN-2 model still showed outliers in the scatter plot, it provided more accurate estimations with higher R 2 and lower RMSE.On the other hand, the testing results from two ANN models at Rochester site (Figure 4c) were comparable, showing a limited influence of vertical mixing.

The Contributions of Predictors to Surface PM2.5 Concentrations
In this section, the regression coefficients generated by the MLR models and variable importance estimated from the ANN models were investigated.Figures 5-7 show the signs of regression coefficients and variable importance at the PS 314, Rochester, and Rockland County sites, respectively.Note that Lat, Lon, and Alt are fixed values at each site, thus shuffling them did not affect the results.The values of regression coefficients are listed in Appendix E.

The Contributions of Predictors to Surface PM 2.5 Concentrations
In this section, the regression coefficients generated by the MLR models and variable importance estimated from the ANN models were investigated.Figures 5-7 show the signs of regression coefficients and variable importance at the PS 314, Rochester, and Rockland County sites, respectively.Note that Lat, Lon, and Alt are fixed values at each site, thus shuffling them did not affect the results.The values of regression coefficients are listed in Appendix E.      For all sites, MERRA-2 PM2.5 concentration and T showed the highest importance.The positive regression coefficients of MERRA-2 PM2.5 concentration were related to the positive effects of aerosol removal processes and chemical reactions on PM2.5 concentrations.The positive coefficients of T were consistent with previous studies [10,11], which showed warm condition was conductive for photochemical formation of secondary aerosols.Thus, the high importance of two predictors collectively indicated that aerosol removal processes and chemical reactions played a dominant role in PM2.5 concentration during the daytime.Furthermore, the high importance of PBLH at the PS 314 and Rochester sites indicated the significant dispersion effect in the PBL.During the daytime, radiative heating on the surface led to a sharp increase in PBLH, resulting in decreasing the PM2.5 concentration due to stronger dispersion.The weekday also showed significant importance at all sites.The negative coefficients between weekday and PM2.5 concentrations may be associated with the anthropogenic emissions from industries and traffic during weekdays.
Additionally, surface wind fields, U and V, and VWSs showed moderate importance.The positive coefficients of U and V indicated the positive effects of westerly (positive U) and southerly (positive V) winds, respectively.This was consistent with previous studies, which showed that high PM2.5 concentrations were associated with transported aerosols driven by westerly and southwesterly flows [41,42].Moreover, the negative coefficients of M-VWS and L-VWS showed that weak air mass exchanges above the PBL and stable conditions in the PBL potentially led to high PM2.5 concentrations, respectively.These results could be associated with the frontal systems caused by the westerly mid-latitude cyclones, which are common during summer in the Northeast US [42,66].Additionally, the strong VWS at higher troposphere could be beneficial for the downward transport of long-range transported aerosols (e.g., smoke aerosols), resulting in the positive correlation between H-VWS and PM2.5 concentrations.
At the PS 314 site (Figure 5), the VWS at three levels played a relatively weak role in PM2.5 concentrations compared to the surface conditions.This was consistent with previous discussion, which mentioned that the influence of local ambient conditions on PM2.5 concentrations were more significant than vertical mixing at NYC sites.In contrast, M-VWS and L-VWS at the Rochester site (Figure 6) showed comparable importance with surface winds and RH, indicating a similar, even stronger, influence of vertical mixing than surface ambient conditions.Similar phenomenon was also For all sites, MERRA-2 PM 2.5 concentration and T showed the highest importance.The positive regression coefficients of MERRA-2 PM 2.5 concentration were related to the positive effects of aerosol removal processes and chemical reactions on PM 2.5 concentrations.The positive coefficients of T were consistent with previous studies [10,11], which showed warm condition was conductive for photochemical formation of secondary aerosols.Thus, the high importance of two predictors collectively indicated that aerosol removal processes and chemical reactions played a dominant role in PM 2.5 concentration during the daytime.Furthermore, the high importance of PBLH at the PS 314 and Rochester sites indicated the significant dispersion effect in the PBL.During the daytime, radiative heating on the surface led to a sharp increase in PBLH, resulting in decreasing the PM 2.5 concentration due to stronger dispersion.The weekday also showed significant importance at all sites.The negative coefficients between weekday and PM 2.5 concentrations may be associated with the anthropogenic emissions from industries and traffic during weekdays.
Additionally, surface wind fields, U and V, and VWSs showed moderate importance.The positive coefficients of U and V indicated the positive effects of westerly (positive U) and southerly (positive V) winds, respectively.This was consistent with previous studies, which showed that high PM 2.5 concentrations were associated with transported aerosols driven by westerly and southwesterly flows [41,42].Moreover, the negative coefficients of M-VWS and L-VWS showed that weak air mass exchanges above the PBL and stable conditions in the PBL potentially led to high PM 2.5 concentrations, respectively.These results could be associated with the frontal systems caused by the westerly mid-latitude cyclones, which are common during summer in the Northeast US [42,66].Additionally, the strong VWS at higher troposphere could be beneficial for the downward transport of long-range transported aerosols (e.g., smoke aerosols), resulting in the positive correlation between H-VWS and PM 2.5 concentrations.
At the PS 314 site (Figure 5), the VWS at three levels played a relatively weak role in PM 2.5 concentrations compared to the surface conditions.This was consistent with previous discussion, which mentioned that the influence of local ambient conditions on PM 2.5 concentrations were more significant than vertical mixing at NYC sites.In contrast, M-VWS and L-VWS at the Rochester site (Figure 6) showed comparable importance with surface winds and RH, indicating a similar, even stronger, influence of vertical mixing than surface ambient conditions.Similar phenomenon was also found at Rockland County site (Figure 7), with higher importance of M-VWS and H-VWS than surface winds, RH and PBLH.In addition, the low importance of PBLH and L-VWS at the Rockland County site may indicate a weak PBL-PM 2.5 correlation, since the PM 2.5 concentrations at rural sites are relatively low and not dominated by local emissions.
The positive correlation between AOD and PM 2.5 concentrations was expected.Since AOD is an indicator of total-column aerosol loadings, higher AOD represents higher PM 2.5 concentration particularly when most of the columnar aerosol loading is present in the PBL.According to previous studies, AOD showed significant impacts on PM 2.5 concentrations with high variable importance [32].However, in this study, AOD had moderate importance and relatively weak influence compared to most of the other predictors.This could be due to the application of MERRA-2 PM 2.5 concentration.MERRA-2 is an aerosol reanalysis, which consists of both model simulations and observation assimilation.Therefore, MERRA-2 PM 2.5 concentration already includes AOD information from the assimilated AOD measurements.The AOD assimilation in MERRA-2 could contribute to the high importance of MERRA-2 PM 2.5 concentration.Since aerosol removal processes and chemical reactions were dominant during the daytime, MERRA-2 PM 2.5 concentration showed higher importance than AOD alone.The model results without MERRA-2 PM 2.5 concentration (not shown) also indicated the significant contributions of aerosol photochemical reactions with the highest importance of T.
Based on the definition, high AP_ratio reflects the presence of aloft aerosol layers and positive W_avg represents the atmospheric downward mixing.With strong downward mixings, aloft aerosols could descend to the surface, resulting in increasing the surface PM 2.5 concentrations.The positive coefficients at all sites indicated the positive effects of aloft aerosols on PM 2.5 concentrations.However, the low importance of the two predictors showed that the influence of aloft aerosols was relatively minor.It was probably because of the limited cases of the presence of high PM 2.5 concentrations and aloft aerosols, and the capability of AP_ratio in representing aloft aerosols.In addition, the low importance of PS could be due to its dependence on Alt.Atmospheric pressure provides information about circulation patterns, synoptic weather systems, and atmospheric stability conditions.Since PS decreases with height, such relation may dilute the connection between PS and meteorology, and the correlation between PS and PM 2.5 concentrations, especially for UNY and rural sites with a wide range of Alt values.

Conclusions
In NYS, episodic high PM 2.5 concentrations due to transported aerosols have been reported in summer.Driven by synoptic downward mixing and PBL entrainment, pollutants potentially transport from free troposphere into the PBL and affect PM 2.5 concentrations near the surface.In light of the contributions of transported aerosols to high PM 2.5 concentrations in NYS, understandings of the relationships between the vertical mixing of aloft aerosols and PM 2.5 concentrations are important.This study investigated the influences of various factors on the PM 2.5 concentrations in NYS by analyzing the testing results of multiple linear regression (MLR) and artificial neural network (ANN) models trained with two sets of predictors.Overall, the ANN models performed better than MLR models.Additionally, for the ANN models, the predictors of vertical mixing and aloft aerosols improved the MB, R 2 and RMSE by 0.34, 0.02 and 0.17 µg m −3 , respectively, on average.Although RMSEs around 3 µg m −3 were relatively high for NYS, as the RMSEs were one-third and/or close to the annual PM 2.5 average concentrations of 2019, the improvements in model performance were non-negligible.The leave-one-out cross-validation results showed significant site-variations and were able to differentiate predictor-PM 2.5 correlations at sites with different air quality characteristics.The model improvement due to vertical mixing and aloft aerosols was more significant at rural sites, where the PM 2.5 concentrations are mainly affected by meteorology and transported aerosols.The changes in model performance at UNY sites varied among sites, probably due to the spatial variability within a wide region of UNY.In addition, a joint analysis of regression coefficients and variable importance provided insights into the contributions of selected predictors to PM 2.5 concentration.The aerosol removal process and chemical reactions showed the highest importance at three categories of sites (UNY, NYC and rural), and the contribution of vertical mixing was more significant at UNY and rural sites.However, the influence of aloft aerosols was limited in the current results.Identifying the cases of high PM 2.5 concentrations associated with transported aerosols prior to training may provide more significant results and better understanding of the influence of aloft aerosols.

Figure 1 .
Figure 1.Topographic map with the PM2.5 monitoring sites used in this study.Orange, blue and red circles are UNY, rural and NYC sites, respectively.Site labels are referred from Table2.

Figure 1 .
Figure 1.Topographic map with the PM 2.5 monitoring sites used in this study.Orange, blue and red circles are UNY, rural and NYC sites, respectively.Site labels are referred from Table2.

Figure 2 .
Figure 2. Statistical scores of the testing results of the (a-1,a-2) MLR and (b-1,b-2) ANN models with (1) set 1 and (2) set 2 predictors.Site labels are referred from Table 2. Diamonds, crosses and circles are UNY, rural and NYC sites, respectively.

Figure 4 .
Figure 4. Scatter plots, with PM2.5 observation as x-axis and prediction as y-axis, and data plots, with available data point as x-axis and PM2.5 concentration as y-axis, of the testing results of the ANN models at (a-1,a-2) PS 314, (b-1,b-2) Rockland County and (c-1,c-2) Rochester sites.The data point in data plots are composited from four summers and each point represents one day.

Figure 4 .
Figure 4. Scatter plots, with PM 2.5 observation as x-axis and prediction as y-axis, and data plots, with available data point as x-axis and PM 2.5 concentration as y-axis, of the testing results of the ANN models at (a-1,a-2) PS 314, (b-1,b-2) Rockland County and (c-1,c-2) Rochester sites.The data point in data plots are composited from four summers and each point represents one day.

Atmosphere 2020, 11 , 1303 13 of 22 Figure 5 .
Figure 5. Variable importance of (a) set 1 and (b) set 2 models at PS 314 site.Values indicate the permutation importance estimated from the ANN models, and colors indicate the signs of regression coefficients in the MLR models.

Figure 6 .
Figure 6.Variable importance of (a) set 1 and (b) set 2 models at Rochester site.Values indicate the permutation importance estimated from the ANN models, and colors indicate the signs of regression coefficients in the MLR models.

Figure 5 . 22 Figure 5 .
Figure 5. Variable importance of (a) set 1 and (b) set 2 models at PS 314 site.Values indicate the permutation importance estimated from the ANN models, and colors indicate the signs of regression coefficients in the MLR models.

Figure 6 .
Figure 6.Variable importance of (a) set 1 and (b) set 2 models at Rochester site.Values indicate the permutation importance estimated from the ANN models, and colors indicate the signs of regression coefficients in the MLR models.

Figure 6 .
Figure 6.Variable importance of (a) set 1 and (b) set 2 models at Rochester site.Values indicate the permutation importance estimated from the ANN models, and colors indicate the signs of regression coefficients in the MLR models.

Figure 7 .
Figure 7. Variable importance of (a) set 1 and (b) set 2 models at Rockland County site.Values indicate the permutation importance estimated from the ANN models, and colors indicate the signs of regression coefficients in the MLR models.

7 .
Variable importance of (a) set 1 and (b) set 2 models at Rockland County site.Values indicate the permutation importance estimated from the ANN models, and colors indicate the signs of regression coefficients in the MLR models.

Table 1 .
List of variables used in this study.

Table 2 .
List of PM 2.5 monitor sites in NYS used in this study.Site labels are referred to in Figure 1.Sites using Federal Equivalence Method (FEM) are marked with asterisks *.Sites 1-5, 6-8 and 9-21 are UNY, rural and NYC sites, respectively.

Table 3 .
Averaged statistical scores of four models at 21 selected sites.

Table 4 .
Averaged statistical scores of four models at rural, NYC and UNY sites.

Table 4 .
Averaged statistical scores of four models at rural, NYC and UNY sites.