Evaluation of the Street Canyon Level Air Pollution Distribution Pattern in a Typical City Block in Baoding, China

Urban traffic pollution, which is strongly influenced by the complex urban morphology, has posed a great threat to human health. In this study, we performed a high-resolution simulation of traffic pollution in a typical city block in Baoding, China, based on the Parallelized Large-eddy simulation Model (PALM), to examine the distribution patterns of traffic-related pollutants and explore their relationship with urban morphology. Based on the model results, we conducted a multi-linear regression (MLR) analysis and found that the distribution of air pollutants inside the city block was dominated by both traffic emissions and urban morphology, which explained about 70% of the total variance in spatial distribution of air pollutants. Excluding the contribution of emissions, over 50% of the total variance can still be explained by the urban morphology. Among these urban morphological factors, the key factors determining the spatial distribution of air pollution are “Distance from the road” (DR), “Building Coverage Ratio” (BCR) and “Aspect Ratio” (H/W) of the street canyon. Specifically, urban areas with lower Aspect Ratio, lower BCR and larger DR are less affected by traffic pollution. Compiling these individual factors, we developed a complex Urban Morphology Pollution Index (UMPI). Each unit increase in UMPI is associated with a one percent increase of nearby traffic pollution contribution. This index can help urban planners to semi-quantitatively evaluate building groups which tend to trap or ventilate traffic pollution and thus help to reduce human exposure to street canyon level pollution through either traffic emission control or urban morphology amelioration.


Introduction
Over 60% of the Chinese population live in urban areas, where traffic emissions are a main source of air pollutants. Fine particulate matter (PM 2.5 ) and ozone (O 3 ) can lead to 650,000 and 70,000 premature deaths in urban areas each year, respectively [1,2]. These deaths are closely related to urban residents who live close to street canyons and experience chronic exposure to traffic pollution. Therefore, understanding the source, dispersion and distribution of the air pollutants in urban areas is the key to reducing the exposure of urban residents and protecting human health.
The shape and clustering of building groups (including the form of street canyons) play an important role in determining the air flow movements, pollution dispersion processes and accumulation of pollutants in urban areas. Deep street canyons have been found to be associated with the formation of vortices, which remarkably strengthen the accumulation of pollutants at the ground level [3]. Fu et al. [4] found that canyons with building heights over 40 m and a high asymmetric ratio lead to a significant increase in human exposure to traffic pollution. However, as for the pollution distribution inside building groups, there is still substantial uncertainty regarding the effects of building groups on the dispersion processes. To assess the contributions of building group shapes to air pollution distribution, a set of parameters describing the urban morphology have been proposed. For example, Edussuriya et al. [5] identified the correlations between NO x , CO, PM 2.5 and urban morphological indicators (e.g., the aspect ratio, the building compactness and mean building height) in Hong Kong and found that in-site fabrics have strong impacts on air quality. Yang et al. [6] revealed that the pedestrian wind speed ratio has good correlation with several parameters, including the building density and average height. However, most of the previous studies mainly focused on the correlation with a single factor and ignored the overall effects of the urban morphology.
To investigate the spatiotemporal distribution patterns of air pollutants within urban block areas, we apply the Parallelized Large-eddy simulation Model (PALM) for a simulation of a typical urban areas in Baoding, China. Large Eddy Simulation (LES) models have been successfully applied in urban areas. For example, Idrissi et al. [7] applied an LES model to study air pollution dispersion within an area of 600 m × 580 m with complex urban morphology. Xavier et al. [8] used LES models to compare the performance of air pollution simulations on different types of urban layouts. With the model results, we specifically focus on the investigation of correlations between the urban morphological parameters and the spatiotemporal distribution of air pollutants, based on which we then introduce a new prediction index for quick positioning of the areas highly affected by air pollution in urban blocks. This index, calculated with the characteristics of urban forms, is aimed at quick semi-quantitative estimation of urban air pollution distribution, and will help policy makers and urban planners to optimize the planning of urban building groups to reduce urban air pollution.
The model and its performance are described in Section 2. We then investigate their constraints with both traffic emissions and different urban morphology parameters in Section 3.1. Section 3.2 contains the introduction to our newly developed urban morphology index. Finally, conclusions are drawn in Section 4.

Model Description
In this study, we simulate the air pollution dispersions inside urban street canyons based on the Parallelized Large-eddy simulation Model (PALM) [9]. PALM has been used for a variety of boundary layer studies over the last 15 years, such as heterogeneously heated convective boundary layers [10,11], urban canopy flows [12,13] and cloudy boundary layers [14,15]. The participation in the first intercomparison of LES models in terms of the stable boundary layer also proved its ability to perform simulations with a resolution of down to 1 m [16].
PALM uses the governing equations of non-hydrostatic, filtered, incompressible Navier-Stokes equations in the Boussinesq-approximated form. An upwind-biased fifthorder differencing scheme [17,18] and a third-order Runge-Kutta scheme [19][20][21][22] were chosen for time and space discretization, respectively. More details about the PALM governing equations and the model structure can be found in Maronga et al. [23].
In this study, the PALM model is applied for the ultra-fine-solution, urban-scale air pollution simulation. PALM allows for a scalability of up to 50,000 processor cores, which allows for simulations with ultra-fine grids. Several previous studies have applied the PALM model for high-resolution simulations in urban atmospheric environments. With the newly developed PALM-4U (short for PALM for urban atmospheric boundary layers) components, the PALM model is especially suitable for the simulation of complex urban layouts, just as we investigate in this study.

Model Configuration and Evaluation
Study Area and Time A 1 km × 1 km × 200 m city block in the downtown area of Baoding, China, was chosen as the study area (see Figure 1). Baoding city is located about 200 km south of Beijing.
It is an important source of air pollution over the Beijing-Tianjin-Hebei (BTH) region and also suffers from severe air pollution problems itself. Previous studies have shown that Baoding is the major contributor to air pollution over the entire BTH area [24,25]. Therefore, Baoding is a representative city for urban pollution studies. As shown in Figure 1, the study area consists of three main streets and lots of buildings with different heights (see Figure S1 for the satellite map). We divided the study area into 200 × 200 × 40 grids, with the size of each grid equal to 5 m × 5 m × 5 m. The simulation period ranged from 22 July 2018 to 29 July 2018. The time resolution of the model output was 60 s.
components, the PALM model is especially suitable for the simulation of complex urban layouts, just as we investigate in this study.

Model Configuration and Evaluation
Study Area and Time A 1 km × 1 km × 200 m city block in the downtown area of Baoding, China, was chosen as the study area (see Figure 1). Baoding city is located about 200 km south of Beijing. It is an important source of air pollution over the Beijing-Tianjin-Hebei (BTH) region and also suffers from severe air pollution problems itself. Previous studies have shown that Baoding is the major contributor to air pollution over the entire BTH area [24,25]. Therefore, Baoding is a representative city for urban pollution studies. As shown in Figure 1, the study area consists of three main streets and lots of buildings with different heights (see Figure S1 for the satellite map). We divided the study area into 200 × 200 × 40 grids, with the size of each grid equal to 5 m × 5 m × 5 m. The simulation period ranged from 22 July 2018 to 29 July 2018. The time resolution of the model output was 60 s.

Model Configuration and Initialization
The PALM model system, version 6.0, revision 4233, is used in this study. The default "clear-sky" scheme is used for calculating radiation fluxes. The clear-sky scheme is a simple model calculating the shortwave incoming, shortwave outgoing, longwave incoming, longwave outgoing and, consequently, the net radiation at the surface. The land surface model and the urban surface model of PALM are used, respectively, for natural-type surfaces (e.g., vegetation, soil) and building surfaces. The surface roughness length is set based on the results from the static driver.
A chemical mechanism based on the PHSTAT mechanism (abbreviation for photostationary), which is one of the mechanisms included in PALM-4U, is applied for chemistry simulation. The PHSTAT mechanism is a simplified two-reaction mechanism describing the photostationary equilibrium between NO, NO2 and O3. We add CO as a new species into the mechanism and treat it as a passive gas, since it hardly takes part in chemical reactions in this scene. Particulate matter (PM) is not included in this study, because PM generally consists of primary and secondary components. For primary components, they hardly undergo chemical processes, and therefore show a similar pattern to CO. For secondary components, the chemical mechanism in this study is not sufficient for simulation. As a result, we choose to focus on the chemical processes between NOx and O3.
The boundary conditions (including the wind fields, chemical species and a list of other profiles, such as θ and q, were obtained from WRF-Chem (version 4.2) simulation results. The emission inventory used for WRF-Chem simulation is taken from the MEIC model. Due to the difference in spatial resolution between the two models, data from two

Model Configuration and Initialization
The PALM model system, version 6.0, revision 4233, is used in this study. The default "clear-sky" scheme is used for calculating radiation fluxes. The clear-sky scheme is a simple model calculating the shortwave incoming, shortwave outgoing, longwave incoming, longwave outgoing and, consequently, the net radiation at the surface. The land surface model and the urban surface model of PALM are used, respectively, for natural-type surfaces (e.g., vegetation, soil) and building surfaces. The surface roughness length is set based on the results from the static driver.
A chemical mechanism based on the PHSTAT mechanism (abbreviation for photostationary), which is one of the mechanisms included in PALM-4U, is applied for chemistry simulation. The PHSTAT mechanism is a simplified two-reaction mechanism describing the photostationary equilibrium between NO, NO 2 and O 3 . We add CO as a new species into the mechanism and treat it as a passive gas, since it hardly takes part in chemical reactions in this scene. Particulate matter (PM) is not included in this study, because PM generally consists of primary and secondary components. For primary components, they hardly undergo chemical processes, and therefore show a similar pattern to CO. For secondary components, the chemical mechanism in this study is not sufficient for simulation. As a result, we choose to focus on the chemical processes between NO x and O 3 .
The boundary conditions (including the wind fields, chemical species and a list of other profiles, such as θ and q, were obtained from WRF-Chem (version 4.2) simulation results. The emission inventory used for WRF-Chem simulation is taken from the MEIC model. Due to the difference in spatial resolution between the two models, data from two national monitoring sites located to the northeast and southwest of the study area were used for calibration for different boundaries. The data retrieved from the northeast site were applied to set the north and east boundaries of the area, while the data acquired from the southwest site were used to establish the south and west boundaries. The top boundary was set based on the mean of the data obtained from the two sites. The position of the national sites and the study area are shown as Figure S3 in the supporting information (SI).
Previous studies have reported that, compared to traffic emissions, residential emissions are minor in urban areas [26][27][28][29][30][31]. Moreover, there are no industrial facilities in the study area. Therefore, only traffic emissions are included in the simulation. During this study period, we counted the vehicle numbers on different roads via camera footage. The vehicles were categorized into four types: light-and heavy-duty passenger vehicles and light-and heavy-duty trucks. The number of each type of vehicle was multiplied by a corresponding emission factor to estimate traffic emissions. The emission factors were obtained from Chen et al. [32].

Model Evaluation
We validated the model results using the data from the two nearest monitoring sites (see Figure S3 for the position of the sites). Figure 2 shows the time series of the observed and modelled species, i.e., O 3 , NO 2 and CO, from 22 July 2018 to 29 July 2018 (a scatter plot is also attached, please refer to Figure S4). Generally, the simulation results fit the observations well during most of the study time, and the results from monitoring site 2 are closer to the model results than those from monitoring site 1. This is because monitoring site 2 is located closer to the study area. For O 3 , both the model and observation data show similar diurnal cycles, which is related to the photochemistry of O 3 . Some exceedance in the morning can be explained by the morning traffic peak, which contributes to large amounts of NO x emissions. The modelling results for CO attain the best fit with the monitored data, which is because the lifetime of CO is much longer than that of NO 2 , and it is less influenced by chemical reactions. In addition, we also measured the CO and NO 2 concentrations at a roadside site during the morning and afternoon from 27-29 July ( Figures S5 and S6). The model results show relatively good fits with the roadside monitoring data. The results of NO 2 are slightly higher than the monitored data, while the results of CO are generally in agreement. Another simulation on an area of the same size (we call it the "test area"), where the national site 2 is located, was also conducted. (Figures S7 and S8) The results show that PALM can reach satisfying simulation results. Overall, the model results show high consistency with the observational data and are reliable for further analysis.
the southwest site were used to establish the south and west boundaries. The top bound-ary was set based on the mean of the data obtained from the two sites. The position of the national sites and the study area are shown as Figure S3 in the supporting information (SI).
Previous studies have reported that, compared to traffic emissions, residential emissions are minor in urban areas [26][27][28][29][30][31]. Moreover, there are no industrial facilities in the study area. Therefore, only traffic emissions are included in the simulation. During this study period, we counted the vehicle numbers on different roads via camera footage. The vehicles were categorized into four types: light-and heavy-duty passenger vehicles and light-and heavy-duty trucks. The number of each type of vehicle was multiplied by a corresponding emission factor to estimate traffic emissions. The emission factors were obtained from Chen et al. [32].

Model Evaluation
We validated the model results using the data from the two nearest monitoring sites (see Figure S3 for the position of the sites). Figure 2 shows the time series of the observed and modelled species, i.e., O3, NO2 and CO, from 22 July 2018 to 29 July 2018 (a scatter plot is also attached, please refer to Figure S4). Generally, the simulation results fit the observations well during most of the study time, and the results from monitoring site 2 are closer to the model results than those from monitoring site 1. This is because monitoring site 2 is located closer to the study area. For O3, both the model and observation data show similar diurnal cycles, which is related to the photochemistry of O3. Some exceedance in the morning can be explained by the morning traffic peak, which contributes to large amounts of NOx emissions. The modelling results for CO attain the best fit with the monitored data, which is because the lifetime of CO is much longer than that of NO2, and it is less influenced by chemical reactions. In addition, we also measured the CO and NO2 concentrations at a roadside site during the morning and afternoon from 27-29 July (Figures S5 and S6). The model results show relatively good fits with the roadside monitoring data. The results of NO2 are slightly higher than the monitored data, while the results of CO are generally in agreement. Another simulation on an area of the same size (we call it the "test area"), where the national site 2 is located, was also conducted. (Figures S7 and  S8) The results show that PALM can reach satisfying simulation results. Overall, the model results show high consistency with the observational data and are reliable for further analysis.

Data Analysis Parameters Describing the Urban Form
The interpretation of urban form in different studies can be quite different. Previous studies regarding urban morphology are mainly based on qualitative descriptions of compactness and building heights [33]. Here, we adopted some parameters from Adolphe et al. [34] to describe the morphological characteristics of the urban block area, including the building coverage ratio (BCR), rugosity, porosity and occlusivity more precisely. In addition, two important parameters describing the street canyon form were taken into consideration: the asymmetry ratio (H 1 /H 2 ) and aspect ratio (H/W) [35]. The corresponding definitions of these parameters and their calculations are shown in Table 1 (for detailed definitions of the symbols in the equation, please refer to Section S2 in the SI). Some of these parameters are easy to understand, while others are not. For example, distance from the main road (DR), BCR and Rugosity are generally clear indices, which, respectively, describe a certain grid's position, along with the density and height of its surrounding area. For other parameters which are not so straightforward, we give a brief discussion here. There are several different ways of interpreting the concept of porosity. Here, we adopt the definition given by T. Gál et al. [36], which is an index for measuring how penetrable the area is for the airflow. Therefore, areas with high porosity are generally associated with low building density (BCR), but there still remain some exceptions. Occlusivity is another important index, which, to some extent, provides 3-D information of the area. It is defined as the average of the ratios of the perimeter of built area to unbuilt area on all the layers. The layer height here is chosen to be 3 m, which is generally the height of a storey. Note that a building's margin is calculated both in the perimeter of built and unbuilt areas, so a high occupation rate of buildings can lead to a high ratio in its layer. The difference between occlusivity and the BCR lies in that low buildings have relatively lower contributions to occlusivity, while they are completely calculated in the BCR. In high layers where few buildings exist, the low ratio of the perimeters can significantly lower the occlusivity. Therefore, to distinguish from BCR, occlusivity is defined as the "openness" of the urban areas. As for the street canyons, aspect ratio and asymmetry ratio are commonly used indices. The only thing to be noted is that the asymmetry ratio is between 0 and 1 in our study. A multicollinearity test is attached in the SI to confirm that no severe collinearity exists between the parameters (Table S1).

Multi-Linear Regression
The parameters described in Table 1, along with the emission values on the road areas, were used to build up a multi-linear regression model. Two parameters, i.e., asymmetry ratio and the aspect ratio, are related to street canyon characteristics and are calculated based on the topography data in the range of −50 to +50 m along the direction of the road. The other urban morphological parameters were calculated based on the building groups in an area of 100 m × 100 m, with the grid located at the centre. If the grid was not on the main street, the distance to the nearest main street was also adopted. In this way, each grid contained a group of parameters representing the form of its neighbouring buildings. The spatial patterns of these parameters are shown in Figure S2. These parameters serve as the independent variables in the regression models. Since all the parameters are from or calculated from measured data, the variables in the regression model are all continuous data.

Effects of Urban Morphological Factors on Dispersion of Traffic Pollutants
The spatial distribution of the air pollutants can be found in Section S4 in the SI. An EOF analysis is also conducted to determine the contribution of various factors on the studied area. From the results, ground-level distribution of air pollution is determined not only by traffic-related emissions, but also by other factors, including background transport and urban morphology (please refer to Section S4 and Figures S10-S18 in the SI for detailed discussion). To quantitatively understand the contribution from individual factors, we constructed a multi-linear regression model and used traffic emissions and the abovementioned seven urban morphological parameters as the independent variables to explore their association to the mean concentration distribution of individual air pollutants. For each grid at the ground level over the study domain, we have a group of data, including mean pollution concentration ( Figure S9), emissions and seven building morphological parameters ( Figure S2). The grids occupied by buildings are omitted and we finally obtain 23,338 sets of data. The regression results are shown in Table 2 (see Table S5 for the standard error). The algebraic equation of the regression model in Table 2 takes the following form: where C refers to concentration and E refers to emission. Other abbreviations can be found in Table 1. The unit for emission is "µg·m −2 ·s −1 ", and units for DR and Rugosity are meters (m) Other variables are ratios and do not have units. From Table 2, all pollutants except O 3 have an R 2 over 70%, showing that the selected eight parameters can explain most of the spatial distribution of traffic pollutants. O 3 has an R 2 of 47%, much lower than the other pollutants, indicating that O 3 , as a secondary pollutant, is only indirectly related to traffic emissions (through the NO x titration process). These traffic-related parameters, e.g., emissions and DR, are generally having opposite effects on O 3 compared to other pollutants.
For NO 2 and CO, the correlation for all parameters is generally significant, except for the correlation between "Rugosity" and CO. Among the eight parameters, "DR", "Rugosity" and "Occlusivity" are negatively correlated with the spatial distribution of air pollution. The negative correlation of "DR" shows that the longer the distance from the road, the lower the ground-level pollution concentration, mainly reflecting the dispersion pattern of traffic emissions. The coefficient of NO 2 is about twice that of CO, showing that NO 2 has a larger change when going away from the main roads. Increased building heights can lead to deeper canyons, where the ground-level wind speed becomes larger and contributes to a quicker dispersion, resulting in lower pollution levels. This is why "Rugosity" has a negative correlation. As previously discussed, "Occlusivity" indicates the openness of the building groups, and the negative correlation with pollution distribution indicates that areas with higher openness have higher pollution levels. However, by common sense, higher "Occlusivity", i.e., lower openness, should give rise to worse ventilation and, finally, lead to higher pollution. Therefore, to explain this contradiction and reveal the true effect of "Occlusivity", we also conducted an additional correlation analysis to differentiate the effects between road and non-road areas (Table S2). The results show that "Occlusivity" is the only parameter with opposite correlation results between road and non-road areas. Over the non-road areas, "Occlusivity" has a positive correlation, just as expected. The negative correlation in road areas, however, is because the lower openness in the roads is related to narrow parts of the streets, where traffic numbers and its related emissions are lower. Since traffic emissions are very important in the studied block, the correlations with road areas are more significant in the correlation, and the combination of the different effects over road and non-road areas finally results in the negative correlation.
The remaining five parameters are all positively correlated with pollution distribution. "Emission" is the source of pollutions, and this result is expected. As for "Aspect Ratio" and "Asymmetry", which describe the shape of street canyons, higher "Aspect Ratio" (which implies a deeper street canyon) and "Asymmetry" values (in this case closer to 1, which implies a more asymmetric canyon) lead to a higher pollution concentration within the street canyon, which is consistent with previous studies. "BCR" is related to building density, and areas with higher "BCR" values usually have worse ventilation, resulting in ground-level pollution accumulation. "Porosity" reflects the open volume ratios of the building groups. Higher "Porosity" values mean there is less space occupied by buildings, which can enhance the air flow and lead to stronger ventilation. However, another effect, which is that traffic-emitted pollutants are carried by the air flow into the residential areas, also have significant influence and override the effects of dispersion in this case, and finally results in the positive correlation.
The correlation patterns for O 3 are generally opposite to the results of NO 2 (Table S2), as the titration by NO x consumes O 3 and increases NO 2 concentrations at the same time. Therefore, in areas where NO 2 have high concentration, O 3 has relatively low concentration. This also indicates that controlling traffic emissions over a small city block may lower local NO x and CO concentrations, but has little effect on O 3 . The most effective way to control O 3 concentration over the urban area is to reduce the background contribution, which requires sophistically controlling both VOCs and NO x emissions over a much broader area.
As previously discussed, traffic emissions are the most important factor in the area. Therefore, it accounts for a large proportion of the pollution distribution. Nevertheless, the effects of urban form also have important effects. To further explore the effects of city form on pollution distribution, we also conducted a correlation with the emission term excluded. The results are shown in Table 3 (see Table S6 for the standard error). The algebraic equation of the regression model in Table 3 takes the following form: where C refers to concentration. Other abbreviations can be found in Table 1. The units for DR and Rugosity are meters (m). Other variables are ratios and do not have units. Without the effects of emission, the regression models for NO 2 and CO still have an R 2 over 50%, showing that urban form still has reasonably large effects on pollution distribution. Compared to the previous regression, the results for individual morphology parameters are similar, except that "Rugosity" becomes insignificant in this case. These results indicate that a better designed urban form may be helpful for enhancing air ventilation and reducing pollution accumulation, resulting in a better urban air quality and a lower human exposure with the same traffic emissions. We also calculated the association rates between the dependent and each independent variable in both road areas and non-road areas. All the independent and dependent variables are normalized into the range of 0 to 1. The simple correlation results are shown in Table 4 (see Table S7 for the standard error). The regression coefficient of each independent variable here means the rate of the change of the dependent variable is associated to one unit change in the corresponding independent variable. For example, the regression coefficient between the standardized NO 2 concentration and the standardized NO 2 emission over the road area is 0.46, which means that every additional percentage of emission leads to a 0.46% increase in NO 2 concentration. Table 4. Simple correlation coefficients between the standardized pollutants' concentration (i.e., NO 2 , O 3 , CO) and individual standardized parameters over the road and non-road areas at the ground level. The results show that the correlation coefficients over the road areas are generally larger than those in the non-road areas. This is mainly because the emissions in the road areas can lead to larger rangeability in the concentration than in the non-road areas. From the results, "Emission", "Aspect Ratio", "Asymmetry" and "Rugosity" have major impacts, i.e., an additional increase/decrease of over 10% occurs when these factors have a 1% change. Therefore, the above indices should be given more attention in future urban planning. Another interesting finding is that the correlation coefficient between O 3 and "Rugosity" (which is 0.73), indicating that high buildings should be avoided alongside the streets, as they cause large O 3 accumulation in the street canyons. As for the nonroad areas, the "DR" is much larger than other parameters, which indicates that the distance to the pollution source is the dominant factor over the non-road areas. The other parameters, e.g., BCR, Rugosity, Occlusivity and Porosity, though associated with smaller coefficients compared to "DR", can also significantly affect the traffic pollution dispersion, and thus could serve as the fine-tuning parameters in urban canopy design.

Urban Morphology Pollution Index (UMPI)
In order to collectively address how the building morphology characteristics affect pollution dispersion without the influence from variation of traffic emission, we devel-oped an Urban Morphology Pollution Index (UMPI, Equation (1)) based on the multiple regression coefficients between the standardized traffic-related pollution concentrations and urban morphological factors (Table S3). The final coefficient in the equation is obtained using the average of the coefficients for NO 2 and CO in Table S3. This is because O 3 has almost opposite correlation results to NO 2 , and it is mainly affected by the background concentration instead of morphology. The coefficient with the emissions is excluded and only the coefficients with morphological parameters remain to make sure that this index is only affected by the shape of the urban canopy. A rescale of the coefficients is then performed so that that every unit change of UMPI is associated with a 1% change (increase/decrease) of the corresponding pollutant. The final equation of the UMPI is as follows: UMPI = −25 * p(DR) + 14 * p(H/W) + 22 * p(H 1 /H 2 ) + 18 * p(BCR)−5 * p(Rugosity) − 7 * p(Occlusivity) + 5 * p(Porosity) + 30 (3) where p(X) represents each normalized urban morphological parameter (Equation (2)).
where X represents the original value of each parameter at a specific grid; X min is the minimum value of X over the whole domain; and X max is the maximum value of X over the whole domain. Figure 3 shows the spatial distribution of the UMPI of each grid over the 1 km × 1 km study domain. The correlation between the UMPI and the normalized concentrations are shown in Table 5. Results show that for the studied pollutants (i.e., NO 2 , O 3 , CO), an additional 1% increase/decrease occurs when the UMPI changes by 1 unit. This shows that the UMPI can represent the general effects of building morphology on the air pollution and can be used for a quick pollution estimation.
where X represents the original value of each parameter at a specific grid; Xmin is the minimum value of X over the whole domain; and Xmax is the maximum value of X over the whole domain. Figure 3 shows the spatial distribution of the UMPI of each grid over the 1 km × 1 km study domain. The correlation between the UMPI and the normalized concentrations are shown in Table 5. Results show that for the studied pollutants (i.e., NO2, O3, CO), an additional 1% increase/decrease occurs when the UMPI changes by 1 unit. This shows that the UMPI can represent the general effects of building morphology on the air pollution and can be used for a quick pollution estimation.  The UMPI can be used to evaluate how the size and shape of building clusters potentially enhance traffic-related air pollution. Figure 4 marks three typical areas with different UMPIs in the study area, and further decomposes the contribution of individual morphological parameters to the UMPI. From these two figures, the Aspect Ratio and the BCR are the main reasons for the high UMPIs, while DR and the Rugosity mainly explain the  The UMPI can be used to evaluate how the size and shape of building clusters potentially enhance traffic-related air pollution. Figure 4 marks three typical areas with different UMPIs in the study area, and further decomposes the contribution of individual morphological parameters to the UMPI. From these two figures, the Aspect Ratio and the BCR are the main reasons for the high UMPIs, while DR and the Rugosity mainly explain the lower values. Area 1 is typical for the low UMPIs in residential areas. These areas are generally a reasonable distance from the roads, and thus have lower UMPIs. Area 2 represents the high UMPIs in the road areas, which are mainly caused by the Aspect Ratio and the Porosity. In particular, the buildings next to area 2 are higher (notice the difference in the grayscale of the building), and this leads to a higher Aspect Ratio, thus a higher UMPI. As for Area 3 , even though it is close to the road, the high buildings have higher Rugosity, and contribute to a low UMPI.  We further collected the data from the "test area" for an additional test of the performance of the UMPI in other different areas. The results show that, except for NO2, the other two pollutants still keep the 1% relationship with the UMPI (Table S4). The coefficient with NO2 is about 0.7%, which is much smaller than the previous 1%. This is because NO2 has the shortest lifetime and is mostly affected by local factors, e.g., traffic emissions and the dispersion conditions. Note that the UMPI is completely based on the characteristics of the urban canopy and is not related to traffic emission, meteorology or any other influencing factors. The similar relationship obtained in two different areas provides powerful support for the effectiveness of the UMPI.
The results in two different areas show that the effects of the urban morphological parameters are commonly shared. However, as the difference in the NO2 results shows, some adjustment may be necessary when using this index for quantitative estimation. Nevertheless, we believe that the UMPI can be widely used in the estimation of pollution conditions with higher efficiency. For example, there is no need for urban planners to carry out a thorough investigation over the whole city. Instead, they can use this index to position the pollution hotspots in urban areas and apply specific improvements over these areas to achieve a better urban design.

Conclusions
In this study, we adopted an LES model, namely, the PALM, to simulate the air flow and pollution distribution in a city block of Baoding, China, and applied an MLR model to identify the factors influencing the pollution distribution in the city block. The results reveal that traffic emissions are the most important factor affecting the concentration in We further collected the data from the "test area" for an additional test of the performance of the UMPI in other different areas. The results show that, except for NO 2 , the other two pollutants still keep the 1% relationship with the UMPI (Table S4). The coefficient with NO 2 is about 0.7%, which is much smaller than the previous 1%. This is because NO 2 has the shortest lifetime and is mostly affected by local factors, e.g., traffic emissions and the dispersion conditions. Note that the UMPI is completely based on the characteristics of the urban canopy and is not related to traffic emission, meteorology or any other influencing factors. The similar relationship obtained in two different areas provides powerful support for the effectiveness of the UMPI.
The results in two different areas show that the effects of the urban morphological parameters are commonly shared. However, as the difference in the NO 2 results shows, some adjustment may be necessary when using this index for quantitative estimation. Nevertheless, we believe that the UMPI can be widely used in the estimation of pollution conditions with higher efficiency. For example, there is no need for urban planners to carry out a thorough investigation over the whole city. Instead, they can use this index to position the pollution hotspots in urban areas and apply specific improvements over these areas to achieve a better urban design.

Conclusions
In this study, we adopted an LES model, namely, the PALM, to simulate the air flow and pollution distribution in a city block of Baoding, China, and applied an MLR model to identify the factors influencing the pollution distribution in the city block. The results reveal that traffic emissions are the most important factor affecting the concentration in the studied city block, which is in line with previous studies. Traffic emissions mainly occur within street canyons, and greatly affect the ground-level concentration in nearby areas. Nevertheless, the effects of traffic emissions can decrease quickly with an increase in the distance from the road. Therefore, to mitigate human exposure to traffic exhaust, residential areas should be located beyond a certain distance from roads, and higher-storey buildings are preferred to enhance ventilation.
Urban morphological forms (i.e., the shapes of street canyons and building groups) are also proved to be important factors influencing pollution distribution. In regard to street canyons, the "Aspect ratio" is the most important factor. Street canyons with high "Aspect ratios", i.e., deep and narrow canyons, tend to accumulate more pollution, because wider street canyons can enhance air exchange and ventilation to reduce pollution concentration. In future urban planning, deep and narrow street canyons should be avoided. As for the shape of building groups, our results show that the main influencing factors are the "BCR", "Rugosity", "Occlusivity" and "Porosity". Among these factors, "Rugosity" is the only factor which has a negative correlation. Higher BCR refers to high density building groups, which is unfavorable for ventilation and air exchange, and hence causes pollution accumulation. "Occlusivity" represents the openness of building groups. Higher "Occlusivity" is also related to dense building groups, which are unfavorable for ventilation and pollution dispersion. "Porosity", the index for the open volume rate of building groups, is related to both the air exchange rate and the vulnerability of building groups (i.e., the chance that pollutants are carried into the building groups). In this case, the effects of background transport and traffic emissions are larger than the elimination from dispersion, leading to the result that higher porosity is related to higher pollution concentrations. Meanwhile, higher "Rugosity", i.e., higher building groups, can lead to an increase in ground-level wind speed, thus enhancing ventilation and weakening pollution accumulation.
Based on these influencing factors, we develop a new index, i.e., the UMPI, to comprehensively describe the effects of urban morphology on pollution distribution. Our results show that the UMPI has good correlation with the change of pollution concentration. A unit change in the UMPI leads to approximately a 1% change in the pollution concentration. The characteristics of the UMPI are close to the effects discussed above, where areas have low BCR and large DR can lead to a low UMPI. In addition, high buildings over the non-road areas can also help to reduce the UMPI, which is possibly related to the enhanced turbulence near high buildings.
In conclusion, the distribution of air pollutants within a certain city area is controlled by several factors altogether. Within the urban area, traffic emissions are important sources, and lead to the road-centered distribution pattern. As a result, residential areas should be separated from main roads by a buffer area. Deep and narrow street canyons should also be avoided to prevent accumulation within the canyon. Furthermore, the form of building groups should also be considered. In most cases, high-density buildings are unfavorable for pollution dispersion. High buildings with low density can enhance groundlevel ventilation by increasing wind speed, and thus help eliminate pollutants. Therefore, relatively separated high buildings may be a better choice for future cities.
We admit that all results from this study are based on the particular urban area in Baoding, China. Due to the limitation of data and potential changes in urban forms or traffic patterns, it is possible that the findings in this study may not be fully applicable for other places. Moreover, the study is based on the data during summertime. It is possible that in winter, due to various factors (e.g., less active photochemical reactions, heating, etc.), the pollution situation may be different. In the future, we will apply the framework of the UMPI proposed in this study to more cities to extensively test its effectiveness. We will also collect data of more pollutants in order to include other pollutants (e.g., benzopyrene, PM) into the model. Simulations in other seasons will also be carried out to explore the changes in patterns of different seasons.
Supplementary Materials: The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/ijerph191610432/s1, Figure S1: Satellite map of the study area, a 1 km × 1 km × 200 m city block in the downtown area of Baoding, China; Figure S2: Spatial patterns of (a) emissions and these morphological parameters including: (b) rugosity; (c) BCR; (d) porosity; (e) occlusivity; (f) distance from the main road; (g) asymmetry ratio; (h) aspect ratio. The parameters of the porosity, asymmetry ratio and aspect ratio are related to the features of the street canyons, so they exhibit values only along the roads; Figure S3: The position of the study area (the orange rectangle), the national sites (the two red triangles) and the monitoring sites for model validation (the two blue circles). Note that data from the national sites are used only for a calibration of the WRF-Chem results; Figure S4: Scatter plot of O 3 , NO 2 and CO concentration in the study area. The model data are resampled into a time resolution of 1 hour to match the observation data. (NMB: Normalized mean bias); Figure S5: The concentration of NO 2 from observation data and modelled data. The red line is the modelled data while the black line is the observation data. Time resolution is 5 min; Figure S6: Same as Figure S5, but for CO; Figure S7: Time series of the observed and modelled O 3 , NO 2 and CO concentrations from 22 July to 29 July 2018 over the test area. The red line shows the model data at the grid where the national site is located, and the black solid line is the spatially averaged model data. The black dashed line is the data from the national site; Figure S8: Same as Figure S4, but between the model results of the test area and the national site data; Figure S9 Figure  S10: The spatial modes of EOF1 to EOF3 on the ground level (z = 2.5 m) of (a) O3; (b) NO2; (c) CO. The eigenvalues of the corresponding EOF patterns are also shown; Figure S11: The time averaged pollution concentration for O 3 , NO 2 , CO and PM 10 , at the height z = 2.5 m; Figure S12: Same as Figure S11, but for z = 12.5 m; Figure S13: Same as Figure S11, but for z = 27.5 m; Figure S14: EOF correlation result of O 3 at z = 2.5 m. The figures on the left are the spatial patterns while those on the right are corresponding time series; Figure S15: Same as Figure S14, but for NO 2 ; Figure S16: Same as Figure S14, but for CO; Figure S17: Same as Figure S14, but for PM 10 ; Figure S18: EOF correlation result of NO 2 at z = 12.5 m. The figures on the left are the spatial patterns while those on the right are corresponding time series; Figure S19: EOF correlation result of NO 2 at z = 27.5 m. The figures on the left are the spatial patterns while those on the right are corresponding time series; Table S1: Multicollinearity test of the selected urban morphological parameters; Table S2: Correlation coefficients between the pollutants (i.e., O 3 , NO 2 , CO) and eight influencing factors individually in road and non-road areas at the ground-level; Table S3: Correlation coefficients between the standardized pollutants concentration (i.e., O 3 , NO 2 , CO, PM 10 ) and standardized influencing factors individually in road and non-road areas at the ground-level; Table S4: Correlation between the UMPI and the standardized pollution concentration (i.e., NO 2 , O 3 , CO) in the ground-level over the test area. Reference [37] is cited in Supplementary Materials.