The Context-Dependent Effect of Urban Form on Air Pollution: A Panel Data Analysis

There have been debates and a lack of understanding about the complex effects of urban-scale urban form on air pollution. Based on the remotely sensed data of 150 cities in the Beijing-Tianjin-Hebei agglomeration in China from 2000 to 2015, we studied the effects of urban form on fine particulate matter (PM2.5) and nitrogen dioxide (NO2) concentrations from multiple perspectives. The panel models show that the elastic coefficients of aggregation index and fractal dimension are the highest among all factors for the whole region. Population density, aggregation index, and fractal dimension have stronger influences on air pollution in small cities, while area size demonstrates the opposite effect. Population density has a stronger impact on medium/high-elevation cities, while night light intensity (NLI), fractal dimension, and area size show the opposite effect. Low road network density can enlarge the influence magnitude of NLI and population density. The results of the linear regression model with multiplicative interactions provide evidence of interactions between population density and NLI or aggregation index. The slope of the line that captures the relationship between NLI on PM2.5 is positive at low levels of population density, flat at medium levels of population density, and negative at high levels of population density. The study results also show that when increasing the population density, the air pollution in a city with low economic and low morphological aggregation degrees will be impacted more greatly.


Introduction
In recent years, in addition to tracing emissions from pollution sources and encouraging changes in energy structure, the linkage between urban form and air quality has become a hot topic in the context of global urbanization and urban air pollution, which are some of the largest environmental health threats [1,2]. Urban form, which is characterized by urban scale, density, and development level; layout concentration; border complexity; etc., has a significant impact on the physical ventilation environment, microclimate, and traffic patterns, which then affect the emission and diffusion of air pollutants [3][4][5][6]. Due to the high pressure of urbanization and the lock-in effect of urban form worldwide, it has become an urgent international issue to gain insight into the complex relationship between urban form and air pollution from multiple perspectives.
Remarkable research progress has been made on the relationship between urban form and air pollution, but it has not solved the long-standing debates [7]. Most studies have found that larger

Study Area
The Beijing-Tianjin-Hebei urban agglomeration ( Figure 1) is one of the most developed areas in China in terms of economy, transportation, and industry, as well as participation in global trade and competition. It is also one of the areas in China most affected by air pollution problems [24,25]. Most of the region is located in the northern part of the North China Plain. It has an area of 216,000 km 2 and a residential population of approximately 111 million, which was approximately 8.1% of China's total population in 2015. There are a total of 69.7 million people living in the urban area (AREA), with an urbanization rate of 62.5% [26]. This area comprises a very large number of cities due to intense urban expansion over the past 15 years.
Remote Sens. 2020, 12, x FOR PEER REVIEW 3 of 21 competition. It is also one of the areas in China most affected by air pollution problems [24,25]. Most of the region is located in the northern part of the North China Plain. It has an area of 216,000 km 2 and a residential population of approximately 111 million, which was approximately 8.1% of China's total population in 2015. There are a total of 69.7 million people living in the urban area (AREA), with an urbanization rate of 62.5% [26]. This area comprises a very large number of cities due to intense urban expansion over the past 15 years.

Schematic Diagram of Concept and Mechanism and the Research Framework
We first drew a schematic diagram of the urban form and impact mechanism ( Figure 2a) and outlined the overall research framework to clarify the data processing and modeling process ( Figure  2b). In Figure 2a, the diagram first shows the principal indicators of urban form. The AI is calculated from an adjacency matrix and is used as a descriptor of desperation, interspersion, subdivision, or isolation [27]. FRAC values greater than 1 indicate a departure from Euclidean geometry and an increase in shape complexity. This metric can be used across a range of spatial scales and thus overcomes limitations of the straight perimeter/area ratio [28][29][30]. AREA, POPDEN, urban nighttime light (NTL), and nighttime light intensity (NLI) also have close relationships with urban air pollution [31,32]. These indicators of urban form can affect the emission of urban air pollution through the influence on transportation and other paths. They can also affect the diffusion of air pollutants through the impact of urban microclimate and other factors. This brief schematic diagram lays the foundation for our further empirical analysis.
As for the method flow (Figure 2b), we first quantified the urban form and air pollution indicators based on multisource data, including land use data, air pollution grid products, night light remote sensing images, POPDEN spatial grids, and meteorological monitored data. Based on these indicators, we built a variety of panel models. We used the Hausman test to determine the appropriate type of panel model, identify individual FEs, and evaluate the overall effect of each factor on air pollution indicators. After that, we built subsample models of all cities according to the natural and social economic conditions of cities, such as elevation and road network density, and we

Schematic Diagram of Concept and Mechanism and the Research Framework
We first drew a schematic diagram of the urban form and impact mechanism ( Figure 2a) and outlined the overall research framework to clarify the data processing and modeling process (Figure 2b). In Figure 2a, the diagram first shows the principal indicators of urban form. The AI is calculated from an adjacency matrix and is used as a descriptor of desperation, interspersion, subdivision, or isolation [27]. FRAC values greater than 1 indicate a departure from Euclidean geometry and an increase in shape complexity. This metric can be used across a range of spatial scales and thus overcomes limitations of the straight perimeter/area ratio [28][29][30]. AREA, POPDEN, urban nighttime light (NTL), and nighttime light intensity (NLI) also have close relationships with urban air pollution [31,32]. These indicators of urban form can affect the emission of urban air pollution through the influence on transportation and other paths. They can also affect the diffusion of air pollutants through the impact of urban microclimate and other factors. This brief schematic diagram lays the foundation for our further empirical analysis.
As for the method flow (Figure 2b), we first quantified the urban form and air pollution indicators based on multisource data, including land use data, air pollution grid products, night light remote sensing images, POPDEN spatial grids, and meteorological monitored data. Based on these indicators, we built a variety of panel models. We used the Hausman test to determine the appropriate type of panel model, identify individual FEs, and evaluate the overall effect of each factor on air pollution indicators. After that, we built subsample models of all cities according to the natural and social economic conditions of cities, such as elevation and road network density, and we compared the differences of coefficients across models. Finally, we used the FE model with multiple interactions to quantify the interactions between urban form factors. Brief descriptions of some variables involved are listed in Table 1.
Remote Sens. 2020, 12, x FOR PEER REVIEW 4 of 21 interactions to quantify the interactions between urban form factors. Brief descriptions of some variables involved are listed in Table 1.

Air Pollution
We used the satellite remote sensing inversion data (annual near-surface PM 2.5 and NO 2 concentration grids) from 2000, 2005, 2010, and 2015 as the basis to extract air pollution indicators (APIs). This method can overcome the problem of sparse spatial data recorded by air quality monitoring stations and help to obtain long time series and spatially continuous observations. The annual PM 2.5 grid data set was derived from a combination of aerosol optical depth (AOD) retrievals from the NASA Moderate Resolution Imaging Spectroradiometer (MODIS), Multiangle Imaging Spectroradiometer (MISR), and Sea-Viewing Wide Field-of-View Sensor (SeaWiFS). The GEOS-Chem chemical transport model and geographically weighted regression model were used to relate the AOD to the near-surface PM 2.5 concentration (https://sedac.ciesin.columbia.edu/data/set/sdei-global-annual-gwr-pm2-5-modismisr-seawifs-aod) [33,34]. The annual ground-level NO 2 grids were derived from the Global Ozone Monitoring Experiment (GOME), Scanning Imaging Absorption SpectroMeter for Atmospheric CHartographY (SCIAMACHY), and GOME-2 satellite retrievals. GEOS-Chem (https://sedac.ciesin. columbia.edu/data/set/sdei-global-3-year-running-mean-no2-gome-sciamachy-gome2) was also used to relate the tropospheric NO 2 column densities and the NO 2 concentrations at ground level [35]. PM 2.5 and NO 2 grids with original spatial resolutions of 0.01 and 0.1 degrees, respectively, were resampled to 1 km grids for subsequent analysis ( Figure 3). Due to the lack of data, we replaced the 2015 NO 2 concentration with the 3-year mean of 2010-2012.
Remote Sens. 2020, 12, x FOR PEER REVIEW 5 of 21 We used the satellite remote sensing inversion data (annual near-surface PM2.5 and NO2 concentration grids) from 2000, 2005, 2010, and 2015 as the basis to extract air pollution indicators (APIs). This method can overcome the problem of sparse spatial data recorded by air quality monitoring stations and help to obtain long time series and spatially continuous observations. The annual PM2.5 grid data set was derived from a combination of aerosol optical depth (AOD) retrievals from the NASA Moderate Resolution Imaging Spectroradiometer (MODIS), Multiangle Imaging Spectroradiometer (MISR), and Sea-Viewing Wide Field-of-View Sensor (SeaWiFS). The GEOS-Chem chemical transport model and geographically weighted regression model were used to relate the AOD to the near-surface PM2.5 concentration (https://sedac.ciesin.columbia.edu/data/set/sdei-globalannual-gwr-pm2-5-modis-misr-seawifs-aod) [33,34]. The annual ground-level NO2 grids were derived from the Global Ozone Monitoring Experiment (GOME), Scanning Imaging Absorption SpectroMeter for Atmospheric CHartographY (SCIAMACHY), and GOME-2 satellite retrievals. GEOS-Chem (https://sedac.ciesin.columbia.edu/data/set/sdei-global-3-year-running-mean-no2gome-sciamachy-gome2) was also used to relate the tropospheric NO2 column densities and the NO2 concentrations at ground level [35]. PM2.5 and NO2 grids with original spatial resolutions of 0.01 and 0.1 degrees, respectively, were resampled to 1 km grids for subsequent analysis ( Figure 3). Due to the lack of data, we replaced the 2015 NO2 concentration with the 3-year mean of 2010-2012.

Urban Form Indicators
We used land use data with a 30 m resolution to extract urban extent and calculate urban geometric indicators. The data were derived from the Resource and Environment Data Cloud Platform, Chinese Academy of Sciences (CAS) (http://www.resdc.cn/DataList.aspx). First, we took 2015 as a reference year for the identification of city samples and their boundaries. We selected the urban patches in 2015 according to the land use classification and merged the connected urban patches into one city sample since some cities may grow beyond their administrative boundaries. In this way, we obtained 150 independent urban units. Then, based upon the boundaries of the 150 urban units and the land use maps in different years, we calculated the landscape metrics AREA, AI, and FRAC for each urban unit in each year. The POPDEN dataset was obtained from the Center for International Earth Science Information Network at Columbia University (available at

Urban Form Indicators
We used land use data with a 30 m resolution to extract urban extent and calculate urban geometric indicators. The data were derived from the Resource and Environment Data Cloud Platform, Chinese Academy of Sciences (CAS) (http://www.resdc.cn/DataList.aspx). First, we took 2015 as a reference year for the identification of city samples and their boundaries. We selected the urban patches in 2015 according to the land use classification and merged the connected urban patches into one city sample since some cities may grow beyond their administrative boundaries. In this way, we obtained 150 independent urban units. Then, based upon the boundaries of the 150 urban units and the land use maps in different years, we calculated the landscape metrics AREA, AI, and FRAC for each urban unit in each year. The POPDEN dataset was obtained from the Center for International Earth Science Information Network at Columbia University (available at https://beta.sedac.ciesin.columbia.edu/data/set/gpw-v4population-count-rev10). The Defense Meteorological Satellite Program (DMSP) Operational Line-scan System (OLS) night time light (NLT) version 4 stable average visible data were obtained from the NOAA, National Geophysical Data Center (http://ngdc.noaa.gov/eog/dmsp/downloadV4composites.html), and then the NTL data were calibrated via the ridgeline sampling regression method to obtain a consistent NLI time series [36]. Due to the lack of NLI data in 2015, we used the data in 2013 as a replacement. The calculations of FRAC and AI were performed in FRAGSTATS 4.2 software developed and supported by Dr. McGarigal and Dr. Cushman.

Control Variables
We further took meteorological and vegetation factors as control variables. The annual average WIND and PREC data were based on the existing Princeton reanalysis data, GLDAS (Global Land Data Assimilation System) data, GEWEX-SRB (Global Energy and Water Cycle Experiment-Surface Radiation Budget) radiation data, and TRMM (Tropical Rainfall Measuring Mission) PREC data as the background field, which are integrated with the records from meteorological stations in China [37]. The meteorological data with a spatial resolution of 0.1 degrees were resampled to 1 km for the zonal statistics. The normalized difference vegetation index (NDVI) was obtained from MODIS datasets (MOD13Q1) with a spatial resolution of 250 m. We used the growing season data as the annual measurement of urban vegetation coverage.

Econometric Models
A panel data model for the period 2000-2015 was utilized in this study. The panel model has several major advantages over conventional cross-sectional or time series models [38,39]: It usually has high-power control of individual heterogeneity and can help reduce the effects of multicollinearity among the variables and increase the degrees of freedom [38]. We used two methods for estimating unobserved effect panel data models, namely, the FE estimator and the random effect estimator. Let API ij be the air pollution indicator of the tth year (t = 2000, 2005, 2010, 2015) in the ith city (I = 1, 2, . . . , 150). We modeled the relationships between APIs and UFIs: where API it is the air pollution indicator of the tth year at the ith urban unit; µ is a scalar coefficient; β is a vector of the parameters; U i denotes the individual effect of the ith urban unit, capturing the idiosyncratic characteristics of each urban unit; ε it denotes the random error of the tth year at the ith urban unit; and UFI it is a vector of the urban form factors. Before conducting the panel data model, we used the Hausman test to decide whether the FE model or random effect model should be used [40]. We applied the natural logarithm transformation to all the independent variables to avoid nonstationarity and heteroskedasticity phenomena in the time series [41]. A log transformation was applied for APIs, and then the elastic estimation coefficient was obtained.

Subsample Modeling for City in the Context of Various Natural and Social Conditions
To gain insight into the differences in the relationship between urban form and air pollution in the context of various natural and social conditions, we regrouped the cities from three perspectives and modeled them separately. First, we classified the cities according to their built-up areas as of 2015, using the criteria proposed in a previous study [42]. According to urban area size [42,43], these cities were divided into two categories of different sizes, including small-sized cites (≤50 km 2 ) and medium/large-sized cities (>50 km 2 ). Then, we divided all cities into two categories according to their elevation [44], including low-elevation cities (<500 m above sea level) and medium/high-elevation cities (≥500 m). Finally, we calculated the road density based on the road network obtained from the Baidu Map data of 2015, taking into account all levels from national roads to township roads. We divided the cities into low road density cities and high road density cities using the classification criteria determined according to the natural breakpoint of the road density of all cities.

Linear Regression Model with Multiplicative Interaction
Because POPDEN has led to one of the most important debates on compact city issues, we used the linear regression model with multiplicative interaction terms to estimate the conditional marginal effect of POPDEN on air pollution concentrations across different levels of NLI, AI, and FRAC. At the same time, the conditional marginal effect of NLI, AI, or FRAC on air pollution concentrations across different levels of POPDEN were also obtained.
The linear regression model with multiplicative interaction terms is a common model for examining whether the relationship between an outcome Y and a key independent variable D varies with the levels of a moderator X, which is often used to capture the differences in the context [22,45]. For example, the effect of D on Y may grow with higher levels of X. Such conditional hypotheses are ubiquitous in many social and natural sciences, and linear regression models with multiplicative interaction terms are the most widely used framework for testing these conditional hypotheses in applied work [45]. The corresponding formula of this model is as follows: where API it is the API of the tth year at the ith urban unit; µ is a scalar coefficient; β is a vector of the parameters; U i denotes the individual effect of the ith urban unit, capturing the idiosyncratic characteristics of each urban unit; ε it denotes the random error of the tth year at the ith urban unit; and UFI it is a vector of the urban form factors. D it and X it denote a key independent variable and a moderator variable of the tth year at the ith urban unit, respectively. To verify the robustness of the evaluation of the interaction, we also added the interaction item into the pooled sectional model as a comparison. Moreover, we used linear interaction diagnostic (LID) plots to illustrate the marginal effect of D on Y. In LID plots, the horizontal axis is X, and the vertical axis is the estimated regression coefficient of D and Y. We further calculated the confidence intervals and binning estimator for comparison and verification, that is, the estimation results based on the regression of subgroups with high (H), medium (M), and low (L) values of the indicators. The graph was drawn using the 'Interflex' package of Stata 16 [46].

Panel Data Model Estimations
The results of the F test and Hausman test showed that it was reasonable to add individual FEs to the panel model ( Table 2). The results showed that the combination of AREA, geometry, vegetation, and meteorological factors well fitted the PM2.5 and NO2 concentrations in the individual FE model, and the corresponding R 2 values were up to 0.63 and 0.69, respectively ( Table 2). The significant

Panel Data Model Estimations
The results of the F test and Hausman test showed that it was reasonable to add individual FEs to the panel model (Table 2). The results showed that the combination of AREA, geometry, vegetation, and meteorological factors well fitted the PM 2.5 and NO 2 concentrations in the individual FE model, and the corresponding R 2 values were up to 0.63 and 0.69, respectively ( Table 2). The significant estimation coefficients of the influence of all urban morphology indexes on the concentrations of the two pollutants were obtained by FE regression ( Table 3). The results showed that the increases in NLI, AREA, POPDEN, AI, and FRAC increased PM 2.5 and NO 2 concentrations. The increase in WIND significantly reduced the concentrations of the two pollutants. PREC showed a greater effect on PM 2.5 concentration but no significant effect on NO 2 . In contrast, the NDVI showed a greater impact on NO 2 than on PM 2.5 . More significantly, the coefficients (0.1-0.9 for PM 2.5 and 0.6-2.2 for NO 2 ) of the three indexes of NLI, AREA, and POPDEN were much lower than those of the two geometric shape indexes (1.8-5.1 for PM 2.5 and 5.2-9.7 for NO 2 ), which suggests that the change in AI or FRAC with the same proportion plays a much greater role in air pollution than other UFIs.

The Relationships between Air Pollution and Urban Form Factors in Cities with Different Conditions
The classification of cities showed that cities below 500 m were mainly located in the North China Plain, while those above 500 m were mainly located in the Bashang Plateau [47] (Figure 5a). Medium and large cities, i.e., cities with an area of more than 50 km 2 , had relatively scattered geographical distribution (Figure 5b), which was similar to the distribution of cities with higher road network density (Figure 5c). According to the violin charts of PM 2.5 and NO 2 concentrations in the cities with different scales, elevations and road network densities from 2000 to 2015 (Figure 5d), there were change patterns of the air pollution. The air pollution level in low-elevation and medium/large cities was higher and changed faster. In contrast, cities with sparse road networks had more serious air pollution levels and the most rapidly increased trends. The results of sub panel models showed that the coefficients of most factors were similar to the coefficients of the whole regional scale regression (Table A1). However, the effects of various factors on air pollution differed between cities with different elevations, city sizes, and road network densities. In general, the regression R-square of cities with medium/large size, medium/high elevation, and low road network density is higher than that of small cities, cities with low elevation, and high road network density, respectively. As Figure 6 shows, from the perspective of urban scale, in the PM2.5 model, the elasticity coefficients of POPDEN, AI, FRAC, and WIND are significant, and their absolute values are larger in small cities, whereas AREA has a greater impact on air pollution in larger cities. From the perspective of elevation, in the PM2.5 model, POPDEN and PREC have a greater impact on the PM2.5 of the medium/high-elevation cities, while NLI, FRAC, AREA, and WIND have the opposite effect. The difference in the regression coefficient of the NO2 model is consistent with that of the PM2.5 model except for NLI and POPDEN. From the perspective of road network density, NLI and POPDEN have a greater impact on cities with a low road network density, while AREA has a greater impact on cities with a high road network density.  The results of sub panel models showed that the coefficients of most factors were similar to the coefficients of the whole regional scale regression (Table A1). However, the effects of various factors on air pollution differed between cities with different elevations, city sizes, and road network densities. In general, the regression R-square of cities with medium/large size, medium/high elevation, and low road network density is higher than that of small cities, cities with low elevation, and high road network density, respectively. As Figure 6 shows, from the perspective of urban scale, in the PM 2.5 model, the elasticity coefficients of POPDEN, AI, FRAC, and WIND are significant, and their absolute values are larger in small cities, whereas AREA has a greater impact on air pollution in larger cities. From the perspective of elevation, in the PM 2.5 model, POPDEN and PREC have a greater impact on the PM 2.5 of the medium/high-elevation cities, while NLI, FRAC, AREA, and WIND have the opposite effect. The difference in the regression coefficient of the NO 2 model is consistent with that of the PM 2.5 model except for NLI and POPDEN. From the perspective of road network density, NLI and POPDEN have a greater impact on cities with a low road network density, while AREA has a greater impact on cities with a high road network density.

The Internal Interactions of Urban Form Factors
In the FE model and pooled sectional linear model, the multiplicative interaction terms including POPDEN were added, and the regression estimation results are shown in Table 4. In the FE model, only in models A1 and B1, the main effect items and interaction items are statistically significant synchronously, while in the pooled sectional linear model, the interaction items are mostly significant. However, adding the interaction terms of POPDEN and NLI/AI/FRAC significantly increased the Rsquare of the FE model by approximately 0.06, while the improvement of the pooled sectional model is relatively small or even opposite.
We focused on the analysis of models A1 and A2 with significant interaction terms and obtained the LID plots shown in Figure 7a-d. As a contrast, we also present the results derived from pooled sectional models, as shown in Figure 7e-h. The results of the binning estimator are also shown on all plots. Figure 7 shows that the positive or negative marginal effects of the binding estimator, the FE model, and the pooled sectional model are consistent, while the binning estimators have wider confidence intervals. There is also clear evidence of an interaction as the slope of the line that captures the relationship between NLI on PM2.5 is positive at low levels of POPDEN, flat at medium levels of POPDEN, and negative at high levels of POPDEN. Figure 7b and 7d show that when the POPDEN increases, the air pollution in the cities with a low economic development level and low morphological compactness will be impacted more greatly.

The Internal Interactions of Urban Form Factors
In the FE model and pooled sectional linear model, the multiplicative interaction terms including POPDEN were added, and the regression estimation results are shown in Table 4. In the FE model, only in models A1 and B1, the main effect items and interaction items are statistically significant synchronously, while in the pooled sectional linear model, the interaction items are mostly significant. However, adding the interaction terms of POPDEN and NLI/AI/FRAC significantly increased the R-square of the FE model by approximately 0.06, while the improvement of the pooled sectional model is relatively small or even opposite.
We focused on the analysis of models A1 and A2 with significant interaction terms and obtained the LID plots shown in Figure 7a-d. As a contrast, we also present the results derived from pooled sectional models, as shown in Figure 7e-h. The results of the binning estimator are also shown on all plots. Figure 7 shows that the positive or negative marginal effects of the binding estimator, the FE model, and the pooled sectional model are consistent, while the binning estimators have wider confidence intervals. There is also clear evidence of an interaction as the slope of the line that captures the relationship between NLI on PM 2.5 is positive at low levels of POPDEN, flat at medium levels of POPDEN, and negative at high levels of POPDEN. Figure 7b,d show that when the POPDEN increases, the air pollution in the cities with a low economic development level and low morphological compactness will be impacted more greatly.

The Influence of Urban Form on Air Pollution
The rise in AI and FRAC show a negative impact on air quality. Our results regarding AI are consistent with the findings of Yupeng Liu on Chinese cities at the national scale [48,49] and the findings of Qiannan She of the Yangtze River Delta in China [50]. However, these results are inconsistent with the findings of Bereitschaft and Debbage [51] in the United States and the findings of Shi et al. [42] in Chinese cities. Note that the latter used the urban continuity index and percentage of like adjacencies index to represent the urban agglomeration degree. In contrast, our FRAC results are consistent with most of the existing studies [51]. Considering that we set control variables in the regression and used a reliable FE panel model [21], our results provide a reliable reference for understanding the role of urban morphology at the regional scale.
AI represents the dispersion or aggregation distribution of multiple patches, which is closely related to the traffic flow inside and between cities, further affecting the air pollutant emissions caused by traffic [5,52]. On the one hand, the separated leapfrog urban form corresponding to low AI leads to an increase in long-distance travel flow between urban patches, which has been suggested to have a negative impact on urban air quality. On the other hand, the high AI corresponding with an aggregated and monocentric urban form has a negative impact on ventilation and convective efficiency in the city. In addition, the high concentration of corresponding social and economic resources will increase local traffic flow and cause traffic congestion, and the low-efficiency vehicle driving conditions will increase the emissions of NO 2 and PM 2.5 and accelerate the generation of secondary particles. The negative impact of high AI seems to be more significant in the Beijing-Tianjin-Hebei urban agglomeration than in other regions.
High FRAC can better represent disordered urban sprawl development, and the extra traffic burden it causes can be better correlated with the increase in air pollution emissions. Fragmented cities with highly convoluted, irregular boundaries are expected to have higher nonpoint emissions (primarily from automotive sources) [51], as well as higher secondary aerosol contributions (e.g., NO 2 and SO 2 ) to particulate matter (PM) pollution [53]. Although the confirmation of this positive or negative relationship is consistent with the existing research, the relatively larger coefficients of FRAC and AI that we obtained suggest the interesting potential impact of minor changes in urban geometry.

The Influences of Natural Conditions and Urban Characteristics and Their Implications for Urban Planning
The results of the sub panel models show that planning strategies should be adjusted for cities with different geographical environments and urbanization characteristics. This is consistent with or conflicting with the conclusions of the existing studies based on sub panel or cross-sectional models. For example, for small size cities, the population density and morphological sprawl need to be well managed. This is similar to Bechle et al.'s research results based on 1274 global cities [11], whose study showed that the impact of morphological compactness on the concentration of NO 2 in small-population sized cities is significantly higher than that in large-scale cities, and the value of the regression coefficient of the small-scale compactness index is more than twice the value of the global scale regression coefficient. However, our conclusion is inconsistent with that of one research based on 830 cities in East Asia [54]. Larkin et al. found that the impact of urban sprawl index on NO 2 concentration is greater in large sized cities, while the impact on PM 2.5 concentration is the largest in medium-sized cities. It is worth noting that these two studies used a cross-sectional rather than panel data research design, which might be one important reason for the inconsistent results.
For medium/large cities, it is more important to prevent excessive urban area expansion. This is consistent with the conclusion of another study based on sub panel models, whose results show that urban expansion had more significant positive impacts on PM 2.5 concentration in medium/large-sized cites (>50 km 2 ). For the plain cities at low elevation, the control of social economic intensity plays an important role in clean air, but for the cities at medium/high elevation, the control of POPDEN has a profound impact. The influence of elevation on the relationship between urban form and air pollution is closely related to meteorological conditions. The low-elevation areas of the Beijing-Tianjin-Hebei region are mainly located on the North China Plain, which is bordered on the north by the Yanshan Mountains and on the west by the Taihang Mountains' edge of the Shanxi Plateau [55]. This terrain causes the North China Plain to be dominated by a cold high pressure system with low surface wind speeds, sometimes also accompanied by surface temperature inversion [55]. Therefore, the air pollutants produced by socio-economic activities are favorable for the formation of haze or fog, and usually lead to high levels of pollutants concentration due to weak mixing and dispersion [55,56], and thus the larger regression coefficient corresponding to the variable of NLI. For cities with low road network density, NLI and POPDEN have a greater impact, implying that the improvement of traffic efficiency brought by the construction of adequate traffic infrastructure can effectively reduce the impact of POPDEN and economic growth on air pollution, and possibly a reduction of per capita air pollutant emissions.

Implications of Interactions among Urban Form Indicators for Urban Planning
In most previous studies, the effect of each urban form indicator was considered separately, without consideration of their interactions. Our research shows that the increase of POPDEN has an overall improvement on the air pollution of the cities in the Beijing-Tian-Hebei region. However, the results of interaction analysis reveal that this effect is context-dependent, which is regulated by social and economic conditions (represented by NLI) and morphological aggregation.
The analysis of the interaction between NLI and POPDEN shows that social and economic development will reduce the impact of population growth density. According to the Environmental Kuznets curve, the development of cities can reduce pollutant emissions through industrial upgrading and the strengthening of clean policies [10,57], which will further restrain the negative impact of POPDEN. Therefore, giving priority to areas with higher economic development levels for the increase of POPDEN, or giving priority to the economic development in areas with more dense populations, may be effective measures to reduce air pollution on the regional scale.
Previous studies suggested that more concentrated cities have great benefits to help control pollutant emissions, heat island effect, carbon emissions, etc. [5,10,41,58]. However, from the perspective of cleaner air, for cities with low population density in the Beijing-Tianjin-Hebei region, the aggregated urban form is not a good choice. What is more informative is that aggregated urban form can inhibit the effect of POPDEN on urban air quality deterioration. That is, more aggregated cities can greatly reduce the negative impact of POPDEN. This is because a more compact city can effectively reduce the average travel distance and travel mode choice of the population on the urban scale, thus affecting the overall pollution emissions. This role can have a greater impact in more densely populated areas. Therefore, in more densely populated areas, it is more urgent to maintain aggregated distribution. At the same time, excessive urban sprawl should be avoided, since it is difficult to reverse urban geometry once it has been built and has a lock-in effect on the city [59,60].
The existence of interactions means that the expansion and development of cities will have a greater joint effect with the evolution of urban forms, which cannot be measured only according to their respective independent functions. The estimated coefficients from traditional linear models should be considered more carefully because interactions will lead to the formation of a complex nonlinear relationship between urban form factors and air pollution indicators and thus not be robust.

Conclusions
In this paper, we studied the role of urban form in air pollution based on a panel data model from multiple perspectives, including elevation, area size, and road density. The results show that the expansion of AREA, the increase in NLI, high POPDEN, an aggregated layout, and a disordered sprawl aggravate air pollution. Among the urban form factors, the elastic coefficients of the urban AI and FRAC are highest at the regional scale. However, NLI, POPDEN, AI, and FRAC are all more influential in small cities, while AREA has the opposite trend. POPDEN has a greater impact on medium/high-elevation cities, while NLI and FRAC have a greater impact on low-elevation cities. NLI and POPDEN have a greater impact on low-density road network cities, while AREA does not.
The result of the linear regression model with multiplicative interaction provides evidence of interactions between POPDEN and NLI or AI as the slope of the line that captures the relationship between NLI on PM 2.5 is positive at low levels of POPDEN, flat at medium levels of POPDEN, and negative at high levels of POPDEN. When the POPDEN increases, the air pollution in a city with low economic development level and low aggregation degree will be impacted more greatly. The results imply that in the process of urban development or expansion, urban form optimization and context-dependent adjustment are urgent.
The first limitation of our study is that most of the urban samples used were small and mediumsized cities, and more samples are needed to confirm the patterns in medium/large cities. Second, the relationship between urban form and air pollution may change with season [49], but we did not consider seasonal variations. Third, this work focused on the regional scale. Although more accurate assessment can be obtained for these cities, larger-scale observations are necessary because of the strong heterogeneity of the background characteristics of global cities. Finally, our linear regression model with multiplicative interaction maintained an important assumption: that the interaction effect is linear. The linear interaction effect assumption often fails in empirical settings because many interaction effects are not linear, and some may not even be monotonic. In future studies, it will be meaningful if the complex interaction among more variables can be included based on a machine learning algorithm (such as random forest, Gradient Boosting model, or corresponding multi-objective learning model), which will further deepen the insight into the impact of multi-perspective urban form on air pollution.   Note: In parentheses are the t statistics corresponding to the coefficients. The t statistics in parentheses = "* p < 0.05, ** p < 0.01, *** p < 0.001".