Land-Use Regression Modelling of Intra-Urban Air Pollution Variation in China: Current Status and Future Needs

: Rapid urbanization in China is leading to substantial adverse air quality issues, particularly for NO 2 and particulate matter (PM). Land-use regression (LUR) models are now being applied to simulate pollutant concentrations with high spatial resolution in Chinese urban areas. However, Chinese urban areas differ from those in Europe and North America, for example in respect of population density, urban morphology and pollutant emissions densities, so it is timely to assess current LUR studies in China to highlight current challenges and identify future needs. Details of twenty-four recent LUR models for NO 2 and PM 2.5 /PM 10 (particles with aerodynamic diameters <2.5 µ m and <10 µ m) are tabulated and reviewed as the basis for discussion in this paper. We highlight that LUR modelling in China is currently constrained by a scarcity of input data, especially air pollution monitoring data. There is an urgent need for accessible archives of quality-assured measurement data and for higher spatial resolution proxy data for urban emissions, particularly in respect of trafﬁc-related variables. The rapidly evolving nature of the Chinese urban landscape makes maintaining up-to-date land-use and urban morphology datasets a challenge. We also highlight the importance for Chinese LUR models to be subject to appropriate validation statistics. Integration of LUR with portable monitor data, remote sensing, and dispersion modelling has the potential to enhance derivation of urban pollution maps.


Introduction
Air quality in China has suffered as a consequence of rapid economic growth and urbanization. In 2016, 81%, 62%, 46%, 38%, 4.1% and 1.4% of 74 main cities in China (including provincial capitals and prefectural and municipality-level cities) failed the Chinese air quality standards that are listed in Table 1 for PM 2.5 , PM 10 , NO 2 , O 3 , CO and SO 2 , respectively [1]. (PM 2.5 and PM 10 refer to the mass concentration of particulate matter with aerodynamic diameters less than 2.5 µm and less than 10 µm, respectively.) Furthermore, while the Chinese air quality standard for NO 2 is the same as the World Health Organization (WHO) air quality guideline, those for PM 2.5 , PM 10 , and O 3 are currently less stringent than the WHO equivalents (Table 1). The air quality problem is particularly serious in regions of rapid population growth such as Beijing, Shanghai, the Pearl River Delta (PRD) (Pearl River Delta refers to cities that cover nine prefectures of Guangdong province (Guangzhou, Shenzhen, Zhuhai, Dongguan, Zhongshan, Foshan, Huizhou, Jiangmen, and Zhaoqing), Hong Kong, and Macau), and their surrounding areas [2]. Ample evidence that air pollution leads to adverse health effects and Table 1. Chinese air quality standards for concentrations of SO 2 , NO 2 , CO, O 3 , PM 10 , and PM 2.5 [10]. Also shown for comparison are the World Health Organization (WHO) air quality guidelines, where they exist [11]. Since 2013, the China National Environmental Monitoring Center (CNEMC) has been implementing a nationwide monitoring network for the routine measurement of ambient air pollutant concentrations. However, the monitoring network only provides concentrations at a limited number of discrete points-for example, the 22 monitoring stations covering a 50 km × 50 km area over Beijing equates to an average of 113 km 2 per station [12]-which is inadequate to describe the spatial variability of urban air pollution. The misclassification of human exposure can lead to a loss of power in epidemiological studies and the attenuation of health risk estimates [13].

Pollutant
A popular approach in Europe and North America for deriving intra-urban estimates of air pollutant concentrations is land-use regression (LUR) modelling. LUR is a stochastic technique that regresses spatially-explicit predictor variables (e.g., land cover, traffic, topography) onto monitored pollutant concentrations within a geographic information system [14,15]. The relationship enables prediction of pollutant concentrations at unmonitored locations. Selection of potential predictor variables uses a priori knowledge that they may contribute to, or influence, emissions and concentrations of modelled pollutants. Examples of input data commonly needed for a LUR model are shown in Table 2.
The expansion of the CNMEC (and other) monitoring networks is driving a growing number of LUR studies in China. As the nature of urban areas in China differ from those in Europe and North America, as outlined in Section 2, it is timely to assess the present status of LUR modelling in China in the context of identifying what lessons can be learned to advance this field from the similarities and differences with LUR studies applied elsewhere. The objectives of this paper are therefore: (i) to briefly summarize differences between Chinese urban areas and those in Europe and North America in the context of LUR modelling of air pollution concentrations; (ii) to summarize the state-of-the-art of LUR air pollution modelling in China; and (iii) to highlight the current gaps in LUR modelling in China in comparison with LUR models in Europe and North America and make recommendations on future needs for China. The majority of applications of LUR have been for NO 2 and PM 2.5 /PM 10 , since these pollutants are a priority for regulatory monitoring and interventions due to their public health effects. This paper therefore focuses on LUR studies for NO 2 and PM 2.5 /PM 10 only. Twenty-four such studies in China were identified from the peer-reviewed English-language literature and their details are used as the basis for discussion in this paper. Table 2. Summary of example input data for the application of a land-use regression (LUR) model.

Input
Detailed Components pollutant data regularity monitoring data purpose-designed campaign land-use classification residential land industrial land urban green space street morphology (aspect ratio) traffic data road network by road classification numbers and types of vehicles railways census data population density household density meteorology wind field temperature topography altitude slope angle emission data emission inventory remote sensing data satellite data

Pace of Urbanization
China's rapid economic development has led to extremely rapid urban population growth. At the end of 2011, China's urban population exceeded that of rural dwellers for the first time [16,17] and the trend is continuing-the proportion of urban population in China is expected to reach approximately 70% by 2025 [18]. The extent of built-up area in China has expanded correspondingly rapidly. The growth rate of built-up land from 2000 to 2010 was 2.14 times higher than in the previous decade [19]. Between 2010 and 2016 the area of land undergoing urban construction increased from 39,760 km 2 to 52,750 km 2 and possession of private vehicles increased from 59.4 to 163.3 million [20].

Magnitude and Density of Urbanization
Urban areas in China are characterized by populations of several million [21], with considerably higher population densities than the vast majority of urban areas in Europe and North America. The average urban population density in China across all built-up areas with a population >500,000 is 5100 per km 2 , compared with equivalent values of 2900 per km 2 for the European Union (3100 per km 2 for all of Europe including Russia) and 1600 per km 2 for North America (1200 and 2500 per km 2 for the USA and Canada individually) [22].

Air Pollutant Emissions
Greater population density leads to increased emissions of air pollutants in a given area. The sectorial contributions to NO x (sum of NO and NO 2 ), PM 2.5 and PM 10 emissions presented in Figure 1 for three illustrative urban areas (Beijing, Shanghai, and Guangdong province) show that power generation, industry and transportation all contribute substantially to NO x emissions. Industrial sources dominate the contributions to emissions of PM, although residential combustion (heating and cooking) also contributes substantially to emissions of fine PM (PM 2.5 ).
Greater emissions density in China can also derive from a lag in implementation [23] and/or in compliance [24] with industrial and vehicle emissions standards compared with that in Europe and North America. Poorer fuel quality has also been reported as contributing to greater per vehicle emissions in China [25].

Urban Topography
The higher urban population density in China is a consequence of the greater proportion of multi-story residential and commercial buildings. In 2016, China completed the most high-rise buildings (84) with heights exceeding 200 m of any country in the world. It was the ninth year in a row that China topped this list. Thirty-one cities in China had at least one 200-m-plus building completion in 2016 [27]. The greater urban 'roughness' and extent of street canyons impact the dispersion of air pollutants both directly and indirectly via differential changes in surface albedo and surface temperatures.

LUR Models for Chinese Urban Areas
Twenty-four LUR studies were identified from Web of Science and Google Scholar, using the following search phrases and keywords: land use regression; LUR; China; Hong Kong; Taiwan; air pollution; NO 2 ; PM 2.5 ; PM 10 . A first search was undertaken on 11 November 2017, with a follow-up search for any new literature on 3 February 2018. Language was limited to English but there was no restriction on publication date. Characteristics of these studies are summarized in Table 3. LUR models for NO 2 and/or PM 2.5 /PM 10 have been developed for individual Chinese cities (Hong Kong, Nanjing, Tianjing, Shanghai, Beijing, Changsha, Wuhan, Jinan, and Kaohsing) and regions (PRD, Taipei Metropolitan area, and Liaoning Province). The recent dates of the publications demonstrate the recent impetus for LUR modelling in China. Table 3 shows that 16 of the 24 Chinese LUR studies used fixed-site, regulatory air pollutant monitoring data from CNEMC. Based on previous studies in Europe and North America, Hoek et al. [28] suggested the use of 40-80 monitoring sites for LUR models, while Basagaña et al. [13] recommended no less than 80 monitoring sites for accurate health impact assessment. Apart from a study in Hong Kong that utilized potable sensors to collect PM 10 and PM 2.5 data at 222 ad hoc monitoring sites and a study in Changsha with 80 ad hoc sites for NO 2 , Table 3 shows that the remaining Chinese LUR studies all had fewer than 80 monitoring sites.

Monitoring Data
The fixed-site monitoring data used in these studies were derived from hourly and daily concentrations published on websites of local environmental agencies. Although real-time concentration data are published on the National Air Quality Publication Platform (http://106.37.208. 233:20035/), archived historical data are available only on a few official websites [29][30][31]. Historical data are also collected and archived elsewhere by private companies without validation, for example https://data.epmap.org/. There is therefore an urgent need for formal archiving of historical data. This explains why some studies seeking to construct LUR models have collected their own campaign data [32][33][34][35][36][37]. While this approach provides freedom in the allocation of sites (both locations and heights), it costs both time and money and provides only short-term measurements of concentration at a given location and non-contemporaneous measurements across the full set of locations.

Predictor Variables
The predictor variables in the final LUR models for the studies in Table 3 are presented in Table 4 for NO 2 LUR models and in Table 5 for PM 2.5 /PM 10 LUR models. The most-used variables in the final LUR models for NO 2 were traffic-related variables or proxies of traffic variables such as road lengths or distance to the nearest roads (Table 4). This is consistent with LUR models for NO 2 developed elsewhere and of course reflects traffic being a major emission source for NO x . However, it is important to note that only 4 of the 24 studies summarized in Table 3 were able to obtain traffic data, and three of these were based in Hong Kong, not mainland China [34,37,38]. A study in Jinan was able to obtain traffic data for major roads only [39]. For the rest of the studies, lack of traffic data meant that road lengths were used as a proxy for traffic counts. Some studies used bus stop density and gridded traffic emission estimates as predictor variables to represent the influence of road emissions [40,41]. Ideally, variables based on traffic intensity (motor vehicles per day) multiplied by road length and divided by distance to the road would be used, as recommended by the European Study of Cohorts for Air Pollution Effects (ESCAPE), since these incorporate both emission and dispersion effects [42].
The second most common category of variables for the LUR models in Table 4 were land-use variables, which indirectly represent the NO x emissions from power plants, industrial sites, or residential areas. Land-use variables for greenspace are also used, since these are negatively related to NO 2 concentrations. Most studies incorporating such variables were derived from 2010 Landsat TM5 data (www.globallandcover.com/home/Enbackground.aspx), which classifies into agricultural land, industrial land, commercial and residential land, green space and water area, at 30 m resolution. The data need to be converted into shapefile (.shp) formats in ENVI5.0 (ESRI). The monthly global Moderate Resolution Imaging Spectroradiometer (MODIS) Normalized Difference Vegetation Index (NDVI), a satellite-based greenness index that measures and monitors plant growth and vegetation density has also been used as a predictor.
In general, the same set of variables used in NO 2 LUR models were used to develop the PM models (Table 5). However, while traffic or traffic proxy variables were selected in all LUR models for NO 2 , except for the study in Nanjing [43], this was not the case for PM models. Variables related to industrial emissions, artificial lands (such as residential, industrial, and public facilities lands), greenspace (such as forests, natural vegetation, and parks), and water lands were selected in the models as well. This is due to the complexity of the sources for PM emissions compared to NO 2 . LUR models in the Taipei metropolitan area and Kaohsiung City included location height-related variables to simulate vertical variation of PM 2.5 [32,44]. Sampling height had a larger predictive influence in Kaohsiung City than in the Taipei metropolitan area, which may be due to differences in the terrain between the two study areas. Other different predictor variables incorporated into PM 2.5 LUR models were related to Chinese restaurants and the burning of joss paper and incense in temples [45].

Model Performance
Tables 4 and 5 also summarize the model performance statistics for the Chinese NO 2 and PM LUR models. An important observation is that not all studies reported formal validation, which is highlighted here as a shortcoming for those studies that did not do so.
Hold-out validation and leave-one-out cross-validation (LOOCV) are the most commonly used validation tests for LUR models. In hold-out validation, the dataset is separated into a training set and a testing set. The training set is used to develop the model and the testing set is used to evaluate the model by using the model to predict the output values for the data in the testing set. In k-fold cross validation, the data set is divided into k subsets and the holdout method is repeated k times. LOOCV uses the variables in the final model to develop a regression model using n − 1 sites, where n is the total number of observations in a monitoring network. The predicted concentrations are then compared with the actual measured concentrations at the left-out site. The procedure is repeated n times. LOOCV can be used in studies with a small number of monitoring data that prevents division into training and testing datasets. However, LOOCV does not sufficiently address the overestimation of the predictive ability of regression models, especially for smaller numbers of sites [13,[46][47][48].
The agreement between predicted and observed concentrations is usually assessed using the adjusted R 2 , which quantifies how well a linear regression explains the variance in the measurement data, and the root-mean-square error (RMSE), which quantifies the differences between the model-predicted and observed concentrations. The adjusted R 2 values for the 15 studies in Table 4 that provided quantitative validation results ranged from 0.42 to 0.87. For comparison, Hoek et al. (2008) suggested R 2 values for usable models are typically in the range 0.6-0.7. The variability in R 2 values in the Chinese studies is likely related to variable quality in the measured concentrations and predictor variables, and the complexity of the city, for example, in terms of differences in topography and emission sources. The highest R 2 value was for a study in Hong Kong [38]. The high R 2 in this model may be due to the inclusion also of topographical and building-morphological variables. These predictors act as surrogates for the complex wind conditions in a mountainous high-density city like Hong Kong. An approximately 20% increase in prediction performance was observed in the LUR model, which included these parameters compared to those without [38]. However, this study only included 15 monitoring sites, so the high R 2 could also be a result of overfitting. Of the studies that utilized more than 40 monitoring sites, the model developed in Taipei had the highest explained variance for NO 2 [33] ( Table 4). The authors of this study conducted 2-week campaigns of NO 2 measurement by Ogawa passive samplers during 3 seasons (Table 3).
The adjusted R 2 values for the 12 models for PM 2.5 and 11 models for PM 10 that reported model performance statistics ranged from 0.19 to 0.89 (Table 5). Compared with NO 2 studies, fewer PM 2.5 and PM 10 monitoring data were available from CNEMC for developing LUR models and there was no regulatory monitoring data of PM 2.5 in China before 2012. The development of portable sensors makes collecting campaign data relatively easy for PM, albeit subject to the greater measurement uncertainties typically associated with portable monitors compared with fixed-site network analyzers [49]. In Hong Kong, transient measurements of PM 2.5 and PM 10 at 222 locations were collected non-contemporaneously by portable monitor. This made it possible to estimate concentrations of PM in deep street canyons formed by compact urban development [38]. As for NO 2 , the LUR models for PM with urban-morphology-related variables were important for modelling the variability of pollutants in complex urban areas like Hong Kong. Another study in Hong Kong, which did not use the mountainous topographical and building morphological parameters, only had an R 2 value of 0.46, based on 97 campaign measurements [34].
The nature of the monitoring locations also affected LUR model performance. The study in PRD resulted in a high R 2 value (0.884) and low RMSE (2.754 µg/m 3 ) for PM 2.5 but the monitoring network only included 5 sites located within 500 m of a major freeway and 1 site within 200 m of a freeway [41]. The predictor variables included in the final models were latitude, longitude, and artificial land and water areas (Table 5). These variables could only explain the strong northwest-southeast trend of the variability of PM 2.5 due to the change of wind directions, emissions from human activities and emissions from international and domestic ports, but failed to estimate more detailed street-level variation. The final model was therefore not able to satisfactorily explain the intra-urban variability of PM 2.5 .     road (500-1000), forest (500-1000), industry (300-500), park (500-1000), railroad (0-50), government institutions (100-300), park (300-500), public equipment (100-300), bus (0-50), public equipment (100-300), port (500-1000) Tianjin PM 10 major roads (1000), residential area (500), wind speed 0.49 0.008 10 1, 3 The LUR models were for summer. 2,4 The LUR models were for winter. 5 This is standard error of estimate. 6 This is standard deviation. 7 This LUR model was for the heating season. 8,10 These are standard errors. 9 This LUR model was for non-heating season.

Modelled Pollutant
In some of the PM LUR studies, the intercept values of the resultant models are close to the mean observed/predicted values of the PM concentrations [41,43,53,54]. The reason for these large intercepts is that longer-range transport and secondary formation within the atmosphere are more influential for PM 10 and PM 2.5 , compared with pollutants emitted locally and with shorter atmospheric lifetimes such as NO x . LUR models for PM are therefore more likely to fit with noise rather than capturing the spatial variation. In contrast, LUR is more effective for modelling NO 2 , which is derived mainly from traffic and domestic and other local combustion sources through the rapid reaction of NO and therefore has more pronounced spatial variability that the LUR model can capture.

Data Availability/Accessibility Challenges
Although an increasing number of LUR models have been developed at the intra-urban scale in China, they remain constrained by the availability of key input data needed for LUR models (Table 2), especially air quality monitoring data. Measurement networks are sparser to date compared with those in Europe, and the cities in China are larger, denser and more complex and often also expanding rapidly. The number of monitoring sites from CNEMC often do not reach that recommended for developing LUR models for health studies [13,28], so the expansion of monitoring (regulatory and/or 'low cost' and/or passive) is needed. Quality-assured historical data are not often available. Therefore, as well as the archiving of network data, there is also an urgent need for nationally-applied quality assurance/quality control (QA/QC) protocols to improve and document the quality of the data to ensure that Chinese networks comply with international standards [61,62].
Aside from carrying out additional targeted ground-based measurement campaigns to support LUR model development, satellite data have recently been used in LUR modelling applications to compensate for the lack of measurement data. Example datasets are NO 2 vertical column density (VCD) from the Ozone Monitoring Instrument, and aerosol optical depth (AOD) for PM [63]. As with studies in Europe [64][65][66][67][68], studies in China have reported that incorporating satellite-based estimates improved model performance of regional LUR models for NO 2 , PM 10 , and PM 2.5 [41,56]. More temporally-resolved models have also been reported; for example, using satellite AOD and VCDs, linear mixed-effects LUR models have been used to estimate daily average concentration of PM 10 [53] and NO 2 [51].
Nevertheless, current satellite data are not currently suitable for modelling the intra-urban variability of ground-level pollutants within a LUR modelling framework for the following reasons: First, ground-level concentrations cannot be detected directly from satellite instruments but must be derived by removing satellite responses to the rest of the column using complex algorithms and modelling [63]. For example, high AOD values do not necessarily translate to high surface PM 2.5 levels due to the relationship between AOD, relative humidity, and PM 2.5 [69]; secondly, the satellite-derived estimates are area-averaged concentrations, which are normally much too coarse to capture spatial contrast within cities. The pixel sizes of current instruments are around several hundred km 2 [70]; and thirdly, the availability of some satellite data are subject to meteorological conditions; for example, cloud cover can influence the quality of the retrievals of many variables (e.g., AOD) that are sensitive to atmospheric optics. In addition, satellite data can have systematic seasonal errors [41,56]. For example, Yang et al. [41] reported that satellite remote sensing tended to overestimate concentrations of PM 2.5 in summer and underestimate in winter.
Another approach to overcoming a lack of monitoring data is to integrate dispersion models with LUR models. Typically, dispersion models compute concentrations of air pollutants with high spatiotemporal resolution at specific background, roadside, and curbside receptors. The concentrations at these receptors can then be used as pseudo-observations in the LUR model and eventually simulate the concentration variation over the city without applying a computationally-intensive dispersion model over the whole area. The integrated modelling approach also has the flexibility of developing more temporally-resolved models. The approach has been exemplified in the USA [46] and in the UK [71]. However, it must be remembered that the pseudo-observations are not real monitored data. Importantly also, a fundamental drawback to the application of dispersion models in China is that they require very detailed input data for emissions (particularly traffic emissions such as fleet composition, emission factors, road width, canyon height, and time factors), meteorological variables (such as wind field, cloud cover, and precipitation) and boundary conditions-datasets, which as discussed above, are largely currently lacking in China.
As well as the challenges noted above in respect of pollutant data, another important challenge for Chinese LUR studies is capturing the rapidly changing urban landscape. The study by Xu et al. [52] interpreted Landsat series images from 2007 to 2014 to account for urban land-use change during that period.

Temporal Variability
LUR models are often used in epidemiological studies to estimate the long-term effects of air pollutants. LUR models for previous years were developed [45,52]. Xu et al. [52] used Landsat series images over different years and modified traffic emissions and industrial emissions by the number of registered motor vehicles and the numbers of enterprises to account for the temporal variation over the years.
Greater temporal-resolution modelling data are needed for shorter-term-exposure epidemiology. Some studies in China have developed LUR models for different seasons [35,38,40,41,60]. Both NO 2 and PM concentrations during the winter tended to be higher than in the summer, particularly over urban areas, which may be caused by a change of boundary layer and increased emissions from heating and other activities, such as setting off fireworks in the winter [42]. Under the influence of the East Asian monsoon, wind and precipitation patterns substantially change seasonally in much of China. Localized wind direction and speed data are needed as predictors to help characterize these temporal changes [36,38,60].
Since PM 2.5 and NO 2 have shown high day-to-day and diurnal temporal variations [72][73][74], more detailed temporal resolution would benefit short-term epidemiological studies. The temporal resolution of LUR models can be improved by calibrating concentrations with observed measurements at a fixed continuous monitoring station or building several unique models in different time periods. The first approach generalizes the temporal pattern, which leads to misclassification of exposure assessment. The latter requires manpower and material resources. Two studies in China achieved this by using a mixed linear regression (MLR) approach including satellite data [51,53]. LUR models are typically fixed-effect models, in which the predictor variables are temporally invariant. MLR combines both fixed and random effects. In MLR, additional time-dependent variables are used to model daily concentrations of pollutants. Liu et al. [36] used meteorological factors to build neural network models to explain the nonlinear relationship between meteorological factors and concentrations of NO 2 and PM 10 .

Improvement of Data Quality/Accessibility
LUR models require monitored pollutant concentrations, road networks and traffic data, land-use classification, population density and meteorological data ( Table 2). Foremost amongst these are the air pollutant monitoring data. Attempts have been made to improve the quality and the number of regulatory monitoring sites by the Ministry of Environmental Protection [75] but formal archived historical data available to the public, together with nationally-applied QA/QC approaches to improve data quality, are still urgently needed.
In terms of predictor variable data, as summarized in Section 2, urban land-use in China is rapidly changing so regularly updated land-use classification data are needed now and going forward. In order to model the effects of high-rise buildings, detailed building height and street width datasets are required as a model proxy for potential street canyon effects.
As discussed in Section 3.2, most LUR studies in China have used road lengths as a proxy for traffic counts due to unavailable traffic data. Thus, traffic counts and fleet data are also indispensable.
At present, all the data mentioned in Table 2 are produced by different organizations and archived (if at all) on various platforms. It would save time and effort if these data were systematically collected and stored.

Integration of Modelling Approaches and Fusion of Sensor Data
As shown by studies in both the UK and the USA [46,71], the integration of dispersion and LUR modelling can provide an alternative to simulating high spatial resolution pollutant concentration fields with insufficient monitoring data. As discussed in Section 4.2, the integrated approach makes it possible to design optimal monitoring sites network within the dispersion modelling domain. Future research should use this approach to explore the number of sites required to develop spatial models in Chinese urban areas and how the design of monitoring networks affects the modelling results, as demonstrated elsewhere [48]. However, progress on this front is contingent on the availability of detailed, temporally-resolved emissions and meteorological data, which is lacking at present.
The integration of satellite and ground-based lower-cost sensing also has potential for overcoming the limitation of monitoring data scarcity in China. Data collected by sensors at different spatiotemporal scales may be used to develop models and also to calibrate and validate them. Example studies discussed in Sections 3.3 and 4.2 illustrate the benefits of incorporating sensor data, despite remaining challenges in relation to data QA/QC, metadata, access, spatiotemporal resolutions and data management [76].

LUR Model Standards
Rigorous standards and requirements specific to cities in China need to be set for LUR model development and validation to prevent substantial bias in estimated concentrations and to improve applicability in health burden research. Relevant standards could relate to a minimum number of monitoring sites, the ratio of roadside sites to residential sites, key predictor variables and published model-validation statistics.

Conclusions
Urban areas in China are developing rapidly and exhibit important differences from urban areas in Europe and North America with respect to, for example, population density and urban morphology. The quality of land-use regression modelling studies of intra-urban air pollution concentrations in China is currently constrained mainly by the scarcity of input data, especially monitoring pollution data. There is an urgent need for the continued expansion of monitoring (including via passive and miniaturized sensors), the application of minimum standards of measurement assurance, and accessible, long-term archiving of the data. There is a similarly urgent need for higher spatial resolution proxy data for urban emissions, particularly in respect of traffic-related variables. The rapidly evolving nature of the Chinese urban landscape makes generating and maintaining up-to-date land-use and urban morphology datasets a challenge but one that must be met to support researchers, planners and policy decision-makers. It is important that Chinese LUR models are subject to appropriate validation statistics. As is the case elsewhere, the integration of LUR modelling with portable air-pollution monitor data, satellite data, and dispersion modelling has the potential to enhance the derivation of spatially-resolved urban pollution maps.