PM2.5 Concentrations Variability in North China Explored with a Multi-Scale Spatial Random Effect Model

Compiling fine-resolution geospatial PM2.5 concentrations data is essential for precisely assessing the health risks of PM2.5 pollution exposure as well as for evaluating environmental policy effectiveness. In most previous studies, global and local spatial heterogeneity of PM2.5 is captured by the inclusion of multi-scale covariate effects, while the modelling of genuine scale-dependent variabilities pertaining to the spatial random process of PM2.5 has not yet been much studied. Consequently, this work proposed a multi-scale spatial random effect model (MSSREM), based a recently developed fixed-rank Kriging method, to capture both the scale-dependent variabilities and the spatial dependence effect simultaneously. Furthermore, a small-scale Monte Carlo simulation experiment was conducted to assess the performance of MSSREM against classic geospatial Kriging models. The key results indicated that when the multiple-scale property of local spatial variabilities were exhibited, the MSSREM had greater ability to recover local- or fine-scale variations hidden in a real spatial process. The methodology was applied to the PM2.5 concentrations modelling in North China, a region with the worst air quality in the country. The MSSREM provided high prediction accuracy, 0.917 R-squared, and 3.777 root mean square error (RMSE). In addition, the spatial correlations in PM2.5 concentrations were properly captured by the model as indicated by a statistically insignificant Moran’s I statistic (a value of 0.136 with p-value > 0.2). Overall, this study offers another spatial statistical model for investigating and predicting PM2.5 concentration, which would be beneficial for precise health risk assessment of PM2.5 pollution exposure.


Introduction
PM 2.5 refers to particulate matters with an aerodynamic diameter ≤ 2.5 microns, which is not only a major lethal health factor in addition to hypertension, smoking, hyperglycaemia, and high cholesterol [1], but also causes great social and economic loss [2]. Precise health risk assessment of PM 2.5 pollution exposure and environmental policy evaluation would require an accurate fine-resolution spatial data product and suitable modelling strategies [3,4]. However, this presents a major challenge.
From the formation mechanics perspective, PM 2.5 takes the particles in the pollutant gas as condensation nuclei, with water vapour and other substances condensing on it, and thus, the pollutant gas emission (i.e., primary PM 2.5 ) directly affects the PM 2.5 concentrations [5]. In addition, the secondary PM 2.5 formation process through complex photochemical reaction, condensation, and atmospheric processes tends to be highly variable across space and scales [6,7]. Thereby, a credible modelling approach is expected to capture such effects simultaneously and explicitly [8,9]. but exhibits great discontinuities (or even abrupt changes) at a local or small scale. The co-existence of smoothness and discontinuities at different scales was highlighted as a generic feature of the distribution of geographical variables [39].
From a statistical modelling perspective, modelling scale-dependent variabilities and spatial correlations in a unified statistical model is challenging. In a seminal paper by Cressie and Johannesson (2008), an innovative method, the fixed-rank Kriging (FRK) model, was proposed [40]. FRK defines a spatially correlated mean-zero and generally nonstationary random process, which is further decomposed by using a linear combination of flexible and multi-scale spatial basis functions with structured random coefficients. By doing so, it can reconstruct a complex, spatially dependent, nonstationary, and highdimensional spatial process. Moreover, this is scalable for large spatial datasets [18].
In line with the FRK model, we proposed a multi-scale spatial random effect model (MSSREM) to explore the spatial variability in ground PM 2.5 concentrations in North China. This area was chosen because of its relatively high levels of air pollution and the great variabilities that it exhibits with regard to natural and socioeconomic characteristics. Before our empirical investigation, we first conducted a Monte Carlo simulation experiment to assess the relative prediction performance of the MSSREM against a single-scale spatial statistical model (e.g., a classical ordinary Kriging). The simulation results indicated that when higher levels of local spatial variabilities were exhibited, the MSSREM had a greater potential to recover local-or fine-scale variations hidden in spatial processes. Furthermore, we found significant impacts of both meteorological, physical, and human activity factors on the distribution of PM 2.5 concentrations in North China.
The remainder of this paper is organized as follows: The statistical model is presented in Section 2. The description of a Monte Carlo simulation experiment is given in Section 3 to assess the relative prediction performance between MSSREM and single-scale spatial statistical models. In Section 4, we present our empirical study results. The conclusions are presented in Section 5.

Statistical Modelling
Our conceptual framework for PM 2.5 spatial process modelling is presented in Figure 1. Briefly, we assume that the geographical process of PM 2.5 concentrations is driven by regional factors, including nature and human factors, and a spatial random process. For a study region R, the hidden (or real) process of PM 2.5 , namely H(s), is defined as where s denotes the location of H(s). On the right-hand side of the equation, the first two terms capture global deterministic trend of PM 2.5 concentrations, in which N(s) T α measures the effect of nature factors, and M(s) T β measures the contribution of human factors. The third term, ω(s), is spatial Gaussian process capturing the spatially structured random effect underlying the outcome variable. The last term, ξ(s), is a random error term with mean zero and variance-covariance σ 2 ξ I, which is spatially uncorrelated. For the PM 2.5 spatial process, in the real world, boundary effect and scale effect are unavoidable. Consequently, the spatial random process is decomposed as multi-scale spatial basis function with random coefficients [40], where τ = (τ 1 , . . . , τ r ) T is an r-dimensional Gaussian vector with mean zero and r by r covariance matrix K, and τ k captures the average random effect governed by k th spatial basis function. Φ = (Φ 1 , . . . , Φ r ) is r-dimensional spatial basis functions (e.g., Gaussian basis function or exponential basis function) with a multi-scale nested structure (e.g., Figure 2). To cater for different observation supports (e.g., monitoring stations and remote sensing pixels), the region is discretized as n non-overlapping but compact, basic 4 of 14 areal units (BAU) [41]. If BAUs are small enough, compared to the study region, the error in the discrete process could be ignored. Then, the hidden process, H(s), is averaged over the BAUs, which can be written as where |B i | is the area of BAU-i. At the BAU level, the process model can be written as  For the PM2.5 spatial process, in the real world, boundary effect and scale effect are unavoidable. Consequently, the spatial random process is decomposed as multi-scale spatial basis function with random coefficients [40], where = ( , … , ) is an r-dimensional Gaussian vector with mean zero and r by r covariance matrix , and captures the average random effect governed by th spatial basis function.
= ( , … , ) is r-dimensional spatial basis functions (e.g., Gaussian basis function or exponential basis function) with a multi-scale nested structure (e.g., Figure 2). To cater for different observation supports (e.g., monitoring stations and remote sensing pixels), the region is discretized as non-overlapping but compact, basic areal units (BAU) [41]. If BAUs are small enough, compared to the study region, the error in the discrete process could be ignored. Then, the hidden process, ( ), is averaged over the BAUs, which can be written as where | | is the area of BAU-. At the BAU level, the process model can be written as ( ) = ( ) + ( ) + ( ) + ( ); = 1, … , .
A simple illustration of the idea is provided in Figure 2. The region, , is discretized as BAUs, and spatial basis functions at three scales (different bandwidth in the kernel functions) are constructed to capture the heterogeneous random effects of PM2.5 concentrations. For the BAU with observations in Figure 2, such as BAU-1, its value is governed by the three spatial basis functions ( , , and ) and calculated as = 0.4 × 0.55 + 0.6 × 0.25 + 0.3 × 0.2 = 0.43. For the BAU without observations in Figure 2, such as BAU-0, its random effect is calculated as = 0.4 × 0.6 + 0.6 × 0.23 + 0.3 × 0.17 = 0.429, with the same spatial basis functions ( , , and ) but with different weights. If a BAU was governed by a single spatial basis function, the variabilities on other scales would be ignored, such as = 0.3 × 1 = 0.3 . Consequently, this multi-scale When PM2.5 concentrations are measured either by monitoring stations or remote sensing instruments, measurement error is inevitable. Consequently, the measurement model is defined as the weighted average of hidden process plus an independent A simple illustration of the idea is provided in Figure 2. The region, R, is discretized as n BAUs, and spatial basis functions at three scales (different bandwidth in the kernel functions) are constructed to capture the heterogeneous random effects of PM 2.5 concentrations. For the BAU with observations in Figure 2, such as BAU-1, its value is governed by the three spatial basis functions (Φ 1 , Φ 2 , and Φ 3 ) and calculated as ω 1 = 0.4 × 0.55 + 0.6 × 0.25 + 0.3 × 0.2 = 0.43. For the BAU without observations in Figure 2, such as BAU-0, its random effect is calculated as ω 0 = 0.4 × 0.6 + 0.6 × 0.23 + 0.3 × 0.17 = 0.429, with the same spatial basis functions (Φ 1 , Φ 2 and Φ 3 ) but with different weights. If a BAU was governed by a single spatial basis function, the variabilities on other scales would be ignored, such as ω 8 = 0.3 × 1 = 0.3. Consequently, this multi-scale decomposition runs through the whole process of parameter estimation and prediction, leading to high flexibility to deal with complex variabilities and high computational efficiency.
When PM 2.5 concentrations are measured either by monitoring stations or remote sensing instruments, measurement error is inevitable. Consequently, the measurement model is defined as the weighted average of hidden process plus an independent measurement error term, ε j , as in Equation (1), where O P j denotes the footprint of observed PM 2.5 concentration, P j . w ij is the spatial weight between observation-i and BAU-j. For monitoring station data, O P j is the location of P j , and w ij is a set of 0-1 weights. For remote sensing data, O P j is the area of P j , and w ij is the overlapped area between pixel area-j and BAU-i. It is assumed that ε has a Gaussian distribution with mean-zero and variance-covariance σ 2 ε I. Here, σ 2 ε is estimated using variogram techniques ahead of parameter estimation [42]. Eventually, if we define the MSSREM can be written as The unknown parameters are included in a set ϑ = α, β, σ 2 ξ , K . The MSSREM are estimated by the expectation-maximization (EM) algorithm. The complete-data likelihood is defined as L(ϑ) = [τ, P | ϑ]. After initialization, the EM algorithm for L(ϑ) is an iterative optimization procedure including E-step, which computes conditional distribution of τ based on Gaussian prior distribution at current parameter estimates (ϑ), and M-step, which updates ϑ based the conditional distribution of τ and finds the max-likelihood estimates.

A Monte Carlo Simulation Experiment
In this section, we conducted a small-scale Monte Carlo simulation study to assess the relative prediction performance between multi-scale spatial random effect model (MSS-REM) and classic ordinary Kriging models (a single-scale spatial statistical model). The purpose was to demonstrate that MSSREM could serve as a useful methodology for modelling and predicting and to provide a tentative assessment on conditions under which MSSREM would be useful.
For simplicity, following Kang and Cressie (2011) and Sengupta and Cressie (2013), we chose a stable exponential spatial covariance function to generate a spatially correlated random field [43,44]: where C(d) is the covariance function related to distance d; σ 2 is the variance of the field; and α is the power of distance. Under this specification, larger values of α indicate higher levels of stability or smoothness of spatial processes, as illustrated by Figure 3, where nine processes were generated with discrete values of α ranging from 0 to 2.
which MSSREM would be useful. For simplicity, following Kang and Cressie (2011) and Sengupta and Cressie (2013), we chose a stable exponential spatial covariance function to generate a spatially correlated random field [43,44]: where ( ) is the covariance function related to distance d; is the variance of the field; and is the power of distance. Under this specification, larger values of indicate higher levels of stability or smoothness of spatial processes, as illustrated by Figure 3, where nine processes were generated with discrete values of ranging from 0 to 2. To assess the relative performance between MSSREM and the classic ordinary Kriging method, spatial point data and areal data commonly used in the studies of ground PM2.5 concentrations were chosen as experimental data. Kriging methods usually operate with point-level data, whereas MSSREM could process point-level data, areal data, or both at the same time. To mimic the real-world PM2.5 monitoring station data, under each simulation scenario (i.e., 40 varied values of with an equal interval of 0.05), we randomly draw 500 points (grid centroids) from each simulated real random process as point-level sample data. With respect to areal sample data, we simply aggregated a real random process generated to a resolution of 0.1°. Two sample data are depicted in Figure  4. For a regular 200-by-200 grid topology with a resolution of 0.01 • , 100 simulation experiments (random fields) were generated under each spatial covariance function scenario (i.e., 40 varied values of α with an equal interval of 0.05), leading to 4000 experiments for the 4000 grids on a two-dimensional lattice. We treated each simulated random field as a realisation (population in the statistics terminology) of the real PM 2.5 concentrations process in region R o , H α i : i ∈ [1, 100]. To assess the relative performance between MSSREM and the classic ordinary Kriging method, spatial point data and areal data commonly used in the studies of ground PM 2.5 concentrations were chosen as experimental data. Kriging methods usually operate with point-level data, whereas MSSREM could process point-level data, areal data, or both at the same time. To mimic the real-world PM 2.5 monitoring station data, under each simulation scenario (i.e., 40 varied values of α with an equal interval of 0.05), we randomly draw 500 points (grid centroids) from each simulated real random process as point-level sample data. With respect to areal sample data, we simply aggregated a real random process generated to a resolution of 0.1 • . Two sample data are depicted in Figure 4.  For each of the 4000 experiments, the MSSREM and classic ordinary Kriging models were implemented, both running with an exponential spatial covariance function. Simple R-squared statistic was calculated to assess model fit (e.g., Cressie, 1993; Banerjee, Carlin and Gelfand, 2015). Results were presented in Figure 5. In line with common sense, when globally structured spatial variability (stronger spatial dependence) is exhibited, both methods could reasonably reconstruct the underlying real process with an acceptable error range, as indicated by high values of R-squared statistic (≥0.96) with values of ≥ For each of the 4000 experiments, the MSSREM and classic ordinary Kriging models were implemented, both running with an exponential spatial covariance function. Simple R-squared statistic was calculated to assess model fit (e.g., Cressie, 1993; Banerjee, Carlin and Gelfand, 2015). Results were presented in Figure 5. In line with common sense, when globally structured spatial variability (stronger spatial dependence) is exhibited, both methods could reasonably reconstruct the underlying real process with an acceptable error range, as indicated by high values of R-squared statistic (≥0.96) with values of α ≥ 1.5. This observation holds for both point and areal sample data.
When higher levels of local spatial variabilities exhibited, the MSSREM produced better model fit than the classic ordinary Kriging model did for both spatial point ( ∈ (0,0.675)) and areal data ( ∈ (0,0.525)), indicating that MSSREM had greater chances to recover local-or fine-scale variations hidden in spatial processes. When medium levels of local spatial variabilities exhibited, for instance, ∈ (0.675,2) of point-level sample and ∈ (0.525,2) of area-level sample, model fits produced by both methodologies were not really distinguishable. Overall, this small-scale simulation experiment suggested that the MSSREM model, due to the use of multi-scale spatial basis functions with random coefficients, performed relatively better than the classic ordinary Kriging model. This could present a real advantage of MSSREM in real-world empirical examinations of ground PM2.5 concentrations, where global or large-scale spatial variabilities were usually captured by covariate effects.  When higher levels of local spatial variabilities exhibited, the MSSREM produced better model fit than the classic ordinary Kriging model did for both spatial point (α ∈ (0, 0.675)) and areal data (α ∈ (0, 0.525)), indicating that MSSREM had greater chances to recover local-or fine-scale variations hidden in spatial processes. When medium levels of local spatial variabilities exhibited, for instance, α ∈ (0.675, 2) of point-level sample and α ∈ (0.525, 2) of area-level sample, model fits produced by both methodologies were not really distinguishable. Overall, this small-scale simulation experiment suggested that the MSSREM model, due to the use of multi-scale spatial basis functions with random coefficients, performed relatively better than the classic ordinary Kriging model. This could present a real advantage of MSSREM in real-world empirical examinations of ground PM 2.5 concentrations, where global or large-scale spatial variabilities were usually captured by covariate effects.

Study Area
North China is one of the five meteorological geographic zones, covering the regions of Beijing, Tianjin, Hebei, Shanxi, Shandong, and Henan. It sits to the north of the Qinling Mountains-Huaihe River line and the south of the Great Wall and has a significant topographic variability, being high in the West and low in the East ( Figure 6). The region locates in the transition from subtropical to temperate zones, thus exhibiting great climatic differences between its north and south areas. Spatial disparities in socioeconomic and population distributions are also evident. The north region is one of areas with the worst air pollution levels in China and the world. Whether the combined differences in both natural and human factors lead to prominent variability in the PM 2.5 concentrations, and if so, to what extent, are the key inquiries of our empirical study. topographic variability, being high in the West and low in the East (Figure 6). The region locates in the transition from subtropical to temperate zones, thus exhibiting great climatic differences between its north and south areas. Spatial disparities in socioeconomic and population distributions are also evident. The north region is one of areas with the worst air pollution levels in China and the world. Whether the combined differences in both natural and human factors lead to prominent variability in the PM2.5 concentrations, and if so, to what extent, are the key inquiries of our empirical study.

Ground PM2.5 Concentrations
We crowded sourcing ground PM2.5 concentrations data by using web crawler technology (with python language) from the World Air Quality Project (http://aqicn.org (accessed on 10 July 2020)), a project providing historical and real-time air-quality data. To ensure model estimation robustness, we excluded stations with missing data for more than 65 days or 15 consecutive days and calculated annual ground PM2.5 concentrations averages for 1287 stations, as shown in Figure 6. The station data were part of the Nowcast system of The U.S. Environmental Protection Agency (EPA), which converted raw pollutant readings into air-quality index values (on a scale ranging from 0 to 500), referred to as the PM2.5 air quality index ( 2.5 ) [45]. According to the US-EPA 2016 standard, we converted 2.5 back into PM2.5 concentrations ( 2.5 ) based on the formula where and ℎ ℎ are, respectively, the left and right boundaries of the subinterval that 2.5 falls into and belongs to the range with breakpoints (0, 12, 35, 55, 150, 250, 350, 500). and ℎ ℎ are, respectively, the breakpoints (0, 50, 100, 150, 200, 300, 400, 500) corresponding to and ℎ ℎ .  [37,46] and the conceptual framework mentioned earlier, this study constructed nature and human factors to explain

Ground PM 2.5 Concentrations
We crowded sourcing ground PM 2.5 concentrations data by using web crawler technology (with python language) from the World Air Quality Project (http://aqicn.org (accessed on 10 July 2020)), a project providing historical and real-time air-quality data. To ensure model estimation robustness, we excluded stations with missing data for more than 65 days or 15 consecutive days and calculated annual ground PM 2.5 concentrations averages for 1287 stations, as shown in Figure 6. The station data were part of the Nowcast system of The U.S. Environmental Protection Agency (EPA), which converted raw pollutant readings into air-quality index values (on a scale ranging from 0 to 500), referred to as the PM 2.5 air quality index (AQI pm 2.5 ) [45]. According to the US-EPA 2016 standard, we converted AQI pm 2.5 back into PM 2.5 concentrations (C PM 2.5 ) based on the formula where C low and C high are, respectively, the left and right boundaries of the subinterval that C PM 2.5 falls into and belongs to the range with breakpoints (0, 12,35,55,150,250,350,500). AQI low and AQI high are, respectively, the breakpoints (0, 50, 100, 150, 200, 300, 400, 500) corresponding to C low and C high .

Independent Variables
Following Zhou et al. (2021) and Wei et al. (2020) [37,46] and the conceptual framework mentioned earlier, this study constructed nature and human factors to explain the deterministic trend in PM 2.5 concentrations. Detailed sources and descriptions of covariates are presented in Table 1.

Empirical Model Specification
The empirical model specification follows Equation (7). Regular grids with a 0.02 • × 0.02 • resolution were chosen as the basic areal units, yielding 48,403 BAUs. To capture the potential scale-dependent variabilities, spatial basis functions at three scales (a large scale with 5.4 • radius, a medium scale with 1.6 • radius, and a small scale with 0.5 • radius) were specified, as depicted in Figure 7. It is useful to note that there has not been a consensus on the optimal scale number of spatial basis functions [47]. However, in this study, the spatial basis functions with various spatial scales number were constructed, and the found model with a three-scale spatial basis function yielded the highest model fit.

Covariate Effects
Results on regression coefficients and the associated statistical significance of covariates are presented in Table 2. With respect to meteorological factors, relative humidity, cumulative precipitation, and wind speed were statistically negatively correlated with PM2.5 concentration, with everything else being equal. It is understandable

Covariate Effects
Results on regression coefficients and the associated statistical significance of covariates are presented in Table 2. With respect to meteorological factors, relative humidity, cumulative precipitation, and wind speed were statistically negatively correlated with PM 2.5 concentration, with everything else being equal. It is understandable that precipitation could clean the air by shooting down particles. Wind could accelerate PM 2.5 escape speed, thus decreasing PM 2.5 concentration, ceteris paribus. Higher temperature was associated with higher levels of PM 2.5 concentration. In addition, the greenhouse effect of aerosols (PM 2.5 ) could lead to warming [48], which could be a vicious cycle of air pollution and climate change in the study area and globally. , σ A = (X T X) −1 σ 2 ξ +σ 2 ε Where A is regression coefficients, andσ A is standard error of regression coefficient.
With respect to land-use characteristics, only unused land density and woodlandgrassland density were statistically negatively associated with PM 2.5 concentration. In the human activity domain, there were no consistent evidences on significant relationships between industry concentration and PM 2.5 concentration and between local urbanization and PM 2.5 concentration, as indicated by the insignificant regression coefficients of covariates IED and NTL. The significant correlation between road network density and PM 2.5 concentration might highlight the importance of transportation emission in air pollution.

Prediction Accuracy
This study used a tenfold cross-validation procedure to assess model fit and prediction accuracy. We randomly selected 90% of the data as the training group and the remaining 10% as validation group or out-of-sample validation. This whole procedure was repeated for 100 times, and results are presented in Figure 8. Following Cressie and Johannesson (2008) and Zammit-Mangion and Cressie (2021), the R-squared statistics and root mean squared error (RMSE) were used to assess prediction accuracy [40,47]. We noted that the MSSREM in Section 4.3 was fitted with full data, in which the validation group is the same as the training group. However, the sampling method of out-of-sample validation, a more robust verification method for predict accuracy in which the validation group is different from the training group, had a small probability to assign outliers into the validation group. This resulted in R 2 in the full-data model being less than that in tenfold cross-validation.

Conclusions
Producing high-accuracy and PM2.5 concentrations data at a fine spatial resolution is essential for health risk assessment and environment regulation evaluation. Primarily, PM2.5 concentrations is the key variable that links to various health outcome variables, and a fine spatial resolution pollution measure could yield a more accurate estimation of the relationships between pollution and health. This study presented a multi-scale spatial random effect model (MSSREM) for investigating PM2.5 concentrations' variability. Besides the spatial correlation effects often observed for geographical data, it has the capacity to model the potential scale-dependent effect, as it is flexibly specified by a linear combination of multi-scale spatial basis functions. Beyond the conceptual modelling advantages, it substantively improves computational efficiency by estimating a much smaller set of spatial basis function coefficients rather than a full set of spatial random effects, thus offering great potential to cater for large spatial data.
The small-scale simulation experiment indicates that when higher levels of local spatial variabilities are exhibited in a Gaussian random file, the MSSREM had greater chances to recover local-or fine-scale variations hidden in spatial processes, especially in real-world empirical examinations where global or large-scale spatial variabilities were usually captured by covariate effects. This was confirmed by the empirical study on North China based on MSSREM, in which we obtained more reliable covariate effects than non- As clearly presented in Figure 8a, the regression slope, obtained by regressing the predicted values on observed values of PM 2.5 concentrations, was close to one on average, indicating a good model fit. In addition, the averaged R-squared value was as high as 0.917 with an interval of (0.914, 0.923) (mean ± 1.96 × standard error), whilst the averaged RMSE was 3.777 with an interval of (3.665, 3.889) (mean ± 1.96 × standard error). With respect to the spatial distribution of estimation errors, only 1.5% of the stations exhibited absolute estimation errors ≥15 and 75% of the stations with absolute estimation errors less than 5. More importantly, the distribution of model estimation errors appeared to be spatially random, which was confirmed by a statistically insignificant Moran's I statistic of 0.136 (a p-value > 0.2). This highlighted that the spatial dependency effects were well-captured by the MSSREM model.
Among existing studies, Wei et al. (2020) reconstructed the PM 2.5 pattern in North China based on machine learning method and derived fitting results (R 2 = 0.92 and RMSE = 11.52) [37]. Compared with this, our results with close R 2 = 0.917 and evidently smaller RMSE = 3.777 show a higher precision. This is mainly because through the multi-scale local modelling of residual scale-dependent variabilities and spatial dependence effect outside the global trends, spatial basis functions with random coefficients well-recovered local variations hidden in spatial processes of secondary PM 2.5 and ensured smaller local errors on a fine scale.

Conclusions
Producing high-accuracy and PM 2.5 concentrations data at a fine spatial resolution is essential for health risk assessment and environment regulation evaluation. Primarily, PM 2.5 concentrations is the key variable that links to various health outcome variables, and a fine spatial resolution pollution measure could yield a more accurate estimation of the relationships between pollution and health. This study presented a multi-scale spatial random effect model (MSSREM) for investigating PM 2.5 concentrations' variability. Besides the spatial correlation effects often observed for geographical data, it has the capacity to model the potential scale-dependent effect, as it is flexibly specified by a linear combination of multi-scale spatial basis functions. Beyond the conceptual modelling advantages, it substantively improves computational efficiency by estimating a much smaller set of spatial basis function coefficients rather than a full set of spatial random effects, thus offering great potential to cater for large spatial data.
The small-scale simulation experiment indicates that when higher levels of local spatial variabilities are exhibited in a Gaussian random file, the MSSREM had greater chances to recover local-or fine-scale variations hidden in spatial processes, especially in realworld empirical examinations where global or large-scale spatial variabilities were usually captured by covariate effects. This was confirmed by the empirical study on North China based on MSSREM, in which we obtained more reliable covariate effects than non-spatial statistics and more precise prediction results with smaller local errors than previous studies.
In terms of methodological significance, the multi-scale modelling strategy developed in this study could, to some extent, alleviate the modifiable areal unit problem. As it captures the multiple-scale variabilities in the spatial random effect, the potential confounding effects between covariates and geographical scales could be substantially reduced. With respect to policy significance, compiling local-and fine-resolution PM 2.5 concentrations data would be beneficial for precise health risk assessment of PM 2.5 pollution exposure because a PM 2.5 concentration data with smaller local errors offer opportunities to understand the nuanced relationships between air pollution and health. In addition, with medium effects, it is intuitive to extend our methodology to a spatio-temporal modelling context, thus offering a practical solution to obtain fine spatio-temporal-scale PM 2.5 concentration estimates, contributing real-time monitoring of regional air pollution.
Despite a careful design for investigating the annual PM 2.5 concentrations variability in the North China, some limitations remain. Firstly, remote-sensing-based data were not simultaneously modelled along with the monitoring station data although the multiscale spatial random effect model, in principle, can model multiple data sources with different spatial supports. Secondly, the annual average left the temporal variabilities unmodelled. However, a further methodological extension to a simultaneously modelling monitoring station and remote sensing-based PM 2.5 concentrations data as well as the temporal dependency is on top of our future research priorities.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.