Interpolation-Based Fusion of Sentinel-5P, SRTM, and Regulatory-Grade Ground Stations Data for Producing Spatially Continuous Maps of PM 2.5 Concentrations Nationwide over Thailand

: Atmospheric pollution has recently drawn signiﬁcant attention due to its proven adverse effects on public health and the environment. This concern has been aggravated speciﬁcally in Southeast Asia due to increasing vehicular use, industrial activity, and agricultural burning practices. Consequently, elevated PM 2.5 concentrations have become a matter of intervention for national authorities who have addressed the needs of monitoring air pollution by operating ground stations. However, their spatial coverage is limited and the installation and maintenance are costly. Therefore, alternative approaches are necessary at national and regional scales. In the current paper, we investigated interpolation models to fuse PM 2.5 measurements from ground stations and satellite data in an attempt to produce spatially continuous maps of PM 2.5 nationwide over Thailand. Four approaches are compared, namely the inverse distance weighted (IDW), ordinary kriging (OK), random forest (RF), and random forest combined with OK (RFK) leveraging on the NO 2 , SO 2 , CO, HCHO, AI, and O 3 products from the Sentinel-5P satellite, regulatory-grade ground PM 2.5 measurements, and topographic parameters. The results suggest that RFK is the most robust, especially when the pollution levels are moderate or extreme, achieving an RMSE value of 7.11 µ g/m 3 and an R 2 value of 0.77 during a 10-day long period in February, and an RMSE of 10.77 µ g/m 3 and R 2 and 0.91 during the entire month of March. The proposed approach can be adopted operationally and expanded by leveraging regulatory-grade stations, low-cost sensors, as well as upcoming satellite missions such as the GEMS and the Sentinel-5.


Introduction
Air quality has become a major global social concern as it is associated with health problems such as stroke, heart disease, lung cancer, and respiratory infection [1,2]. The World Health Organization (WHO) has listed six major related species, namely particulate matter, ground-level ozone, carbon monoxide (CO), sulfur oxides, nitrogen oxides, and lead. Among them, the fine particles with an aerodynamic diameter of less than 2.5 µm (PM 2.5 ) have become the center of interest due to their influence on visibility, human health, and global climate [3]. Several studies have even reported that PM 2.5 contributes to the spread of viruses such as SARS-CoV-2, which is responsible for the COVID-19 disease [4][5][6].
In this respect, air quality monitoring and prediction has become a major field of interest in an effort to prevent the consequences caused by air pollution [1,7]. Therefore, countries affected heavily by pollution-such as Thailand which has undergone severe air pollution events due to rapid economic development and urbanization [8]-have developed monitoring systems to address the need for air pollution monitoring and exposure assessments. Three main types of related technology have been normally established. First, the regulatory-grade systems provide the most accurate and robust in-situ observations; nevertheless, they are financially expensive to install and maintain [9]. In order to overcome the consequent low coverage of the aforementioned monitoring stations, low-cost sensors have been deployed in various countries [10,11], of which the quality is still not competitive compared to standard air quality monitoring stations [12]. However, the low cost allows for the deployment of a number of such sensors. Lastly, satellite technology provides broad geographical coverage and consistent data acquisition, hampered only by certain weather conditions [13].
Notwithstanding, since surface PM concentrations measured by regulatory-grade systems are point-based and sparsely distributed, especially in rural areas [10,13], spatial interpolation and remote-sensing-based models to generate estimates of PM concentration in areas where ground stations are not found have been proposed. This approach has gathered attention considering that the data availability could be enhanced and the related costs for establishing additional monitoring stations can be avoided [14,15]. Moreover, interpolationbased estimations in areas with unavailable ground data is important as a support for hot spot and source identification, which is essential for efficient policy planning. The interpolation of ground-level PM 2.5 has been demonstrated widely with basic techniques such as the inverse distance weighted (IDW) and the ordinary kriging (OK), which are two commonly compared methods [16][17][18][19]. Many studies reckon that IDW generally showed poor performance as a non-geostatistical method, which simply defines spatial relationship by distance [20,21]. However, it is well-functioned as a control method to be compared with other approaches in various studies with recent attempts to estimate PM 2.5 concentration from ground monitoring stations [18,22,23]. On the other hand, the kriging method is a geostatistical method that provides the best linear unbiased estimate by incorporating spatial correlation defined as empirical variogram [19,24,25]. Although OK has the disadvantages of being computationally demanding [26] and not being suitable for generating sufficient covariate information [24], it has recently been synthesized with RF or support vector machine (SVM) as an approach for interpolating residuals to achieve less sensitivity to data variance and provide geo-referenced features [27][28][29][30].
Random forest (RF) is one of the most frequently applied [22] tree-based machine learning (ML) models due to its advantages to achieve high accuracy [1] while handling non-linear relationships [31]. It also enables unbiased estimation of errors by constructing a multitude of trees with randomly chosen parameters and preventing learning from specific features [5]. Moreover, RF blended with OK (RFK) is considered as regression kriging (RK) and is often referred to as RFK or RFOK. The difference between RK and RFK is that RK uses a multiple linear regression (MLR) [29], while RFK is a hybrid method using RF for regression and OK for geostatistical modeling of residuals [22,[27][28][29]. It can overcome the limitation of RF in that it does not account for geo-referenced data [25]. Past studies which have compared the accuracy of RF and RFK [22,30], or RK and RFK [29], concluded that the RFK presented a better ability to map non-linear complex relations.
The current study presents the comparative evaluation of four interpolation methods in an attempt to produce spatially continuous PM 2.5 concentrations maps. Four interpolation algorithms (IDW, OK, RF, and RFK) are applied and compared, out of which IDW and OK derive estimates solely using regulatory-grade ground data while RF and RFK fuse regulatory-grade ground data, satellite observations, and topographic factors. In particular, we investigate the development of PM 2.5 concentrations in two test sites with different sources and patterns of air pollution, namely Bangkok and Chiang Mai. The performance is judged using the metrics of the root-mean-squared error (RMSE), scatter index (SI), and the coefficient of determination (R 2 ). We showcase the key features of RFK and advocate it as a suitable spatial interpolating technique for Thailand, which has a limited size and distribution of in-situ monitoring stations.

Study Area
Thailand is a country located at the center of the Indochinese Peninsula at the coordinates of 15.87 • N latitude and 100.99 • E longitude. Most regions in Thailand generally exhibit a tropical climate, especially in the south, but present distinct patterns based on seasonal variation and climatic zones. The seasons could be subdivided into the rainy season lasting from mid-May to mid-October, the winter season running from mid-May to mid-October, and the pre-monsoon season lasting from mid-February until mid-May.
From a topographical perspective, Thailand is divided into the northern, northeastern, central, eastern, and southern regions (Figure 1), which exhibit different seasonal and climatic patterns. As shown in Table 1, the northern part includes 15 provinces and exhibits mountainous topography and similar climatic features to the central region but receives lower precipitation. The northeastern region is a plateau with high elevations including 20 provinces. It represents the driest conditions among the regions, with the lowest precipitation and relative humidity. The central region is mainly a low-level plain encompassing 18 provinces, while the eastern part consists of plains and small valleys. These regions usually experience rainy, dry, and cool seasons, but are mostly warm because they are located in tropical inland latitudes. Meanwhile, the southern part has low variations in temperature and high relative humidity due to its maritime characteristics [32]. These seasonal variations have an influence on PM 2.5 concentrations, such as dilution of air pollutants with heavy rain or excessive air pollution due to temperature inversion or anthropogenic activities.

Data
Data associated with PM2.5 concentrations from different sensors, ground and remote alike, were identified and used synergistically in the current study to derive a countrywide representation of spatial continuous maps. More specifically, ground data from the Pollution Control Department (PCD) (http://air4thai.pcd.go.th/webV2/region.php?re-gion=0, accessed on 11 March 2021) in Bangkok, Thailand, remotely sensed images from  Due to the complexity and large size of the country, in the current study, we concentrate on two regions that exhibit heavy pollution events during different times of the year and are attributed to different sources. Bangkok, the capital and most populous city of Thailand, has been developing rapidly as a metropolitan city, and vehicular usage-which is known as an important source of PM 2.5 -has become excessive. Correspondingly, the concentration of PM 2.5 over Bangkok tends to follow the daily pattern of vehicular use, which reaches a maximum during ordinary rush hours. Conversely, the region around Chiang Mai province is a predominantly agricultural area where the combustion of agricultural biomass as a form of farm management takes place in the first months of each year. Consequently, this major source contributes to excessive pollution levels in Northern Thailand during winter.

Data
Data associated with PM 2.5 concentrations from different sensors, ground and remote alike, were identified and used synergistically in the current study to derive a countrywide representation of spatial continuous maps. More specifically, ground data from the Pollution Control Department (PCD) (http://air4thai.pcd.go.th/webV2/region.php? region=0, accessed on 11 March 2021) in Bangkok, Thailand, remotely sensed images from the Sentinel-5 Precursor (Sentinel-5P) satellite, and the digital elevation model (DEM) product from the Shuttle Radar Topography Mission (SRTM) were the main data sources.
The PCD has established a country-wide monitoring network for air quality management consisting of 68 stations as of 2020. After the elimination of corrupted data, a total of 63 out of 68 stations were taken into consideration ( Figure 1). The monitoring stations of PCD have provided hourly data of PM 2.5 (µg/m 3 ), PM 10 (µg/m 3 ), O 3 (parts per billion (ppb)), CO (parts per million (ppm)), NO 2 (ppb), and SO 2 (ppb) at 2 m above the ground at hourly intervals; PM 2.5 was the main attribute used in the current study.
Lastly, a secondary source of remotely sensed data was used to retrieve elevation. The DEM data obtained from the SRTM mission (https://www2.jpl.nasa.gov/srtm/, (accessed on 18 August 2021) were used for this purpose as in similar studies [42][43][44]. The SRTM was a short 11-day mission during which near-global high-resolution topography data was collected and organized by the National Geospatial-Intelligence Agency (NGA) and the National Aeronautics and Space Administration (NASA). The spatial resolution of the product used in the current study was 90 m.

Methodology
The overall procedure for estimating ground-level PM 2.5 is described in Figure 2. The steps consist of (1) data collection and reprojection in order to bring the data sources in the same coordinate system and time window, (2) implementation of spatial interpolation models on defined scenarios, and (3) model validation and mapping of PM 2.5 estimates.
SRTM was a short 11-day mission during which near-global high-resolution topography data was collected and organized by the National Geospatial-Intelligence Agency (NGA) and the National Aeronautics and Space Administration (NASA). The spatial resolution of the product used in the current study was 90 m.

Methodology
The overall procedure for estimating ground-level PM2.5 is described in Figure 2. The steps consist of (1) data collection and reprojection in order to bring the data sources in the same coordinate system and time window, (2) implementation of spatial interpolation models on defined scenarios, and (3) model validation and mapping of PM2.5 estimates.

Data Collection and Reprojection
The current study integrates the tropospheric NO2 column, tropospheric O3 column, total vertical SO2 column, tropospheric formaldehyde (HCHO) column, total CO column, and aerosol index (AI) layers since several studies have addressed that the TROPOMI atmospheric products are advantageous due to the lower missing proportion in data compared to other AOD products, especially to MODIS which is widely used in estimating ground-level PM2.5 concentrations [45][46][47]. The p-values of the correlation of each layer with the ground PM2.5 are provided in Table 2. This investigation identifies the most suitable layers based on the correlation coefficient to input into the machine learning models.

Data Collection and Reprojection
The current study integrates the tropospheric NO 2 column, tropospheric O 3 column, total vertical SO 2 column, tropospheric formaldehyde (HCHO) column, total CO column, and aerosol index (AI) layers since several studies have addressed that the TROPOMI atmospheric products are advantageous due to the lower missing proportion in data compared to other AOD products, especially to MODIS which is widely used in estimating ground-level PM 2.5 concentrations [45][46][47]. The p-values of the correlation of each layer with the ground PM 2.5 are provided in Table 2. This investigation identifies the most suitable layers based on the correlation coefficient to input into the machine learning models.  Thereafter, based on the defined time series, the six parameters of NO 2 , SO 2 , CO, HCHO, O 3 , and AI from Sentinel-5P, and the DEM product from SRTM were extracted and averaged for the model training. Regarding Sentinel-5P data, we selected CO and HCHO due to their similar emission source to that of PM 2.5 [48][49][50][51], SO 2 and NO 2 as precursors for the PM 2.5 [51][52][53], and O 3 due to its varying interaction with PM 2.5 by seasons and regions [54][55][56]. Lastly, we also included the product of AI as an informative parameter about PM 2.5 [57], since it captures signals of aerosols emitted from biomass burning or fire events [58], which is another source of particulate matter [59,60]. For the retrieval of data, we accessed the Copernicus Open Access Hub website (https://scihub.copernicus.eu/, accessed on 24 June 2021) to retrieve TROPOMI Level 2 products and compared the products with the corresponding pre-processed Google Earth Engine products. After confirming their consistencies, we utilized Google Earth Engine for fetching data with a spatial resolution of 3.5 × 5.5 km and averaging them based on the described time ranges. In order to remove cloud-contaminated scenes, the quality assurance (QA) value of 0.75 was applied as recommended by Zweers [58].
Additionally, with regard to the SRTM processing, the tiles covering the whole Thailand were downloaded from the Earth Explorer website, stitched together, and reprojected to the same coordinate system as the other data (i.e., the World Geodetic System (WGS) 1984 (EPSG 4326)). Finally, the mosaiced image was subset to the area of study (defined as the entire administrative area of Thailand) and spatially resampled to a 3.5 × 5.5 km grid in order to approximate the spatial resolution of the Sentinel-5P data and allow for the synergistic use of the two remotely sensed satellite products.
The whole dataset was split according to the settled time schemes, and the analysis of pollution patterns in Thailand was adopted to define the time window as shown in Figure 3. To compare the pollution trend between Bangkok and Chiang Mai, the hourly data from 63 ground stations were averaged to daily PM 2.5 concentrations while grouped by districts. The different pollution sources and patterns for each region led to differing patterns observed; in February, a time window with moderate pollution is observed in both regions, in March and for most of the month there is excessive air pollution in Chiang Mai, while this is not the case for Bangkok. June to September is a long period with the lowest pollution for both regions, and lastly in December, moderate pollution existed in Bangkok while Chiang Mai had relatively lowered pollution. Thereafter, data from 8 February to 18 February,19 March to 29 March, 7 July to 17 July, and 5 December to 15 December of the year 2020 were extracted and considered as four scenarios with the aforementioned 10-day and monthly averages as the time periods for the scenarios based on which the analysis took place. The total number of data used in this study was 182,952 from 63 stations and the number of layers utilized for the model training was seven for each scenario, which were six satellite images of air pollutants from Sentinel-5P, and the DEM.   ) for the year 2020. The light and dark grayed areas correspond to the monthly and 10-day time windows respectively selected in this study as representative of the four unique scenarios for the PM2.5 temporal distribution.

Spatial Interpolation Modeling
The current experiment considered two types of interpolation methods to produce spatially continuous data of PM2.5, two basic interpolation techniques and two ML techniques. Among the four models, IDW is widely used for spatial interpolation, and simply calculates estimates using distance-based weights based on the formula below [61] (1)

Spatial Interpolation Modeling
The current experiment considered two types of interpolation methods to produce spatially continuous data of PM 2.5 , two basic interpolation techniques and two ML techniques. Among the four models, IDW is widely used for spatial interpolation, and simply calculates estimates using distance-based weights based on the formula below [61] Z where Z (x0) is the estimated PM 2.5 concentration at the target position x 0 , Z (xi) is the PCD's ground-level PM 2.5 data at sample location x i , and i indicates the weighting value, which is calculated by the equation [61] where d is the Euclidean distance between the location of the target and sample data, while p indicates a power parameter, which is set at 2 in this study. Unlike the IDW interpolation method, the OK technique is based on statistical information from the spherical, exponential, Gaussian, and Matern covariance models. These options provide additional features in calculating weights, which could be derived from the semi-variogram. The variogram calculates values from all pairs of the sample locations, x i and x j , using the formula below [62] γ where Var indicates the variance. Since the variogram is established on the basis of the assumption that the closer data tend to have higher correlations, and the four approximation expressions embedded in the R project could provide the correlation formula, we needed to select the formula that best represents the variogram. Finally, the chosen formula, which was a spherical model in this study, was utilized for estimating unknown PM 2.5 concentrations. Judging the major difference between OK and RFK, the RFK relies on the RF method, which belongs to a supervised machine learning algorithm based on decision trees [63]. Conversely, IDW and OK only utilize PM 2.5 data from PCD, RFK and RF employed the whole dataset to apply the model for the spatial interpolation. The Sentinel-5P data and the DEM product at the sample location were extracted from each raster data to calculate the Pearson correlation coefficient as presented in Table 2, and the RF approach was implemented to appropriately produce interpolation results with the existence of unrelated covariables. Despite the fact that both approaches incorporated the same RF model (Ntree = 500 and mtry = 2, nodesize = 5, maxnodes = null; limited by nodesize) and used the whole stacked raster data as the input data, RFK required the additional process to correct estimates by applying the OK method on residuals between actual and approximate value.

Model Validation
The processed dataset was used as the input data for the four spatial prediction models considered in this study, namely the IDW, OK, RF, and RFK techniques. The dataset was split into two parts, out of which approximately 10% (6 out of 63 stations) was used as a testing set, by randomly extracting one station from northern, northeastern, eastern, and southern regions, and two stations from central regions based on their uneven spatial distribution. The remaining data corresponding to 90% (57 stations) was used as a training set, while the split was randomly implemented 10 times for 10-fold cross-validation [64][65][66] to exclude the impact of extreme values in each division, especially in the case of adopting data with large variance.
Therefore, the interpolation results and their validation metrics based on test sets were averaged in the evaluation step. The metrics used for the assessment were the R 2 , the Pearson's correlation coefficient and associated p-values, and the SI, which is derived from the division of RMSE with the mean of PM 2.5 data multiplied by 100 to diminish the influence of the variance. The R statistical programming language [67] was used for the data analysis.   The dataset in July produced a rather uniform distribution among the predefined scenarios, which can be attributed to the small variance of the data ranging from 3.65 to 21.64 μg/m 3 . The pollution pattern is stable as the country enters the monsoon season, which resulted in a uniform color map, and the eastern and central regions showed relatively higher concentrations of PM2.5 than other regions. The dataset for 10 days ranged from 3.55 μg/m 3 to 21.62 μg/m 3 , with a similar distribution for the monthly intervals. The missing data in the TROPOMI atmospheric products resulted in a few blank spaces in the interpolated maps produced from the RF and RFK model, which incorporated satellite images as covariates.
Moreover, in December the pollution was indicated with PM2.5 concentrations over 37 μg/m 3 occurring around the central area, spreading out to the northern part. The data ranged from 5.04 μg/m 3 to 47.98 μg/m 3 for a month and from 5.74 μg/m 3 to 58.77 μg/m 3 for 10 days.

Ten-Fold Cross-Validation
We conducted 10-fold cross-validation to evaluate the model performance for each scenario and presented the results in Figure 8. In February, OK exhibited the lowest RMSE of 4.267 μg/m 3 and 5.362 μg/m 3 for 10 days and a month, and relatively high R 2 as 0.90 and 0.84 for 10 days and a month respectively, followed by RF. RF attained RMSE as 5.57 μg/m 3 and R 2 as 0.88 for the 10-day dataset, and RMSE as 6.34 μg/m 3 and R 2 as 0.85 for a monthly dataset. Moreover, in March, all models showed degraded performance in terms of RMSE due to the high variance of the data, while R 2 became the best performing among the four scenarios. A comparison between the models indicates that OK had the lowest RMSE of 9.27 μg/m 3 for 10 days and 6.91 μg/m 3 for a month, while RF and RFK showed relatively high R 2 of 0.91 and 0.9 for 10 days, and 0.93 and 0.91 for a month respectively.
The dataset in July, which had very low PM2.5 concentration and limited variation, demonstrated good performance in terms of RMSE which ranged from 3.16 μg/m 3 to 3.65 μg/m 3 , and very poor performance in terms of R 2 which ranged from 0.15 to 0.25. However, RMSE could be highly influenced by the mean observation. Therefore, we integrated the SI to further validate the performance. With the objective of evaluating the validity of RMSE with respect to mean observation, we calculated the SI by dividing RMSE with the average of PM2.5 concentrations. The SI excluded the misleading influence of the data average in July, demonstrating values higher than other scenarios, which indicates poor performance.
In December, even though R 2 was reduced to an average of around 0.6, the estimates generally matched the observed concentrations with RMSEs being 8.36, 8.51, 7.93, and 7.84 In February, a monthly dataset of which PM 2.5 concentration ranged from 11.06 to 86.02 µg/m 3 , presented high concentration (>50 µg/m 3 ) around Mae Hong Son, and southern regions showed the lowest pollution under 25 µg/m 3 , which is considered as very good air quality according to the criteria set by PCD. Additionally, the central and northeastern regions were moderately polluted according to Figure 4, demonstrating concentration from 26 to 37 µg/m 3 , while RF and RFK estimated PM 2.5 concentration in northeastern areas over 38 µg/m 3 . RF and RFK tended to illustrate a spatial variation of PM concentration over the regions, and IDW and OK showed zoning distributions around the monitoring stations. The 10-day dataset, which had minimum and maximum values of 8.21 and 79.69 µg/m 3 respectively, also accompanied similar spatial distribution as shown with the 10-day dataset, but with overall lower concentrations.
The dataset in March showed drastic concentration variability over Thailand, by having maximum and minimum values of 12.13 and 193.54 µg/m 3 respectively. Similar to the results in February, the southern region indicated the best air quality, quantified at under 25 µg/m 3 , and northern regions were represented as hotspots of pollution, while their coverage became enlarged. The zoning distribution of IDW became excessive, and OK showed an oversimplified gradual increase toward the north. RF and RFK showed detailed spatial variation along with extreme hotspots in the southern and northern areas. Moreover, the pollution that occurred in the northeastern part of the country, as presented in Figure 5, is depicted with a more realistic representation by RF and RFK. The dataset for 10 days displayed a similar pattern to that of the monthly dataset, while the data ranges from 10.90 to 214.05 µg/m 3 .
The dataset in July produced a rather uniform distribution among the predefined scenarios, which can be attributed to the small variance of the data ranging from 3.65 to 21.64 µg/m 3 . The pollution pattern is stable as the country enters the monsoon season, which resulted in a uniform color map, and the eastern and central regions showed relatively higher concentrations of PM 2.5 than other regions. The dataset for 10 days ranged from 3.55 µg/m 3 to 21.62 µg/m 3 , with a similar distribution for the monthly intervals. The missing data in the TROPOMI atmospheric products resulted in a few blank spaces in the interpolated maps produced from the RF and RFK model, which incorporated satellite images as covariates.
Moreover, in December the pollution was indicated with PM 2.5 concentrations over 37 µg/m 3 occurring around the central area, spreading out to the northern part. The data ranged from 5.04 µg/m 3 to 47.98 µg/m 3 for a month and from 5.74 µg/m 3 to 58.77 µg/m 3 for 10 days. Figure 7 showed relatively higher concentrations with Figure 7e-g. The same pattern was observed, with IDW and OK showing a distribution largely depending on the values of the monitoring stations, while RF and RFK presented a larger and heterogeneous spatial variability.

Ten-Fold Cross-Validation
We conducted 10-fold cross-validation to evaluate the model performance for each scenario and presented the results in Figure 8. In February, OK exhibited the lowest RMSE of 4.267 µg/m 3 and 5.362 µg/m 3 for 10 days and a month, and relatively high R 2 as 0.90 and 0.84 for 10 days and a month respectively, followed by RF. RF attained RMSE as 5.57 µg/m 3 and R 2 as 0.88 for the 10-day dataset, and RMSE as 6.34 µg/m 3 and R 2 as 0.85 for a monthly dataset. Moreover, in March, all models showed degraded performance in terms of RMSE due to the high variance of the data, while R 2 became the best performing among the four scenarios. A comparison between the models indicates that OK had the lowest RMSE of 9.27 µg/m 3 for 10 days and 6.91 µg/m 3 for a month, while RF and RFK showed relatively high R 2 of 0.91 and 0.9 for 10 days, and 0.93 and 0.91 for a month respectively.
Atmosphere 2022, 13, x FOR PEER REVIEW 13 of 23 μg/m 3 for IDW, OK, RF, and RFK respectively with a 10-day dataset. On the other hand, they were improved with a monthly dataset, reaching 6.90, 7.07, 6.93, and 6.91 μg/m 3 for each model for the latter case. As a comparison between the models, RFK showed the highest R 2 and lowest RMSE, followed by RF.   The dataset in July, which had very low PM 2.5 concentration and limited variation, demonstrated good performance in terms of RMSE which ranged from 3.16 µg/m 3 to 3.65 µg/m 3 , and very poor performance in terms of R 2 which ranged from 0.15 to 0.25.

Data Range of Interpolated Estimates
However, RMSE could be highly influenced by the mean observation. Therefore, we integrated the SI to further validate the performance. With the objective of evaluating the validity of RMSE with respect to mean observation, we calculated the SI by dividing RMSE with the average of PM 2.5 concentrations. The SI excluded the misleading influence of the data average in July, demonstrating values higher than other scenarios, which indicates poor performance.
In December, even though R 2 was reduced to an average of around 0.6, the estimates generally matched the observed concentrations with RMSEs being 8. 36, 8.51, 7.93, and 7.84 µg/m 3 for IDW, OK, RF, and RFK respectively with a 10-day dataset. On the other hand, they were improved with a monthly dataset, reaching 6.90, 7.07, 6.93, and 6.91 µg/m 3 for each model for the latter case. As a comparison between the models, RFK showed the highest R 2 and lowest RMSE, followed by RF.  Table 2 presents the results of the correlation analysis between PM 2.5 and covariates. The correlation with PM 2.5 was relatively high for the variables of HCHO and CO, while NO 2 and SO 2 showed lower correlations, which conformed with the fact that PM 2.5 [48,49], HCHO [50], and CO [51] are highly associated with fuel combustion. On the other hand, SO 2 and NO 2 , which are important precursors in the process of PM 2.5 formation [51][52][53], presented low or negative correlations with PM 2.5 . O 3 manifested high coefficient values in February and March, and negative coefficients in December due to the reversed influence in different seasons [54][55][56]. Lastly, AI had a higher correlation with PM 2.5 especially in March, when agricultural biomass burning is prevalent since AI and PM 2.5 are influenced by the aerosol formation due to the agricultural practices prevailing in SE Asia during this season [44,[57][58][59][60].

Station-Based Comparison between Observations and Estimates from Test Sets Observing PM 2.5 Concentration Exceeding 50 µg/m 3
Since the RF and RFK are primarily based on the RF model, we derived significant features used for the models as shown in Figure 10. The importance is represented by the value of the mean decrease in MSE, which indicates the total decrease in model performance when each variable is neglected from the model. According to the results, during the most polluted period in March, overall feature importance was very high compared to other scenarios, with CO, AI, and HCHO being the most important factors to estimate PM 2.5 with the monthly dataset, whereas DEM surpassed HCHO with the 10-day dataset. SO 2 and NO 2 were the least influential throughout the scenarios for both datasets. In February, CO, O 3 , and NO 2 were influential parameters for a month, whereas CO, DEM, and NO 2 became more significant for the 10-day dataset. The feature importance in July and December indicated that most covariates were insignificant for the RF model. These results matched the correlation of covariates with PM 2.5 overall except NO 2 being one of the influential parameters in March. In comparison with the results in Table 2, Figure 10 demonstrates that highly correlated parameters tend to be essential factors for the RF model. Atmosphere 2022, 13, x FOR PEER REVIEW 14 of 23 Figure 9. Box plots of data distribution of observations and estimates from four models with (a) monthly and (b) 10-day datasets. IDW, OK, RF, and RFK indicates inverse distance weighted, ordinary kriging, random forest, and random forest blended with ordinary kriging, respectively. Table 2 presents the results of the correlation analysis between PM2.5 and covariates. The correlation with PM2.5 was relatively high for the variables of HCHO and CO, while NO2 and SO2 showed lower correlations, which conformed with the fact that PM2.5 [48,49], HCHO [50], and CO [51] are highly associated with fuel combustion. On the other hand, SO2 and NO2, which are important precursors in the process of PM2.5 formation [51][52][53], presented low or negative correlations with PM2.5. O3 manifested high coefficient values in February and March, and negative coefficients in December due to the reversed influence in different seasons [54][55][56]. Lastly, AI had a higher correlation with PM2.5 especially in March, when agricultural biomass burning is prevalent since AI and PM2.5 are influenced by the aerosol formation due to the agricultural practices prevailing in SE Asia during this season [44,[57][58][59][60]. . Box plots of data distribution of observations and estimates from four models with (a) monthly and (b) 10-day datasets. IDW, OK, RF, and RFK indicates inverse distance weighted, ordinary kriging, random forest, and random forest blended with ordinary kriging, respectively.

Analysis of Important Features and Correlation of Covariates
Next, we analyzed the difference between observed and interpolated values to evaluate the model performance at the stations which recorded PM 2.5 concentrations over 50 µg/m 3 , which coincides with the threshold set by the PCD. The analysis was run for the test sets in February and March, when atmospheric pollution is at the highest levels. As a result, the selected test stations were distributed mainly in the northern and northeastern regions as shown in Figure 11. According to the results presented in Tables 3-6, IDW and OK had a strong tendency to underestimate values, with the error rate of IDW ranging from −23 to +5% for the monthly dataset, and −23 to +6% for the 10-days dataset, while that of OK ranged from −13 to +8% for the monthly dataset, and −22 to +7% for the 10-days dataset. Despite the fact that both IDW and OK achieved a low error rate and RMSE, they still presented generally negative rates from −2 to −14% and −10 to +3% for the monthly and 10-day datasets, and −14 to +2% respectively. SO2 and NO2 were the least influential throughout the scenarios for both datasets. In February, CO, O3, and NO2 were influential parameters for a month, whereas CO, DEM, and NO2 became more significant for the 10-day dataset. The feature importance in July and December indicated that most covariates were insignificant for the RF model. These results matched the correlation of covariates with PM2.5 overall except NO2 being one of the influential parameters in March. In comparison with the results in Table 2, Figure 10 demonstrates that highly correlated parameters tend to be essential factors for the RF model.

Station-Based Comparison between Observations and Estimates from Test Sets Observing PM2.5 Concentration Exceeding 50 μg/m 3
Next, we analyzed the difference between observed and interpolated values to evaluate the model performance at the stations which recorded PM2.5 concentrations over 50 μg/m 3 , which coincides with the threshold set by the PCD. The analysis was run for the test sets in February and March, when atmospheric pollution is at the highest levels. As a       The results from the application of RF indicated relatively high error rates in March and low in February, with the lowest RMSE in February and the highest RMSE in March, which led to high variability in model performance. RFK, on the other hand, showed a reversed estimation tendency to IDW and OK, demonstrating positively biased ranges of −12 to +28% for the month dataset, and −4 to 19% for the 10-days datasets, with the lowest and the second-lowest RMSE occurring in March. Furthermore, RFK presented improvements in error rates in February, ranging from −3 to 13% and 1 to 11% for the monthly and 10-day datasets respectively.

Discussion
The current study analyzed four spatial interpolation methods, two of which are simple traditional interpolation approaches and the other two are machine learning based methods. According to the results from the 10-fold cross-validation, OK demonstrated the best performance, especially when pollution occurred at critical levels to affect human health (>50 µg/m 3 ), while RF and RFK demonstrated overall high performance throughout the given scenarios and IDW exhibited the worst performance from all. Overall, their performance presented variability depending on differing temporal and regional conditions as also discussed in several other studies e.g., [68][69][70]. Therefore, the period undergoing monsoon season, which is best represented by the month of July, resulted in low accuracy in estimates, while months with considerably higher actual variance of PM 2.5 exhibited high RMSE as well as the highest R 2 , which in return achieved the best accuracy in February, followed by March.
The models also represented varied results depending on regions, which was reflected in the point-based comparative analysis of outputs. Since the extremely high and low concentrations would influence the performance, and we aimed to achieve appropriate estimation even with high contamination, we compared estimates with the original ground-level data as shown in Tables 3-6. According to the latter, IDW and OK tended to underestimate PM 2.5 concentration, which would be problematic in apprehending the severity of pollution. Meanwhile, RFK was inclined to produce higher estimations, which we considered as a less significant problem than an underestimation, in accordance with its lower RMSE compared to the rest of the models.
The data range of interpolated estimates also indicated different explanations of model performance. IDW and OK best matched the observation data range since the models are highly dependent on close monitoring sites, which adversely caused excessive zoning dis-tribution around the ground-level stations. As also demonstrated by Sajjadi et al. [71], IDW produced worse estimates when the sample range cannot be represented by sample points of data, by having lower performance with RMSE of 7.19 µg/m 3 and 7.01 µg/m 3 with 10-day and monthly dataset in March respectively. RF produced oversimplified variability in PM 2.5 concentrations by having a narrower range with the 10-day dataset in February and March, which consequently had higher variance compared to other datasets. RFK, which is a combination of OK and RF, showed the most appropriate range covering the data not represented by the ground observations. This is a very distinctive and important feature in deriving estimates out of the data range of original observations. The different conclusions from cross-validation statistics and data range comparison suggest that traditional cross-validation results cannot solely determine model performance as also advocated by Zou et al. [72].
The spatial distribution map of PM 2.5 advocates the advantage of implementing RFK as a spatial interpolation method. IDW and OK, which are solely based on regulatory-grade monitoring stations, were unlikely to illustrate variances resulting from geographical and anthropogenic factors, especially in the northern and southern regions where monitoring stations are scarce. On the other hand, RF and RFK produced more realistic PM 2.5 spatial distributions with accumulated auxiliary data over larger geographical areas. This implies that these methods, and especially RFK, could be useful for countries that have limited in-situ monitoring stations but plenty of auxiliary data, such as Thailand, while becoming more reliable as more datasets are integrated, especially when extreme contamination has occurred.
However, there are a few attempts to be made to constructively reinforce the conclusion that RFK could be the most suitable method among the four models discussed in the current paper. Firstly, the results from our work could be further improved by integrating additional geostationary satellite data such as the anticipated Global Environmental Monitoring System (GEMS), which has a higher temporal resolution. The TROPOMI satellite system has a daily revisit, therefore, incorporation of additional satellite data with a more frequent scanning cycle could enhance the correlation analysis and modelling of non-linear relations. Secondly, despite the fact that this study excluded meteorological data due to their relatively low spatial resolution, a potential high-spatial-resolution weather data set representing the dynamic climatic conditions and the interaction with PM 2.5 could improve the performance of the models. This could be materialized by integrating data from numerical weather prediction (NWP) models or satellite images. Lastly, the model accuracy could be enhanced by applying the gap-filling method on atmospheric products of TROPOMI. Even though TROPOMI products contain more abundant information on atmospheric constituents compared to MODIS AOD products, which have been traditionally used in PM 2.5 estimation, several studies have also shown that filling the gaps improved the model accuracy [73][74][75]. Therefore, the model performance of RFK would be improved by integrating more auxiliary data and implementing advanced preprocessing techniques. In the current study, we demonstrated the key features of RFK and suggest it to be the appropriate model for PM 2.5 estimation for the cases of scarce distribution and limited number of the in-situ monitoring systems. This novel approach was demonstrated and materialized by integrating satellite-and ground-based information of predominantly chemical composites as covariates.

Summary
The current study presented the application and inter-comparison of four spatial interpolation models in an attempt to estimate ground-level PM 2.5 over Thailand. Overall, the basic interpolation models of IDW and OK could simply calculate the estimates from observations of PCD ground stations, which resulted in a smooth and unrealistic representation of PM 2.5 concentration. On the other hand, machine learning approaches, namely RF and RFK, could successfully derive maps of the spatial distribution of PM 2.5 with a finer spatial resolution of 3.5 × 5.5 km.
This research demonstrated the appropriateness of RFK as a spatial interpolation method for PM 2.5 by conducting 10-fold cross-validation of test sets, and analyzing differences in data range of estimates, station-based estimation, and spatial distribution depicted in maps as additional approaches to evaluate the models. Therefore, we argue that the synergistic use of ground stations and satellite data leads to improvement of accuracy in spatially estimating the ground-level PM 2.5 concentrations. These fusion techniques are anticipated to provide more robust results if more data are integrated and our research will focus in the future on fusing additional information into the workflow, such as the anticipated GEMS, Sentinel-4, and Sentinel-5 satellites as well as low-cost sensors. Last but not least, the presented approach can be reproduced for large geographical regions with low computational demands and therefore assist authorities seeking to set up highspatial-resolution nowcasting monitoring systems for PM 2.5 concentrations. In particular, our findings can act as a reference for comparing traditional and newly adopted spatial interpolation methods, and primarily RFK. We suggest that the latter can be used as a robust technique to derive spatially continuous maps of PM 2.5 concentrations, even in the event of high atmospheric contamination, despite the limited atmospheric monitoring networks. Funding: This research received no external funding.

Data Availability Statement:
All data used in this study, including satellite and ground data, are from sources providing the data freely available through the internet.