An Assessment of the Multi-Input Spatiotemporal RF–XGBoost Hybrid Framework for PM10 Estimation in Lithuania

Fahim, Mina Adel Shokry; Sužiedelytė Visockienė, Jūratė

doi:10.3390/su18042022

Open AccessArticle

An Assessment of the Multi-Input Spatiotemporal RF–XGBoost Hybrid Framework for PM₁₀ Estimation in Lithuania

by

Mina Adel Shokry Fahim

and

Jūratė Sužiedelytė Visockienė

^*

Department of Geodesy and Cadastre, Environmental Engineering Faculty, Vilnius Gediminas Technical University, LT-10223 Vilnius, Lithuania

^*

Author to whom correspondence should be addressed.

Sustainability 2026, 18(4), 2022; https://doi.org/10.3390/su18042022

Submission received: 6 January 2026 / Revised: 8 February 2026 / Accepted: 13 February 2026 / Published: 16 February 2026

(This article belongs to the Section Air, Climate Change and Sustainability)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Air pollution remains a major public-health concern, and exposure to particulate matter (PM), particularly PM₁₀ (with a diameter ≤ 10 µm), is associated with adverse respiratory and cardiovascular outcomes. Most research relies on a singular model for PM₁₀ surface estimation. This study is an assessment of a national-scale, daily PM₁₀ estimation framework for Lithuania (2019–2024), using a hybrid machine-learning method that combines Random Forest (RF) and extreme gradient boosting (XGBoost) algorithms. Hourly PM₁₀ observations were aggregated from 18 monitoring stations to obtain daily means and temporal means. The predictors integrated meteorological factors, such as temperature, wind, humidity, and precipitation, to determine satellite-based atmospheric composition from Sentinel-5P Tropospheric Monitoring Instruments (TROPOMI). Atmospheric components include nitrogen dioxide (NO₂), carbon monoxide (CO), sulfur dioxide (SO₂), ozone (O₃), formaldehyde (HCHO), and the absorbing aerosol index (AI). Moderate-Resolution Imaging Spectroradiometers (MODIS) were used to record land-surface temperature and static spatial descriptors, such as elevation, land cover, Normalized Difference Vegetation Index (NDVI), population, and road proximity. The dataset was partitioned temporally into training (70%), validation (20%), and testing (10%). The hybrid model achieved an improved accuracy, compared with single-model baselines, reaching a coefficient of determination (R²) of 0.739 in validation and R² = 0.75 in the tested dataset. Mean absolute error (MAE) was 3.15 µg/m³, and root mean square error (RMSE) was 3.98 µg/m³. The results indicate a slight tendency to overestimate PM₁₀ concentrations at lower concentration levels. Feature-importance analysis revealed that short-term temporal persistence is the key to daily PM₁₀ prediction, while meteorological variables provide secondary contributions. Temporal evaluation, using consecutive two-year windows, revealed a consistent improvement in predictive performance from 2019–2020 to 2023–2024, while station-level analysis showed moderate-to-strong agreement between the predicted and observed PM₁₀ concentrations across monitoring stations, with R² ranging from 0.455 to 0.760. This provides decision-support capabilities for air-quality management, the evaluation of mitigation measures, and integration of air-pollution considerations into sustainable urban planning strategies assessing public-health protection.

Keywords:

PM₁₀ prediction; air-pollution control; hybrid machine learning; pollution monitoring; random forest; XGBoost

1. Introduction

Air pollution continues to be a major environmental and public-health concern around the world. Particulate matter (PM) is among the pollutants of concern because of its well-documented effects on the respiratory and cardiovascular systems. In particular, PM₁₀ airborne particles, with diameters of 10 µm or less, pose a notable risk, as these particles can reach the lower regions of the respiratory tract and have been linked to higher rates of illness and premature death [1,2]. The reliable estimation of ground-level PM₁₀ is crucial for supporting decisions in environmental management and sustainable urban planning.

Although factories, vehicles, and other human activities continuously release substantial amounts of PM into the air, the problem persists and is expected to intensify over time [3]. In Lithuania, a network of air-quality monitoring stations measures a wide range of pollutants, including PM₁₀, PM_2.5, O₃, NO₂, and others. These stations operate under the National Air Monitoring Network and the Integrated Monitoring Network, providing valuable data across the country. Despite the fact that these stations provide highly reliable data, their geographical reach is limited. As a result, many areas, particularly rural regions, smaller settlements, and some suburban districts, remain without direct air-quality observations, making it more difficult to assess population-wide exposure or hotspot areas with elevated pollution levels.

Most existing studies have focused on developed regions, where ground monitoring stations are abundant. In contrast, research into underdeveloped areas, where monitoring sites tend to be sparse and unevenly distributed, remains limited. Such conditions often lead to small sample sizes, reduced statistical stability, and lower prediction accuracy [4]. Recent advances in remote sensing, geospatial datasets, and machine-learning (ML) methods provide promising alternatives for addressing these challenges. Satellite-based observations offer broad spatial coverage and can capture atmospheric and land-cover characteristics that are strongly correlated with pollutant concentrations [5].

Researchers have started exploring satellite-derived atmospheric variables as proxies for air pollution. Their initial focus was on assessing Aerosol Optical Depth (AOD) from instruments such as MODIS and Sea-viewing Wide Field-of-view Sensor (SeaWiFS) [6,7,8]. Studies have demonstrated moderate-to-strong correlations between AOD and ground-level concentrations, particularly in regions with stable meteorological conditions. This early work established the foundations for integrating satellite observations with statistical and physical models to estimate air quality. Over the past decade, advances in satellite-based atmospheric monitoring, particularly via Sentinel-5P’s TROPOMI instruments (which provide satellite-derived vertical column densities of key pollutants), have enabled the retrieval of trace gases, such as nitric oxide/dioxide, SO₂, CO, and O₃, at spatial resolutions of the order of 3.5 × 5.5 km² for most data products and 7.0 × 5.5 km² for shortwave infrared data products [9,10]. Numerous empirical approaches have been proposed for estimating PM₁₀, starting with linear and nonlinear regression and land-use regression (LUR) models. However, these early models generally achieved moderate predictive performance, with reported coefficients of determination (R²) [11]. In Cusco, Peru, the use of Multiple Linear Regression (MLR) confirmed the presence of nonlinear relationships between particulate concentrations and atmospheric drivers, driven by meteorological, seasonal, and temporal variables. However, predictive skill remained moderate for PM₁₀ with R² = 0.44 in the best annual model [12]. In Tehran, researchers have developed models that demonstrate that integrating meteorological parameters with satellite-derived nonlinear multi-regression models, using MISR AOD, performed best, explaining the R² of up to 0.55 of PM₁₀ variability, while MODIS-based and linear models showed weaker correlations [13]. Substantial improvements were obtained when examining PM_2.5, PM₁₀, and NO₂ concentrations in China between 2014 and 2016, using a LUR framework integrated within a spatiotemporal semi-parametric modeling approach. The researchers employed a generalized mixed additive model, which explained approximately 71% of the time variance in monthly pollutant concentrations. For PM₁₀ specifically, the model achieved a hold-out 10-fold cross-validated R² of 0.62, indicating moderately strong predictive performance [14]. More recent studies have moved towards data-driven and machine-learning models, particularly tree-based ensemble methods, and have shown considerably improved predictive performance when estimating PM₁₀.

In the Caribbean basin, researchers have applied six ML models to daily PM₁₀: support vector regression (SVR), k-nearest neighbor regression (KNN), RF regression (RFR), gradient boosting regression, Tweedie regression, and Bayesian ridge regression. GBR achieved the strongest performance, with R² = 0.61, MAE = 6.85 µg m⁻³, and RMSE = 10.44 µg m⁻³ [15]. A study has been conducted to show that a multi-stage RF can generate reliable daily 1-km gridded estimates of PM₁₀, PM_2.5, and PM_2.5–10 across Sweden (2005–2016), by tracing routine monitoring with satellite/atmospheric composition products, meteorology, and land-use indicators, to produce largely unbiased predictions and capture a substantial share of variability. PM_2.5 performed best, followed by PM_2.5–10 and PM₁₀, with “out-of-bag” values of R² of 0.69, 0.65, and 0.64, and “held-out” monitor values of R² ≈ 0.59, 0.45, and 0.50, respectively [16]. In China, a study confirmed that advanced machine-learning models can effectively capture both the spatiotemporal variability of air pollutants and their controlling factors, by comparing XGBoost with other machine-learning models for PM_2.5 and O₃ prediction in the Beijing–Tianjin–Hebei region during 2019–2023. While the results showed XGBoost’s superior performance [17], another Chinese study recently focused on a hybrid machine-learning framework of RF and XGBoost models, which substantially achieved high prediction accuracy for PM. The model was developed for high-resolution daily PM_2.5 prediction in Jiangxi Province and achieved a test R² of 0.82, highlighting the potential of an ensemble-learning hybrid model to better capture complex nonlinear relationships between satellite observations, meteorology, and surface pollution [4].

Several key challenges remain unresolved. Many PM₁₀ and PM_2.5 estimation studies continue to heavily rely on satellite data products, which often suffer from extensive missing values due to cloud cover and algorithmic limitations, reducing spatial and temporal completeness [18]. Single-model approaches, even advanced machine-learning models, can struggle to generalize in heterogeneous emission regimes and extreme pollution conditions, particularly in regions with sparse or unevenly distributed monitoring stations, where short historical records and small sample sizes lead to model instability. Recent advances in PM prediction have increasingly adopted hybrid machine-learning frameworks, particularly combinations of RF and XGBoost, which have demonstrated improved predictive accuracy compared with single-model approaches. However, much of the existing literature has primarily emphasized model performance metrics, with limited attention being paid to the broader spatiotemporal structure of PM time series, comprehensive feature integration, and multi-level evaluation strategies. The applications within Lithuania remain relatively limited. Existing studies in the Lithuanian context have primarily focused on single-model frameworks, such as artificial neural networks, or conventional statistical and ML regression approaches applied at urban or site-specific scales.

In order to address these limitations, the purpose of this study was to employ a daily time-series hybrid ML framework, integrating RF and XGBoost, to predict PM₁₀ concentrations across Lithuania. The hybrid model is designed to enhance predictive robustness and generalization capability, compared to corresponding single-model implementations, to quantify performance gains. To mitigate the impact of missing observations in ground-based monitoring records, Inverse Distance Weighting (IDW) interpolation was applied to fill spatial and temporal gaps using information from neighboring stations, ensuring continuity of the PM₁₀ time series. The analysis covers a six-year period (2019–2024) and was conducted at the national scale, integrating spatially and temporally resolved data from multiple sources. Model inputs from the Sentinel-5P TROPOMI instrument include meteorological variables, such as hourly air temperature, wind speed and gusts, cloud cover, sea-level pressure, relative humidity, and precipitation. Inputs from satellite-derived atmospheric composition indicators include NO₂, CO, SO₂, O₃, HCHO, and aerosol index. Land-use characteristics include MODIS land-surface temperature (LST) daytime and nighttime, elevation from USGS-SRTM, land-cover classes, NDVI derived from COPERNICUS Sentinel-2 surface reflectance products, population density, and distance to main road networks.

All of the predictor variables, which were spatially derived with regard to ground monitoring stations, satellite products, meteorological reanalysis, land-surface characteristics, and temporal descriptors, were harmonized and assembled into a unified spatiotemporal dataset. The complete dataset was partitioned into training (70%), validation (20%), and testing (10%) to preserve the temporal structure of the daily PM₁₀ time series. Three modeling configurations were implemented: RF, XGBoost, and a hybrid RF–XGBoost ensemble. In the hybrid configuration, both RF and XGBoost models were trained independently on the same training dataset, and their predictions were subsequently combined using weights to generate the final PM₁₀ estimates. All data preprocessing, modeling, and statistical analyses were implemented using Python 3.12.3, while QGIS was employed for spatial data processing and map-based analysis. Model performance was evaluated on both the validation and testing datasets using the R², RMSE, and MAE. In addition, feature importance was analyzed to assess the relative contribution of different predictors, including meteorological variables, satellite-derived atmospheric composition indicators, land-use and population metrics, and temporal features.

The paper is organized as follows: Section 2 describes the study area, datasets, preprocessing steps, and the methods used for model creation and accuracy assessment. Section 3 presents the model results and performance evaluation, including temporal and spatial comparisons. Section 4 discusses the findings and methodological limitations. Finally, Section 5 summarizes the main conclusions.

The structure of the proposed PM₁₀ prediction framework is illustrated in Figure 1.

2. Materials and Methods

2.1. Air Quality in Lithuania

This study applies to Lithuania on a national scale. Lithuania is a Nordic–Baltic country located in northeastern Europe, covering an area of approximately 65,300 km². The country’s geographic coordinates extend between 53°54′–56°27′ N latitude and 20°56′–26°51′ E longitude [19]. The climate is described as a maritime–continental transition, with mean daytime temperatures of approximately −5 °C in January and 20 °C in July [20]. The spatial–temporal variability of annual mean PM₁₀ concentrations across 15 air-quality PM₁₀ monitoring stations in Lithuania is illustrated in Figure 2 for the study period. Each cell represents the yearly average concentration (µg/m³) for a given station, which observes PM₁₀ derived from daily observations after a cleaning phase; color intensity indicates relative pollution levels. Many stations exhibit a gradual shift towards lighter colors over time, suggesting a decreasing tendency in PM₁₀ concentrations at the national scale. Nevertheless, marked inter-station differences persist, reflecting pronounced spatial heterogeneity within the monitoring network. The World Health Organization’s (WHO) annual guideline value of 15 µg/m³ is indicated, as a reference threshold to facilitate comparisons with health-based air-quality standards, while the European Union annual limit value of 40 µg/m³ is included for regulatory comparison [20,21,22].

2.2. DataSets

This study integrates ground-based air-quality observations, meteorological measurements, satellite-derived atmospheric composition, land-surface temperature, static environmental descriptors, and engineered temporal features; all of the datasets are documented in Table 1. The selection of these variables was motivated by the well-established physical, chemical, and temporal drivers of PM₁₀ variability. All datasets were harmonized to enable consistent spatiotemporal integration prior to detailed preprocessing, which is described in subsequent subsections.

2.2.1. Ground Monitoring Stations

Ground-based air-quality observations constitute the primary reference for PM assessment and remain essential for the development and validation of PM₁₀ prediction models. Hourly air-quality data were obtained from the European Environment Agency (EEA) Air-Quality Reporting database [23], maintained nationally by the Lithuanian Environmental Protection Agency (EPA). The dataset comprises observations from 18 monitoring stations across Lithuania, including 14 urban stations from the National Air Monitoring Network and four rural background stations belonging to the Integrated Monitoring Network (i.e., Aukštaitija, Dzūkija, Žemaitija, and Preila). These stations provide standardized measurements of PM₁₀, PM₂.₅, NO₂, SO₂, O₃, and CO, and others [24].

Prior to integration, the raw hourly measurements unified the preprocessing procedure to ensure temporal consistency. All timestamps were converted to Coordinated Universal Time (UTC), after which any invalid and unverified timestamps were removed, based on EEA quality flags. Values were filtered using the Interquartile Range Method (IQR) [25]. The cleaned hourly observations were then aggregated to daily mean concentrations for each station and pollutant, and continuous daily time series were constructed to ensure temporal completeness. As PM₁₀ sensor data were not available at all stations for all time steps, the IDW method of spatial interpolation was applied to estimate missing values at selected locations, thereby maintaining spatial continuity and consistency. This method used the weighted average of measured values in the vicinity, with respect to the nearest 12 stations raised to a power parameter (p) derived by Equation (1). The interpolation formula for the estimated value

\hat{Z} (x_{0})

, at an unknown location, is Equation (2) [26,27,28,29]:

w_{i} = 1 / (d_{i}^{p})

(1)

where

w_{i}

is the weight of the

i^{t h}

point,

d_{i}

is the distance between the

i^{t h}

point and the unknown point, and p is the power parameter.

\hat{Z} (x_{0}) = \frac{\sum_{i = 1}^{n} w_{i} Z (x_{i})}{\sum_{i = 1}^{n} w_{i}}

(2)

where

\hat{Z} (x_{0})

is the value at the ith sample point, and n is the number of sample points used in the interpolation.

2.2.2. Meteorological Data

Hourly meteorological observations were collected from in situ automatic meteorological stations (AMS) operated by the Lithuanian Hydrometeorological Service and distributed through the Meteo API. These observations comprised air temperature, ‘feels-like’ temperature, wind speed, wind gust, cloud cover, sea-level pressure, relative humidity, and precipitation, as defined by the API [30]. The AMS network was available for 52 stations across the country, as shown in Figure 3, and this showed the spatial distribution of the network and ground air-quality monitoring stations across Lithuania. Missing values were removed, and the same method of interpolation was employed with p = 2 to derive an hourly meteorological report for each air-quality station location, followed by aggregation for the daily means.

2.2.3. Satellite Atmospheric Data

Satellite-derived atmospheric composition indicators were obtained from the Sentinel-5P TROPOMI offline Level-3 products accessed via the Google Earth Engine (GEE) API, including tropospheric column densities of NO₂, CO, SO₂, O₃, HCHO, and an indicator of AAI [9,31,32]. Basic screening was applied by removing the unverified data. Prior to interpolation, a Hampel filter was applied to the daily time series for each variable to identify and remove anomalous retrieval outliers. Filtering was performed separately for each monitoring station and for each variable after sorting by date. A 7-day sliding window was selected to capture short-term variability, while providing the robust detection of spurious day-level anomalies. Observations with a z-score exceeding 3, computed relative to the local median and the median absolute deviation (MAD), were flagged as outliers and excluded from subsequent processing [33]. The remaining missing observations were primarily caused by retrieval screening and incomplete satellite coverage and were imputed per station using Gaussian Process Regression (GPR). A Radial Basis Function (RBF) kernel was employed and combined with a WhiteKernel noise component. The RBF kernel was chosen because it is appropriate for modeling smoothly varying atmospheric concentration fields and temporal correlations. A characteristic length-scale of 60 days was used as an initial setting to represent seasonal-scale variability in air pollutant dynamics, while a WhiteKernel noise term, with a noise level of 10⁻³, was included to account for the measurement uncertainty inherent in satellite retrievals [34].

2.2.4. Static Station and Environment Data

Static land-use and environmental covariates are widely used in both LUR and machine-learning air-quality models. Multi-city LUR frameworks explicitly employ population density, altitude, and land-use variables to explain spatial differences in measured concentrations [35,36]. To characterize the relatively time-invariant drivers of PM₁₀ spatial variability, such as topography, land cover, greenness, population densities, and road proximity, a set of static predictors was compiled for each air-quality monitoring station and then joined with the main station’s dataset network, type, coordinates, and altitude. Station metadata were retrieved from the EEA. All geospatial predictors were extracted at each station location using GEE, and the statistics were summarized within a 1-km buffer around each station location. Elevation was computed as the mean of the Shuttle Radar Topography Mission (SRTM) 30 m DEM within the buffer [37]. Land-cover class was assigned as the modal class from the ESA WorldCover 10 m v200 year 2021 [38]. Vegetation activity was represented by a long-term mean NDVI, computed from the COPERNICUS Sentinel-2 surface reflectance imagery across the study period after basic cloud screening. NDVI was calculated using Equation (3) [39].

NDVI = \frac{N I R - R E D}{N I R + R E D}

(3)

where NIR and RED correspond to the near-infrared and red reflectance bands, respectively.

Population exposure pressure was approximated using WorldPop population 100 m year 2020 [40,41]. Proximity to traffic was represented by the Euclidean distance from each station point to the nearest segment of the main road network, computed from a road feature vector layer of the transport network obtained from the Lithuanian National Spatial Data Infrastructure via Geoportal [42]. The dataset corresponds to the INSPIRE Transport Networks theme; all predictors were stored per station and then merged with the EEA station data.

2.2.5. MODIS LST Data

LST was retrieved from the MODIS Terra MOD11A1 product by LST Day and Night layers at a 1-km spatial resolution provided by NASA LP DAAC/USGS [43,44], converted to Celsius. An analogous outlier-filtering procedure to that applied for the satellite atmospheric variables was used to suppress short-term anomalies prior to subsequent processing. Missing LST days were filled-in using GPR, which explicitly models temporal autocorrelation. A composite kernel, combining a periodic component (Exponentiated Sine Squared), RBF for short-term smooth variations, and a white-noise term, was employed to generate continuous daily LST time series [34,45].

2.2.6. Temporal Features

A set of temporal descriptors was derived from the daily time index, including year, month, day of month, day of week, and season. In addition, short-term persistence and delayed effects were represented using lagged PM₁₀ values for 13 days and rolling statistics for a week moving average, prior to the prediction day. Lagged and rolling temporal features are widely used in air-pollution modeling to reflect atmospheric accumulation processes, emission persistence, and meteorological carryover effects. Similar temporal feature engineering strategies have been adopted in statistical, LUR, and machine-learning-based air-quality studies to improve predictive performance and temporal consistency [46,47].

2.3. Method

We implemented a hybrid method, combining RF and XGBoost models. Ensemble learning is widely used because combining multiple learners can improve generalization by reducing error components, such as variance and/or bias, especially when the base models make partially uncorrelated errors [48]. RF is a bagging-based tree ensemble that typically reduces variance through averaging many decorrelated trees [49]. GBoost is a regularized gradient booster, designed to produce strong predictive performance by sequentially correcting residual errors [50]. The combination can, therefore, be beneficial because the two methods often exhibit complementary strengths across different regimes of nonlinear feature interactions and noise. The weighted-average integration method was chosen for its simplicity and interpretability, as it enables a clear and direct combination of RF and XGBoost predictions. The hybrid method was evaluated using multiple weighting configurations to assess the relative contribution of the RF and XGBoost models. Through repeated experiments, the combination that yielded the best predictive performance was obtained by assigning a weight of α to the XGBoost component and β to the RF component, according to Equation (4) [4].

y_{hybrid} (x) = α y_{XGB} (x) + β y_{RF} (x)

(4)

where

y_{hybrid} (x)

is the final ensemble prediction,

y_{XGB} (x)

and

β y_{RF} (x)

are the individual model outputs, and α and β denote their respective weights.

In this study, an RF model was trained using 500 trees with unrestricted tree depth, while the XGBoost model was configured with 1000 boosting iterations and a learning rate of 0.05. Tree complexity was controlled using a maximum depth of 8.

2.4. Accuracy Assessment

The accuracy of the proposed modeling framework was evaluated using standard statistical performance metrics commonly adopted in air-quality research. The model assessment was conducted in two stages. First, 20% of the dataset was allocated for validation and used to evaluate model performance and support model selection. During this phase, R², RMSE, and MAE were computed, as defined in Equations (5)–(7) [51,52].

Following the training and validation procedures, a test set comprising 10% of the total data was reserved to provide an objective evaluation of the model’s predictive capability under unseen conditions, and predicted–observed scatter plots were also analyzed for both validation and testing phases, to visually assess model accuracy and consistency.

RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {({P M}_{10}^{o b s} (i) - {P M}_{10}^{p r e d} (i))}^{2}}

(5)

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {({PM}_{10}^{obs} (i) - {PM}_{10}^{pred} (i))}^{2}}{\sum_{i = 1}^{n} {({{PM}_{10}^{obs} (i)}_{i} - \bar{{PM}_{10}^{obs} (i)})}^{2}}

(6)

M A E = \frac{1}{n} \sum_{i = 1}^{n} |{P M}_{10}^{o b s} (i) - {P M}_{10}^{p r e d} (i)|

(7)

where n is the number of observations,

{P M}_{10}^{o b s} (i) a n d {P M}_{10}^{p r e d} (i)

are the observed and predicted values, respectively, and

\bar{{P M}_{10}^{o b s} (i)}

is the mean of the observations.

3. Results

3.1. Model Performance Comparison

A comparative analysis was conducted using four modeling methods: MLR, RF, XGBoost, and a hybrid RF–XGBoost, to evaluate whether the proposed hybrid strategy yielded improved PM₁₀ prediction performance. MLR was included as a classical statistical baseline, while RF and XGBoost represented state-of-the-art tree-based methods. The analysis was based on a dataset comprising approximately 39,456 daily observations, which was temporally partitioned to preserve the temporal dependency of the daily PM₁₀ series and prevent data leakage. The split was carried out in strict chronological order. The split was performed separately for each monitoring station by ordering observations by date and applying a forward-chaining partition: the earliest 70% for training 27,612 samples, then 20% for validating 7884 samples, and the most recent 10% for testing 3960 samples, while preserving temporal consistency.

As a baseline, the MLR model achieved moderate predictive performance, with an RMSE of 4.22 µg/m³ and an R² of 0.67 on the validation dataset and an RMSE of 4.67 µg/m³ and an R² of 0.66 on the test dataset. During the initial stage, the optimal ensemble weights were determined using a grid-search procedure performed on the validation dataset. The weight α assigned to the XGBoost component varied from 0.0 to 1.0 in increments of 0.1, while β was set as 1 − α for the RF. For each configuration, predictive performance was evaluated, as summarized in Table 2. The combination α = 0.6 and β = 0.4 achieved the best overall validation performance with the lowest RMSE and MAE values and the highest R².

Figure 4 presents scatter plots of predicted versus observed PM₁₀ concentrations for the validation phase across MLR and the tree-based models, as well as for the test phase of the hybrid model. The results indicate that the hybrid model was superior during validation, achieving an R² of 0.739, outperforming the MLR baseline (R² = 0.67) and both RF (R² = 0.719) and XGB (R² = 0.728), while also yielding lower MAE and RMSE values. This performance advantage persisted in the independent test set, where the hybrid model attained the highest predictive accuracy (R² = 0.75, MAE = 3.15 µg/m³, RMSE = 3.984 µg/m³), demonstrating improved generalization compared to the individual models. Overall, these results confirmed that the hybrid RF–XGB framework provides a more reliable and accurate representation of PM₁₀ variability than either of the standalone models.

An interval-based error analysis was conducted for the hybrid RF–XGBoost model on the validation dataset by stratifying observations into three PM₁₀ concentration ranges (<15, 15–40, and >40 µg/m³), as shown in Table 3. At low concentrations (<15 µg/m³), the model shows a tendency towards overestimation, as indicated by a positive mean residual of +2.64 µg/m³. The strongest predictive performance occurred in the intermediate concentration range (15–40 µg/m³), which contains the largest number of samples (N = 4309) and achieves the lowest error metrics (RMSE = 3.70 µg/m³, MAE = 2.89 µg/m³), with residuals being centered near zero. In contrast, predictions at higher concentrations (>40 µg/m³) exhibited increased uncertainty and underestimation (a mean residual of −6.62 µg/m³), primarily due to the very limited number of samples (N = 31), leading to an unstable performance metric.

3.2. Temporal Performance Evaluation

To examine temporal changes in the proposed hybrid RF–XGBoost framework, the complete dataset (spanning 2019–2024) was stratified into three consecutive two-year periods: 2019–2020, 2021–2022, and 2023–2024. For each period, the model was retrained using 70% of the data used for training the same parameters that were applied and evaluated on a test subset of 30%, while preserving the temporal structure of the daily PM₁₀ time series. For 2019–2020, the model achieved a test R² of 0.7047, RMSE was 4.21 µg/m³, and the MAE reached 3.15 µg/m³, as shown in Figure 5a. The period represents the lowest predictive accuracy among the three temporal windows. The test dataset for this window comprised 3960 samples for each window. Model performance improved noticeably during the 2021–2022 window. The hybrid model achieved a test R² of 0.7425, reflecting an increase of nearly 4 percentage points, compared to the 2019–2020 period. Error metrics also decreased, with an RMSE of 3.69 µg/m³ and an MAE of 2.81 µg/m³, see the scatter plot in Figure 5b. The strongest predictive performance was observed for the most recent period (2023–2024). The hybrid RF–XGBoost model attained a test R² of 0.7931. Both error metrics reached their lowest values across all temporal windows, with an RMSE of 3.62 µg/m³ and an MAE of 2.81 µg/m³, as presented in the scatter plot in Figure 5c.

Overall, the results reveal a consistent and monotonic improvement in model performance over time, with R² increasing from 0.70 (2019–2020) to 0.74 (2021–2022) and reaching 0.79 (2023–2024), alongside systematic reductions in RMSE and MAE. This temporal enhancement suggests that the hybrid RF–XGBoost framework benefits from increased data availability, improved satellite retrieval quality, and more stable relationships between meteorological and remote sensing in recent years.

3.3. Station-Level Performance

To evaluate spatial heterogeneity in predictive skill, we conducted a station-level assessment, but only for monitoring stations equipped with ground PM₁₀ sensors; this yielded 15 stations. For each station, the hybrid RF–XGBoost model was trained and tested using the same temporal split, with 1534 training samples and 657 test samples per station. Plots for the test phase of all stations are presented in Figure 6a–o and they highlight the influence of station-specific emission environments and local atmospheric conditions, which may not be fully resolved by the available predictors, leading to varying levels of predictive accuracy across monitoring sites.

Station-level performance showed moderate-to-strong agreement between the predicted and observed PM10, with R² ranging from 0.455 to 0.762, RMSE from 3.604 to 5.249 µg/m³, and MAE from 2.784 to 4.362 µg/m³. Averaged across all the stations, the model achieved a mean R² of 0.665, a mean RMSE of 4.166 µg/m³, and a mean MAE of 3.330 µg/m³. The strongest agreement was obtained at lt00042 and lt00041. The predicted–observed scatter shows a compact distribution around the 1:1 line, indicating strong agreement across the concentration range. The residual distribution is close to the center, and the residual spread remains relatively stable across low-to-moderate concentrations, suggesting limited heteroscedasticity, where R² = 0.762. The lowest performance occurred at lt00031, which exhibits a visibly wider scatter cloud and weaker adherence to the 1:1 line; this is consistent with its lower station-level (R² = 0.455). Residual concentration plots indicate increased error variance with regression-to-the-mean behavior. This pattern is typical when extreme events are rare and difficult to learn. For the comparison of model performance, Figure 7 summarizes the evaluation metrics (R², RMSE, and MAE) for all monitoring stations and provides a clear overview of spatial variability in the predictive skill of the station-level scatter plots.

4. Discussion

This study demonstrates that an ML hybrid RF–XGBoost estimation method provides daily PM₁₀ estimates across Lithuania, improving predictive accuracy relative to classical MLR and single tree standalone models, while maintaining a stable generalization in the test set. The hybrid ensemble benefits from the complementary error characteristics of RF and XGBoost. RF effectively reduces prediction variance and improves robustness against noise and outliers, while XGBoost primarily targets bias reduction by iteratively correcting residual errors through sequential boosting. This behavior is consistent with ensemble-learning theory and has been widely documented in machine-learning applications, where combining bagging and boosting-based models enhances predictive stability and accuracy compared to single estimators. Such a hybrid method has been shown to outperform linear and standalone machine-learning baselines in air-quality modeling, particularly under complex nonlinear interactions between meteorological, land-surface, and emission-related predictors. Similar performance gains have been reported in recent PM modeling studies employing RF–XGBoost integration [4,50,53,54].

The spatial and temporal variability of annual mean PM₁₀ concentrations across Lithuanian monitoring stations, over the study period, is illustrated in Figure 2, revealing a gradual shift towards lower PM₁₀ levels at many stations, indicating an overall national decline in particulate pollution. Nevertheless, substantial inter-station differences persist, reflecting pronounced spatial heterogeneity driven by local emission sources, urbanization, and regional atmospheric conditions. For national-scale PM₁₀ modeling, the study integrates a set of ground-based observations, meteorological measurements, satellite-derived atmospheric indicators, LST products, static environmental descriptors, and engineered temporal features. Ground PM₁₀ measurements were taken from 18 monitoring stations with different urban traffic, industrial, and rural backgrounds. All of the datasets were integrated into a unified spatiotemporal framework. Predicted–observed scatter plots exhibit a minor tendency towards over-prediction, particularly at lower PM₁₀ concentrations. The improvement in predictive performance across successive two-year windows suggests increasing stability in the learned pollutant–predictor relationships, rather than cumulative learning effects. That may reflect more consistent relationships between meteorological conditions and surface PM₁₀ concentrations in recent years.

Despite the reliable efficiency of the proposed hybrid framework, many limitations must be recognized. The restricted quantity of PM₁₀ monitoring stations limits the spatial generalization of the model, especially in areas with diverse emission conditions and inadequate observational coverage. Satellite-derived atmospheric composition products also represent column-integrated or area-averaged conditions, rather than near-surface concentrations, and are characterized by non-daily availability and data gaps due to cloud cover, retrieval filtering, and algorithmic constraints. These constraints reduce temporal completeness and require gap-filling techniques, which may decrease short-term variability [55]. Some of the ground-based PM₁₀ observations were identified as being unverified or invalid and were, therefore, eliminated during quality control, hence decreasing the effective sample size. ML methods rely on spatially representative data, and generalization to new or under-represented areas remains a key challenge in practical applications. The model is trained and evaluated using station-based observations, and its reliability is, therefore, strongest in regions and conditions that are well represented by the monitoring network. In locations with limited observational support or with local characteristics that are not captured in the training data, predictions should be interpreted with caution.

The feature-importance analysis indicates that temporal predictors are the dominant drivers of daily PM₁₀ variability in the hybrid model, as shown in Figure 8. The one-day lagged PM₁₀ concentration, contributing an importance score of 0.32, and weekly moving average of 0.038, captures the short-term persistence in pollutant concentrations driven by atmospheric accumulation, emission continuity, and serial dependence, and is consistent with the autocorrelation structure of air pollution in time-series and machine-learning studies [56,57,58]. Beyond the temporal effects, meteorological conditions provide a secondary contribution to model performance, with relative humidity and precipitation emerging as the most influential meteorological predictors. Co-pollutant signals also play important roles, such as ground-level PM_2.5 concentrations, which provide complementary information, suggesting shared sources and coupled formation processes between fine and coarse PM. NO₂ acts as a proxy for shared emission sources and atmospheric chemical environments that influence particulate formation and fate. Static spatial descriptors, such as location, population density, distance to major roads, and land use, contribute approximately 0.12 in total, indicating that, while spatial context is valuable for persistent differences between urban and rural or traffic-influenced settings, it is less decisive for daily fluctuations, compared to dynamic factors. Satellite-derived data exhibit lower relative contributions in the feature-importance analysis. This outcome can be primarily attributed to the intrinsic characteristics of satellite observations, which are characterized by non-daily availability and data gaps arising from cloud cover and quality filtering.

In this study, satellite time series were temporally completed using interpolation techniques to ensure continuity, which is likely to smooth short-term variability and reduce the marginal contribution of satellite predictors, once strong temporal persistence, meteorological conditions, and local emission proxies are accounted for. As satellite retrievals represent column-integrated or regional-scale atmospheric conditions and, therefore, remain indirect indicators of near-surface PM₁₀, this further explains their comparatively limited influence on daily station-level predictions [55,59].

5. Conclusions

This study employed and evaluated a multi-input spatiotemporal hybrid RF–XGBoost framework for daily PM₁₀ estimation across Lithuania in the period 2019–2024. It integrated ground observations with meteorology, Sentinel-5P TROPOMI atmospheric products, MODIS land-surface temperature, static environmental descriptors, and temporal features. The hybrid model consistently outperformed both classical MLR baselines and standalone tree-based models, achieving R² = 0.75, RMSE = 3.98 µg/m³, and MAE = 3.15 µg/m³ on the test set. Predicted–observed scatter plots revealed a slight tendency towards overestimation at lower PM₁₀ concentration levels. Temporal evaluation, using consecutive two-year windows, showed a progressive improvement in predictive accuracy, with test R² increasing from 0.70 (2019–2020) to 0.74 (2021–2022) and reaching its highest value of 0.79 (2023–2024), accompanied by decreasing RMSE (from 4.21 to 3.62 µg/m³) and MAE (from 3.15 to 2.81 µg/m³). Station-level analysis across air-quality monitoring sites showed moderate-to-strong agreement between the predicted and observed concentrations, with R² values ranging from 0.46 to 0.76, and the best performance was observed at stations lt00042 and lt00041 (R² = 0.76, RMSE ~ 3.8–4.2 µg/m³). Feature-importance analysis highlighted the impact of temporal persistence features, followed by meteorological conditions and co-pollutants. Overall, the suggested framework provides a practical basis for national-scale PM₁₀ time-series assessment and may support air quality for environmental management into more sustainable urban planning strategies to assess public-health protection.

Author Contributions

Conceptualization, M.A.S.F. and J.S.V.; methodology, M.A.S.F.; software, M.A.S.F.; validation, M.A.S.F. and J.S.V.; formal analysis, M.A.S.F. and J.S.V.; investigation, J.S.V.; resources, M.A.S.F. and J.S.V.; data curation, M.A.S.F.; writing—original draft preparation, M.A.S.F.; writing—review and editing, M.A.S.F. and J.S.V.; visualization, M.A.S.F.; supervision, J.S.V. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further enquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

PM	Particulate Matter
RF	Random Forest
XGBoost	Extreme Gradient Boosting
TROPOMI	Tropospheric Monitoring Instrument
NO₂	Nitrogen Dioxide
CO	Carbon Monoxide
SO₂	Sulfur Dioxide
O₃	Ozone
HCHO	Formaldehyde
AI	Aerosol Index
MODIS	Moderate-Resolution Imaging Spectroradiometer
NDVI	Normalized Difference Vegetation Index
R²	Coefficient of determination
MAE	Mean Absolute Error
RMSE	Root Mean Square Error
ML	Machine Learning
AOD	Aerosol Optical Depth
SeaWiFS	Sea-viewing Wide Field-of-view Sensor
MLR	Multiple Linear Regression
SVR	Support Vector Regression
KNN	K-nearest neighbor regression
IDW	Inverse Distance Weighting
WHO	World Health Organisation
EEA	European Environment Agency
EPA	Environmental Protection Agency
IQR	Interquartile Range
AMS	Automatic Meteorological Stations
GEE	Google Earth Engine
MAD	Median Absolute Deviation
GPR	Gaussian Process Regression
RBF	Radial Basis Function
LUR	Land-use regression
SRTM	Shuttle Radar Topography Mission
LST	Land-Surface Temperature
DEM	Digital Elevation Model

References

Bodor, K.; Micheu, M.M.; Keresztesi, Á.; Birsan, M.V.; Nita, I.A.; Bodor, Z.; Petres, S.; Korodi, A.; Szép, R. Effects of PM10 and Weather on Respiratory and Cardiovascular Diseases in the Ciuc Basin (Romanian Carpathians). Atmosphere 2021, 12, 289. [Google Scholar] [CrossRef]
Duarte, R.M.B.O.; Duarte, A.C. Health Effects of Urban Atmospheric Aerosols. Atmosphere 2023, 14, 309. [Google Scholar] [CrossRef]
Particulate Matter (PM) Basics | US EPA. Available online: https://www.epa.gov/pm-pollution/particulate-matter-pm-basics (accessed on 5 December 2025).
Gencarelli, N.; Tang, Y.; Deng, J.; Cui, X.; Liu, Z.; Yang, L.; Zhang, S.; Liang, Y. High-Resolution Spatial Prediction of Daily Average PM_2.5 Concentrations in Jiangxi Province via a Hybrid Model Integrating Random Forest and XGBoost. Atmosphere 2025, 16, 1317. [Google Scholar] [CrossRef]
Serio, C.; Jiang, Z.; Yu, H.; Alvarez, C.I.; Andrés Ulloa Vaca, C.; Armando, N.; Llumipanta, E. Machine Learning for Urban Air Quality Prediction Using Google AlphaEarth Foundations Satellite Embeddings: A Case Study of Quito, Ecuador. Remote Sens. 2025, 17, 3472. [Google Scholar] [CrossRef]
Vidot, J.; Santer, R.; Ramon, D. Remote Sensing of Particle Matter Using SeaWiFs. In Remote Sensing of Clouds and the Atmosphere VIII; SPIE: Bellingham, WA, USA, 2004; Volume 5235, pp. 619–626. [Google Scholar] [CrossRef]
Gupta, P.; Christopher, S.A.; Wang, J.; Gehrig, R.; Lee, Y.; Kumar, N. Satellite Remote Sensing of Particulate Matter and Air Quality Assessment over Global Cities. Atmos. Environ. 2006, 40, 5880–5892. [Google Scholar] [CrossRef]
van Donkelaar, A.; Martin, R.V.; Park, R.J. Estimating Ground-Level PM_2.5 Using Aerosol Optical Depth Determined from Satellite Remote Sensing. J. Geophys. Res. Atmos. 2006, 111, D21201. [Google Scholar] [CrossRef]
TROPOMI Observing Our Future | TROPOMI: TROPOspheric Monitoring Instrument. Available online: https://www.tropomi.eu/ (accessed on 8 December 2025).
Grzybowski, P.T.; Markowicz, K.M.; Musiał, J.P. Estimations of the Ground-Level NO₂ Concentrations Based on the Sentinel-5P NO₂ Tropospheric Column Number Density Product. Remote Sens. 2023, 15, 378. [Google Scholar] [CrossRef]
Vienneau, D.; De Hoogh, K.; Bechle, M.J.; Beelen, R.; Van Donkelaar, A.; Martin, R.V.; Millet, D.B.; Hoek, G.; Marshall, J.D. Western European Land Use Regression Incorporating Satellite and Ground-Based Measurements of NO₂ and PM₁₀. Environ. Sci. Technol. 2013, 47, 13555–13564. [Google Scholar] [CrossRef]
Warthon, J.; Zamalloa, A.; Olarte, A.; Warthon, B.; Miranda, I.; Zamalloa-Puma, M.M.; Ccollatupa, V.; Ormachea, J.; Quispe, Y.; Jalixto, V.; et al. A Comprehensive Assessment of PM_2.5 and PM₁₀ Pollution in Cusco, Peru: Spatiotemporal Analysis and Development of the First Predictive Model (2017–2020). Sustainability 2025, 17, 394. [Google Scholar] [CrossRef]
Sotoudeheian, S.; Arhami, M. Estimating Ground-Level PM₁₀ Using Satellite Remote Sensing and Ground-Based Meteorological Measurements over Tehran. Environ. Health Sci. Eng. 2014, 12, 122. [Google Scholar] [CrossRef]
Zhang, Z.; Wang, J.; Hart, J.E.; Laden, F.; Zhao, C.; Li, T.; Zheng, P.; Li, D.; Ye, Z.; Chen, K. National Scale Spatiotemporal Land-Use Regression Model for PM_2.5, PM₁₀ and NO₂ Concentration in China. Atmos. Environ. 2018, 192, 48–54. [Google Scholar] [CrossRef]
Plocoste, T.; Laventure, S. Forecasting PM₁₀ Concentrations in the Caribbean Area Using Machine Learning Models. Atmosphere 2023, 14, 134. [Google Scholar] [CrossRef]
Stafoggia, M.; Johansson, C.; Glantz, P.; Renzi, M.; Shtein, A.; de Hoogh, K.; Kloog, I.; Davoli, M.; Michelozzi, P.; Bellander, T. A Random Forest Approach to Estimate Daily Particulate Matter, Nitrogen Dioxide, and Ozone at Fine Spatial Resolution in Sweden. Atmosphere 2020, 11, 239. [Google Scholar] [CrossRef]
Wei, C.; Zhao, C.; Hu, Y.; Tian, Y. Predicting the Concentration Levels of PM_2.5 and O₃ for Highly Urbanised Areas Based on Machine Learning Models. Sustainability 2025, 17, 9211. [Google Scholar] [CrossRef]
Li, T.; Zhang, C.; Shen, H.; Yuan, Q.; Zhang, L. Real-Time and Seamless Monitoring of Ground-Level PM_2.5 Using Satellite Remote Sensing ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci. 2018, IV-3, 143–147. [Google Scholar] [CrossRef]
Šilingas, M.; Suchockas, V.; Varnagirytė-Kabašinskienė, I. Evaluation of Undergrowth under the Canopy of Deciduous Forests on Very Fertile Soils in the Lithuanian Hemiboreal Forest. Forests 2022, 13, 2172. [Google Scholar] [CrossRef]
Byčenkienė, S.; Khan, A.; Bimbaitė, V. Impact of PM_2.5 and PM₁₀ Emissions on Changes of Their Concentration Levels in Lithuania: A Case Study. Atmosphere 2022, 13, 1793. [Google Scholar] [CrossRef]
WHO. Global Air Quality Guidelines: Particulate Matter (PM_2.5 and PM₁₀), Ozone, Nitrogen Dioxide, Sulphur Dioxide and Carbon Monoxide; WHO: Geneva, Switzerland, 2021; pp. 1–360. [Google Scholar]
Directive–2008/50–EN–EUR-Lex. Available online: https://eur-lex.europa.eu/eli/dir/2008/50/oj/eng (accessed on 13 December 2025).
Air Quality E-Reporting (AQ e-Reporting). Available online: https://www.eea.europa.eu/en/datahub/datahubitem-view/3b390c9c-f321-490a-b25a-ae93b2ed80c1 (accessed on 14 December 2025).
INSPIRE Geoportal. Available online: https://inspire-geoportal.ec.europa.eu/srv/eng/catalog.search#/extenddetails?country=lt&view=priorityOverview&theme=none&resourceId=ccd714ff-d1ee-4400-9822-aaff4e97bee7 (accessed on 14 December 2025).
McDonnell, W.F.; Nishino-Ishikawa, N.; Petersen, F.F.; Chen, L.H.; Abbey, D.E. Relationships of Mortality with the Fine and Coarse Fractions of Long-Term Ambient PM10 Concentrations in Non-smokers. J. Expo. Anal. Environ. Epidemiol. 2000, 10, 427–436. [Google Scholar] [CrossRef]
Abdulkareem, S.K.; Alhadithi, M.; Amer, W. Evaluating Spatial Interpolation Techniques for Accurate Air Quality Prediction: An Overview. E3S Web Conf. 2025, 633, 07008. [Google Scholar] [CrossRef]
Neumann, C. Habitat Sampler—A Sampling Algorithm for Habitat Type Delineation in Remote Sensing Imagery. Divers. Distrib. 2020, 26, 1752–1766. [Google Scholar] [CrossRef]
Lu, G.Y.; Wong, D.W. An Adaptive Inverse-Distance Weighting Spatial Interpolation Technique. Comput. Geosci. 2008, 34, 1044–1055. [Google Scholar] [CrossRef]
Shepard, D. A Two-Dimensional Interpolation Function for Irregularly-Spaced Data. In Proceedings of the 1968 23rd ACM National Conference; ACM: New York, NY, USA, 1968; pp. 517–524. [Google Scholar] [CrossRef]
Meteo.Lt API. Available online: https://api.meteo.lt/ (accessed on 21 December 2025).
Sentinel-5P―Sentinel Online. Available online: https://sentinels.copernicus.eu/copernicus/sentinel-5p (accessed on 21 December 2025).
Earth Engine Data Catalog | Google for Developers. Available online: https://developers.google.com/earth-engine/datasets (accessed on 21 December 2025).
Hampel―Outlier Removal Using Hampel Identifier―MATLAB. Available online: https://se.mathworks.com/help/signal/ref/hampel.html (accessed on 21 December 2025).
Rasmussen, C.E.; Williams, C.K.I. Gaussian Processes for Machine Learning; MIT Press: Cambridge, MA, USA, 2006; p. 248. [Google Scholar]
Wang, M.; Beelen, R.; Bellander, T.; Birk, M.; Cesaroni, G.; Cirach, M.; Cyrys, J.; de Hoogh, K.; Declercq, C.; Dimakopoulou, K.; et al. Performance of Multi-City Land Use Regression Models for Nitrogen Dioxide and Fine Particles. Environ. Health Perspect. 2014, 122, 843. [Google Scholar] [CrossRef] [PubMed]
Stafoggia, M.; Bellander, T.; Bucci, S.; Davoli, M.; de Hoogh, K.; de’ Donato, F.; Gariazzo, C.; Lyapustin, A.; Michelozzi, P.; Renzi, M.; et al. Estimation of Daily PM10 and PM2.5 Concentrations in Italy, 2013–2015, Using a Spatiotemporal Land-Use Random-Forest Model. Environ. Int. 2019, 124, 170–179. [Google Scholar] [CrossRef]
Farr, T.G.; Rosen, P.A.; Caro, E.; Crippen, R.; Duren, R.; Hensley, S.; Kobrick, M.; Paller, M.; Rodriguez, E.; Roth, L.; et al. The Shuttle Radar Topography Mission. Rev. Geophys. 2007, 45, RG2004. [Google Scholar] [CrossRef]
Zanaga, D.; Van De Kerchove, R.; Daems, D.; De Keersmaecker, W.; Brockmann, C.; Kirches, G.; Wevers, J.; Cartus, O.; Santoro, M.; Fritz, S.; et al. ESA WorldCover 10 m 2021 V200[Data set]. Zenodo 2022. [Google Scholar] [CrossRef]
Rouse, J.W., Jr.; Haas, R.H.; Schell, J.A.; Deering, D.W. Monitoring Vegetation Systems in the Great Plains with ERTS; NASA: Washington, DC, USA, 1974; Volume 1.
Sorichetta, A.; Hornby, G.M.; Stevens, F.R.; Gaughan, A.E.; Linard, C.; Tatem, A.J. High-Resolution Gridded Population Datasets for Latin America and the Caribbean in 2010, 2015, and 2020. Sci. Data 2015, 2, 150045. [Google Scholar] [CrossRef] [PubMed]
Open Spatial Demographic Data and Research―WorldPop. Available online: https://www.worldpop.org/ (accessed on 21 December 2025).
Home Page―Geoportal.Lt. Available online: https://www.geoportal.lt/geoportal/ (accessed on 21 December 2025).
Wan, Z.; Hook, S.; Hulley, G. MODIS/Terra Land Surface Temperature/Emissivity Daily L3 Global 1km SIN Grid V061; NASA: Washington, DC, USA, 2021.
Land Processes Distributed Active Archive Center | NASA Earthdata. Available online: https://www.earthdata.nasa.gov/centers/lp-daac (accessed on 22 December 2025).
Dudek, A.; Baranowski, J.; Liu, H.; Li, J.-B.; Li, M.; Chen, S.-H.; Dudek, A.; Baranowski, J. Gaussian Processes for Signal Processing and Representation in Control Engineering. Appl. Sci. 2022, 12, 4946. [Google Scholar] [CrossRef]
Liu, C.; Chen, R.; Sera, F.; Vicedo-Cabrera, A.M.; Guo, Y.; Tong, S.; Coelho, M.S.Z.S.; Saldiva, P.H.N.; Lavigne, E.; Matus, P.; et al. Ambient Particulate Air Pollution and Daily Mortality in 652 Cities. N. Engl. J. Med. 2019, 381, 705–715. [Google Scholar] [CrossRef]
Chen, G.; Li, S.; Knibbs, L.D.; Hamm, N.A.S.; Cao, W.; Li, T.; Guo, J.; Ren, H.; Abramson, M.J.; Guo, Y. A Machine Learning Method to Estimate PM_2.5 Concentrations across China with Remote Sensing, Meteorological and Land Use Information. Sci. Total Environ. 2018, 636, 52–60. [Google Scholar] [CrossRef]
Dietterich, T.G. Ensemble Methods in Machine Learning. In International Workshop on Multiple Classifier Systems; Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Berlin, Germany, 2000; pp. 1–15. [Google Scholar] [CrossRef]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar] [CrossRef]
Chicco, D.; Warrens, M.J.; Jurman, G. The Coefficient of Determination R-Squared Is More Informative than SMAPE, MAE, MAPE, MSE and RMSE in Regression Analysis Evaluation. PeerJ Comput. Sci. 2021, 7, e623. [Google Scholar] [CrossRef]
Willmott, C.J.; Matsuura, K. Advantages of the Mean Absolute Error (MAE) over the Root Mean Square Error (RMSE) in Assessing Average Model Performance. Clim. Res. 2005, 30, 79–82. [Google Scholar] [CrossRef]
Lin, L.; Liang, Y.; Liu, L.; Zhang, Y.; Xie, D.; Yin, F.; Ashraf, T.; Lin, L.; Liang, Y.; Liu, L.; et al. Estimating PM_2.5 Concentrations Using the Machine Learning RF-XGBoost Model in Guanzhong Urban Agglomeration, China. Remote Sens. 2022, 14, 5239. [Google Scholar] [CrossRef]
Faye, D.; Lguensat, R.; Kaly, F.; Sudmant, A.; Gaye, A.T.; Kalisa, E. Machine Learning for Air Quality Forecasting: Insights from Five Provinces of Rwanda. Sci. Afr. 2025, 30, e02959. [Google Scholar] [CrossRef]
Sorek-Hamer, M.; Chatfield, R.; Liu, Y. Review: Strategies for Using Satellite-Based Products in Modelling PM_2.5 and Short-Term Pollution Episodes. Environ. Int. 2020, 144, 106057. [Google Scholar] [CrossRef]
Lee, J.; Barquilla, C.A.M.; Park, K.; Hong, A. Urban Form and Seasonal PM_2.5 Dynamics: Enhancing Air Quality Prediction Using Interpretable Machine Learning and IoT Sensor Data. Sustain. Cities Soc. 2024, 117, 105976. [Google Scholar] [CrossRef]
Zhang, Z.; Johansson, C.; Engardt, M.; Stafoggia, M.; Ma, X. Improving 3-Day Deterministic Air Pollution Forecasts Using Machine Learning Algorithms. Atmos. Chem. Phys. 2024, 24, 807–851. [Google Scholar] [CrossRef]
Wallek, S.; Langner, M.; Schubert, S.; Franke, R.; Sauter, T. Hourly Particulate Matter (PM₁₀) Concentration Forecast in Germany Using Extreme Gradient Boosting. Atmosphere 2024, 15, 525. [Google Scholar] [CrossRef]
Li, B.; Liu, C.; Hu, Q.; Sun, M.; Zhang, C.; Zhu, Y.; Liu, T.; Guo, Y.; Carmichael, G.R.; Gao, M. A Deep Learning Approach to Increase the Value of Satellite Data for PM_2.5 Monitoring in China. Remote Sens. 2023, 15, 3724. [Google Scholar] [CrossRef]

Figure 1. Overview of the data integration, preprocessing, and GIS-based prediction framework used for PM₁₀ estimation.

Figure 2. PM₁₀ Annual Mean Concentrations by Station and Year.

Figure 3. Spatial Distribution of Meteorological and Air-Quality Monitoring Stations.

Figure 4. (a–c) show validation results for the MLR, RF, and XGBoost models, respectively, (d) presents validation results for the hybrid RF–XGB model, and panel (e) illustrates the corresponding test performance of the hybrid model.

Figure 6. Station-level predicted–observed performance of the hybrid RF–XGBoost model at PM₁₀ sensor stations: (a) lt00001; (b) lt00002; (c) lt00003; (d) lt00004; (e) lt00012; (f) lt00021; (g) lt00022; (h) lt00023; (i) lt00031; (j) lt00033; (k) lt00041; (l) lt00042; (m) lt00043; (n) lt00044; (o) lt00053.

Figure 7. Station-level performance metrics of the hybrid RF–XGBoost model for PM₁₀.

Figure 8. Feature-importance ranking for the hybrid RF–XGBoost model.

Table 1. Dataset catalogue.

Data Group	Variables	Unit	Spatial/Temporal Resolution	Source
Ground air-quality data	PM₁₀, PM_2.5, NO₂, SO₂, O₃, CO	µg/m³, mg/m³	Hourly	EEA
Meteorological data	Air temperature, feels-like temperature, wind speed, wind gust, relative humidity, cloud cover, sea-level pressure, precipitation	°C, m/s, %, hPa, mm	Hourly	Meteo
Satellite atmospheric data	NO₂, CO, SO₂, O₃, HCHO columns, Absorbing Aerosol Index (AAI)	mol/m², unitless	~3.5–7 km	TROPOMI
Static station and environmental data	Coordinates (lat, lon), altitude, elevation (DEM), land cover, NDVI, population density, proximity to main roads	mixed	static derived	EEA, ESA, USGS, WorldPop, INSPIRE
MODIS LST data	day, night	°C	1 km	MODIS
Temporal features	year, month, day, weekday, season, lag1, lag2, lag3, roll7	unitless, µg/m³	Daily	Derived

Table 2. Validation performance of the hybrid RF–XGB model for different weighting schemes.

α	β	MAE	RMSE	R²
0.0	1.0	3.0179	3.8912	0.7194
0.1	0.9	2.9881	3.8475	0.7257
0.2	0.8	2.9638	3.8117	0.7308
0.3	0.7	2.9458	3.7842	0.7346
0.4	0.6	2.9339	3.7650	0.7373
0.5	0.5	2.9281	3.7544	0.7388
0.6	0.4	2.9271	3.7523	0.7391
0.7	0.3	2.9323	3.7589	0.7382
0.8	0.2	2.9439	3.7739	0.7361
0.9	0.1	2.9613	3.7975	0.7328
1.0	0.0	2.9846	3.8293	0.7283

Table 3. Statistics of the hybrid RF–XGBoost PM₁₀ predictions across concentration ranges.

Interval (µg/m³)	N	MAE	RMSE	Mean Residual (Pred−Obs)
<15	3544	2.94	3.76	2.64
15–40	4309	2.89	3.70	−0.02
>40	31	6.62	7.48	−6.62

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Fahim, M.A.S.; Sužiedelytė Visockienė, J. An Assessment of the Multi-Input Spatiotemporal RF–XGBoost Hybrid Framework for PM₁₀ Estimation in Lithuania. Sustainability 2026, 18, 2022. https://doi.org/10.3390/su18042022

AMA Style

Fahim MAS, Sužiedelytė Visockienė J. An Assessment of the Multi-Input Spatiotemporal RF–XGBoost Hybrid Framework for PM₁₀ Estimation in Lithuania. Sustainability. 2026; 18(4):2022. https://doi.org/10.3390/su18042022

Chicago/Turabian Style

Fahim, Mina Adel Shokry, and Jūratė Sužiedelytė Visockienė. 2026. "An Assessment of the Multi-Input Spatiotemporal RF–XGBoost Hybrid Framework for PM₁₀ Estimation in Lithuania" Sustainability 18, no. 4: 2022. https://doi.org/10.3390/su18042022

APA Style

Fahim, M. A. S., & Sužiedelytė Visockienė, J. (2026). An Assessment of the Multi-Input Spatiotemporal RF–XGBoost Hybrid Framework for PM₁₀ Estimation in Lithuania. Sustainability, 18(4), 2022. https://doi.org/10.3390/su18042022

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Assessment of the Multi-Input Spatiotemporal RF–XGBoost Hybrid Framework for PM₁₀ Estimation in Lithuania

Abstract

1. Introduction

2. Materials and Methods

2.1. Air Quality in Lithuania

2.2. DataSets

2.2.1. Ground Monitoring Stations

2.2.2. Meteorological Data

2.2.3. Satellite Atmospheric Data

2.2.4. Static Station and Environment Data

2.2.5. MODIS LST Data

2.2.6. Temporal Features

2.3. Method

2.4. Accuracy Assessment

3. Results

3.1. Model Performance Comparison

3.2. Temporal Performance Evaluation

3.3. Station-Level Performance

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI