PM2.5: Air Quality Index Prediction Using Machine Learning: Evidence from Kuwait’s Air Quality Monitoring Stations
Abstract
1. Introduction
2. Literature Review
3. Research Goals and Methodology
- (a)
- Identify temporal, spatial, and environmental conditions associated with poor PM2.5 air quality in Kuwait.
- (b)
- Determine the minimal and most relevant set of features needed for accurate PM2.5 AQI prediction.
- (c)
- Evaluate and optimize multiple machine learning models to identify those that best predict PM2.5 AQI.
- (d)
- Apply insights from the best-performing models to propose public policies for improving Kuwait’s air quality.
- Mean Square Error (MSE) measures the average squared difference between predicted and actual PM2.5_AQI values, heavily penalizing large classification errors. This metric is particularly useful for identifying models that produce significant deviations, with lower values indicating better model performance.
- Root Mean Square Error (RMSE), the square root of MSE, provides a more interpretable measure of classification error magnitude in the same units as PM2.5_AQI. Lower RMSE values indicate better model accuracy, offering practical insight into average prediction deviations.
- Mean Absolute Error (MAE) represents the average absolute difference between predicted and actual values, treating all errors equally without disproportionate penalization of large errors. This metric provides a straightforward understanding of classification error in the original PM2.5_AQI scale.
- Coefficient of Determination (R2) indicates the proportion of variance in PM2.5_AQI that is classified from the model, and ranges between 0–1. R2 = 1 indicates perfect classification, and R2 = 0 suggests the model provides no better classification than the mean value. Higher R2 values represent better model fit and classification power.
4. Kuwait’s Environmental Landscape: A Study in Air Quality Management
- The historic Shuwaikh Power Station in the capital region (North).
- The modern Saad Alabdullah facility serving the western districts (West).
- The industrial hub of Shuaiba (South).
- Critical urban monitoring points at Rumaithya and Road 50 Station.
- Suburban stations at Qurain and Mutla (West).
- Residential area monitors in Mansouria and Jahra (West).
- Industrial zone stations near Ahmadi and Fahaheel (South, near Oil and Gas facilities).
- Population centers at Al Salam (Center) and Ali Sabah Al Salem (South).
4.1. Dataset Structure and Components
- Temporal and Spatial Parameters:
- –
- High-resolution timestamps for each measurement
- –
- Precise geographical locations of monitoring stations
- –
- Seasonal classification of data points
- Environmental Parameters: The dataset tracks multiple air quality indicators, measured with high precision:
- –
- Particulate Matter (PM10): Measured in micrograms per cubic meter (g/m3, also 24 h average and maximum).
- –
- Particulate Matter (PM2.5): Measured in micrograms per cubic meter (g/m3, also 24 h average and maximum).
- –
- Sulfur Dioxide (SO2): Measured in parts per billion (ppb, also 24 h average).
- –
- Hydrogen Sulfide (H2S): Measured in ppb.
- –
- Nitrogen Oxides (NO, NO2, NOx): Measured in ppb, also NOx 24 h average and NO2 1-hout average.
- –
- Ammonia (NH3): Measured in ppb.
- –
- Ozone (O3): Measured in ppb, also 8 h maximum.
- –
- Carbon Monoxide (CO): Measured in parts per million (ppm, also 8 h maximum).
- –
- Carbon Dioxide (CO2): Measured in ppm.
- –
- Wind Direction (WD) in degrees.
- –
- Wind Speed (WS) in m/s, also, 24 h average and maximum.
- –
- Temperature (TEMPERAT, degree Celsius), Relative Humidity (RH, percent).
- AQI Metrics: The dataset tracks multiple air quality indicators, measured with high precision:
- –
- PMI2.5_AQI.
- –
- PMI10_AQI.
- –
- SO2_AQI, NOX_AQI, NO2_AQI, CO_AQI, O3_AQI, Combined_AQI.
4.2. WHO and Environmental Agency Air Quality Standards: A Comprehensive Health Risk Assessment Framework
- Optimal Air Quality (0–12 µg/m3): Air quality in this range represents ideal conditions with minimal health risks. The concentration of fine particulate matter is sufficiently low that both the general population and sensitive individuals can safely engage in outdoor activities without concern for respiratory impacts.
- Moderate Conditions (13–35 µg/m3): While generally safe for most individuals, these levels warrant awareness for sensitive populations. The air quality remains within acceptable parameters, though individuals with known sensitivities may need to monitor their exposure during extended outdoor activities.
- Sensitive Population Advisory (36–55 µg/m3): At this level, vulnerable populations require increased vigilance. This includes individuals with pre-existing respiratory conditions, those with cardiovascular sensitivities, young children and elderly populations, and pregnant women. The general population typically maintains normal resilience at these levels.
- Public Health Notice (56–150 µg/m3): These concentrations signal broader public health concerns, with potential impacts across all population segments. Sensitive groups face elevated risks, and the general public may begin experiencing mild respiratory symptoms or discomfort.
- Severe Health Alert (151–250 µg/m3): This range triggers public health warnings due to significant health risks. All population segments face an increased likelihood of adverse health effects, requiring protective measures and reduced outdoor exposure.
- Critical Health Emergency (251+ µg/m3): These extreme levels constitute a severe public health emergency, demanding immediate protective action for all population segments.
4.3. PM2.5 Air Quality Index (AQI) Standards and Pollutant Thresholds
- Good (AQI: 0–50): At this optimal level, air quality is ideal for all populations. The concentrations of pollutants are minimal, allowing for unrestricted outdoor activities and posing no health risks. These conditions represent the gold standard for urban air quality management.
- Moderate (AQI: 51–100): While still generally safe, these levels indicate a slight deterioration in air quality. Sensitive individuals might want to monitor their outdoor exposure, though the general population can continue normal activities without concern.
- Unhealthy Level 1 (AQI: 101–150): At this level, air quality begins to pose health risks to sensitive groups. People with respiratory conditions, the elderly, and young children should limit prolonged outdoor exposure. The general public might begin to notice mild respiratory discomfort.
- Unhealthy Level 2 (AQI: 151–200): These conditions represent a significant deterioration in air quality that affects all population groups. Outdoor activities should be limited, and sensitive groups should remain indoors whenever possible. Health effects may become more pronounced and widespread.
- Very Unhealthy (AQI: 201–300): This range signals a serious health risk for all populations. Emergency measures may be necessary, and all outdoor activities should be curtailed. Health effects can be immediate and severe, particularly for vulnerable populations.
- Hazardous (AQI: 301–500): These extreme levels constitute a health emergency. The entire population is at risk of experiencing serious health effects. Immediate action is required to protect public health, and authorities should consider implementing emergency measures such as closing schools and restricting outdoor activities.
4.4. Meteorological Parameters and Their Impact on Air Quality
- Temperature (TEMPERAT °C): Temperature variations significantly influence air pollution dynamics through vertical air movement and pollutant dispersion, chemical reaction rates of atmospheric pollutants, formation of ground-level ozone in warmer conditions, thermal inversion effects that can trap pollutants, and the impact on local wind patterns and pollution transport.
- Relative Humidity (RH %RH): Humidity levels affect air quality through several mechanisms: particle size and composition modifications, formation of secondary aerosols, visibility reduction in high humidity conditions, impact on photochemical reaction rates, and the influence on pollutant deposition rates.
- Wind Speed (WS m/s): Wind speed is a critical factor in pollutant dispersion. Generally, higher speeds generally improve air quality through increased mixing. Low speeds can lead to pollutant accumulation. Noteworthy are the impact on dust suspension and transport, the influence on pollutant concentration patterns, and the effect on atmospheric stability conditions.
- Wind Direction (WD Deg): Wind direction determines the pollutant transport patterns from source regions, the impact areas of industrial emissions, the urban pollution distribution patterns, the regional air quality influences, and the seasonal pollution patterns based on prevailing winds.
5. Visual Analysis of PM2.5 Distribution and Environmental Correlations
5.1. Environmental Feature Correlations with the AQI
5.2. Weather Patterns and Air Quality: Wind Conditions and Air Stagnation
5.3. Temporal and Seasonal Patterns
5.4. Insights from Visual Analysis: Implications for Model Optimization and Environmental Management
5.5. Industrial Proximity and Environmental Drivers of Air Quality in Kuwaiti Regions: A Multivariate Visual Analysis
5.5.1. Industrial Impact on Regional Air Quality
5.5.2. PM2.5 Concentration and Temperature Effects
5.5.3. Wind Patterns and Air Quality Correlation
5.5.4. PM2.5 Temporal Analysis and Seasonal Variations
5.5.5. Critical Air Pollutants Contributing to Severe Air Quality Degradation
5.6. Correlations and Other Statistics
- Correlation of +0.916 with PM2.5_24hr_avg.
- Correlation of +0.634 with PM2.5_24hr_max.
- Correlation of +0.489 with PM2.5 g/m3(L).
- Correlation of +0.410 with PM10_24hr_avg.
- Correlation of +0.380 with Combined_AQI.
5.7. Implications for Feature Selection and Model Development
6. ML Parameter Optimization and Experimental Results
6.1. Experiment 1: With All the Original Dataset Features
6.2. Experiment 2: Refined Model Analysis with Key Features
6.3. Experiment 3: Advanced Model Evaluation with Aggressive Pruning
6.4. Experiment 4: Model Performance Under Challenging Data Conditions
- The significantly reduced feature set led to poor performance across all models.
- Random Forest’s relative superiority suggests better resilience to limited features.
- The generally lower performance across all models indicates the importance of the feature selection.
6.5. Experiment 5: Final Model Evaluation with Optimized Features
7. Research Result Analysis
- The strong correlation between PM2.5 and PM10 is partially explained by the inherent inclusion of PM2.5 particles within PM10 measurements. Given the high temperatures and relative humidity in Kuwait, atmospheric chemical reactions accelerate and generate secondary PM. This explains why the ML regression models in experiments with PM2.5-related and PM10-related features performed better than other experiments which did not contain them.
- The weak correlations of the gases features with the PM2.5 AQI target explain why the experiment 2 performance metrics were not strong, while the experiment 5 metrics were better.
- (a)
- (b)
- The superiority of Gradient Boosting, AdaBoost, and Tree in classifying the air quality data, as the data fits the simple tree patterns of these ML models. Ensembles of weak decision tree models, namely Gradient Boosting and AdaBoost, fall into the general ML group of Decision Trees.
- (c)
- Pruning the features with careful feature selection enhanced the performance of Gradient Boosting, AdaBoost, and Tree, in particular the MSEs of experiment 5 (with only five features) which were lower than in experiment 1.
- (d)
- The good performances obtained in experiments 1, 3, and 5 are attributed to mainly the presence of PM2.5 and PM10 features which correlate well with PM2.5_AQI. However, experiment 4 reveals poor performance results by the eight ML models although PM2.5 g/m3L was part of the selected features, as the PM2.5 g/m3L feature’s Pearson-R correlation factor with the target was only 0.489.
- (e)
- Comparing the selected features of experiments 4 (where the ML models performed poorly) and 5 (where the ML models performed well), experiment 5 added PM2.5_24-h_average as feature (refer to Figure 5) which was critical in boosting the performance. Although the PM2.5 g/m3L readings and weather factors such as WS and temperature were important, given that we trimmed the dataset to only 13.4% of its original size, the 24 h average of the PM2.5 g/m3L samples was crucial in compensating for the aggressive pruning of data instances (rows).
- (f)
- Deep learning models such as LSTM were not needed, given the goodness of the obtained results.
8. Supporting the Formulation of Public Policies
8.1. Reducing Industrial Gas Flaring
8.2. Transitioning Away from Hydrocarbon-Based Electricity Generation
8.3. Expanding Adoption of Electric Vehicles (EVs)
8.4. Continue Electricity Imports from Neighboring Countries
9. Conclusions and Future Research
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Alsaber, A.; Alsahli, R.; Al-Sultan, A.; Abu Doush, I.; Sultan, K.; Alkandary, D.; Coffie, E.; Setiya, P. Evaluation of various machine learning prediction methods for particulate matter PM 10 in Kuwait. Int. J. Inf. Technol. 2023, 15, 4505–4519. [Google Scholar]
- Alrashidi, H.; Almujally, N.; Kadhum, M.; Ullmann, T.D.; Joy, M. Evaluating an Automated Analysis Using Machine Learning and Natural Language Processing Approaches to Classify Computer Science Students’ Reflective Writing. In Pervasive Computing and Social Networking; Ranganathan, G., Bestak, R., Fernando, X., Eds.; Lecture Notes in Networks and Systems; Springer: Singapore, 2022; Volume 475, pp. 425–437. [Google Scholar] [CrossRef]
- Marjovi, A.; Arfire, A.; Martinoli, A. High resolution air pollution maps in urban environments using mobile sensor networks. In Proceedings of the 2015 International Conference on Distributed Computing in Sensor Systems, Fortaleza, Brazil, 10–12 June 2015; pp. 11–20. [Google Scholar]
- Zhao, J.; Ghedira, H.; Temimi, M. Detection of oil pollution in the arabian gulf using optical remote sensing imagery. In Proceedings of the 2014 IEEE Geoscience and Remote Sensing Symposium, Quebec City, QC, Canada, 13–18 July 2014; pp. 1453–1456. [Google Scholar]
- Mohamed Al-Damkhi, A.; Ahmed Abdul-Wahab, S.; Mansour Al-Khulaifi, N. Kuwait’s 1991 environmental tragedy: Lessons learned. Disaster Prev. Manag. Int. J. 2009, 18, 233–248. [Google Scholar] [CrossRef]
- Dairi, A.; Harrou, F.; Khadraoui, S.; Sun, Y. Integrated multiple directed attention-based deep learning for improved air pollution forecasting. IEEE Trans. Instrum. Meas. 2021, 70, 1–15. [Google Scholar] [CrossRef]
- Du, S.; Li, T.; Yang, Y.; Horng, S.J. Deep air quality forecasting using hybrid deep learning framework. IEEE Trans. Knowl. Data Eng. 2019, 33, 2412–2424. [Google Scholar] [CrossRef]
- Liang, Y.C.; Maimury, Y.; Chen, A.H.L.; Juarez, J.R.C. Machine learning-based prediction of air quality. Appl. Sci. 2020, 10, 9151. [Google Scholar] [CrossRef]
- Li, S.; Xie, G.; Ren, J.; Guo, L.; Yang, Y.; Xu, X. Urban PM2.5 concentration prediction via attention-based CNN–LSTM. Appl. Sci. 2020, 10, 1953. [Google Scholar] [CrossRef]
- Zhang, Z.; Zhang, S.; Zhao, X.; Chen, L.; Yao, J. Temporal difference-based graph transformer networks for air quality PM2.5 prediction: A case study in China. Front. Environ. Sci. 2022, 10, 924986. [Google Scholar] [CrossRef]
- Wang, X.; Shi, G.; Huang, C.; Wang, M.; Wang, Z. Affect of Power Consumption Decline of Air Compressor on Oil Saving of Heavy-duty Vehicle. In Proceedings of the 2013 IEEE Vehicle Power and Propulsion Conference (VPPC), Beijing, China, 15–18 October 2013; pp. 1–5. [Google Scholar]
- Chen, Z.Y.; Zhang, T.H.; Zhang, R.; Zhu, Z.M.; Yang, J.; Chen, P.Y.; Ou, C.Q.; Guo, Y. Extreme gradient boosting model to estimate PM2.5 concentrations with missing-filled satellite data in China. Atmos. Environ. 2019, 202, 180–189. [Google Scholar] [CrossRef]
- Liu, H.; Wu, H.; Lv, X.; Ren, Z.; Liu, M.; Li, Y.; Shi, H. An intelligent hybrid model for air pollutant concentrations forecasting: Case of Beijing in China. Sustain. Cities Soc. 2019, 47, 101471. [Google Scholar] [CrossRef]
- Lyu, Y.; Ju, Q.; Lv, F.; Feng, J.; Pang, X.; Li, X. Spatiotemporal variations of air pollutants and ozone prediction using machine learning algorithms in the Beijing-Tianjin-Hebei region from 2014 to 2021. Environ. Pollut. 2022, 306, 119420. [Google Scholar] [CrossRef] [PubMed]
- Islam, A.R.M.T.; Al Awadh, M.; Mallick, J.; Pal, S.C.; Chakraborty, R.; Fattah, M.A.; Ghose, B.; Kakoli, M.K.A.; Islam, M.A.; Naqvi, H.R.; et al. Estimating ground-level PM2.5 using subset regression model and machine learning algorithms in Asian megacity, Dhaka, Bangladesh. Air Qual. Atmos. Health 2023, 16, 1117–1139. [Google Scholar] [CrossRef] [PubMed]
- Aram, S.; Nketiah, E.; Saalidong, B.; Wang, H.; Afitiri, A.R.; Akoto, A.; Lartey, P. Machine learning-based prediction of air quality index and air quality grade: A comparative analysis. Int. J. Environ. Sci. Technol. 2024, 21, 1345–1360. [Google Scholar] [CrossRef]
- MachineLearning.org.in. Accuracy, Precision, Recall, and F1-Score. 2024. Available online: https://machinelearning.org.in/accuracy-precision-recall-and-f1-score/ (accessed on 18 May 2025).
- Canbek, O.; Szeto, C.; Washburn, N.R.; Kurtis, K.E. A quantitative approach to determining sulfate balance for LC3. Cement 2023, 12, 100063. [Google Scholar] [CrossRef]
- Bellinger, C.; Jabbar, M.; Zäıane, O.; Osornio-Vargas, A. A systematic review of data mining and machine learning for air pollution epidemiology. BMC Public Health 2017, 17, 907. [Google Scholar] [CrossRef] [PubMed]
- Zhang, B.; Duan, M.; Sun, Y.; Lyu, Y.; Hou, Y.; Tan, T. Air quality index prediction in six major chinese urban agglomerations: A comparative study of single machine learning model, ensemble model, and hybrid model. Atmosphere 2023, 14, 1478. [Google Scholar] [CrossRef]
- Morapedi, T.; Obagbuwa, I. Air pollution particulate matter (pm2.5) prediction in south african cities using machine learning techniques. Front. Artif. Intell. 2023, 6, 1230087. [Google Scholar] [CrossRef] [PubMed]
- Pant, A.; Sharma, S.; Pant, K. Evaluation of machine learning algorithms for air quality index (aqi) prediction. J. Reliab. Stat. Stud. 2023, 16, 229–242. [Google Scholar] [CrossRef]
- Castelli, M.; Clemente, F.; Popovič, A.; Silva, S.; Vanneschi, L. A machine learning approach to predict air quality in california. Complexity 2020, 2020, 8049504. [Google Scholar] [CrossRef]
- Nandi, A.; Kaur, N.; Singla, P. Simple augmentations of logical rules for neuro-symbolic knowledge graph completion. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, Toronto, ON, Canada, 9–14 July 2023. [Google Scholar]
- Tasioulis, T.; Karatzas, K. Reviewing explainable artificial intelligence towards better air quality modelling. In Environmental Informatics; Springer: Berlin/Heidelberg, Germany, 2023; pp. 3–19. [Google Scholar]
- Demšar, J.; Curk, T.; Erjavec, A.; Gorup, Č.; Hočevar, T.; Milutinovič, M.; Možina, M.; Polajnar, M.; Toplak, M.; Starič, A.; et al. Orange: Data mining toolbox in Python. J. Mach. Learn. Res. 2013, 14, 2349–2353. [Google Scholar]
- Văduva, A.G.; Munteanu, M.; Oprea, S.V.; Bâra, A.; Niculae, A.M. Understanding climate change and air quality over the last decade: Evidence from news and weather data processing. IEEE Access 2023, 11, 144631–144648. [Google Scholar] [CrossRef]
AQI Category | AQI Level | PM2.5 (g/m3) | Ozone (O3) (ppm), 8-h avg | PM10 (g/m3) | CO (ppm) | SO2 (ppm) | NO2 (ppm) |
---|---|---|---|---|---|---|---|
Good | 0–50 | 0–9 | 0.0–0.03 | 0.0–90 | 0.0–4.0 | 0.0–0.03 | 0.0–0.03 |
Moderate | 51–100 | 9.1–35.4 | 0.031–0.06 | 90.1–350.0 | 4.1–8.0 | 0.031–0.06 | 0.04–0.05 |
Unhealthy Level 1 | 101–150 | 35.5–55.4 | 0.061–0.092 | 350.1–431.1 | 8.1–11.7 | 0.061–0.182 | 0.06–0.30 |
Unhealthy Level 2 | 151–200 | 55.5–125.4 | 0.093–0.124 | 431.4–512.5 | 11.8–15.4 | 0.183–0.304 | 0.31–0.55 |
Very Unhealthy | 201–300 | 125.5–225.4 | 0.125–0.374 | 512.6–675.0 | 15.5–30.4 | 0.305–0.604 | 0.56–1.04 |
Hazardous | 301–500 | 225.5+ | 0.375–0.504 | 675.1–1000 | 30.5–50.4 | 0.605–1.004 | 1.05–2.04 |
Air Quality Station | Mean PM2.5 AQI | Max PM2.5 AQI | Mean PM2.5 (g/m3) | Max PM2.5 (g/m3) | Mean PM10 (g/m3) | Max PM10 (g/m3) | Mean SO2 (ppb) | Mean H2S (ppb) | Mean NO (ppb) | Mean NO2 (ppb) | Mean CO (ppb) | Mean CO2 (ppb) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
All | 102 | 869 | 39 | 2096 | 134 | 16,461 | 7.3 | 4.3 | 13.7 | 28.4 | 0.75 | 435.2 |
Ahmadi | 122 | 666 | 52 | 1398 | 106 | 3515 | 11.4 | 7.0 | 9.4 | 38.5 | 1.00 | 423.4 |
AlSalam | 94 | 521 | 33.7 | 1491 | 111 | 8448 | 10.0 | 5.5 | 13.3 | 23.8 | 0.91 | 426.7 |
Ali Sabah AlSalem | 115 | 553 | 44 | 1345 | 193 | 6731 | 5.0 | 3.4 | 9.6 | 30.3 | 0.78 | 430.9 |
Cerex2 | 90 | 197 | 30.9 | 163 | 126.3 | 1451 | 13.6 | - | 23.3 | 42.3 | - | - |
Fahaheel | 105 | 869 | 40 | 1315 | 114 | 10,881 | 9.2 | 3.2 | 20.0 | 35.0 | 0.75 | 409.3 |
Jahra | 78.3 | 197 | 26 | 423 | 137 | 4071 | 6.0 | 1.7 | 12.5 | 15.9 | 0.31 | 429.3 |
Mansouria | 96.6 | 416 | 34.4 | 978 | 95 | 3968 | 4.3 | 3.5 | 9.3 | 27.3 | 0.43 | 425.0 |
Mutla | 110.4 | 815 | 45.1 | 2096 | 254 | 16,461 | 3.4 | 3.9 | 15.7 | 23.3 | 1.37 | 432.3 |
Qurain | 98 | 496 | 36.2 | 1759 | - | - | 4.8 | 3.4 | 12.7 | 31.1 | 0.36 | - |
Rumaithya | 93 | 469 | 32.4 | 1200 | 132 | 2928 | 7.4 | 2.7 | 9.6 | 17.3 | 0.50 | 440.8 |
Saad AlAbdulla | 163.4 | 504 | 82.2 | 1198 | 106 | 1474 | 6.2 | 5.4 | 18.8 | 30.9 | 0.47 | 408.6 |
Shuaiba | 105 | 216 | 37.9 | 581 | 81.3 | 2571 | 10.0 | 4.0 | 27.7 | 38.8 | 1.28 | 558.7 |
Shuwaikh | 76 | 179 | 25.7 | 221 | 123 | 2190 | 10.0 | 9.5 | 14.4 | 36.5 | 0.80 | 425.3 |
ML Method | Parameter Settings |
---|---|
Random Forest | Number of trees = 800; Limit depth of individual trees = 3 |
k-Nearest Neighbors (kNN) | Number of neighbors = 2; Metric = Euclidean; Weight = Distance |
Neural Networks | Neurons in hidden layers = 40, 70, 50, 10; Activation = Logistic; Solver = Adam; Regularization parameter ; Max iterations = 200 |
Decision Tree | Minimum instances in leaves = 3; Maximum tree depth = 10 |
Linear Regression | Fit intercept; Elastic Net regularization L1:L2 = 0.37:0.63 |
AdaBoost | Number of estimators = 5; Learning rate = 0.51;
Classification algorithm = SAMME; Regression loss = Linear |
Gradient Boosting | Number of trees = 8; Learning rate = 1; Limit depth of individual trees = 9; Do not split subsets smaller than 2 |
Stochastic Gradient Descent (SGD) | Classification: -insensitive, ; Regression: Squared loss; Regularization = Ridge (L2) |
ML Method | MSE | RMSE | MAE | R2 |
---|---|---|---|---|
kNN | 1493.768 | 38.649 | 25.132 | 0.177 |
Tree | 3.526 | 1.878 | 0.116 | 0.998 |
SVM | 1840.104 | 42.896 | 31.323 | 0.014 |
Random Forest | 116.603 | 10.798 | 6.935 | 0.936 |
Neural Network | 1814.913 | 42.602 | 30.303 | 0.000 |
Linear Regression | 773.726 | 27.816 | 21.515 | 0.574 |
Gradient Boosting | 0.981 | 0.990 | 0.141 | 0.999 |
AdaBoost | 1.510 | 1.229 | 0.044 | 0.999 |
ML Method | MSE | RMSE | MAE | R2 |
---|---|---|---|---|
kNN | 1671.76 | 40.89 | 25.901 | 0.100 |
Tree | 695.77 | 26.38 | 15.172 | 0.626 |
SVM | 1880.46 | 43.364 | 31.291 | 0.012 |
Random Forest | 94.086 | 31.370 | 21.078 | 0.470 |
Neural Network | 1858.395 | 43.109 | 30.284 | 0.000 |
Linear Regression | 1371.489 | 37.034 | 26.593 | 0.262 |
Gradient Boosting | 754.689 | 27.472 | 15.783 | 0.594 |
AdaBoost | 612.396 | 24.747 | 14.071 | 0.670 |
ML Method | MSE | RMSE | MAE | R2 |
---|---|---|---|---|
kNN | 43.630 | 6.605 | 4.005 | 0.976 |
Tree | 1.379 | 1.174 | 0.102 | 0.999 |
SVM | 488.034 | 22.091 | 16.214 | 0.731 |
Random Forest | 116.963 | 10.815 | 6.940 | 0.936 |
Neural Network | 324.129 | 18.004 | 1.873 | 0.821 |
Linear Regression | 790.053 | 28.108 | 21.624 | 0.565 |
Gradient Boosting | 1.182 | 1.087 | 0.125 | 0.999 |
AdaBoost | 1.120 | 1.058 | 0.032 | 0.999 |
ML Method | MSE | RMSE | MAE | R2 |
---|---|---|---|---|
kNN | 1322.122 | 36.361 | 23.293 | 0.271 |
Tree | 1296.360 | 36.005 | 21.981 | 0.286 |
SVM | 1677.427 | 40.956 | 30.034 | 0.076 |
Random Forest | 1131.302 | 33.635 | 21.647 | 0.377 |
Neural Network | 1814.903 | 42.602 | 30.302 | −0.000 |
Linear Regression | 1493.905 | 38.651 | 27.403 | 0.177 |
Gradient Boosting | 1524.831 | 39.049 | 23.076 | 0.160 |
AdaBoost | 1395.816 | 37.361 | 22.960 | 0.231 |
ML Method | MSE | RMSE | MAE | R2 |
---|---|---|---|---|
kNN | 56.554 | 7.520 | 2.993 | 0.969 |
Tree | 1.383 | 1.176 | 0.102 | 0.999 |
SVM | 765.157 | 27.661 | 20.138 | 0.578 |
Random Forest | 117.367 | 10.834 | 6.953 | 0.935 |
Neural Network | 870.577 | 29.506 | 11.932 | 0.520 |
Linear Regression | 697.105 | 26.403 | 21.002 | 0.616 |
Gradient Boosting | 0.775 | 0.880 | 0.118 | 1.000 |
AdaBoost | 0.713 | 0.844 | 0.030 | 1.000 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Alrashidi, H.; Sibai, F.N.; Abonamah, A.; Alrashidi, M.; Alsaber, A. PM2.5: Air Quality Index Prediction Using Machine Learning: Evidence from Kuwait’s Air Quality Monitoring Stations. Sustainability 2025, 17, 9136. https://doi.org/10.3390/su17209136
Alrashidi H, Sibai FN, Abonamah A, Alrashidi M, Alsaber A. PM2.5: Air Quality Index Prediction Using Machine Learning: Evidence from Kuwait’s Air Quality Monitoring Stations. Sustainability. 2025; 17(20):9136. https://doi.org/10.3390/su17209136
Chicago/Turabian StyleAlrashidi, Huda, Fadi N. Sibai, Abdullah Abonamah, Mufreh Alrashidi, and Ahmad Alsaber. 2025. "PM2.5: Air Quality Index Prediction Using Machine Learning: Evidence from Kuwait’s Air Quality Monitoring Stations" Sustainability 17, no. 20: 9136. https://doi.org/10.3390/su17209136
APA StyleAlrashidi, H., Sibai, F. N., Abonamah, A., Alrashidi, M., & Alsaber, A. (2025). PM2.5: Air Quality Index Prediction Using Machine Learning: Evidence from Kuwait’s Air Quality Monitoring Stations. Sustainability, 17(20), 9136. https://doi.org/10.3390/su17209136