Machine Learning for Urban Air Quality Prediction Using Google AlphaEarth Foundations Satellite Embeddings: A Case Study of Quito, Ecuador
Abstract
Highlights
- Machine learning using Google AlphaEarth Foundations satellite embeddings in Google Earth Engine accurately predicted NO2 and SO2 concentrations in Quito (R2 = 0.71), capturing fine-scale pollution patterns at 10 m resolution.
- SHAP analysis revealed that only a small subset of embedding bands drives accurate predictions, demonstrating that compact, globally consistent features can explain urban air quality dynamics without handcrafted indices or auxiliary datasets.
- Embedding-based remote sensing models provide a scalable solution for urban air quality monitoring in the Global South, overcoming sparse ground stations and persistent cloud cover.
- The approach supports policy-relevant applications such as hotspot detection, trend analysis, and sustainable urban planning, offering transferable methods for data-scarce cities worldwide.
Abstract
1. Introduction
2. Materials and Methods
2.1. Study Area
2.2. Ground-Based Air Quality Data (REEMAQ)
2.3. Satellite Embeddings (A00–A63)
2.4. Machine Learning Models and Evaluation
2.5. Feature Importance Analysis (SHAP)
3. Results
3.1. Analysis of Ground-Based REEMAQ Data
3.2. Machine Learning Model Performance
4. Discussion
4.1. Main Findings
4.2. Comparison with Existing Studies
4.3. SHAP Interpretability
4.4. Strengths, Novelty, Policy Relevance and Transferability
4.5. Integrated Limitations
- Sparse monitoring network—Quito has only nine stations, constraining representativeness and increasing spatial-interpolation uncertainty.
- Annual aggregation—the use of annual embeddings smooths short-term meteorological and photochemical variability, lowering predictive skill for pollutants such as O3 that depend on day-to-day processes.
- Topographic complexity—Quito’s setting in a high Andean valley (~2850 m) surrounded by steep mountains promotes thermal inversions and weak circulation that trap pollutants [49]. Model smoothing across steep terrain and limited station density resulted in some apparent NO2 spill-over into mountain slopes, an artefact also observed in other mountainous regions where satellite-based NO2 retrievals often correlate poorly with surface concentrations due to vertical-profile uncertainties and representation errors [50,51,52].
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| REEMAQ | Red Metropolitana de Monitoreo Atmosférico de Quito | 
| PM2.5 | Particulate Matter with aerodynamic diameter ≤2.5 μm | 
| NO2 | Nitrogen Dioxide | 
| SO2 | Sulfur Dioxide | 
| O3 | Ozone | 
| CO | Carbon Monoxide | 
| AEF | AlphaEarth Foundations | 
| ERA5 | ECMWF Reanalysis v5 | 
| SVR | Support Vector Regression | 
| SHAP | Shapley Additive Explanations | 
References
- Kim, S.Y.; Kerr, G.H.; van Donkelaar, A.; Martin, R.V.; West, J.J.; Anenberg, S.C. Tracking air pollution and CO2 emissions in 13,189 urban areas worldwide using large geospatial datasets. Commun. Earth Environ. 2025, 6, 311. [Google Scholar] [CrossRef] [PubMed]
- Zalakeviciute, R.; Lopez-Villada, J.; Ochoa, A.; Moreno, V.; Byun, A.; Proaño, E.; Mejía, D.; Bonilla-Bedoya, S.; Rybarczyk, Y.; Vallejo, F. Urban Air Pollution in the Global South: A Never-Ending Crisis? Atmosphere 2025, 16, 487. [Google Scholar] [CrossRef]
- Kushwaha, M.; Mehta, S.; Arora, P.; Dye, T.; Matte, T. Integrated Use of Low-Cost Sensors to Strengthen Air Quality Management; Vital Strategies: New York, NY, USA, 2022; Available online: https://www.vitalstrategies.org/resources/integrated-use-of-low-cost-sensors-to-strengthen-air-quality-management-in-indian-cities/ (accessed on 16 August 2025).
- Castell, N.; Dauge, F.R.; Schneider, P.; Vogt, M.; Lerner, U.; Fishbain, B.; Broday, D.; Bartonova, A. Can commercial low-cost sensor platforms contribute to air quality monitoring and exposure estimates? Environ. Int. 2017, 99, 293–302. [Google Scholar] [CrossRef]
- World Health Organization. Types of Pollutants. In Air Quality, Energy and Health; World Health Organization; Available online: https://www.who.int/teams/environment-climate-change-and-health/air-quality-and-health/health-impacts/types-of-pollutants (accessed on 16 August 2025).
- Institute of Environmental Science and Research (ESR). Health Effects of Air Pollution; ESR: Porirua, New Zealand, 2022; Available online: https://www.phfscience.nz/media/cofl2ahi/esr-environmental-health-report-health-effects-pollution.pdf (accessed on 16 August 2025).
- Vallejo, F.; Villacrés, P.; Yánez, D.; Espinoza, L.; Bodero-Poveda, E.; Díaz-Robles, L.A.; Oyaneder, M.; Campos, V.; Palmay, P.; Cordovilla-Pérez, A.; et al. Prolonged Power Outages and Air Quality: Insights from Quito’s 2023–2024 Energy Crisis. Atmosphere 2025, 16, 274. [Google Scholar] [CrossRef]
- Secretaría de Ambiente del Distrito Metropolitano de Quito. Red Metropolitana de Monitoreo de la Calidad del Aire (REMMAQ). This Platform Provides Real-Time Monitoring and Analysis of Air Quality (e.g., PM10, PM2.5, NO2, SO2, O2, CO, and VOCs) Across Quito. Available online: https://ambiente.quito.gob.ec/red-metropolitana-de-monitoreo-de-la-calidad-del-aire/ (accessed on 16 August 2025).
- Alvarez-Mendoza, C.I.; Teodoro, A.; Ramirez-Cando, L. Improving NDVI by removing cirrus clouds with optical remote sensing data from Landsat-8—A case study in Quito, Ecuador. Remote Sens. Appl. Soc. Environ. 2019, 13, 257–274. [Google Scholar] [CrossRef]
- Alvarez-Mendoza, C.I.; Teodoro, A.; Ramirez-Cando, L. Spatial estimation of surface ozone concentrations in Quito Ecuador with remote sensing data, air pollution measurements and meteorological variables. Environ. Monit. Assess. 2019, 191, 155. [Google Scholar] [CrossRef] [PubMed]
- Alvarez-Mendoza, C.I.; Teodoro, A.C.; Torres, N.; Vivanco, V. Assessment of Remote Sensing Data to Model PM10 Estimation in Cities with a Low Number of Air Quality Stations: A Case of Study in Quito, Ecuador. Environments 2019, 6, 85. [Google Scholar] [CrossRef]
- Rolf, E.; Proctor, J.; Carleton, T.; Bolliger, I.; Shankar, V.; Ishihara, M.; Recht, B.; Hsiang, S. A generalizable and accessible approach to machine learning with global satellite imagery. Nat. Commun. 2021, 12, 4392. [Google Scholar] [CrossRef]
- Brown, C.F.; Kazmierski, M.R.; Pasquarella, V.J.; Rucklidge, W.J.; Samsikova, M.; Zhang, C.; Shelhamer, E.; Lahera, E.; Wiles, O.; Ilyushchenko, S.; et al. AlphaEarth Foundations: An embedding field model for accurate and efficient global mapping from sparse label data. arXiv 2025, arXiv:2507.22291. [Google Scholar] [CrossRef]
- Kozlov, M. Google AI model creates maps of Earth ‘at any place and time’. Nature 2025, 644, 313–314. [Google Scholar] [CrossRef]
- Tang, D.; Zhan, Y.; Yang, F. A review of machine learning for modeling air quality: Overlooked but important issues. Atmos. Res. 2024, 300, 107261. [Google Scholar] [CrossRef]
- Agbehadji, I.E.; Obagbuwa, I.C. Systematic Review of Machine Learning and Deep Learning Techniques for Spatiotemporal Air Quality Prediction. Atmosphere 2024, 15, 1352. [Google Scholar] [CrossRef]
- Méndez, M.; Merayo, M.G.; Núñez, M. Machine learning algorithms to forecast air quality: A survey. Artif. Intell. Rev. 2023, 56, 10031–10066. [Google Scholar] [CrossRef]
- Xu, Z.; Zhang, H.; Zhai, A.; Kong, C.; Zhang, J. Stacking Ensemble Learning and SHAP-Based Insights for Urban Air Quality Forecasting: Evidence from Shenyang and Global Implications. Atmosphere 2025, 16, 776. [Google Scholar] [CrossRef]
- Tao, C.; Zhang, Q.; Huo, S.; Ren, Y.; Han, S.; Wang, Q.; Wang, W. PM2.5 pollution modulates the response of ozone formation to VOC emitted from various sources: Insights from machine learning. Sci. Total Environ. 2024, 916, 170009. [Google Scholar] [CrossRef]
- Alvarez, C.I.; López, S.; Vásquez, D.; Gualotuña, D. Assessing Air Quality Dynamics during Short-Period Social Upheaval Events in Quito, Ecuador, Using a Remote Sensing Framework. Remote Sens. 2024, 16, 3436. [Google Scholar] [CrossRef]
- Secretaria de Ambiente de Quito. DATOS HISTÓRICOS REMMAQ (2004–2024)—Historic Air Quality Data. Available online: https://datosambiente.quito.gob.ec/ (accessed on 16 August 2025).
- Google DeepMind. AlphaEarth Foundations Helps Map Our Planet in Unprecedented Detail. Discover (DeepMind Blog), 2025. Available online: https://deepmind.google/discover/blog/alphaearth-foundations-helps-map-our-planet-in-unprecedented-detail/ (accessed on 16 August 2025).
- Stamou, A.; Karachaliou, E.; Tavantzis, I.; Bakousi, A.; Dosiou, A.; Tsifodimou, Z.-E.; Stylianidis, E. Satellite Imagery for Comprehensive Urban Morphology and Surface Roughness Analysis: Leveraging GIS Tools and Google Earth Engine for Sustainable Urban Planning. Urban Sci. 2025, 9, 213. [Google Scholar] [CrossRef]
- Mohamadi, B.; Abu, G.; Mohamed, O.; Li, H.; Al-Sabbagh, T.A.; Younes, A. Integrating InSAR coherence and air pollution detection satellites to study the impact of war on air quality. Int. J. Appl. Earth Obs. Geoinf. 2025, 142, 104687. [Google Scholar] [CrossRef]
- Chatterjee, K.; Kumar, S.S.; Kumar, R.P.; Bandyopadhyay, A.; Swain, S.; Mallik, S.; Al-Rasheed, A.; Abbas, M.; Soufiene, B.O. Future Air Quality Prediction Using Long Short-Term Memory Based on Hyper Heuristic Multi-Chain Model. IEEE Access 2024, 12, 123678–123693. [Google Scholar] [CrossRef]
- Khattab, I.G.; Ali, M.C.; Abonazel, M.R.; Elshamy, H.M.; Azazy, A.R. Air Quality Forecasting Based on Socio-Economic Environmental Indicators: Combining Statistical Machine Learning Techniques. Int. J. Anal. Appl. 2025, 23, 183. [Google Scholar] [CrossRef]
- Chen, J.; Zhu, S.; Wang, P.; Zheng, Z.; Shi, S.; Li, X.; Xu, C.; Yu, K.; Chen, R.; Kan, H.; et al. Predicting particulate matter, nitrogen dioxide, and ozone across Great Britain with high spatiotemporal resolution based on random forest models. Sci. Total Environ. 2024, 926, 171831. [Google Scholar] [CrossRef] [PubMed]
- Alfasanah, Z.; Niam, M.Z.H.; Wardiani, S.; Ahsan, M.; Lee, M.H. Monitoring air quality index with EWMA and individual charts using XGBoost and SVR residuals. MethodsX 2025, 14, 103107. [Google Scholar] [CrossRef]
- Alhathloul, S.H.; Mishra, A.K.; Khan, A.A. Low visibility event prediction using random forest and K-nearest neighbor methods. Theor. Appl. Climatol. 2024, 155, 1289–1300. [Google Scholar] [CrossRef]
- Singh, S.; Kumar, M.; Verma, B.K.; Kumar, S. Optimizing Air Pollution Prediction with Random Forest Algorithm. Aerosol Sci. Eng. 2025. [Google Scholar] [CrossRef]
- Sawah, M.S.; Elmannai, H.; El-Bary, A.A.; Lotfy, K.; Sheta, O.E. Improving air quality prediction using hybrid BPSO with BWAO for feature selection and hyperparameters optimization. Sci. Rep. 2025, 15, 13176. [Google Scholar] [CrossRef]
- Yao, T.; Lu, S.; Wang, Y.; Li, X.; Ye, H.; Duan, Y.; Fu, Q.; Li, J. Revealing the drivers of surface ozone pollution by explainable machine learning and satellite observations in Hangzhou Bay, China. J. Clean. Prod. 2024, 440, 140938. [Google Scholar] [CrossRef]
- World Health Organization. WHO Global Air Quality Guidelines: Particulate Matter (PM2.5 and PM10), Ozone, Nitrogen DIOXIDE, Sulfur Dioxide and Carbon Monoxide; World Health Organization: Geneva, Switzerland, 2021. Available online: https://www.ncbi.nlm.nih.gov/books/NBK574594/?utm_source=chatgpt.com (accessed on 10 August 2025).
- Parra, R. Modeling PM2.5 Levels Due to Combustion Activities and Fireworks in Quito (Ecuador) for Forecasting Using WRF-Chem. Atmosphere 2025, 16, 495. [Google Scholar] [CrossRef]
- Cazorla, M.; Trujillo, M.; Seguel, R.; Gallardo, L. Comparative ozone production sensitivity to NOX and VOCs in Quito, Ecuador, and Santiago, Chile. Atmos. Chem. Phys. 2025, 25, 7087–7109. [Google Scholar] [CrossRef]
- Rowley, A.; Karakuş, O. Predicting air quality via multimodal AI and satellite imagery. Remote Sens. Environ. 2023, 293, 113609. [Google Scholar] [CrossRef]
- Mejía, C.D.; Faican, G.; Zalakeviciute, R.; Matovelle, C.; Bonilla, S.; Sobrino, J.A. Spatio-temporal evaluation of air pollution using ground-based and satellite data during COVID-19 in Ecuador. Heliyon 2024, 10, e28152. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Chau, P.N.; Zalakeviciute, R.; Thomas, I.; Rybarczyk, Y. Deep Learning Approach for Assessing Air Quality During COVID-19 Lockdown in Quito. Front. Big Data 2022, 5, 842455. [Google Scholar] [CrossRef]
- Tavella, R.A.; das Neves, D.F.; Silveira, G.d.O.; Vieira de Azevedo, G.M.G.; Brum, R.d.L.; Bonifácio, A.d.S.; Machado, R.A.; Brum, L.W.; Buffarini, R.; Adamatti, D.F.; et al. The Relationship Between Surface Meteorological Variables and Air Pollutants in Simulated Temperature Increase Scenarios in a Medium-Sized Industrial City. Atmosphere 2025, 16, 363. [Google Scholar] [CrossRef]
- Lakra, K.; Avishek, K. Influence of meteorological variables and air pollutants on fog/smog formation in seven major cities of Indo-Gangetic Plain. Environ. Monit. Assess. 2024, 196, 533. [Google Scholar] [CrossRef]
- Kassem, H.; El Hajjar, S.; Abdallah, F.; Omrani, H. Multi-view deep embedded clustering: Exploring a new dimension of air pollution. Eng. Appl. Artif. Intell. 2025, 139, 109509. [Google Scholar] [CrossRef]
- Jiménez-Navarro, M.J.; Martínez-Ballesteros, M.; Martínez-Álvarez, F.; Asencio-Cortés, G. Explaining deep learning models for ozone pollution prediction via embedded feature selection. Appl. Soft Comput. 2024, 157, 111504. [Google Scholar] [CrossRef]
- Morillas, C.; Alvarez, S.; Serio, C.; Masiello, G.; Martinez, S. TROPOMI NO2 Sentinel-5P data in the Community of Madrid: A detailed consistency analysis with in situ surface observations. Remote Sens. Appl. Soc. Environ. 2024, 33, 101083. [Google Scholar] [CrossRef]
- Alvarez-Mendoza, C.I. The Use of Remote Sensing in Air Pollution Control and Public Health. In Socio-Environmental Research in Latin America; López, S., Ed.; The Latin American Studies Book Series; Springer: Cham, Switzerland, 2023. [Google Scholar] [CrossRef]
- Alvarez-Mendoza, C.I.; Teodoro, A.; Freitas, A.; Fonseca, J. Spatial estimation of chronic respiratory diseases based on machine learning procedures—An approach using remote sensing data and environmental variables in Quito, Ecuador. Appl. Geogr. 2020, 123, 102273. [Google Scholar] [CrossRef]
- Fraunhofer Institute for Applied Optics Precision Engineering IOF. ESA Mission Sentinel 5 Launches with Optics from Jena; Fraunhofer Institute for Applied Optics and Precision Engineering IOF: Jena, Germany, 2025. [Google Scholar]
- De Vito, S.; Del Giudice, A.; D’Elia, G.; Esposito, E.; Fattoruso, G.; Ferlito, S.; Formisano, F.; Loffredo, G.; Massera, E.; D’Auria, P.; et al. Future Low-Cost Urban Air Quality Monitoring Networks: Insights from the EU’s AirHeritage Project. Atmosphere 2024, 15, 1351. [Google Scholar] [CrossRef]
- Connolly, R.E.; Yu, Q.; Wang, Z.; Chen, Y.-H.; Liu, J.Z.; Collier-Oxandale, A.; Papapostolou, V.; Polidori, A.; Zhu, Y. Long-term evaluation of a low-cost air sensor network for monitoring indoor and outdoor air quality at the community scale. Sci. Total Environ. 2022, 807, 150797. [Google Scholar] [CrossRef]
- Mancheno, G.; Jorquera, H. High spatial resolution WRF-Chem modeling in Quito, Ecuador. Environ. Sci. Adv. 2025, 4, 1310–1332. [Google Scholar] [CrossRef]
- Chang, B.; Liu, H.; Zhang, C.; Xing, C.; Tan, W.; Liu, C. Relating satellite NO2 tropospheric columns to near-surface concentrations: Implications from ground-based MAX-DOAS NO2 vertical profile observations. npj Clim. Atmos. Sci. 2025, 8, 1. [Google Scholar] [CrossRef]
- Cazorla, M.; Gallardo, L.; Jimenez, R. The complex Andes region needs improved efforts to face climate extremes. Elem. Sci. Anth. 2022, 10, 92. [Google Scholar] [CrossRef]
- Rijsdijk, P.; Eskes, H.; Dingemans, A.; Boersma, K.F.; Sekiya, T.; Miyazaki, K.; Houweling, S. Quantifying uncertainties in satellite NO2 superobservations for data assimilation and model evaluation. Geosci. Model Dev. 2025, 18, 483–509. [Google Scholar] [CrossRef]









| Pollutant | Model | No. Train | No. Test | MAE (µg/m3) | RMSE (µg/m3) | R2 | 
|---|---|---|---|---|---|---|
| CO | SVR | 42 | 18 | 0.06 | 0.07 | 0.61 | 
| Gradient Boosting | 42 | 18 | 0.07 | 0.08 | 0.48 | |
| NO2 | SVR | 42 | 18 | 2.53 | 2.91 | 0.71 | 
| KNN | 42 | 18 | 2.33 | 2.92 | 0.71 | |
| O3 | Random Forest | 48 | 21 | 3.78 | 4.56 | −0.02 | 
| Ridge | 48 | 21 | 3.67 | 4.60 | −0.04 | |
| PM2.5 | Ridge | 49 | 22 | 1.20 | 1.57 | 0.55 | 
| Elastic Net | 49 | 22 | 1.21 | 1.57 | 0.55 | |
| SO2 | SVR | 44 | 19 | 0.28 | 0.39 | 0.71 | 
| Random Forest | 44 | 19 | 0.36 | 0.43 | 0.66 | 
| Study | Location | Data/Method | Pollutant(s) | Reported R2 | Resolution | 
|---|---|---|---|---|---|
| Alvarez-Mendoza et al. (2019) [10] | Quito | Landsat + meteorological regression | O3 | 0.55 | 30 m | 
| Mejía et al. (2024) [37] | Quito | Sentinel-5P + land-use regression | NO2 | 0.40–0.60 | 1 km | 
| Chau et al. (2022) [38] | Quito | Deep learning + Sentinel-5P | PM2.5, NO2 | 0.45–0.65 | 1 km | 
| Chen et al. (2024) [25] | Great Britain | Random Forest + multiple predictors | NO2, O3 | 0.75–0.80 | 1 km | 
| This study | Quito | AlphaEarth embeddings + SVR | NO2, SO2 | 0.71 | 10 m | 
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. | 
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Alvarez, C.I.; Ulloa Vaca, C.A.; Echeverria Llumipanta, N.A. Machine Learning for Urban Air Quality Prediction Using Google AlphaEarth Foundations Satellite Embeddings: A Case Study of Quito, Ecuador. Remote Sens. 2025, 17, 3472. https://doi.org/10.3390/rs17203472
Alvarez CI, Ulloa Vaca CA, Echeverria Llumipanta NA. Machine Learning for Urban Air Quality Prediction Using Google AlphaEarth Foundations Satellite Embeddings: A Case Study of Quito, Ecuador. Remote Sensing. 2025; 17(20):3472. https://doi.org/10.3390/rs17203472
Chicago/Turabian StyleAlvarez, Cesar Ivan, Carlos Andrés Ulloa Vaca, and Neptali Armando Echeverria Llumipanta. 2025. "Machine Learning for Urban Air Quality Prediction Using Google AlphaEarth Foundations Satellite Embeddings: A Case Study of Quito, Ecuador" Remote Sensing 17, no. 20: 3472. https://doi.org/10.3390/rs17203472
APA StyleAlvarez, C. I., Ulloa Vaca, C. A., & Echeverria Llumipanta, N. A. (2025). Machine Learning for Urban Air Quality Prediction Using Google AlphaEarth Foundations Satellite Embeddings: A Case Study of Quito, Ecuador. Remote Sensing, 17(20), 3472. https://doi.org/10.3390/rs17203472
 
        

 
                                                

 
       