A Multi-Source Sensor Dataset for Spain: Integrating Air Quality, Meteorological, Mobility and Calendar Records
Abstract
1. Introduction
2. Related Work
2.1. Open Air Quality Data and the Integration Gap
2.2. Multi-Source Data Fusion for Air Quality
2.3. Mobile-Network Data as a Sensor for Human Activity
3. Materials and Methods
3.1. Overview of the Integration Pipeline
- 1.
- Metadata consolidation. The yearly station inventory files published by MITECO are read and deduplicated by station code, producing a single station registry that contains coordinates, province and municipality codes, the area type classification, and the dominant emission source label. The same operation is applied to the AEMET station inventory obtained from the OpenData API. The remaining two sources do not require a consolidation step at this stage: the MITMA mobility records are keyed directly by the INE 5-digit municipality code, which is constructed from the province and municipality codes already present in the MITECO registry, so no separate metadata file is needed and no acquisition of additional inventory data is performed; the public holiday calendar is keyed by date and Autonomous Community code, both of which are derived from the same MITECO registry through the official INE province-to-CCAA mapping.
- 2.
- Per-source acquisition and cleaning. Each of the four sources is acquired and processed independently through its own data interface. MITECO air quality data are downloaded as bulk annual CSV files from the agency’s open data portal; AEMET meteorological data are obtained from the OpenData REST API, with paginated calls respecting the maximum six-month window and the rate limit of approximately 50 requests per minute; MITMA mobility data are downloaded as monthly archives of compressed CSV files from the agency’s open mobility portal; and the public holiday calendar is constructed through the Python holidays package, which encodes both the national list and the Autonomous Community lists for the study period. The cleaning operations performed on each source are described in detail in Section 3.2.1, Section 3.2.2, Section 3.2.3 and Section 3.2.4. The output of this stage is a set of cleaned per-station and per-municipality time series with continuous datetime indices, one row per native time unit and one column per published variable.
- 3.
- Spatial association. Each MITECO air quality station is linked to the relevant record in every other source through identifiers derived from its inventory metadata. For meteorology, the MITECO and AEMET networks use independent station codes with no shared identifier, so each MITECO station is paired with the nearest AEMET station through a three-tier rule based on great-circle distance, computed by the Haversine formula [40]. The first tier accepts the nearest match if it lies within a 50 km threshold, the second tier restricts the search to the same INE province if the first tier fails, and the third tier accepts the nationwide nearest station as an unconditional fallback; the rule and the rationale for the 50 km cut-off are described in detail in Section 3.3.1, and the resulting empirical distribution of pair distances is reported in Section 5, where it is shown that every MITECO station in the network is matched under the first tier and the median pair distance is below 5 km. The output of this stage is a lookup table that records, for each MITECO station, the assigned AEMET station, the Haversine distance between the two and the tier under which the match was made. For mobility, the five-digit INE municipality code of each MITECO station is constructed by concatenating its province and municipality codes, and the MITMA records of that municipality are then attached to the station; all stations within the same municipality share the same mobility features. For holidays, the Autonomous Community code derived from the station’s province (through the official INE province-to-CCAA mapping) is used to retrieve the applicable regional holiday calendar in addition to the national one, which is the same for every station.
- 4.
- Per-station integration. For each MITECO air quality station, the pipeline assembles the integrated table by joining the air quality series with the meteorological, mobility and holiday records. The exact procedure differs slightly between the two temporal variants of the released dataset. In the hourly variant, the air quality observations remain at their native hourly resolution, and the daily meteorological and mobility records are first lagged by one day and then expanded to the 24 h of the corresponding day (the lag-and-expand procedure described in Section 3.3.2). In the daily variant, the air quality observations are previously aggregated to daily mean, minimum and maximum values, and the daily meteorological and mobility records are merged at the same calendar day without temporal shift. The holiday flags are assigned at time t without lag in both variants. The result is one file per station with a uniform schema across the network at either hourly or daily resolution.
3.2. Data Sources
3.2.1. Air Quality Data: The MITECO Sensor Network
3.2.2. Meteorological Data: The AEMET Sensor Network
3.2.3. Mobility Data: The MITMA Sensor-Derived Network
3.2.4. Public Holiday Calendar
3.2.5. Source-Level Summary
3.3. Integration Procedure
3.3.1. Distance-Based Pairing of Air Quality and Meteorological Stations
3.3.2. Temporal Alignment of Daily Exogenous Features
3.3.3. Per-Station Enrichment Join and Feature Schema
4. Data Records
4.1. Repository Structure
4.2. Integrated Per-Station Files
4.3. Segregated Per-Source Files
4.4. Supporting Files and Source Code
5. Technical Validation
5.1. Spatial Pairing Quality
5.2. Internal Consistency of Source Data
5.3. Cross-Source Consistency
6. Usage Notes
6.1. Loading and Suggested Applications
6.2. Recommended Filtering and Preprocessing
6.3. Caveats and Known Limitations
6.4. Preliminary Modelling Experiments
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Glossary
| AI | Artificial Intelligence |
| DL | Deep Learning |
| ML | Machine Learning |
| CNN | Convolutional Neural Network |
| LSTM | Long Short-Term Memory |
| RNN | Recurrent Neural Network |
| GNN | Graph Neural Network |
| MAE | Mean Absolute Error |
| RMSE | Root Mean Squared Error |
| MITECO | Spanish Ministry for the Ecological Transition |
| AEMET | Spanish State Meteorological Agency |
| MITMA | Spanish Ministry of Transport and Sustainable Mobility |
| CDR | Call Detail Record |
| INE | Spanish National Statistics Institute |
| API | Application Programming Interface |
| WHO | World Health Organization |
Appendix A. Repository Structure

References
- World Health Organization. WHO Global Air Quality Guidelines: Particulate Matter (PM2.5 and PM10), Ozone, Nitrogen Dioxide, Sulfur Dioxide and Carbon Monoxide. 2021. Available online: https://www.who.int/publications/i/item/9789240034228 (accessed on 16 May 2026).
- European Environment Agency. Europe’s Air Quality Status 2024; Briefing 06/2024; European Environment Agency: Copenhagen, Denmark, 2024. [Google Scholar] [CrossRef]
- Copernicus Atmosphere Monitoring Service. CAMS National Collaboration Programme Spain. 2024. Available online: https://atmosphere.copernicus.eu/spain (accessed on 16 May 2026).
- Huang, C.J.; Kuo, P.H. A deep CNN-LSTM model for particulate matter (PM2.5) forecasting in smart cities. Sensors 2018, 18, 2220. [Google Scholar] [CrossRef] [PubMed]
- Du, S.; Li, T.; Yang, Y.; Horng, S.J. Deep air quality forecasting using hybrid deep learning framework. IEEE Trans. Knowl. Data Eng. 2019, 33, 2412–2424. [Google Scholar] [CrossRef]
- González-Enrique, J.; Ruiz-Aguilar, J.J.; Moscoso-López, J.A.; Urda, D.; Deka, L.; Turias, I.J. Artificial neural networks, sequence-to-sequence LSTMs, and exogenous variables as analytical tools for NO2 (air pollution) forecasting: A case study in the Bay of Algeciras (Spain). Sensors 2021, 21, 1770. [Google Scholar] [CrossRef] [PubMed]
- Morales-García, J.; Ramos-Sorroche, E.; Balderas-Díaz, S.; Guerrero-Contreras, G.; Muñoz, A.; Santa, J.; Terroso-Sáenz, F. Reducing pollution health impact with air quality prediction assisted by mobility data. IEEE J. Biomed. Health Inform. 2024, 29, 9210–9220. [Google Scholar] [CrossRef] [PubMed]
- Hameed, S.; Islam, A.; Ahmad, K.; Belhaouari, S.B.; Qadir, J.; Al-Fuqaha, A. Deep learning based multimodal urban air quality prediction and traffic analytics. Sci. Rep. 2023, 13, 22181. [Google Scholar] [CrossRef] [PubMed]
- Ibrahim, M.R.; Lyons, T. Transforming CCTV cameras into NO2 sensors at city scale for adaptive policymaking. Sci. Rep. 2025, 15, 3640. [Google Scholar] [CrossRef] [PubMed]
- Cordero, J.M.; Narros, A.; Borge, R. True reduction in the air pollution levels in the community of madrid during the COVID-19 lockdown. Front. Sustain. Cities 2022, 4, 869000. [Google Scholar] [CrossRef]
- Donzelli, G.; Cioni, L.; Cancellieri, M.; Llopis-Morales, A.; Morales-Suárez-Varela, M. Relations between air quality and COVID-19 lockdown measures in Valencia, Spain. Int. J. Environ. Res. Public Health 2021, 18, 2296. [Google Scholar] [CrossRef] [PubMed]
- Baldasano, J.M. COVID-19 lockdown effects on air quality by NO2 in the cities of Barcelona and Madrid (Spain). Sci. Total Environ. 2020, 741, 140353. [Google Scholar] [CrossRef] [PubMed]
- Gangoiti, G.; de Blas, M.; Gómez, M.C.; Rodríguez-García, A.; Torre-Pascual, E.; García-Ruiz, E.; Saez de Camara, E.; Zuazo, I.; García, J.A.; Valdenebro, V. Impact of the COVID-19 lockdown in a European regional monitoring network (Spain): Are we free from pollution episodes? Int. J. Environ. Res. Public Health 2021, 18, 11042. [Google Scholar] [CrossRef] [PubMed]
- Xu, L.; Xu, B.; Sun, Y.; Cao, Y. Interpretable and scalable deep learning for urban NO2 prediction via multisource data. Transp. Res. Part Transp. Environ. 2025, 149, 105039. [Google Scholar] [CrossRef]
- Yi, X.; Zhang, J.; Wang, Z.; Li, T.; Zheng, Y. Deep distributed fusion network for air quality prediction. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining; Association for Computing Machinery: New York, NY, USA, 2018; pp. 965–973. [Google Scholar]
- Yin, P.; Deng, S.Y.; Zhu, Y.; Quan, G. Multi-source CPS data fusion model for air pollution prediction based on spatio-temporal Transformer. Cyber-Phys. Syst. 2026, 12, 203–224. [Google Scholar]
- Bonastre-Egea, J.; Bueno-Crespo, A.; Morales-García, J. A Multi-Source Sensor Dataset for Spain: Integrating Air Quality, Meteorological, Mobility and Calendar Records (2022–2024). 2026. Available online: https://zenodo.org/records/20196221 (accessed on 16 May 2026).
- Bonastre-Egea, J. airq_enriched_dataset: A Multi-Source Air Quality Data Integration Pipeline for Spain. GitHub Repository. 2026. Available online: https://github.com/juanbonastre/airq_enriched_dataset (accessed on 16 May 2026).
- OpenAQ. Aggregates Real-Time and Historical Air Quality Measurements from Monitoring Locations Worldwide Through a Uniform API. OpenAQ: Open Air Quality Data Platform. 2024. Available online: https://openaq.org (accessed on 16 May 2026).
- Rosales, C.M.; R. Bratburd, J.; Diez, S.; Duncan, S.; Malings, C.; Pant, P. Open air quality data platforms for environmental health research and action. Curr. Environ. Health Rep. 2025, 12, 27. [Google Scholar] [CrossRef] [PubMed]
- European Environment Agency. Centralised Repository of Validated Air Quality Data Reported by EEA Member Countries at Station Level. Air Quality e-Reporting (AQ e-Reporting). 2024. Available online: https://www.eea.europa.eu/data-and-maps/data/aqereporting-9 (accessed on 16 May 2026).
- Ministerio para la Transición Ecológica y el Reto Demográfico (MITECO). Red Nacional de Vigilancia de la Calidad del Aire. Datos Horarios Oficiales Publicados en Formato CSV. Datos de Calidad del Aire. 2024. Available online: https://www.miteco.gob.es/es/calidad-y-evaluacion-ambiental/temas/atmosfera-y-calidad-del-aire/evaluacion-y-datos-de-calidad-del-aire/datos.html (accessed on 16 May 2026).
- Felici-Castell, S.; Segura-Garcia, J.; Perez-Solano, J.J.; Fayos-Jordan, R.; Soriano-Asensi, A.; Alcaraz-Calero, J.M. AI-IoT Low-Cost Pollution-Monitoring Sensor Network to Assist Citizens with Respiratory Problems. Sensors 2023, 23, 9585. [Google Scholar] [CrossRef] [PubMed]
- Copernicus Atmosphere Monitoring Service. Distribution of Modelled Atmospheric Composition Concentration and Forecast Fields. Copernicus Atmosphere Monitoring Service (CAMS): Atmosphere Data Store. 2024. Available online: https://ads.atmosphere.copernicus.eu/ (accessed on 16 May 2026).
- Yin, H.; Zhang, Y.M.; Xu, J.; Chang, J.L.; Li, Y.; Liu, C.L. Air quality prediction with a meteorology-guided modality-decoupled spatio-temporal network. IEEE Trans. Artif. Intell. 2025, 7, 2059–2072. [Google Scholar]
- Xu, M.; Hu, W.; Han, K.; Ji, W. A multi-source data fusion model for air quality inference with hybrid supervised and self-supervised learning and adaptive feature importance estimation. Inf. Process. Manag. 2026, 63, 104474. [Google Scholar] [CrossRef]
- Geng, Z.; Fan, X.; Lu, X.; Zhang, Y.; Yu, G.; Huang, C.; Wang, Q.; Li, Y.; Ma, W.; Yu, Q.; et al. FuXi-Air: Urban Air Quality Forecasting Based on Emission-Meteorology-Pollutant multimodal Machine Learning. arXiv 2025, arXiv:2506.07616. [Google Scholar]
- Kalaiselvi, S.; Anitha, V.; Manimaran, V.; Lawrence, T.S. Air quality prediction using multi-source remote sensing data integration with hybrid deep learning framework. Sci. Rep. 2025, 16, 2688. [Google Scholar] [CrossRef] [PubMed]
- Arsov, M.; Zdravevski, E.; Lameski, P.; Corizzo, R.; Koteli, N.; Gramatikov, S.; Mitreski, K.; Trajkovik, V. Multi-Horizon Air Pollution Forecasting with Deep Neural Networks. Sensors 2021, 21, 1235. [Google Scholar] [CrossRef] [PubMed]
- Graça, D.; Reis, J.; Gama, C.; Monteiro, A.; Rodrigues, V.; Rebelo, M.; Borrego, C.; Lopes, M.; Miranda, A.I. Sensors Network as an Added Value for the Characterization of Spatial and Temporal Air Quality Patterns at the Urban Scale. Sensors 2023, 23, 1859. [Google Scholar] [CrossRef] [PubMed]
- Cukjati, J.; Mongus, D.; Rizman Žalik, K.; Žalik, B. IoT and Satellite Sensor Data Integration for Assessment of Environmental Variables: A Case Study on NO2. Sensors 2022, 22, 5660. [Google Scholar] [CrossRef] [PubMed]
- Folgado, M.G.; Sanz, V.; Hirn, J.; Lorenzo-Sáez, E.; Urchueguia, J. Deep learning for urban air quality: A traffic-based prediction and alert system for Valencia. Neural Comput. Appl. 2025, 37, 15837–15854. [Google Scholar] [CrossRef]
- Terroso-Saenz, F.; Morales-García, J.; Munoz, A. Nationwide air pollution forecasting with heterogeneous graph neural networks. ACM Trans. Intell. Syst. Technol. 2024, 15, 1–19. [Google Scholar] [CrossRef]
- Calabrese, F.; Ferrari, L.; Blondel, V.D. Urban sensing using mobile phone network data: A survey of research. ACM Comput. Surv. (CSUR) 2014, 47, 25. [Google Scholar] [CrossRef]
- Mamei, M.; Bicocchi, N.; Lippi, M.; Mariani, S.; Zambonelli, F. Evaluating Origin–Destination Matrices Obtained from CDR Data. Sensors 2019, 19, 4470. [Google Scholar] [CrossRef] [PubMed]
- Ministerio de Transportes y Movilidad Sostenible. Estudio de Movilidad de Viajeros de Ámbito Nacional Aplicando la Tecnología Big Data; Informe Metodológico (Versión 8); Ministerio de Transportes y Movilidad Sostenible: Madrid, Spain, 2024. [Google Scholar]
- Morales-García, J.; Ramos-Sorroche, E.; Balderas-Díaz, S.; Guerrero-Contreras, G.; Muñoz, A.; Santa, J.; Terroso-Sáenz, F. Exploiting synthetic data generation to enhance pollution prediction. Appl. Soft Comput. 2025, 175, 113076. [Google Scholar] [CrossRef]
- Terroso-Saenz, F.; Morales-García, J.; Puig-Cabrera, M.; Martinez-del Vas, G.; Muñoz, A. Tourist Mobility Forecasting with Region-Based Flows and Hierarchical Spatial Tessellation. Int. J. Inf. Technol. Decis. Mak. 2025, 1–33. [Google Scholar] [CrossRef]
- Melgarejo-Hernández, J.; García-Tapia-Mateo, P.; Morales-García, J.; Mazón, J.N. Near-Real-Time Integration of Multi-Source Seismic Data. Sensors 2026, 26, 451. [Google Scholar] [CrossRef] [PubMed]
- Robusto, C.C. The Cosine-Haversine Formula. Am. Math. Mon. 1957, 64, 38–40. [Google Scholar] [CrossRef] [PubMed]
- European Parliament and Council. Directive 2008/50/EC of the European Parliament and of the Council of 21 May 2008 on Ambient Air Quality and Cleaner Air for Europe. Off. J. Eur. Union 2008, L 152/1, 1–44. [Google Scholar]
- Agencia Estatal de Meteorología. Sistema de Acceso a Datos Meteorológicos y Climatológicos Mediante API REST. AEMET OpenData. 2024. Available online: https://opendata.aemet.es/ (accessed on 16 May 2026).
- Vacanza. Python Library for Generating Holidays. Python-Holidays: Definitive List of Public Holidays per Country. 2024. Available online: https://github.com/vacanza/holidays (accessed on 16 May 2026).













| Variable | Unit | N | Missing | Mean | Std. | Median | Min | Max |
|---|---|---|---|---|---|---|---|---|
| NO2 | g/m3 | 14,087,160 | 6.4% | 12.84 | 14.29 | 8.0 | −1.0 | 403 |
| NOx | g/m3 | 13,886,592 | 33.9% | 18.83 | 29.68 | 10.5 | −2.0 | 2785 |
| O3 | g/m3 | 11,364,600 | 4.3% | 59.21 | 27.86 | 59.9 | 0.0 | 339 |
| SO2 | g/m3 | 11,016,048 | 6.4% | 3.44 | 4.46 | 3.0 | 0.0 | 583 |
| PM10 | g/m3 | 10,433,952 | 9.9% | 20.30 | 28.27 | 15.0 | 0.0 | 3607 |
| PM2.5 | g/m3 | 6,318,768 | 11.7% | 9.02 | 9.80 | 7.0 | −0.3 | 1024 |
| CO | mg/m3 | 6,148,536 | 11.8% | 0.31 | 0.23 | 0.23 | 0.0 | 13.1 |
| C6H6 | g/m3 | 2,561,712 | 21.9% | 0.46 | 1.01 | 0.2 | 0.0 | 200.9 |
| Variable | Unit | Missing | Mean | Std. | Median | Min | Max |
|---|---|---|---|---|---|---|---|
| temp_mean | °C | 4.8% | 16.11 | 7.06 | 15.8 | −15.0 | 40.4 |
| temp_min | °C | 4.8% | 10.60 | 6.86 | 10.4 | −18.7 | 50.0 |
| temp_max | °C | 4.8% | 21.63 | 8.03 | 21.3 | −50.0 | 46.8 |
| precip | mm | 4.7% | 1.66 | 6.16 | 0.0 | 0.0 | 710.8 |
| wind_speed | m/s | 5.1% | 2.79 | 1.88 | 2.5 | 0.0 | 33.1 |
| wind_gust | m/s | 5.3% | 9.64 | 4.08 | 8.9 | 0.0 | 65.6 |
| humidity_mean | % | 5.6% | 64.25 | 18.38 | 64.0 | 1.0 | 100.0 |
| humidity_min | % | 5.2% | 46.36 | 19.48 | 43.0 | 0.0 | 100.0 |
| humidity_max | % | 5.2% | 86.45 | 13.96 | 91.0 | 1.0 | 111.0 |
| Variable | Unit | Mean | Std. | Median | Max |
|---|---|---|---|---|---|
| Pernoctaciones (overnight presence) | |||||
| stays_total | persons | 18,534 | 79,878 | 6201 | 3.4 M |
| stays_residents | persons | 6553 | 8776 | 4625 | 128,276 |
| stays_visitors | persons | 11,981 | 80,244 | 640 | 3.4 M |
| stays_visitor_ratio | ratio (0–1) | 0.258 | 0.341 | 0.096 | 1.000 |
| Personas | (daytime presence) | ||||
| pop_age_0_25 | persons | 4478 | 18,279 | 1458 | 763,209 |
| pop_age_25_45 | persons | 4804 | 22,554 | 1518 | 961,954 |
| pop_age_45_65 | persons | 5563 | 23,049 | 1908 | 980,847 |
| pop_age_65_100 | persons | 3681 | 16,171 | 1461 | 696,233 |
| pop_trips_0 | persons | 5078 | 18,619 | 2521 | 1.2 M |
| pop_trips_1 | persons | 819 | 3482 | 252 | 371,659 |
| pop_trips_2 | persons | 3803 | 16,367 | 1309 | 759,272 |
| pop_trips_2plus | persons | 8827 | 42,286 | 2425 | 2.0 M |
| pop_mobile_ratio | ratio (0–1) | 0.640 | 0.138 | 0.684 | 1.000 |
| Viajes–aggregate trip counts and kilometres | |||||
| trips_inbound | trips | 48,507 | 238,943 | 12,590 | 11.5 M |
| trips_outbound | trips | 48,507 | 238,933 | 12,588 | 11.5 M |
| trips_internal | trips | 27,728 | 194,811 | 2358 | 9.7 M |
| trips_inbound_km | km | 463,566 | 1,971,094 | 188,015 | 163.4 M |
| trips_outbound_km | km | 463,566 | 2,004,148 | 188,051 | 154.4 M |
| Inbound trips by destination activity | |||||
| trips_inbound_act_home | trips | 18,345 | 83,060 | 5152 | 3.9 M |
| trips_inbound_act_frequent | trips | 17,606 | 88,784 | 3971 | 5.0 M |
| trips_inbound_act_infrequent | trips | 7230 | 43,030 | 1855 | 2.9 M |
| trips_inbound_act_work_study | trips | 5325 | 27,225 | 1371 | 1.6 M |
| Inbound trips by household income | |||||
| trips_inbound_income_high | trips | 9308 | 111,662 | 444 | 6.1 M |
| trips_inbound_income_mid | trips | 30,918 | 130,580 | 8002 | 5.3 M |
| trips_inbound_income_low | trips | 8281 | 31,438 | 544 | 1.9 M |
| Inbound trips by traveller age band | |||||
| trips_inbound_age_0_25 | trips | 8924 | 52,224 | 1026 | 2.5 M |
| trips_inbound_age_25_45 | trips | 10,388 | 69,358 | 989 | 3.4 M |
| trips_inbound_age_45_65 | trips | 11,886 | 72,336 | 1245 | 3.6 M |
| trips_inbound_age_65_100 | trips | 5500 | 35,834 | 575 | 1.8 M |
| trips_inbound_age_na | trips | 11,810 | 17,061 | 7316 | 339,867 |
| Inbound trips by traveller residency | |||||
| trips_inbound_residents | trips | 46,023 | 230,571 | 11,480 | 11.1 M |
| trips_inbound_nonresidents | trips | 2484 | 9263 | 853 | 691,003 |
| Outbound trips by origin activity | |||||
| trips_outbound_act_home | trips | 17,961 | 81,724 | 5089 | 4.0 M |
| trips_outbound_act_frequent | trips | 18,093 | 90,741 | 4100 | 5.1 M |
| trips_outbound_act_infrequent | trips | 7173 | 42,759 | 1799 | 2.9 M |
| trips_outbound_act_work_study | trips | 5279 | 26,804 | 1361 | 1.5 M |
| Dataset | Provider | Resolution | Spatial Units | Records | Missing | Download |
|---|---|---|---|---|---|---|
| Air quality (AIRQ) | MITECO | hourly | 588 stations | 14,944,488 | 4.3–33.9% | CSV |
| Meteorology (AEMET) | AEMET | daily | 904 stations | 990,784 | 4.7–5.6% | Bulk API |
| Mobility (MITMA) | MITMA | daily | 2687 zones | 2,824,332 | 0% | CSV |
| Public holidays | Gov. ES | daily | 19 CCAAs | 230 | — | Library |
| Source/Block | Cols | Variables |
|---|---|---|
| Air quality (MITECO) | 8 | NO2, NOx, O3, SO2, CO, PM10, PM2.5, C6H6. Hourly concentrations in g/m3 (CO in mg/m3). |
| Meteorology (AEMET) | 9 | temp_mean, temp_min, temp_max, precip, wind_speed, wind_gust, humidity_mean, humidity_min, humidity_max. Daily values, lagged , expanded to hourly. |
| Mobility, pernoctaciones (MITMA) | 4 | Total overnight stays, residents, visitors, visitor ratio. Daily, lagged , expanded. |
| Mobility, personas (MITMA) | 9 | Population counts by age band (0–25, 25–45, 45–65, 65–100), by number of trips per day (0, 1, 2, 2+) and mobile-population ratio. Daily, lagged , expanded. |
| Mobility, viajes (MITMA) | 23 | Inbound and outbound trip counts and kilometre aggregates, segmented by trip purpose, household income, traveller age band (including unknown) and residency status; plus internal trips. Daily, lagged , expanded. |
| Holiday flags | 2 | Binary indicators for national and Autonomous Community holidays, assigned at time t without lag. |
| Station metadata | 1 | Area-type categorical label (area_type): urbana de tráfico, urbana de fondo, suburbana, rural de fondo, etc. Constant per station. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Bonastre-Egea, J.; Bueno-Crespo, A.; Morales-García, J. A Multi-Source Sensor Dataset for Spain: Integrating Air Quality, Meteorological, Mobility and Calendar Records. Sensors 2026, 26, 3883. https://doi.org/10.3390/s26123883
Bonastre-Egea J, Bueno-Crespo A, Morales-García J. A Multi-Source Sensor Dataset for Spain: Integrating Air Quality, Meteorological, Mobility and Calendar Records. Sensors. 2026; 26(12):3883. https://doi.org/10.3390/s26123883
Chicago/Turabian StyleBonastre-Egea, Juan, Andrés Bueno-Crespo, and Juan Morales-García. 2026. "A Multi-Source Sensor Dataset for Spain: Integrating Air Quality, Meteorological, Mobility and Calendar Records" Sensors 26, no. 12: 3883. https://doi.org/10.3390/s26123883
APA StyleBonastre-Egea, J., Bueno-Crespo, A., & Morales-García, J. (2026). A Multi-Source Sensor Dataset for Spain: Integrating Air Quality, Meteorological, Mobility and Calendar Records. Sensors, 26(12), 3883. https://doi.org/10.3390/s26123883

