Noise Pollution Prediction in a Densely Populated City Using a Spatio-Temporal Deep Learning Approach
Abstract
:1. Introduction
2. Related Work
3. Methodology
3.1. Formal Problem Definition
- V is the set of nodes, defined as
- is the set of edges at time t, which connects the nodes based on their spatial proximity. Specifically, each node is connected by edges to its k nearest neighbors according to the geodesic distance calculated using the Haversine formula. Thus, it is formally defined asThe distance associated with each edge is defined by the Haversine distance as
- is the feature matrix of the graph at time t, where each row represents a node (station) and each column an acoustic or temporal feature associated with that station. Therefore,
3.2. Data Modeling
3.3. Models Used
- LSTM [18]: Long Short-Term Memory (LSTM) networks are a type of Recurrent Neural Network (RNN) designed to capture long-term temporal dependencies in sequential data.
- CNN+LSTM [19]: This hybrid architecture combines Convolutional Neural Networks (CNNs) for extracting local spatial features with LSTM networks for modeling temporal dependencies, making it suitable for multivariate time series prediction.
- GAT [20]: Graph Attention Networks (GATs) apply attention mechanisms to assign adaptive weights to neighboring nodes in a graph, enhancing predictive performance in structured data scenarios.
- GraphSAGE [21]: GraphSAGE is a graph learning method that generates node embeddings by aggregating features from neighboring nodes, suitable for inductive learning.
- GraphSAGE+LSTM: This combined model integrates GraphSAGE for capturing spatial relationships within a graph and LSTM networks for modeling temporal dynamics, offering a spatiotemporal solution for sequential graph-structured data.
- Transformer [22]: The Transformer architecture leverages self-attention mechanisms to model long-range dependencies without relying on recurrence. It allows for the parallel processing of sequences and has demonstrated strong performance in tasks such as natural language processing, machine translation, and time series forecasting.
3.4. Our Model, Spatio-Temporal Graph Convolution—Transformer Conv (STGC-TC)
3.5. Evaluation Metrics
- RMSE (Root Mean Squared Error): Square root of the average of the squared errors:
- MAE (Mean Absolute Error): Average of the absolute errors:
- Accuracy: A measure of model precision; in this case, a prediction is considered correct if the difference between the predicted and actual value is less than 10% of the actual value:The 10% threshold is justified considering that the analyzed ambient noise levels have average values around 60 dB. Thus, a relative error of less than 10% corresponds approximately to an absolute error of less than 6 dB, which is within the acceptable range for urban noise prediction and operational monitoring tasks.
- Pearson Correlation: Measures the strength and direction of the linear relationship between two variables:
4. Experiments
4.1. Dataset
- NMT (Measurement Station Number): A unique numerical identifier for each station where noise measurements are taken. This code allows for the specific identification of each station’s location.
- Year: The year in which the measurements were taken, represented as a number.
- Month: The month when the noise data were recorded, indicated by a number from 1 to 12.
- Day: The day of the month when the measurement was taken, represented by a numerical value between 1 and 31.
- Type: Time of day corresponding to the measurement. Expressed using the following options:
- “D”: Daytime.
- “N”: Nighttime.
- LAeq (Equivalent Continuous Sound Level): The average sound pressure level during the measurement period, expressed in decibels (dB). This value reflects overall exposure to noise over the observed period.
- L1: Sound pressure level exceeded during 1% of the observation time. Used to identify noise peaks that occur rarely.
- L10: Sound pressure level exceeded during 10% of the observation time. Typically associated with frequent loud noises.
- L50: Sound pressure level exceeded during 50% of the observation time. Often used as a reference for “typical” or “median” noise levels in an area.
- L90: Sound pressure level exceeded during 90% of the observation time. Usually represents background or constant noise levels.
- L99: Sound pressure level exceeded during 99% of the observation time. Indicates the most persistent noise levels, reflecting minimum values present most of the time.
- Environmental noise during the day (outdoor): 55 dB(A)—maximum recommended to avoid annoyance.
- Environmental noise at night (outdoor): 40 dB(A)—recommended limit to avoid adverse health effects related to sleep.
- Indoor noise during sleep: 30 dB(A)—ideal limit for indoor environments such as bedrooms at night.
- Maximum noise in short events (peaks) during the night: 45 dB(A)—limit for isolated noise events like passing cars or sirens.
- Six past graphs: .
- Fourteen future graphs: .
- Vertex connections to the 15 closest nodes based on Haversine distance.
- Edge attributes, including distance.
- Geographic coordinates used for future model evaluation.
4.2. Model Configuration
4.3. Results
5. Discussion
5.1. Analysis of Individual Components (Ablation Study)
5.2. Limitations
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Feldscher, K. Noise Can Harm Your Health. Harvard T.H. Chan School of Public Health. 2023. Available online: https://hsph.harvard.edu/news/noise-can-harm-your-health-even-if-you-sleep-through-it/ (accessed on 4 May 2025).
- Hahad, O.; Kuntic, M.; Al-Kindi, S.; Kuntic, I.; Gilan, D.; Petrowski, K.; Daiber, A.; Münzel, T. Noise and mental health: Evidence, mechanisms, and consequences. J. Expo. Sci. Environ. Epidemiol. 2025, 35, 16–23. [Google Scholar] [CrossRef] [PubMed]
- Zaman, M.; Muslim, M.; Jehangir, A. Environmental noise-induced cardiovascular, metabolic and mental health disorders: A brief review. Environ. Sci. Pollut. Res. 2022, 29, 76485–76500. [Google Scholar] [CrossRef] [PubMed]
- James, P. Noise Pollution Can Lead to Sleep Issues, Chronic Health Problems. Harvard T.H. Chan School of Public Health. 2023. Available online: https://hsph.harvard.edu/news/noise-pollution-can-lead-to-sleep-issues-chronic-health-problems/ (accessed on 4 May 2025).
- Capone, V.; Iannuzzo, G.; Camastra, F. Deep Learning for Time Series Forecasting: Advances and Open Problems. Information 2023, 14, 598. [Google Scholar] [CrossRef]
- Briki, S.; Khabou, N.; Rodriguez, I.B. A Comparative Analysis of Time Series Prediction Techniques: A Systematic Literature Review. In Model and Data Engineering; Springer: Berlin/Heidelberg, Germany, 2023. [Google Scholar]
- Navarro, J.M.; Martínez-España, R.; Bueno-Crespo, A.; Martínez, R.; Cecilia, J.M. Sound Levels Forecasting in an Acoustic Sensor Network Using a Deep Neural Network. Sensors 2020, 20, 903. [Google Scholar] [CrossRef] [PubMed]
- Tsai, W.-C.; Tu, C.-S.; Hong, C.-M.; Lin, W.-M. A Review of State-of-the-Art and Short-Term Forecasting Models for Solar PV Power Generation. Energies 2023, 16, 5436. [Google Scholar] [CrossRef]
- Zhang, X.; Zhao, M.; Dong, R. Time-Series Prediction of Environmental Noise for Urban IoT Based on Long Short-Term Memory Recurrent Neural Network. Appl. Sci. 2020, 10, 1144. [Google Scholar] [CrossRef]
- Tashakor, S.; Chamani, A.; Moshtaghie, M. Noise pollution prediction and seasonal comparison in urban parks using a coupled GIS- Artificial Neural Network model. Environ. Monit. Assess. 2023, 195, 303. [Google Scholar] [CrossRef] [PubMed]
- Tashakor, S.; Chamani, A. Temporal variability of noise pollution attenuation by vegetation in urban parks. Environ. Sci. Pollut. Res. 2021, 28, 23143–23151. [Google Scholar] [CrossRef] [PubMed]
- Xiao, H.; Zou, B.; Xiao, J. Graph convolution networks based on adaptive spatiotemporal attention for traffic flow forecasting. Sci. Rep. 2025, 15, 8935. [Google Scholar] [CrossRef] [PubMed]
- Zhang, T.; Guo, G. Graph Attention LSTM: A Spatiotemporal Approach for Traffic Flow Forecasting. IEEE Intell. Transp. Syst. Mag. 2022, 14, 190–199. [Google Scholar] [CrossRef]
- Wang, C.; Zhu, Y.; Zang, T.; Liu, H.; Yu, J. Modeling inter-station relationships with attentive temporal graph convolutional networks for air quality prediction. In Proceedings of the ACM WSDM Conference, Virtual Event, 8–12 March 2021; pp. 616–634. [Google Scholar]
- Liu, H.; Han, Q.; Sun, H.; Sheng, J.; Yang, Z. Spatiotemporal adaptive attention graph convolution network for city-level air quality prediction. Sci. Rep. 2023, 13, 13335. [Google Scholar] [CrossRef] [PubMed]
- Chen, Q.; Ding, R.; Mo, X.; Li, H.; Xie, L.; Yang, J. Adaptive adjacency matrix-based graph convolutional recurrent network for air quality prediction. Sci. Rep. 2024, 14, 4408. [Google Scholar] [CrossRef] [PubMed]
- Yu, W.; Jang, J.-C.; Zhu, Y.; Peng, J.; Yang, W.; Li, K. Enhanced Estimation of Traffic Noise Levels Using Minute-Level Traffic Flow Data through CNN. Sustainability 2023, 16, 6088. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- van Dyck, L.E.; Kwitt, R.; Denzler, S.J.; Gruber, W.R. Comparing Object Recognition in Humans and Deep Convolutional Neural Networks—An Eye Tracking Study. Front. Neurosci. 2021, 15, 750639. [Google Scholar] [CrossRef] [PubMed]
- Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Liò, P.; Bengio, Y. Graph Attention Networks. arXiv 2018, arXiv:1710.10903. [Google Scholar]
- Hamilton, W.L.; Ying, R.; Leskovec, J. Inductive Representation Learning on Large Graphs. arXiv 2017, arXiv:1706.02216. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
- Shi, Y.; Huang, Z.; Feng, S.; Zhong, H.; Wang, W.; Sun, Y. Masked Label Prediction: Unified Message Passing Model for Semi-Supervised Classification. arXiv 2020, arXiv:2009.03509. [Google Scholar]
- Ayuntamiento de Madrid. Portal de Datos Abiertos del Ayuntamiento de Madrid, 2024. Available online: https://datos.madrid.es/portal/site/egob/ (accessed on 10 May 2025).
- World Health Organization. Environmental Noise Guidelines for the European Region; World Health Organization: Copenhagen, Denmark, 2019. [Google Scholar]
- World Health Organization. Guidelines for Community Noise; World Health Organization: Geneva, Switzerland, 1999. [Google Scholar]
Model | Structure |
---|---|
LSTM | LSTM: 12 inputs, 14 units, 10% dropout |
GAT | GATConv: 12 to 128, 8 attention heads GATConv: 1024 to 128, 1 attention head Linear layer: 128 to 14 |
CNN3D+LSTM | Conv3D: 1 to 32 channels, 3 × 3 × 3 kernel, padding 1 LSTM: 32 to 64 units Dropout: 10% Linear layer: 64 to 14 |
GraphSAGE+LSTM | GraphSAGE: 12 to 12, mean aggregation LSTM: 12 to 32 units Dropout: 10% Linear layer: 32 to 14 |
TransformerConv | TransformerConv: 12 to 128, 1 attention head Linear layer: 128 to 14 |
GraphSAGE | GraphSAGE: 12 to 128, mean aggregation Linear layer: 128 to 14 |
CNN1D+LSTM+ TransformerConv | Conv1D: 12 to 16, kernel 3, padding 1 Conv1D: 16 to 32, kernel 3, padding 1 LSTM: 32 to 64 units TransformerConv: 64 to 64, 2 attention heads, no concatenation Dropout: 40% Linear layer: 64 to 14 |
Transformer | Transformer encoder: dimension 12, 2 attention heads Dropout: 40% Linear layer: 12 to 14 |
Model | RMSE | MAE | R2 | Correlation | Accuracy |
---|---|---|---|---|---|
GraphSage+LSTM | 0.0297 | 0.0274 | 0.6985 | 0.8982 | 0.8013 |
CNN3D+LSTM | 0.0340 | 0.0340 | 0.6827 | 0.8803 | 0.6763 |
GAT | 0.0361 | 0.0365 | 0.6382 | 0.8586 | 0.6485 |
LSTM | 0.0305 | 0.0290 | 0.7188 | 0.8939 | 0.7725 |
GraphSage | 0.0294 | 0.0268 | 0.7234 | 0.8997 | 0.8188 |
Transformer | 0.0178 | 0.0171 | 0.6684 | 0.8807 | 0.9090 |
TransformerConv | 0.0238 | 0.0205 | 0.6333 | 0.9330 | 0.7157 |
CNN1D+LSTM+TransformerConv | 0.0169 | 0.0174 | 0.8927 | 0.9601 | 0.9158 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Semper, M.; Curado, M.; Oliver, J.L.; Vicent, J.F. Noise Pollution Prediction in a Densely Populated City Using a Spatio-Temporal Deep Learning Approach. Appl. Sci. 2025, 15, 5576. https://doi.org/10.3390/app15105576
Semper M, Curado M, Oliver JL, Vicent JF. Noise Pollution Prediction in a Densely Populated City Using a Spatio-Temporal Deep Learning Approach. Applied Sciences. 2025; 15(10):5576. https://doi.org/10.3390/app15105576
Chicago/Turabian StyleSemper, Marc, Manuel Curado, Jose Luis Oliver, and Jose F. Vicent. 2025. "Noise Pollution Prediction in a Densely Populated City Using a Spatio-Temporal Deep Learning Approach" Applied Sciences 15, no. 10: 5576. https://doi.org/10.3390/app15105576
APA StyleSemper, M., Curado, M., Oliver, J. L., & Vicent, J. F. (2025). Noise Pollution Prediction in a Densely Populated City Using a Spatio-Temporal Deep Learning Approach. Applied Sciences, 15(10), 5576. https://doi.org/10.3390/app15105576