Leveraging Climate Data Through Intelligent Systems for the Prediction of Arbovirus Transmission by Aedes aegypti
Highlights
- Arboviruses transmitted by Aedes aegypti represent a persistent and climate-sensitive public health threat in tropical urban settings such as Recife, Brazil.
- This study integrates climate, entomological, and epidemiological surveillance data to improve early prediction of arbovirus transmission risk.
- The study demonstrates that intelligent systems, particularly single-layer extreme learning machines, can accurately and efficiently forecast mosquito breeding sites at fine spatial scales.
- High-resolution, climate-driven predictions enable earlier identification of priority areas for intervention, improving the effectiveness of arbovirus control strategies.
- Municipal health authorities can use these models to optimize vector control actions, targeting high-risk neighborhoods before outbreaks occur.
- The open-source, reproducible framework can be adapted to other cities facing climate-related arbovirus risks, supporting scalable and data-driven public health planning.
Abstract
1. Introduction
2. Materials and Methods
2.1. Study Type and Area
2.2. Study Population and Data Source
- Climate data—APAC and INMET: Climate variables were obtained from two sources: (i) the APAC Geographic Information System (SIGH-PE), which provides real-time rainfall and fluviometric measurements, and (ii) the INMET Meteorological Database for Teaching and Research (BDMEP), from which we extracted monthly averages of temperature, relative humidity, and wind speed from the Recife conventional station (A301). These variables represent known climatic drivers of mosquito proliferation and arbovirus transmission. Climate data were available for 2009–2021.
- Entomological data—LIRAa Recife: LIRAa datasets, published by the Recife Open Data Portal, report infested locations as well as Building and Breteau indices collected through standardized municipal surveillance procedures. LIRAa data were available for 2009–2017 and were paired with climate variables from the same period to train models predicting breeding site indices. Because LIRAa is aggregated at the stratum level rather than by neighborhood, spatial disaggregation was addressed during preprocessing.
- Epidemiological data—Arbovirus cases: Confirmed cases of dengue, Zika, and chikungunya were retrieved from the Recife Open Data Portal, which provides records of each reported case including neighborhood of residence. Data were available for 2013–2021, the earliest period with consistent neighborhood-level reporting. These records were combined with climate variables from the same period to develop prediction models for arbovirus case numbers.
- Data access and availability: We complied with open-data policies and obtained permission from the responsible institutions. INMET and APAC publish data on public portals. We exported the required datasets. The Open Data Portal of Recife offers public access and supports downloads of health datasets. The experiments were carried out in the Weka software version 3.8.6 [25] and we used the PyRCN library [26] to run the experiments with reservoir computing methods (extreme learning machines). Each experiment was run 30 times with the cross-validation technique in order to avoid overfitting.
2.3. Preprocessing
2.3.1. Arboviruses Dataset
2.3.2. Breeding Sites Dataset
2.3.3. Climatic Variables
2.3.4. Prediction Datasets
2.4. Experiments and Evaluation Metrics
- RF: 10–80, and 100 trees;
- MLP: learning rate 0.3, momentum 0.2, 10–30 neurons in the hidden layer.
- Support vector regressor:
- −
- Parameter C = 0.1;
- −
- Polynomial kernel: 1 (linear), 2 and 3 degrees;
- −
- Radial Basis Function (RBF) kernel: gamma = 0.01.
- ELM:
- −
- 1, 2, 5, and 10 layers; with 500, 700, 900, and 1000 neurons in each layer.
2.5. Reproducibility Statement
2.6. Code and Platform Availability
3. Results
Arboviruses Cases Prediction
4. Discussion
4.1. Implications of Model Performance for Public Health Forecasting
4.2. Comparative Analysis of Top-Performing Model Configurations
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| AI | Artificial Intelligence |
| APAC | Pernambuco Water and Climate Agency |
| ELM | Extreme Learning Machines |
| INMET | National Institute of Meteorology |
| LIRAa | Rapid Survey of Indices for Aedes aegypti |
| MAE | Mean Absolute Error |
| MLP | Multilayer Perceptron |
| PyRCN | Python Reservoir Computing Networks |
| RF | Randon Forest |
| RBF | Radial Basis Function |
| RMSE | Root Mean Square Error |
| RRSE | Relative Root Square Error |
| SIGH-PE | Pernambuco Geographic Information System |
| SVM | Support Vector Machine |
References
- World Health Organization. Global Arbovirus Initiative. 2022. Available online: https://www.who.int/initiatives/global-arbovirus-initiative (accessed on 11 April 2025).
- World Health Organization. Dengue and Severe Dengue. Fact Sheet. 23 April 2024. Available online: https://www.who.int/news-room/fact-sheets/detail/dengue-and-severe-dengue (accessed on 5 April 2025).
- World Health Organization. Disease Outbreak News: Geographical Expansion of Dengue and Chikungunya in the Region of the Americas. 23 March 2023. Available online: https://www.who.int/emergencies/disease-outbreak-news/item/2023-DON448 (accessed on 2 May 2025).
- Santos, L.L.M.; de Aquino, E.C.; Fernandes, S.M.; Ternes, Y.M.F.; Feres, V.C.R. Dengue, chikungunya, and Zika virus infections in Latin America and the Caribbean: A systematic review. Rev. Panam. Salud Pública 2023, 47, e34. [Google Scholar] [CrossRef] [PubMed]
- Kaye, A.R.; Obolski, U.; Sun, L.; Hart, W.S.; Hurrell, J.W.; Tildesley, M.J.; Thompson, R.N. The impact of natural climate variability on the global distribution of Aedes aegypti: A mathematical modelling study. Lancet Planet. Health 2024, 8, e1079–e1087. [Google Scholar] [CrossRef] [PubMed]
- Abud, D.A.; Santos, C.Y.; Neto, A.A.L.; Senra, J.T.; Tuboi, S. Real-world data study of prevalence and direct costs related to dengue management in Brazil’s private healthcare from 2015 to 2020. Braz. J. Infect. Dis. 2022, 26, 102718. [Google Scholar] [CrossRef]
- Azil, A.H.; Li, M.; Williams, C.R. Dengue vector surveillance programs: A review of methodological diversity in endemic and epidemic countries. Asia Pac. J. Public Health 2011, 23, 827–842. [Google Scholar] [CrossRef] [PubMed]
- Ribeiro, M.S.; Ferreira, D.F.; Azevedo, R.C.; Santos, G.B.G.D.; Medronho, R.A. Aedes aegypti larval indices and dengue incidence: An ecological study in the state of Rio de Janeiro, Brazil. Cad. Saúde Pública 2021, 37, e00263320. [Google Scholar] [CrossRef]
- Brady, O.J.; Hay, S.I. The global expansion of dengue: How Aedes aegypti enabled the first arbovirus pandemic. Annu. Rev. Entomol. 2020, 65, 191–208. [Google Scholar] [CrossRef]
- Flaibani, N.; Pérez, A.A.; Barbero, I.M.; Burroni, N.E. Different approaches to characterize artificial breeding sites of Aedes aegypti using generalized linear mixed models. Infect. Dis. Poverty 2020, 9, 107. [Google Scholar] [CrossRef]
- Gurgel-Gonçalves, R.; Oliveira, W.K.; Croda, J. The greatest dengue epidemic in Brazil: Surveillance, prevention, and control. Rev. Soc. Bras. Med. Trop. 2024, 57, e002032024. [Google Scholar] [CrossRef]
- Lima, Y.; Pinheiro, W.; Barbosa, C.E.; Magalhães, M.; Chaves, M.; de Souza, J.M.; Rodrigues, S.; Xexéo, G. Development of an index for the inspection of Aedes aegypti breeding sites in Brazil: Multi-criteria analysis. JMIR Public Health Surveill. 2021, 7, e19502. [Google Scholar] [CrossRef]
- Rodrigues, G.O.; Pereira, B.G.V.; Pereira, M.A.F.; Trindade-Bezerra, J.M.; Guimarães-e-Silva, A.S.; Soares-Pinheiro, V.C.; Soares-da-Silva, J. Potential breeding containers of Aedes aegypti and Aedes albopictus at strategic points in Maranhão, Brazil. Braz. J. Biol. 2023, 83, e275582. [Google Scholar] [CrossRef]
- Schultes, O.L.; Morais, M.H.F.; Cunha, M.D.C.M.; Sobral, A.; Caiaffa, W.T. Spatial analysis of dengue incidence and Aedes aegypti ovitrap surveillance in Belo Horizonte, Brazil. Trop. Med. Int. Health 2021, 26, 237–255. [Google Scholar] [CrossRef]
- Soares, A.P.M.; Rosário, I.N.G.; Silva, I.M. Distribution and oviposition preferences of Aedes albopictus in Belém, Brazilian Amazon. J. Vector Ecol. 2020, 45, 312–320. [Google Scholar] [CrossRef]
- da Silva, C.C.; de Lima, C.L.; da Silva, A.C.G.; de Almeida, G.M.; Albuquerque, J.O.; de Lima, M.M. Spatiotemporal forecasting for dengue, chikungunya fever, and Zika using machine learning and artificial expert committees. Res. Biomed. Eng. 2022, 38, 499–537. [Google Scholar] [CrossRef]
- Teillet, C.; Devillers, R.; Tran, A.; Valle, D.; Samba-Louaka, A.; Mathieu, A.; Degenne, P. Exploring fine-scale urban landscapes using satellite data to predict Aedes breeding sites. Int. J. Health Geogr. 2024, 23, 18. [Google Scholar] [CrossRef]
- Rahman, M.S.; Pientong, C.; Zafar, S.; Ekalaksananan, T.; Paul, R.E.; Haque, U.; Rocklöv, J.; Overgaard, H.J. Mapping the spatial distribution of Aedes aegypti and predicting its abundance in Northeastern Thailand using a machine-learning approach. One Health 2021, 13, 100358. [Google Scholar] [CrossRef]
- Javaid, M.; Sarfraz, M.S.; Aftab, M.U.; Zaman, Q.U.; Rauf, H.T.; Alnowibet, K.A. WebGIS-based real-time surveillance and response system for vector-borne diseases. Int. J. Environ. Res. Public Health 2023, 20, 3740. [Google Scholar] [CrossRef]
- da Silva, C.C.; Lima, C.L.; Silva, A.C.G.; Moreno, G.M.M.; Musah, A.; Aldosery, A.; Dutra, L.; Ambrizzi, T.; Borges, I.V.G.; Tunali, M.; et al. Forecasting dengue, chikungunya, and Zika cases in Recife using climate and machine learning. Res. Soc. Dev. 2021, 10, e452101220804. [Google Scholar] [CrossRef]
- Steiner, A.; Hennequin, R.; Massaroli, S.; Kröger, T.; Scherer, P.; Sobieczky, F.; Zimmer, C.; Kipf, T. PyRCN: A toolbox for reservoir computing networks. Eng. Appl. Artif. Intell. 2022, 113, 104964. [Google Scholar] [CrossRef]
- Souza, W.V.; Albuquerque, M.F.P.M.; Vazquez, E.; Bezerra, L.C.A.; Mendes, A.C.G.; Lyra, T.M.; Albuquerque, L.L.; Oliveira, W.K.; Ximenes, R.A.A. Microcephaly epidemic related to Zika virus and living conditions in Recife, Brazil. BMC Public Health 2018, 18, 130. [Google Scholar] [CrossRef]
- Municipal de Saúde. City of Recife. In Plano Municipal de Saúde 2022–2025; Recife Municipal Health Secretariat: Recife, Brazil, 2022. Available online: https://www2.recife.pe.gov.br/servico/plano-municipal-de-saude-pms-2022-2025 (accessed on 9 September 2025).
- IBGE—Instituto Brasileiro de Geografia e Estatística. Panorama: Recife, Pernambuco. 2022. Available online: https://cidades.ibge.gov.br/brasil/pe/recife/panorama (accessed on 3 August 2023).
- Frank, E.; Hall, M.A.; Witten, I.H. The WEKA Workbench. In Data Mining: Practical Machine Learning Tools and Techniques, 4th ed.; Morgan Kaufmann: Burlington, MA, USA, 2016. [Google Scholar]
- Python Software Foundation. Python, Version 3.13. Available online: https://www.python.org (accessed on 10 October 2025).
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- Hijmans, R.J. Terra: Spatial Data Analysis; R Package Version 1.8-64. 2025. Available online: https://github.com/rspatial/terra (accessed on 10 October 2025).
- R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2025. Available online: https://www.R-project.org (accessed on 15 January 2025).
- Huang, G.; Huang, G.B.; Song, S.; You, K. Trends in extreme learning machines: A review. Neural Netw. 2015, 61, 32–48. [Google Scholar] [CrossRef] [PubMed]
- Chai, T.; Draxler, R.R. Root mean square error (RMSE) or mean absolute error (MAE)? Arguments against avoiding RMSE in the literature. Geosci. Model Dev. 2014, 7, 1247–1250. [Google Scholar] [CrossRef]


| MAE | RMSE | RRSE (%) | Correlation Coefficient | Training Time (s) | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Regression Method | Mean | Std | Mean | Std | Mean | Std | Mean | Std | Mean | Std |
| Linear Regression | 0.646 | 0.24 | 1.12 | 0.416 | 36.857 | 6.518 | 0.928 | 0.026 | 0.044 | 0.022 |
| MAE | RMSE | RRSE (%) | Correlation Coefficient | Training Time (s) | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Regression Method | Setup | Mean | Std | Mean | Std | Mean | Std | Mean | Std | Mean | Std |
| Random Forest | 10 trees | 0.227 | 0.078 | 0.530 | 0.215 | 18.005 | 4.109 | 0.984 | 0.008 | 0.223 | 0.085 |
| 20 trees | 0.203 | 0.070 | 0.480 | 0.199 | 16.313 | 3.944 | 0.988 | 0.007 | 0.382 | 0.122 | |
| 30 trees | 0.191 | 0.064 | 0.459 | 0.192 | 15.619 | 3.937 | 0.989 | 0.007 | 0.867 | 0.699 | |
| 40 trees | 0.118 | 0.065 | 0.454 | 0.192 | 15.421 | 3.925 | 0.989 | 0.007 | 0.852 | 0.170 | |
| 50 trees | 0.216 | 0.067 | 0.521 | 0.205 | 15.627 | 3.703 | 0.989 | 0.006 | 1.007 | 0.140 | |
| 60 trees | 0.213 | 0.066 | 0.517 | 0.204 | 15.490 | 3.703 | 0.990 | 0.006 | 0.956 | 0.146 | |
| 70 trees | 0.211 | 0.066 | 0.514 | 0.204 | 15.405 | 3.711 | 0.990 | 0.006 | 1.250 | 0.311 | |
| 80 trees | 0.209 | 0.065 | 0.511 | 0.203 | 15.333 | 3.713 | 0.990 | 0.006 | 1.621 | 0.238 | |
| 100 trees | 0.207 | 0.065 | 0.508 | 0.203 | 15.226 | 3.704 | 0.990 | 0.006 | 1.761 | 0.257 | |
| MAE | RMSE | RRSE (%) | Correlation Coefficient | Training Time (s) | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Regression Method | Setup | Mean | Std | Mean | Std | Mean | Std | Mean | Std | Mean | Std |
| SVM | polynomial kernel, p = 1 | 0.514 | 0.211 | 1.231 | 0.560 | 41.243 | 10.357 | 0.907 | 0.044 | 38.135 | 6.673 |
| polynomial kernel, p = 2 | 0.057 | 0.023 | 0.123 | 0.068 | 4.171 | 2.052 | 0.999 | 0.002 | 392.845 | 197.922 | |
| polynomial kernel, p = 3 | 0.033 | 0.013 | 0.055 | 0.046 | 1.834 | 1.377 | 1.000 | 0.001 | 1423.400 | 838.156 | |
| polynomial kernel, RBF | 0.715 | 0.271 | 1.454 | 0.625 | 48.951 | 8.715 | 0.882 | 0.045 | 50.354 | 16.332 | |
| MAE | RMSE | RRSE (%) | Correlation Coefficient | Training Time (s) | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Regression Method | Setup | Mean | Std | Mean | Std | Mean | Std | Mean | Std | Mean | Std |
| MLP | 10 neurons | 0.114 | 0.065 | 0.181 | 0.104 | 6.084 | 2.648 | 0.998 | 0.002 | 69.651 | 31.176 |
| 20 neurons | 0.106 | 0.062 | 0.168 | 0.101 | 5.630 | 2.561 | 0.999 | 0.002 | 111.492 | 36.160 | |
| 30 neurons | 0.097 | 0.056 | 0.156 | 0.097 | 5.220 | 2.533 | 0.999 | 0.002 | 124.706 | 34.655 | |
| MAE | RMSE | RRSE (%) | Correlation Coefficient | Training Time (s) | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Regression Method | Setup | Neurons | Mean | Std | Mean | Std | Mean | Std | Mean | Std | Mean | Std |
| ELM | 1 layers | 500 | 0.002 | 0.001 | 0.007 | 0.006 | 4.375 | 3.463 | 0.998 | 0.004 | 0.268 | 0.168 |
| 700 | 0.001 | 0.001 | 0.006 | 0.007 | 3.366 | 3.789 | 0.999 | 0.005 | 0.338 | 0.200 | ||
| 900 | 0.001 | 0.000 | 0.004 | 0.005 | 2.575 | 2.846 | 0.999 | 0.002 | 0.405 | 0.215 | ||
| 1000 | 0.001 | 0.000 | 0.004 | 0.005 | 2.386 | 2.717 | 0.999 | 0.002 | 0.339 | 0.135 | ||
| 2 layers | 500 | 0.021 | 0.004 | 0.037 | 0.015 | 22.570 | 9.005 | 0.973 | 0.024 | 0.401 | 0.162 | |
| 700 | 0.017 | 0.005 | 0.036 | 0.071 | 22.510 | 46.623 | 0.975 | 0.059 | 0.557 | 0.305 | ||
| 900 | 0.015 | 0.003 | 0.030 | 0.030 | 18.237 | 18.225 | 0.982 | 0.044 | 0.700 | 0.259 | ||
| 1000 | 0.014 | 0.003 | 0.032 | 0.037 | 19.810 | 22.472 | 0.977 | 0.055 | 0.903 | 1.867 | ||
| 5 layers | 500 | 0.073 | 0.012 | 0.104 | 0.017 | 63.867 | 5.068 | 0.773 | 0.039 | 0.887 | 0.254 | |
| 700 | 0.073 | 0.011 | 0.101 | 0.016 | 62.498 | 5.181 | 0.785 | 0.037 | 2.180 | 64.080 | ||
| 900 | 0.071 | 0.011 | 0.098 | 0.015 | 60.249 | 5.031 | 0.804 | 0.034 | 2.760 | 77.216 | ||
| 1000 | 0.070 | 0.011 | 0.096 | 0.016 | 59.154 | 5.060 | 0.812 | 0.033 | 2.183 | 11.042 | ||
| 10 layers | 500 | 0.111 | 0.020 | 0.160 | 0.027 | 97.911 | 3.102 | 0.299 | 0.061 | 1.711 | 0.293 | |
| 700 | 0.115 | 0.020 | 0.162 | 0.027 | 99.447 | 3.684 | 0.298 | 0.066 | 3.855 | 2.446 | ||
| 900 | 0.117 | 0.020 | 0.162 | 0.027 | 99.753 | 4.122 | 0.328 | 0.062 | 4.097 | 0.574 | ||
| 1000 | 0.119 | 0.021 | 0.165 | 0.027 | 101.475 | 4.287 | 0.312 | 0.062 | 4.599 | 0.712 | ||
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Lima, C.L.d.; Sancho, K.A.; Silva, A.C.G.d.; Vital, R.; Silva, C.C.d.; Mendonça, M.F.S.d.; Borges, F.T.; Siqueira, C.E.G.; Santos, W.P.d. Leveraging Climate Data Through Intelligent Systems for the Prediction of Arbovirus Transmission by Aedes aegypti. Int. J. Environ. Res. Public Health 2026, 23, 12. https://doi.org/10.3390/ijerph23010012
Lima CLd, Sancho KA, Silva ACGd, Vital R, Silva CCd, Mendonça MFSd, Borges FT, Siqueira CEG, Santos WPd. Leveraging Climate Data Through Intelligent Systems for the Prediction of Arbovirus Transmission by Aedes aegypti. International Journal of Environmental Research and Public Health. 2026; 23(1):12. https://doi.org/10.3390/ijerph23010012
Chicago/Turabian StyleLima, Clarisse Lins de, Karla Amorim Sancho, Ana Clara Gomes da Silva, Ranielle Vital, Cecília Cordeiro da Silva, Marcela Franklin Salvador de Mendonça, Fabiano Tonaco Borges, Carlos Eduardo Gomes Siqueira, and Wellington Pinheiro dos Santos. 2026. "Leveraging Climate Data Through Intelligent Systems for the Prediction of Arbovirus Transmission by Aedes aegypti" International Journal of Environmental Research and Public Health 23, no. 1: 12. https://doi.org/10.3390/ijerph23010012
APA StyleLima, C. L. d., Sancho, K. A., Silva, A. C. G. d., Vital, R., Silva, C. C. d., Mendonça, M. F. S. d., Borges, F. T., Siqueira, C. E. G., & Santos, W. P. d. (2026). Leveraging Climate Data Through Intelligent Systems for the Prediction of Arbovirus Transmission by Aedes aegypti. International Journal of Environmental Research and Public Health, 23(1), 12. https://doi.org/10.3390/ijerph23010012

