Assessing Flood and Landslide Susceptibility Using XGBoost: Case Study of the Basento River in Southern Italy
Abstract
:1. Introduction
2. Materials and Methods
2.1. Study Area
2.2. Flood Inventory Map
- March 2011 flood event: For this event, RGB images of the flooded areas acquired by Cosmo-SkyMed and provided by ASI to the DPC were georeferenced. Subsequently, a semi-automated supervised classification procedure was applied to identify the pixels affected by the flooding.
- October 2013 flood event: Flood extent maps were made available by the Copernicus EMS service, with acquisitions from the SPOT-6 satellite on October 17th, 2013, at 11:40 AM.
- December 2013 flood event: Flood extent maps were provided for the Ionian coast of Basilicata, referring to the acquisitions from December 2nd and 3rd at 4:31 AM. Furthermore, the Copernicus EMS portal enabled the integration of this information with satellite acquisitions from December 4th and 5th at 10:55 AM and 12:55 PM, respectively. This allowed the creation of a map highlighting the evolution of the flood events in the affected area.
2.3. Landslide Inventory Map
2.4. Conditioning Factors
2.5. Susceptibility Modeling
- In the initial phase, pre-processing and raster data preparation (10 conditioning factors) were carried out, which allowed us to (1) align the raster cells to ensure spatial consistency among data from different sources; (2) convert raster data into 1D vectors to simplify processing; (3) reduce the raster resolution by 80% to optimize data management; (4) visualize raster data to visually identify patterns or anomalies.
- In the data conversion and cleaning phase, the following steps were performed: (1) replace NoData values with NaN to properly handle missing data; (2) apply numeric encoding of categorical variables such as land use and lithology; (3) transform raster data from 2D matrices to 1D vectors to simplify data handling; (4) create a DataFrame containing the raster variables and target for analysis; (5) remove NaN values to ensure a complete dataset ready for analysis.
- Subsequently, a correlation analysis was performed between the conditioning factors and the targeted variables, i.e., the flooded areas and the landslide areas [27]. The analysis was conducted by applying Pearson’s correlation coefficient, which measures the relationship between two datasets: if the value is close to 1 (−1), it indicates a strong positive (negative) correlation, meaning both datasets increase (decrease) together; if the value is close to 0, it indicates no correlation. In this study, no substantial correlation was observed between the conditioning factors for both flooded and landslide-prone areas. Consequently, none of the factors were excluded from the analysis.
- Then, the XGBoost model was trained using 70% of the dataset, while the remaining 30% was used for testing to validate the model’s performance for both flood and landslide susceptibility mapping. The dataset, which included flooded and landslide areas, served as the target variable, while the 10 conditioning factors were used as predictors for susceptibility modeling.
- 5.
- Lastly, the weight of each variable to the model’s predictions is assessed by determining the extent to which each independent variable influences the target variable. This is achieved using the SHAP method [32,33,34], a popular approach in Explainable Artificial Intelligence (XAI) [35], commonly used in game theory. The SHAP method calculates the Shapley value for each predictor in the model, and the relation is written as follows.
3. Results
3.1. Preliminary Analysis and Application of the XGBoost Classifier
3.2. Model Performance Analysis
4. Discussion
- A unified multi-hazard modeling approach, using the same framework to assess both flood and landslide susceptibility, overcoming the traditional separation between these two analyses.
- Integration of multi-source data, including high-resolution satellite imagery (Copernicus EMS, COSMO-SkyMed) and geospatial datasets, enhancing the reliability of predictions.
- The use of SHAP analysis to interpret the model’s decisions and understand the impact of different predisposing factors, ensuring transparency and interpretability.
- A scalable and transferable model, which can be adapted to other geographical areas with similar hazards, supporting land-use planning and risk management.
5. Conclusions
- XGBoost showed excellent predictive performance, achieving a classification accuracy of 100% for flood susceptibility and 92% for landslide susceptibility.
- SHAP analysis enabled interpretability, revealing that low elevation, relative elevation, and high drainage density were the most influential predictors for flood events, whereas elevation, drainage density, relative elevation, distance, and lithology were the key contributors to landslide susceptibility.
- The unified modeling framework allowed for the simultaneous evaluation of multiple hazard types using a single set of predictors, providing a scalable and transferable methodology for other regions facing similar risks.
- The integration of remote sensing products (e.g., Copernicus EMS, COSMO-SkyMed) and GIS-based conditioning factors improved the spatial resolution and reliability of susceptibility maps, supporting more informed decision making in land-use planning, infrastructure design, and emergency response.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Cremen, G.; Galasso, C.; McCloskey, J. Modelling and quantifying tomorrow’s risks from natural hazards. Sci. Total Environ. 2022, 817, 152552. [Google Scholar] [CrossRef] [PubMed]
- Sujatha, E.R.; Sudharsan, J.S. Landslide Susceptibility Mapping Methods—A Review. In Landslide: Susceptibility, Risk Assessment and Sustainability Advances in Natural and Technological Hazards Research; Springer Nature: Berlin, Germany, 2024; pp. 87–102. [Google Scholar]
- Tosun, J.; Howlett, M. Managing slow onset events related to climate change: The role of public bureaucracy. Curr. Opin. Environ. Sustain. 2021, 50, 43–53. [Google Scholar] [CrossRef]
- Modrick, T.M.; Georgakakos, K.P. The character and causes of flash flood occurrence changes in mountainous small basins of Southern California under projected climatic change. J. Hydrol. Reg. 2015, 3, 312–336. [Google Scholar] [CrossRef]
- Johnstone, W.M.; Lence, B.J. Assessing the value of mitigation strategies in reducing the impacts of rapid-onset, catastrophic floods. J. Flood Risk Manag. 2009, 2, 209–221. [Google Scholar] [CrossRef]
- Albano, R.; Limongi, C.; Dal Sasso, S.F.; Mancusi, L.; Adamowski, J. Flood scenario spatio-temporal mapping via hydrological and hydrodynamic modelling and a remote sensing dataset: A case study of the Basento river (Southern Italy). Int. J. Disaster Risk Reduct. 2024, 111, 104758. [Google Scholar] [CrossRef]
- Akinci, H.; Ozalp, A.Y. Investigating the Effects of Different Data Classification Methods on Landslide Susceptibility Mapping. Adv. Space Res. 2025, 75, 3427–3450. [Google Scholar] [CrossRef]
- Guzzetti, F. Landslide fatalities and the evaluation of landslide risk in Italy. Eng. Geol. 2000, 58, 89–107. [Google Scholar] [CrossRef]
- Guzzetti, F.; Stark, C.P.; Salvati, P. Evaluation of flood and landslide risk to the population of Italy. Environ. Manag. 2005, 36, 15–36. [Google Scholar] [CrossRef]
- Mudashiru, R.B.; Sabtu, N.; Abustan, I. Quantitative and semi-quantitative methods in flood hazard/susceptibility mapping: A review. Arab. J. Geosci. 2021, 14, 941. [Google Scholar] [CrossRef]
- Albano, R.; Adamowski, J. Use of digital elevation models for flood susceptibility assessment via a hydrogeomorphic approach: A case study of the Basento River in Italy. Nat. Hazards 2025. [Google Scholar] [CrossRef]
- Azarafza, M.; Akgun, H.; Atkinson, P.M.; Derakhshani, R. Deep learning-based landslide susceptibility mapping. Sci. Rep. 2021, 11, 24112. [Google Scholar] [CrossRef] [PubMed]
- Akinci, H. Assessment of rainfall-induced landslide susceptibility in Artvin, Turkey using machine learning techniques. J. Afr. Earth. Sci. 2022, 191, 104535. [Google Scholar] [CrossRef]
- Liu, S.; Wang, L.; Zhang, W.; He, Y.; Pijush, S. A comprehensive review of machine learning-based methods in landslide susceptibility mapping. Geol. J. 2023, 58, 2283–2301. [Google Scholar] [CrossRef]
- Kaya, C.M.; Derin, L. Parameters and methods used in flood susceptibility mapping: A review. J. Water Clim. Change 2023, 14, 1935–1960. [Google Scholar] [CrossRef]
- Albertini, C.; Gioia, A.; Iacobellis, V.; Petropoulos, G.P.; Manfreda, S. Assessing multi-source random forest classification and robustness of predictor variables in flooded areas mapping. Remote Sens. Appl. Soc. Environ. 2024, 35, 101239. [Google Scholar] [CrossRef]
- Can, R.; Kocaman, S.; Gokceoglu, C. A Comprehensive Assessment of XGBoost Algorithm for Landslide Susceptibility Mapping in the Upper Basin of Ataturk Dam, Turkey. Appl. Sci. 2021, 11, 4993. [Google Scholar] [CrossRef]
- Hitouri, S.; Mohajane, M.; Lahsaini, M.; Ali, S.A.; Setargie, T.A.; Tripathi, G.; D’Antonio, P.; Singh, S.K.; Varasano, A. Flood Susceptibility Mapping Using SAR Data and Machine Learning Algorithms in a Small Watershed in Northwestern Morocco. Remote Sens. 2024, 16, 858. [Google Scholar] [CrossRef]
- Ma, M.; Zhao, G.; He, B.; Li, Q.; Dong, H.; Wang, S.; Wang, Z. XGBoost-Based Method for Flash Flood Risk Assessment. J. Hydrol. 2021, 598, 126382. [Google Scholar] [CrossRef]
- Dawson, G.; Butt, J.; Jones, A.; Fraccaro, P. Flood susceptibility mapping at the country scale using machine learning approaches. Appl. AI Lett. 2023, 4, e88. [Google Scholar] [CrossRef]
- Schiattarella, M.; Giano, S.I.; Gioia, D. Long-term geomorphological evolution of the axial zone of the Campania-Lucania Apennine, Southern Italy: A review. Geol. Carpathica 2017, 68, 57–67. [Google Scholar] [CrossRef]
- Dal Sasso, S.F.; Manfreda, S.; Capparelli, G.; Versace, P.; Samela, C.; Spilotro, G.; Fiorentino, M. Hydrological and geological hazards in Basilicata. L’Acqua 2017, 3, 77–85. [Google Scholar]
- De Musso, N.M.; Capolongo, D.; Refice, A.; Lovergine, F.P.; D’Addabbo, A.; Pennetta, L. Spatial evolution of the December 2013 Metaponto plain (Basilicata, Italy) flood event using multi-source and high-resolution remotely sensed data. J. Maps 2018, 14, 219–229. [Google Scholar] [CrossRef]
- Lacava, T.; Ciancia, E.; Faruolo, M.; Pergola, N.; Satriano, V.; Tramutoli, V. Analyzing the December 2013 Metaponto Plain (Southern Italy) Flood Event by Integrating Optical Sensors Satellite Data. Hydrology 2018, 5, 43. [Google Scholar] [CrossRef]
- Scarpino, S.; Albano, R.; Cantisani, A.; Mancusi, L.; Sole, A.; Milillo, G. Multitemporal SAR data and 2D hydrodynamic model flood scenario dynamics assessment. ISPRS Int. J. Geo-Inf. 2018, 7, 105. [Google Scholar] [CrossRef]
- Schiattarella, M.; Giannandrea, P.; Corrado, G.; Gioia, D. Landscape planning-addressed regional-scale mapping of geolithological units: The example of Basilicata, southern Italy. J. Maps 2024, 20, 2303335. [Google Scholar] [CrossRef]
- Gogtay, N.J.; Thatte, U.M. Principles of Correlation Analysis. J. Assoc. Physicians India 2017, 65, 78–81. [Google Scholar]
- Shetty, S.H.; Shetty, S.; Singh, C.; Rao, A. Supervised Machine Learning: Algorithms and Applications. In Fundamentals and Methods of Machine and Deep Learning: Algorithms, Tools and Applications; Singh, P., Ed.; John Wiley & Sons: Hoboken, NJ, USA, 2022; pp. 1–16. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016. [Google Scholar]
- Bhati, B.S.; Chugh, G.; Al-Turjman, F.; Bhati, N. An Improved Ensemble-Based Intrusion Detection Technique Using XGBoost. Trans. Emerg. Tel. Tech. 2020, 32, e4076. [Google Scholar] [CrossRef]
- Le, T.-T.-H.; Oktian, Y.E.; Kim, H. XGBoost for Imbalanced Multiclass Classification-Based Industrial Internet of Things Intrusion Detection Systems. Sustainability 2022, 14, 8707. [Google Scholar] [CrossRef]
- Zhang, K.; Xu, P.; Zhang, J. Explainable AI in Deep Reinforcement Learning Models: A SHAP Method Applied in Power System Emergency Control. In Proceedings of the IEEE 4th Conference on Energy Internet and Energy System Integration (EI2), Wuhan, China, 30 October–1 November 2020. [Google Scholar] [CrossRef]
- Mangalathu, S.; Hwang, S.H.; Jeon, J.-S. Failure Mode and Effects Analysis of RC Members Based on Machine-Learning-Based SHapley Additive exPlanations (SHAP). Approach. Eng. Struct. 2020, 219, 110927. [Google Scholar] [CrossRef]
- Nohara, Y.; Matsumoto, K.; Soejima, H.; Nakashima, N. Explanation of Machine Learning Models Using Shapley Additive Explanation and Application for Real Data in Hospital. Comput. Methods Programs Biomed. 2022, 214, 106584. [Google Scholar] [CrossRef]
- Barredo Arrieta, A.; Díaz-Rodríguez, N.; Del Ser, J.; Bennetot, A.; Tabik, S.; Barbado, A.; Garcia, S.; Gil-Lopez, S.; Molina, D.; Benjamins, R.; et al. Explainable Artificial Intelligence: A Systematic Review. Inf. Fusion 2020, 58, 82–115. [Google Scholar] [CrossRef]
Data | Source/Scale | Map |
---|---|---|
Elevation | https://rsdi.regione.basilicata.it/ (accessed on 2 May 2025) (5 m spatial resolution) | |
Slope | Derived | |
Aspect | Derived | |
Relative elevation | Derived | |
Distance to network | Derived | |
Drainage density | Derived | |
Land Use | https://rsdi.regione.basilicata.it/ (accessed on 2 May 2025) (1:5000) | |
Lithology | (1:50,000) |
Classification Report | ||||
---|---|---|---|---|
Precision | Recall | F1 Score | Support | |
0 | 0.999 | 0.998 | 0.999 | 1,025,456 |
1 | 0.837 | 0.958 | 0.893 | 11,150 |
accuracy | 0.997 | 1,036,606 | ||
macro avg | 0.918 | 0.978 | 0.946 | 1,036,606 |
weighted avg | 0.998 | 0.998 | 0.998 | 1,036,606 |
Cross Validation | ||||
---|---|---|---|---|
No. of Folds | Precision | Recall | F1 Score | Support |
1 | 0.801 | 0.968 | 0.882 | 0.997 |
2 | 0.815 | 0.966 | 0.884 | 0.997 |
3 | 0.817 | 0.968 | 0.886 | 0.997 |
4 | 0.803 | 0.968 | 0.878 | 0.997 |
5 | 0.808 | 0.968 | 0.881 | 0.997 |
6 | 0.819 | 0.967 | 0.887 | 0.997 |
7 | 0.812 | 0.964 | 0.882 | 0.997 |
8 | 0.803 | 0.968 | 0.878 | 0.997 |
9 | 0.807 | 0.967 | 0.880 | 0.997 |
10 | 0.808 | 0.965 | 0.879 | 0.997 |
μ | 0.809 | 0.967 | 0.882 | 0.997 |
Test sample | 0.837 | 0.958 | 0.893 | 0.997 |
Classification Report | ||||
---|---|---|---|---|
Precision | Recall | F1 Score | Support | |
0 | 0.96 | 0.94 | 0.95 | 949,477 |
1 | 0.73 | 0.79 | 0.76 | 196,004 |
accuracy | 0.92 | 1,145,481 | ||
macro avg | 0.85 | 0.87 | 0.86 | 1,145,481 |
weighted avg | 0.92 | 0.92 | 0.92 | 1,145,481 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Rondinone, M.; Dal Sasso, S.F.; Aung, H.H.; Contillo, L.; Dimola, G.; Schiattarella, M.; Fiorentino, M.; Telesca, V. Assessing Flood and Landslide Susceptibility Using XGBoost: Case Study of the Basento River in Southern Italy. Appl. Sci. 2025, 15, 5290. https://doi.org/10.3390/app15105290
Rondinone M, Dal Sasso SF, Aung HH, Contillo L, Dimola G, Schiattarella M, Fiorentino M, Telesca V. Assessing Flood and Landslide Susceptibility Using XGBoost: Case Study of the Basento River in Southern Italy. Applied Sciences. 2025; 15(10):5290. https://doi.org/10.3390/app15105290
Chicago/Turabian StyleRondinone, Marica, Silvano Fortunato Dal Sasso, Htay Htay Aung, Lucia Contillo, Giusy Dimola, Marcello Schiattarella, Mauro Fiorentino, and Vito Telesca. 2025. "Assessing Flood and Landslide Susceptibility Using XGBoost: Case Study of the Basento River in Southern Italy" Applied Sciences 15, no. 10: 5290. https://doi.org/10.3390/app15105290
APA StyleRondinone, M., Dal Sasso, S. F., Aung, H. H., Contillo, L., Dimola, G., Schiattarella, M., Fiorentino, M., & Telesca, V. (2025). Assessing Flood and Landslide Susceptibility Using XGBoost: Case Study of the Basento River in Southern Italy. Applied Sciences, 15(10), 5290. https://doi.org/10.3390/app15105290