From Localized Collapse to City-Wide Impact: Ensemble Machine Learning for Post-Earthquake Damage Classification

Ein Larouzi, Bilal; Fahjan, Yasin

doi:10.3390/infrastructures11010025

Open AccessArticle

From Localized Collapse to City-Wide Impact: Ensemble Machine Learning for Post-Earthquake Damage Classification

by

Bilal Ein Larouzi

^*

and

Yasin Fahjan

Department of Civil Engineering, Faculity of Engineering, Ayazağa Campus, Istanbul Technical University, 33469 Istanbul, Türkiye

^*

Author to whom correspondence should be addressed.

Infrastructures 2026, 11(1), 25; https://doi.org/10.3390/infrastructures11010025

Submission received: 26 November 2025 / Revised: 29 December 2025 / Accepted: 6 January 2026 / Published: 12 January 2026

(This article belongs to the Topic Disaster Risk Management and Resilience)

Download

Browse Figures

Versions Notes

Abstract

Effective disaster management depends on rapidly understanding earthquake damage, yet traditional methods struggle to operate at scale and rely on expert inspections that become difficult when access is limited or time is critical. Satellite-based damage detection also faces limitations, particularly under adverse weather conditions and delays associated with satellite overpass schedules. This study introduces a machine learning-based approach to assess post-earthquake building damage using real observations collected after the event. The aim is to develop fast and reliable estimation techniques that can be deployed immediately after the mainshock by integrating structural, seismic, and geographic data. Three machine learning models—Random Forest, Histogram Gradient Boosting, and Bagging Classifier—are evaluated across both reinforced concrete and masonry buildings and across multiple spatial levels, including building, district, and city scales. Damage is categorized using practical three-class (traffic light) and detailed four-class systems. The models generally perform better in simpler classifications, with the Bagging Classifier offering the most consistent results across different scales. Although detecting severely damaged buildings remains challenging in some cases, the three-class system proves especially effective for supporting rapid decision-making during emergency response. Overall, this study demonstrates how machine learning can provide faster, scalable, and practical earthquake damage assessments that benefit emergency teams and urban planners.

Keywords:

seismic damage assessment; machine learning; structural damage classification; post-earthquake evaluation; predictive modeling

1. Introduction

Natural hazards and disasters significantly impact human life, not only through loss of life but also through substantial economic consequences, such as the damage and collapse of buildings. Among these hazards, earthquakes are particularly devastating due to their intense ground shaking and, more critically, their unpredictability. In February 2023, a major seismic sequence struck the southern region of Turkiye, consisting of two large earthquakes: the Mw 7.7 Pazarcık earthquake (6 February 2023) followed by the Mw 7.6 Elbistan earthquake (6 February 2023). Rather than a single rupture, these events formed a complex multi-fault earthquake sequence. The earthquakes resulted in more than 53,000 fatalities and approximately 107,000 injuries, while affecting around 4 million buildings across the impacted region.

Effective disaster management and rapid damage assessment are essential for mitigating the consequences of earthquakes, particularly in reducing casualties and injuries through the implementation of timely mitigation plans and immediate response actions. Traditional post-earthquake damage assessment methods, especially at the building scale, present significant challenges. These classical methods—such as visual inspection, spectrum analysis, and nonlinear time history analysis—are often impractical in large-scale seismic events due to their time-consuming nature, the requirement for highly qualified engineers, and the logistical difficulties of accessing affected areas due to infrastructure damage. Fragility analysis, a widely used method for regional assessments, employs statistical approaches to estimate damage. However, constructing fragility curves can be especially challenging in regions with diverse building typologies.

Over the past decade, artificial intelligence (AI) has emerged as a transformative tool in traditional scientific research, including structural and earthquake engineering. For instance, Kim et al. (2018) [1]; developed a deep neural network with convolutional layers to predict maximum responses of nonlinear hysteretic systems subjected to earthquake excitations. Trained on 54,090 hysteretic behaviors and 1499 ground motions, the model demonstrated high predictive accuracy, effectively extracting features and generalizing seismic responses. Zhou et al. (2024) [2]; applied machine learning models to predict structural damage under mainshock–aftershock sequences using 662 ground motion pairs. Among the tested models, gradient boosting showed the highest accuracy, with key predictors varying by structural period, notably peak ground velocity and displacement. Huang et al. (2019) [3]; applied machine learning to classify in-plane failure modes of reinforced concrete frames with masonry infills using a database of 114 specimens. Among six tested algorithms, Adaptive Boosting and Support Vector Machine achieved the highest accuracy (85.7%). Tree-based models performed well, while Logistic Regression offered interpretability. Lee et al. (2023) [4]; proposed a structural damage detection method using finite element (FE) models and a deep neural network (DNN) to identify the location and extent of damage in a three-story frame structure. The DNN, trained on natural frequency data from FE simulations, predicted stiffness with high accuracy (RMSE < 0.85%), indicating its potential for practical structural health monitoring. Qiu et al. (2023) [5]; introduced modular AI models (ANN and SVR) for predicting fire behavior in steel columns, trained on high-resolution FE models and experimental data. These models accurately predicted axial forces under both steady-state and transient heating scenarios.

Seismic risk and damage assessment have also benefited from integrating engineering knowledge with AI and machine learning. These integrations have yielded highly accurate predictive models, highlighting AI’s potential in this research domain. Ghimire et al. (2022) [6]; used machine learning to assess seismic damage following the 2015 Gorkha, Nepal earthquake, analyzing data from 762,106 buildings. The random forest regression model achieved 68% accuracy in classifying damage using a traffic-light system. Basic building attributes (e.g., number of stories, height, area, and age) facilitated reliable predictions. Zhang et al. (2022) [7]; applied Random Forest, Extreme Gradient Boosting, and Active Learning algorithms for rapid damage assessment of reinforced concrete frames, using data from 198 RC frames and 50 ground motions. Active Learning yielded the highest accuracy (84%) and was validated against data from an earthquake in Taiwan. Bhatta et al. (2024) [8]; explored quantum-enhanced machine learning (QML) for post-earthquake safety assessments of reinforced concrete buildings. A variational quantum classifier (VQC), trained on simulation datasets and validated with 2015 Nepal earthquake data, achieved 75% accuracy, outperforming most classical models, though slightly behind Random Forest (83.3%). QML demonstrated efficiency in seismic damage classification. Won et al. (2021) [9]; developed an artificial neural network model to rapidly predict seismic responses of buildings considering soil–structure interaction. Using 11 input parameters and a multistep analysis process, the model reliably generated seismic responses and performance levels, enabling fast and accurate regional seismic damage assessment for disaster preparedness.

In addition, image-based assessment integrated with AI and deep learning has gained significant attention. Lu et al. (2021) [10]; proposed a deep learning method utilizing convolutional neural networks (CNNs) and time-frequency distributions (TFDs) of ground motions for post-earthquake damage assessment. Tested on a dataset comprising 1 and 619 buildings at Tsinghua University, the method achieved up to 92.6% accuracy for individual predictions and 82.8% at a regional level, outperforming traditional fragility analyses. Braik et al. (2024) [11]; combined satellite imagery, GIS, and deep learning techniques for building damage assessment following Hurricane Ike in Galveston County. A CNN model trained on the xBD dataset achieved an F1 score of 88% on test data and 86% after fine-tuning with real-world data, demonstrating the potential of AI in generating accurate, large-scale damage maps for emergency response. Hacıefendioglu et al. (2024) [12]; applied deep learning-based image segmentation models (U-Net, LinkNet, FPN, and PSPNet) to identify collapsed buildings from satellite images after the 2023 Kahramanmaras earthquakes in Turkiye. All models achieved over 96% accuracy, with FPN and U-Net performing best—FPN excelling in accuracy and precision, and U-Net achieving the highest recall, F1, AUC, and IoU scores. Kaur et al. (2023) [13]; proposed DAHiTrA, a hierarchical transformer-based deep learning model for building damage assessment from satellite images after disasters. Leveraging multi-resolution spatial features and temporal differences, DAHiTrA achieved state-of-the-art performance on the xBD and LEVIR-CD datasets. The study also introduced a high-resolution Ida-BD dataset for domain adaptation, demonstrating effective damage prediction with limited fine-tuning on new disaster areas. Wang et al. (2022) [14]; proposed a two-step deep learning approach for automatic building damage detection from satellite imagery, addressing both building localization and damage classification. Using a novel learning strategy to handle highly imbalanced data, their model achieved 97.3% accuracy and 0.538 IoU for localization across three disasters, and 99.6% accuracy with a 0.995 weighted F1-score for damage classification on extracted building patches. However, satellite image-based damage estimation faces several challenges, including the timely availability of images following seismic events and image clarity, particularly under bad weather conditions. Additionally, these methods are generally effective for detecting fully collapsed structures but provide no information on slight and moderate damage states.

To address the limitations of simulation-based earthquake damage studies, this research introduces a real-data-driven machine learning framework for post-earthquake building damage assessment. In contrast to most existing studies that rely on numerically simulated datasets governed by idealized mathematical models, the proposed framework is developed and validated exclusively using real post-earthquake field data, capturing actual structural behavior, construction deficiencies, material variability, and true ground-motion effects observed during seismic events.

The main innovation of this study is the use of a newly integrated multi-source real-world dataset, combining post-earthquake damage surveys, structural characteristics, ground-motion records, and site-specific information. This integration results in a solid and realistic dataset that enables machine learning models to learn true damage mechanisms rather than idealized responses. No synthetic or numerically simulated damage data are employed in model development. In addition, a multi-scale damage prediction framework is proposed, enabling damage estimation at the building, district, and city levels within a unified approach. This capability directly supports post-earthquake disaster management by facilitating both detailed engineering assessment and large-scale emergency response planning. To strengthen the connection between technical evaluation and operational decision-making, two complementary damage state classification schemes are defined: (i) a detailed multi-level damage scale for engineering applications and (ii) a simplified three-level “traffic-light” scheme for rapid safety tagging, building prioritization, and emergency resource allocation. The framework is applicable to both reinforced concrete and masonry buildings, covering the most common structural typologies in earthquake-prone regions. Although established ensemble machine learning algorithms are used, the novelty of this work lies in the real-data foundation, multi-source dataset construction, and multi-scale damage prediction strategy, which clearly distinguish it from existing simulation-based approaches. Consequently, the proposed framework provides a practically deployable, data-realistic, and disaster-oriented solution for post-earthquake damage assessment.

2. Methodology of Damage Estimation Using AI and Machine Learning

This section presents a novel, data-driven methodology for post-earthquake building damage estimation using artificial intelligence (AI) and machine learning (ML) techniques. Unlike traditional approaches that rely on simulated data based on mathematical models, this framework is built and validated on real post-earthquake observations, capturing the true structural behavior and damage patterns of affected buildings. The methodology encompasses data collection, preprocessing, model training, and evaluation using powerful ensemble algorithms such as Random Forest, Histogram-Based Gradient Boosting, and Bagging classifiers. A key innovation is the use of a multi-source dataset, combining structural characteristics, ground motion records, and site-specific conditions, which solidifies the dataset and enhances model reliability and generalization. The framework supports multi-scale damage prediction through two complementary damage classification schemes: a detailed four-level scheme for engineering assessment and a simplified three-level “traffic light” scheme for rapid disaster response. These classification schemes are designed to improve post-earthquake disaster management, enabling faster building prioritization, emergency planning, and resource allocation. Overall, this methodology represents a new, real-data-based predictive model that bridges the gap between technical engineering assessment and operational disaster response. Figure 1 presents a flowchart of the proposed methodology.

2.1. Machine Learning Algorithms

Machine learning (ML) is a key subfield of artificial intelligence (AI) that enables computers to improve performance on tasks through experience. Instead of explicit programming, ML algorithms learn and generalize patterns from data, allowing systems to make decisions or predictions without human intervention. Models adapt over time, achieving greater efficiency across diverse applications.

2.1.1. Random Forest Classifier

Random Forest is an ensemble method that builds multiple decision trees on dataset sub-samples and aggregates outputs to boost accuracy and reduce overfitting. Each tree splits nodes on the best feature via exhaustive search. Randomness decorrelates trees, lowering variance (with a slight bias increase possible). In scikit-learn, it averages probabilistic predictions rather than majority voting. Though individual trees are deep and intensive, the ensemble generalizes well via the Law of Large Numbers and randomness, making it robust [15,16,17].

2.1.2. Histogram Gradient Boosting Classifier

Histogram Gradient Boosting is an efficient Gradient Boosted Decision Trees (GBDT) implementation, extending boosting to differentiable loss functions for regression and classification, especially on tabular data. It discretizes continuous features into bins (e.g., 255 per feature), speeding training by evaluating splits only at boundaries using histograms and integer structures. This reduces computational cost with little accuracy loss and handles missing values inherently [18,19].

2.1.3. Bagging Classifier

Bagging (Bootstrap Aggregating) is an ensemble meta-estimator that cuts variance in models like decision trees by training multiple bases on random dataset subsets and aggregating via voting (classification) or averaging (regression). With replacement, it is Bagging; without, Pasting; with feature subsets, Random Subspaces; with both, Random Patches. It is a framework for creating diverse models and combining them for better stability and generalization [20].

2.1.4. Comparative Analysis of Random Forest, Histogram Gradient Boosting, and Bagging

Random Forest, Histogram Gradient Boosting, and Bagging are ensemble methods using multiple learners for improved generalization and robustness over single models. Random Forest and Bagging build parallel deep trees on random data subsets and aggregate predictions, but Random Forest adds feature-level randomness for more decorrelation. Histogram Gradient Boosting uses sequential boosting, training each tree to fix prior errors and optimize a loss function, addressing both bias and variance. Bagging generally reduces variance via independent averaging, Random Forest refines it with extra randomization, and boosting optimizes stage-wise rather than in parallel.

2.2. Machine Learning Models

This section outlines the methodological framework adopted for developing and evaluating the machine learning models used in this study. A systematic workflow was implemented, encompassing data preprocessing, model training, and performance evaluation. Each step was designed to ensure the reliability, generalizability, and robustness of the predictive models for post-earthquake damage estimation. The machine learning algorithms were implemented using Python 3.10 Version and the scikit-learn library.

Data Cleaning and Filtering: Outlier detection and removal were conducted using the Z-Score method to enhance the robustness of the dataset. This step aimed to mitigate the influence of anomalous data points that could skew model performance, leading to poor model fitting and generalization. By removing outliers, the dataset was refined to improve reliability and model accuracy.
Data Splitting: To ensure comprehensive and representative model evaluation, the dataset was subjected to two distinct splitting procedures:
○
Initial Split: The dataset was divided into training (90%) and validation (10%) sets, with the validation set selected randomly to avoid bias. Stratification based on damage categories was applied to ensure proportional representation in both subsets. The validation set was exclusively reserved for post-training evaluation.
○
Secondary Split: The training set was further partitioned using 10-fold cross-validation. In each fold, 90% of the data were used for training and 10% for testing, ensuring that each data point was used for both training and testing across different iterations.
K-Fold Cross-Validation: The 10-fold cross-validation method was employed to partition the training data into equal subsets. For each iteration, the model was trained on k-10 folds and tested on the remaining fold. This iterative process enhances model robustness and helps prevent performance biases arising from imbalanced datasets by ensuring that all data points contribute equally to model evaluation.
Hyperparameter optimization: To optimize the performance of each machine learning algorithm, a grid search strategy was applied to systematically explore combinations of selected hyperparameters. The optimal parameter set for each model was identified based on classification performance. To control computational cost and avoid unnecessary model complexity, only the most influential hyperparameters for each algorithm were considered during the optimization process.
Model Training and Fitting: The training dataset was utilized to fit three classification models: Random Forest Classifier, Histogram-based Gradient Boosting Classifier, and Bagging Classifier based on Decision Trees. These models were selected based on their suitability for multi-class damage classification tasks.
Selection of Optimal Model: Model performance during each cross-validation iteration was evaluated using the SD-ND generalized confusion matrix, prioritizing alignment with the actual distribution of damage states. Although common metrics such as F1-score, accuracy, and recall were calculated, the SD-ND confusion matrix was considered the primary evaluation metric. The model yielding the highest performance based on this matrix was selected for final training.
Validation Set Evaluation: Following the identification of the optimal k-fold split, the model was retrained using the selected training and testing data. Its generalization capability was subsequently assessed on the reserved 10% validation set, which had remained unused during all previous training and evaluation phases.
Performance Evaluation and Results: Predictions on the validation set were evaluated using the confusion matrix, accuracy, and macro F1-score. Given the categorical nature of the damage classification, the confusion matrix and macro F1-score were prioritized. These metrics provided a detailed assessment of the model’s classification capabilities while accounting for class imbalances. The evaluation results offer valuable insights into the model’s real-world applicability for post-earthquake building damage prediction.
The Spearman correlation coefficient is a non-parametric statistical measure used to assess the strength and direction of a monotonic relationship between two variables. It is computed based on the ranked values of the data, making it robust to outliers and suitable for non-normally distributed variables. This measure captures both linear and non-linear monotonic trends without requiring strict distributional assumptions. Consequently, it is widely applied in engineering and scientific studies where data characteristics may be complex or irregular.
Features Importance Analysis: Feature importance analysis in machine learning identifies which input variables contribute most to the model’s predictions, offering insights into underlying data–response relationships. It enhances model interpretability, helping to understand physical or structural factors driving outcomes.

2.3. Dataset Creation

The process of data collection represents the foundational step in the development of machine learning models. The innovation of this study lies in the integration and synthesis of data from multiple independent real-world sources, which significantly enhances dataset quality and enables the development of more accurate and reliable models. In the context of earthquake damage assessment, datasets are constructed through the explicit fusion of real post-earthquake field surveys, specialized software, and publicly available online databases, resulting in a solid, realistic, and comprehensive dataset that captures actual structural behavior under seismic loading.

Field Data Collection: data collected directly from the field following significant seismic events play a critical role in both damage assessment and the broader understanding of structural behavior under earthquake-induced loading. With the growing adoption of machine learning methodologies in structural damage evaluation, the importance of high-quality, real-world data has further intensified. Field data encompass a wide range of information, typically including structural characteristics (e.g., building type, number of stories, construction materials, and age), geographical location (e.g., latitude and longitude), and, most critically, observed damage levels. Damage assessments are conducted by trained engineers who rely on standardized evaluation protocols, visual inspections, and professional engineering judgment to ensure consistency and accuracy.
The Rapid Earthquake Damage Assessment System (REDAS): REDAS is a comprehensive tool developed through the REDACt project to support earthquake preparedness and response across Southeastern Europe and the Black Sea region. It consists of three integrated components: REDA.p (a desktop platform for seismic damage assessment), a mobile app for quick field updates, and an Educational Hub (Edu.Hub) for capacity building. REDA.p provides scenario-based and near-real-time evaluations of damage to infrastructure such as buildings, pipelines, and geotechnical structures, using harmonized ground motion prediction models and a unified building taxonomy. By incorporating real-time data from regional seismic networks, REDAS delivers spatial damage maps that aid emergency responders and inform long-term urban resilience planning [21].
OpenStreetMap (OSM): OSM is a global, open-source mapping project where contributors collaboratively build a free, editable map of the world. The OSM database includes detailed geospatial data—such as roads, buildings, land use, and natural features—and is continually updated by volunteers using GPS, aerial imagery, and public datasets. Released under the Open Database License (ODbL), OSM data can be freely used, modified, and shared for both commercial and non-commercial purposes, provided it remains open and properly attributed. Its wide coverage and accessibility make OSM a vital resource for research, disaster response, urban planning, and countless location-based services [22,23]. Figure 2 shows a building footprint for the area study.
The Open-Elevation API: Open-Elevation is a public service that offers elevation data based on latitude and longitude coordinates, sourced from global digital elevation models like NASA’s SRTM and ASTER. Its simplicity and open-access model make it a valuable tool for environmental research, transportation modeling, and geospatial analysis. When integrated with mapping data from sources like OSM, it enables detailed terrain visualization, hydrological modeling, and elevation profiling without requiring complex local data processing. Lightweight yet powerful, the Open-Elevation API enhances the functionality of geospatial applications across both academic and commercial domains [24].

2.4. Damage States Classification

The damage state represents a fundamental piece of information in disaster management and plays a crucial role in guiding emergency response and rescue operations. This study introduces two damage state classification schemes, designed to develop a more effective predictive model that supports and enhances post-earthquake disaster management and decision-making processes.

2.4.1. Actual Damage States

The building damage classification was conducted by qualified engineering teams as part of the post-earthquake assessment efforts. The classification was based on visual inspections and engineering judgment, categorizing buildings into four damage levels:

Severe-Damaged Class (SD): This class includes buildings that remained standing but sustained severe damage, as well as buildings that collapsed either partially or entirely.
Moderate-Damaged Class (MD): Buildings in this category exhibited moderate damage and required further assessment before reoccupation.
Low-Damaged Class (LD): This category includes buildings that sustained minor but visible damage, for which further inspection and repairs are recommended.
Non-Damaged Class (ND): Buildings in this category did not exhibit any observable damage during the earthquake.

2.4.2. Proposed Damage States

Damage classification plays a crucial role in damage assessment. To achieve the most accurate modeling, multiple damage classification schemes have been proposed (Table 1):

Four Damage Classes: The four damage state classifications represent the actual damage categorization to the data collected from the field based on engineering judgment and visual inspections.
Three Damage Classes: The three-level damage classification, often referred to as the “traffic light” classification, is particularly useful for disaster management. It categorizes buildings into:
○
Unsafe: Buildings classified as Severe-damaged (SD).
○
Need Further Assessment: Buildings classified as either moderate-damaged (MD) or low-damaged (LD).
○
Safe: Buildings classified as non-damaged (ND).

Table 1. Comparison between proposed damage state schemes and HAZUS damage states.

Damage State Schemes	Damage States
HAZUS	NONE (DS0)	SLIGHT (DS1)	MODERATE (DS2)	HEAVY (DS3)	COMPLETE (DS4)
Proposed 4 Damage states	NO DAMAGE	LOW DAMAGE	MODERATE DAMAGE	SEVERE DAMAGE
Proposed 3 Damage states	NO DAMAGE	MODERATE DAMAGE		SEVERE DAMAGE

The adoption of the three-damage state classification was proposed to improve the reliability and generalizability of the predictive models. In the actual collected data, inconsistencies often arise between adjacent damage levels—particularly between low and moderate damage states—mainly due to variations in engineers’ judgments during post-earthquake inspections. By consolidating these categories into a simplified three-level classification, the approach reduces subjectivity and labeling uncertainty in the dataset. Furthermore, this scheme enhances the practical applicability of the model in disaster management, supporting clearer decision-making for emergency response, building reoccupation, and prioritization of detailed assessments.

2.5. Damage Estimation Across Multiple Spatial Scales

This study presents a multi-scale framework for post-earthquake damage estimation, aiming to identify the most effective spatial resolution that enhances disaster management and supports efficient decision-making by authorities.

At the building scale, assessments were conducted on a building-by-building basis, enabling the identification of individual structures requiring urgent intervention. This level of analysis supports prioritization in evacuation procedures, rescue operations, and detailed structural inspections.

At the district scale, the assessment focused on identifying spatial clusters of damage across urban districts. This approach allows authorities to delineate the most severely affected areas, facilitating the allocation of resources and the development of targeted emergency response and recovery plans.

At the city scale, the framework enables rapid estimation of overall damage distributions across the entire urban area. Such large-scale analyses provide valuable insights for decision-makers to estimate total losses, assess infrastructure resilience, and determine the necessary logistics and support for effective disaster management.

By integrating analyses across these three spatial levels, the framework supports both micro- and macro-level decision-making processes. This multi-scale approach enhances situational awareness immediately after an earthquake and strengthens the capacity of urban resilience planning at the regional and national levels.

2.6. Dataset Uncertainties and Limitations

Several sources of uncertainty were identified during the dataset development process, which may have influenced the overall performance of the predictive models. The damage states used in this study were primarily determined through visual inspections and engineering judgment conducted in the field. This process inherently involves a degree of subjectivity, particularly when distinguishing between moderate and severe damage levels. For instance, in some cases, a building may be classified as severely damaged if any of its structural columns exhibit cracking or spalling, whereas others might categorize similar conditions as moderate damage. Such inconsistencies in evaluation criteria can lead to variability in the recorded data and may affect the model’s ability to generalize accurately.

In addition, in the Kahramanmaras region, where the dataset was collected to represent typical structural characteristics, several construction-related irregularities are prevalent. These include soft-story configurations and the bonding effect, where adjacent buildings are constructed without sufficient separation gaps. These irregularities can significantly influence structural performance during seismic events but are not captured within the dataset, thereby introducing additional uncertainty into the modeling process and partially explaining variations in predictive accuracy.

3. Case Study

This case study presents a novel approach to understanding earthquake impacts in the Kahramanmaras region following the February 2023 earthquakes. It uniquely integrates structural, geospatial, and seismic data to comprehensively characterize the building stock and map damage distribution across districts. By combining field-assessed building damage with scenario-based ground motion parameters, the study establishes an advanced, data-driven foundation for developing machine learning models capable of accurate post-earthquake damage prediction. This integrated framework captures the complex interactions between building characteristics, seismic demand, and local site conditions, providing a level of analytical precision and applicability that surpasses traditional assessment methods.

3.1. Distribution of Building Damages in Kahramanmaras

On 6 February 2023, southeastern Turkey experienced two devastating earthquakes, with magnitudes of 7.8 and 7.6, centered near the city of Kahramanmaras. These twin seismic events caused widespread destruction, with Kahramanmaras bearing some of the heaviest impacts due to its proximity to the epicenters. Satellite-based Synthetic Aperture Radar (SAR) change detection analysis indicated that approximately 17.37% of the city’s urban area was destroyed [25]. Within the city, the districts of Dulkadiroğlu and Onikişubat suffered the most severe damage. These areas, characterized by dense residential and commercial development, experienced the collapse or irreparable damage of thousands of buildings—many of which were constructed before the implementation of stricter seismic design codes [26]. While precise casualty figures at the city level are limited, the broader Kahramanmaras province reported over 12,600 fatalities, a substantial proportion of which likely occurred in the urban core [27].

From an engineering perspective, the earthquakes revealed several critical structural vulnerabilities:

Inadequate Seismic Design: Many collapsed buildings lacked fundamental seismic design elements, such as ductile detailing, proper column confinement, and energy-dissipating systems like shear walls [26].
Soft-Story Configurations: Numerous mid-rise structures featured soft-story designs, typically with open ground floors used for commercial or parking purposes. These levels proved especially weak under lateral loads, often initiating progressive, pancaking failures [26].
Soil Amplification: The city center rests on soft alluvial deposits, which amplified ground motion during the earthquakes. Deformation mapping revealed an average of 0.46 m of ground displacement, contributing to the widespread structural failures [25].
Construction Deficiencies: Many of the affected buildings were constructed before the 2000s and did not conform to modern seismic standards. Common issues included low-strength concrete, inadequate transverse reinforcement, and improper rebar splicing [26].
Resonance Effects: A notable number of the collapsed buildings were within the 5–8 story range—heights that coincided with the dominant frequency of the seismic waves. This resonance effect likely amplified lateral displacements, leading to stress demands that exceeded the structural capacity of these buildings [25].

The 2023 Kahramanmaras earthquakes underscored the critical need for resilient seismic design, particularly in regions of high seismicity. The events highlighted the consequences of outdated construction practices and underscored the importance of incorporating principles such as ductility, redundancy, and dynamic response considerations into both new and existing building stock.

3.2. Dataset

The dataset serves as the foundation for any machine learning model. This section provides an overview of the dataset utilized in the present study. The dataset was constructed by integrating multiple databases to ensure comprehensive coverage and data diversity. It primarily represents and describes the entire Kahramanmaras region, which was severely affected by the earthquakes that occurred on 6 February 2023.

3.2.1. Dataset Collecting

The dataset has been created by integrating multiple data sources with the main database, which was collected from the field after an earthquake struck the southern region of Turkiye, representing the buildings and their damage in the Kahramanmaras province. It has been integrated with REDAS output to represent the ground motion parameters for the earthquake based on scenario analysis, further aiming to enhance the spatial dataset.

3.2.2. Dataset Distribution and Visualizations

The study’s datasets mainly comprise two categories, differentiated by structural system type. Both datasets were collected through field assessments conducted after the earthquakes in the Kahramanmaras province, following two major seismic events with magnitudes of 7.8 Mw and 7.7 Mw, respectively.

Masonry Buildings: The masonry dataset consists of approximately 9700 data points with the following damage distribution: 25% severe-damaged, 1% moderate-damaged, 32% low-damaged, and 42% non-damaged buildings. The data originates from a near-fault region with a rupture distance (Rrup) of approximately 35 km and elevations ranging from 455 m to 1764 m above sea level. Data were collected across approximately 100 districts within the region. The building damage distribution is illustrated in (Figure 3).
Reinforced Concrete Structures: The reinforced concrete (RC) dataset comprises approximately 31,000 data points with the following damage distribution: 17% severe-damaged, 3% moderate-damaged, 37% low-damaged, and 43% non-damaged buildings. The data were collected from the same geographical region as the masonry dataset. Building ages ranged from newly constructed (1 year) to approximately 70 years old, with the number of stories reaching up to 15 floors. Data were collected from 107 districts. The building damage distribution is presented in (Figure 4).

3.2.3. Dataset Features

The dataset employed in this study comprises multiple features that can be grouped into three main categories: structural characteristics, ground motion parameters, and site-specific conditions as shown in Table 2.

The structural characteristics include features describing the building’s physical attributes. Building age plays a critical role in determining seismic vulnerability, as older structures are generally more susceptible to damage due to material deterioration and outdated construction practices. Number of floors is another significant variable, as taller buildings tend to experience higher amplification effects at longer vibration periods and are more sensitive to variations in peak ground acceleration (PGA).

The ground motion parameters characterize the seismic demand acting on each building. Peak Ground Acceleration (PGA) represents the maximum ground acceleration recorded during an earthquake and serves as a key indicator of shaking intensity, particularly influencing low-rise buildings with short natural periods. Peak Ground Velocity (PGV) denotes the maximum ground motion velocity at a specific location and provides complementary information regarding the energy content of seismic waves. Intensity offers a qualitative measure of shaking severity at a given site, derived from observed effects on people, structures, and the surrounding environment. Rupture distance (Rrup) represents the shortest distance between a site and the fault rupture, which strongly governs the amplitude and frequency characteristics of the seismic waves. These parameters were computed using the REDAS (1.2.3 Version) software through a scenario-based analysis procedure.

The site-specific conditions describe the local soil and topographic characteristics. Shear wave velocity (VS30), defined as the average shear wave velocity within the upper 30 m of soil, is a widely recognized indicator of site stiffness; higher VS30 values are typically associated with stiffer soils that may amplify seismic motions under strong shaking. Elevation provides insight into the topographic influence on ground motion, as variations in terrain can significantly affect seismic wave propagation and local amplification effects.

Also, Additionally, the building footprint area was initially derived from OpenStreetMap data by matching the geographic coordinates of each building provided by the ministry. However, this feature was ultimately excluded from the final analysis for two main reasons. First, inaccuracies were observed in calculating the areas of buildings with irregular geometries or shared boundaries, which introduced inconsistencies in the dataset. Second, preliminary analyses indicated that including this feature did not yield a significant improvement in the model’s predictive accuracy.

Figure 5 shows the correlation heat map of the input features used in the analysis. Structural variables such as building age, number of floors, and elevation exhibit weak correlations with seismic intensity measures, indicating limited redundancy among these parameters. In contrast, strong correlations are observed between PGA, PGV, and macro seismic intensity, which is consistent with their physical interdependence. VS30 demonstrates a negative correlation with ground motion parameters, highlighting the influence of site conditions on seismic response. Overall, the observed correlations are physically reasonable and support the suitability of the selected features for damage prediction.

4. Results

This section presents the performance evaluation of the three machine learning models on the validation dataset. The evaluation was conducted at three distinct spatial scales: building-level, district-level, and city-level. The current subsection focuses on building-scale performance, where predictive accuracy and macro F1-scores were computed, and confusion matrices were analyzed to assess the classification capabilities of the models.

4.1. Model Accuracy and Performance

In the three-damage state classification, a moderate decline in performance was observed, with both accuracy and F1-scores averaging around 63% across all algorithms. For the four-damage state classification, a further reduction in predictive accuracy was noted. In the case of masonry buildings, the models achieved accuracies of approximately 46% with corresponding F1-scores of 42%. For reinforced concrete buildings, the accuracy exceeded 51%, with an average F1-score of around 48% as shown in Figure 6, Figure 7 and Figure 8.

4.2. Building-Level Evaluation

Masonry buildings three Damage States: In the three damage states scenario, the confusion matrix (Figure 9) shows that misclassification of “Unsafe” buildings as “Safe” was limited to around 10%, while approximately 30% were incorrectly categorized as requiring “Further Assessment.” The “Safe” class was accurately predicted in over 75% of cases, particularly when using the Histogram-Based Gradient Boosting classifier.

Table 3 presents the precision and recall values for each damage state class. The results indicate balanced performance for the Unsafe and Safe classes, while slightly lower values for the To be Checked class reflect the inherent overlap between intermediate damage levels.

Masonry buildings four Damage Classes: For the four-class damage state of masonry structures, the confusion matrix (Figure 10) indicates significant misclassification between adjacent damage states. For instance, approximately 25% of “Severe-Damaged” buildings were incorrectly predicted as “Low-Damaged.” Similarly, a considerable portion of “Non-Damaged” buildings was misclassified as “Low-Damaged.” However, direct confusion between the “Severe-Damaged” and “Non-Damaged” categories was relatively rare, occurring in less than 10% of cases.

Table 4 presents the model’s performance for each damage state class. The results show that Severe and No Damage classes are identified fairly well, while Moderate and Low Damage classes have lower performance, reflecting the difficulty in distinguishing intermediate damage levels.

Reinforced Concrete buildings three Damage Classes: In the three-class configuration for RC buildings, the confusion matrix (Figure 11) shows that fewer than 10% of “Unsafe” buildings were misclassified as “Safe,” while approximately 35% were classified as requiring “Further Assessment.” The model demonstrated strong predictive capability for the “Safe” category, correctly classifying over 75% of such cases, with the Histogram-Based Gradient Boosting model providing the most consistent results.

Table 5 presents the precision and recall for each damage state class. The results indicate that the Unsafe and Safe classes are identified fairly well, while the To be Checked class shows slightly lower performance, reflecting the challenge of distinguishing intermediate damage levels.

Reinforced Concrete buildings four Damage Classes: When extended to four damage categories, the model’s performance for RC buildings exhibited increased misclassification between adjacent damage levels. The confusion matrix (Figure 12) shows that around 30% of “Severe-Damaged” buildings were predicted as “Low-Damaged,” and approximately 20% of “Non-Damaged” buildings were also misclassified into this category. However, direct confusion between the “Severe-Damaged” and “Non-Damaged” classes was relatively limited, occurring in less than 10% of cases.

Table 6 presents the model’s performance for each damage state class. The results show that Severe and No Damage classes are identified relatively well, while Moderate and Low Damage classes have lower performance, highlighting the challenge of correctly classifying intermediate damage levels.

4.3. District-Scale Evaluation

The district-scale evaluation focuses on assessing the predictive performance of the machine learning model at an intermediate spatial resolution. Specifically, building-level damage predictions were aggregated for ten randomly selected districts out of approximately one hundred within the study area. The following analysis presents the results obtained using the Bagging Classifier.

Masonry buildings three Damage Classes: In the three-class damage classification scenario, the model generally performed well in approximating the actual number of buildings per damage class. However, the “Need Further Assessment” category was occasionally overestimated in certain districts (Figure 13).
Masonry buildings four Damage Classes: For the four-class damage classification of masonry buildings, the model showed satisfactory accuracy in predicting the count of buildings within the “Severe-Damaged” category. Nevertheless, some inconsistencies were observed in the “Low-Damaged” and “Non-Damaged” categories (Figure 14).
Reinforced Concrete buildings three Damage Classes: The three-class damage prediction for RC buildings revealed a higher degree of misclassification, resulting in significant overlap and confusion among damage states in multiple districts (Figure 15).
Reinforced Concrete buildings four Damage Classes: In the four-class scenario for RC structures, the model exhibited substantial challenges in correctly classifying damage levels, particularly for the “Low-Damaged” category (Figure 16).

4.4. City-Scale Evaluation

The city-scale evaluation extends the analysis of model performance by aggregating predictions across the entire validation dataset. This approach enables a broader assessment of the models’ ability to reproduce the overall damage distribution at the urban level, similar to the methodology employed in the district-scale analysis. The performance of each classification model was compared against the actual damage distribution, with a particular focus on the Bagging Classifier due to its consistently strong performance across previous evaluations.

Masonry buildings three Damage Classes: For the three-class classification of masonry buildings, the Bagging Classifier achieved strong predictive performance (Figure 17).
Masonry buildings four Damage Classes: The city-scale prediction for masonry buildings across four damage categories showed a high level of consistency with the actual damage data (Figure 18).
Reinforced Concrete buildings three Damage Classes: In the three-class scenario, the Bagging Classifier again demonstrated high accuracy (Figure 19).
Reinforced Concrete buildings four Damage Classes: For the four-class damage classification of RC buildings, the Bagging Classifier exhibited strong performance (Figure 20).

4.5. Features Importance Analysis

Feature importance analysis was performed for the three damage state classification schemes for both masonry and reinforced concrete (RC) buildings.

Masonry Buildings: For the Random Forest model, the ground motion parameters emerged as the most significant features. In the Histogram-based model, rupture distance was identified as the dominant feature. For the Bagging classifier, PGA, building age, elevation, and rupture distance were found to have comparable influence (Figure 21).
Reinforced Concrete Buildings: For the Random Forest model, the ground motion parameters were again identified as the most influential features. In the Histogram-based model, rupture distance was the dominant parameter. For the Bagging classifier, PGV was identified as the most important feature (Figure 22).

5. Discussion

The decline in model performance observed when transitioning from three to four damage states highlights the increasing complexity associated with finer damage classification. The overlap between adjacent damage categories and the presence of class imbalance make it more challenging for machine learning models to distinguish intermediate damage levels, which explains the reduced accuracy and F1-scores observed across all algorithms.

The results demonstrate that three-damage-state classification frameworks, for both masonry and reinforced concrete buildings, are more robust and reliable for rapid post-earthquake applications. In particular, the consistently low confusion between the “Safe” and “Unsafe” categories suggests that these frameworks are well suited for emergency response scenarios, where minimizing false-safe predictions is critical for public safety and resource allocation.

At larger spatial scales, the Bagging Classifier showed strong capability in reproducing observed damage distributions at both district and city levels. This consistency indicates that ensemble-based approaches are effective in aggregating building-level predictions into meaningful urban-scale damage patterns, supporting their use in large-scale seismic risk assessment and decision-support systems.

The feature importance analysis further confirms the dominant influence of ground motion parameters and proximity-related features, such as rupture distance, PGA, and PGV, across both structural typologies. The consistent importance of building age and elevation highlights the combined role of structural vulnerability and site effects in determining damage outcomes. These findings are physically meaningful and align with established earthquake engineering knowledge, reinforcing confidence in the proposed modeling framework.

6. Conclusions

This study introduced a robust machine learning framework for post-earthquake damage assessment by integrating structural, geospatial, and seismic data across multiple spatial scales—building, district, and city. Using Random Forest, Histogram Gradient Boosting, and Bagging Classifier algorithms, the framework demonstrated strong predictive performance for both reinforced concrete and masonry buildings affected by the 2023 Turkiye earthquakes. The use of two damage state classification schemes—three-level (traffic light) and four-level—allowed the model to address a range of assessment needs, from rapid emergency response to more detailed structural evaluations.

The calculations were performed using scikit-learn functions and custom Python code, with standard tools handling model training and evaluation, and custom code enabling damage aggregation at district and city levels. The framework is a fully implementable pipeline that can use new post-earthquake data for immediate predictions in similar regions. For areas with different building types, the models can be retrained manually using local data to ensure accurate results. As more earthquake data become available, the system can be updated to continuously improve prediction reliability and applicability.

Model performance varied with both the level of classification detail and spatial resolution. The three damage states’ “traffic light” scheme proved particularly valuable for practical decision-making by distinguishing clearly between safe, uncertain, and unsafe buildings. This configuration offered an effective balance between interpretability and the complexity of real-world damage scenarios, making it highly suitable for rapid assessment and emergency planning. In contrast, the four damage states system introduced greater ambiguity, particularly between adjacent damage levels, though it rarely resulted in direct misclassification between undamaged and severely damaged structures.

At the district level, the Bagging Classifier consistently identified localized damage trends, especially in masonry buildings. At the city level, models using the three-damage state classification effectively captured overall damage distributions, aligning closely with observed post-earthquake conditions. Among the tested algorithms, the Bagging Classifier emerged as the most stable across different spatial scales and classification schemes, owing to its ensemble nature and robustness against overfitting.

The proposed three damage states (traffic light) configuration is recommended at the building scale due to its minimal conflict between severely damaged and non-damaged classifications, ensuring more reliable and actionable results. The framework is most suitable for regions with similar construction practices and material properties, such as the Eastern Anatolian region of Turkiye. To effectively apply the model in other areas, it is essential to collect representative post-earthquake field data that accurately reflects local building characteristics and seismic behavior.

A key practical contribution of this framework lies in its potential to support fragility curve development and rapid post-earthquake assessment. While conventional fragility-based analyses require extensive data collection and processing time, the proposed machine learning model can provide immediate, spatially explicit damage estimations. Therefore, it should be viewed not as a replacement but as a complementary approach that enhances fragility-based methods by offering faster, map-based predictions of damage distribution across different spatial scales.

In its current form, the model can be effectively applied in regions with building characteristics similar to those of the Eastern Anatolian region, as the training dataset reflects typical structural behaviors and material properties observed in that area. Accordingly, the model can be directly adopted in other regions where construction practices, building typologies, and material properties are comparable. Otherwise, to ensure reliable performance in regions with different construction characteristics, new models should be developed using local post-earthquake field survey data that capture region-specific building and construction features and seismic performance. Nevertheless, the overall methodology remains adaptable to any region through the incorporation of representative local data. In practice, the model can be applied at different spatial scales to meet specific assessment objectives, with the three-damage state (traffic light) configuration recommended for rapid and reliable prioritization of inspection and repair. These scalable outputs enable efficient decision-making during the critical hours following an earthquake, bridging the gap between detailed fragility-based assessments and immediate operational needs.

To further strengthen the real-world applicability of the framework, future research should explore the integration of remote sensing data, such as Synthetic Aperture Radar (SAR) and optical imagery, which can provide large-scale, near-real-time information on surface deformation and structural damage. Coupling these data sources with the proposed model would enable more comprehensive and spatially continuous assessments in data-scarce regions. Additionally, greater attention should be given to uncertainties in the input data. Quantifying and reducing these uncertainties would significantly enhance the reliability and operational usability of the model’s predictions in real emergency contexts.

Overall, the findings highlight the potential of machine learning to deliver scalable, adaptable, and accurate post-earthquake damage assessments. When supported by high-quality field observations, remote sensing inputs, and reliable seismic data, such frameworks can significantly enhance urban resilience planning and real-time disaster response.

Author Contributions

All authors contributed to the study conception and design. Material preparation and analysis were performed by B.E.L. and Y.F. The first draft of the manuscript was written by B.E.L., and all authors commented on previous versions of the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the SCIENTIFIC RESEARCH PROJECTS COORDINATION UNIT from Istanbul Technical University (Grant numbers [45808]).

Data Availability Statement

The data presented in this study are available upon request from the corresponding author due to legal reasons.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Kim, T.; Kwon, O.; Song, J. Response prediction of nonlinear hysteretic systems by deep neural networks. Neural Netw. 2018, 111, 1–10. [Google Scholar] [CrossRef] [PubMed]
Zhou, Z.; Wang, M.; Han, M.; Yu, X.; Lu, D. Prediction of damage potential in mainshock–aftershock sequences using machine learning algorithms. Earthq. Eng. Eng. Vib. 2024, 23, 919–938. [Google Scholar] [CrossRef]
Huang, H.; Burton, H.V. Classification of in-plane failure modes for reinforced concrete frames with infills using machine learning. J. Build. Eng. 2019, 25, 100767. [Google Scholar] [CrossRef]
Lee, Y.; Kim, H.; Min, S.; Yoon, H. Structural damage detection using deep learning and FE model updating techniques. Sci. Rep. 2023, 13, 46141. [Google Scholar] [CrossRef]
Qiu, J.; Jiang, L. Development of modular and reusable AI models for fast predicting fire behaviour of steel columns in structural systems. Eng. Struct. 2023, 297, 116994. [Google Scholar] [CrossRef]
Ghimire, S.; Guéguen, P.; Giffard-Roisin, S.; Schorlemmer, D. Testing machine learning models for seismic damage prediction at a regional scale using building-damage dataset compiled after the 2015 Gorkha Nepal earthquake. Earthq. Spectra 2022, 38, 2970–2993. [Google Scholar] [CrossRef]
Zhang, H.; Cheng, X.; Li, Y.; He, D.; Du, X. Rapid seismic damage state assessment of RC frames using machine learning methods. J. Build. Eng. 2022, 65, 105797. [Google Scholar] [CrossRef]
Bhatta, S.; Kang, X.; Dang, J. Machine learning prediction models for ground motion parameters and seismic damage assessment of buildings at a regional scale. Resilient Cities Struct. 2024, 3, 84–102. [Google Scholar] [CrossRef]
Won, J.; Shin, J. Machine learning-based approach for seismic damage prediction method of building structures considering soil–structure interaction. Sustainability 2021, 13, 4334. [Google Scholar] [CrossRef]
Lu, X.; Xu, Y.; Tian, Y.; Cetiner, B.; Taciroglu, E. A deep learning approach to rapid regional post-event seismic damage assessment using time-frequency distributions of ground motions. Earthq. Eng. Struct. Dyn. 2021, 50, 1612–1627. [Google Scholar] [CrossRef]
Braik, A.M.; Koliou, M. Automated building damage assessment and large-scale mapping by integrating satellite imagery, GIS, and deep learning. Comput.-Aided Civ. Infrastruct. Eng. 2024, 39, 2389–2404. [Google Scholar] [CrossRef]
Hacıefendioğlu, K.; Başağa, H.B.; Kahya, V.; Özgan, K.; Altunışık, A.C. Automatic detection of collapsed buildings after the 6 February 2023 Türkiye earthquakes using post-disaster satellite images with deep learning-based semantic segmentation models. Buildings 2024, 14, 582. [Google Scholar] [CrossRef]
Kaur, N.; Lee, C.; Mostafavi, A.; Mahdavi-Amiri, A. Large-scale building damage assessment using a novel hierarchical transformer architecture on satellite images. Comput.-Aided Civ. Infrastruct. Eng. 2023, 38, 2072–2091. [Google Scholar] [CrossRef]
Wang, Y.; Chew, A.W.Z.; Zhang, L. Building damage detection from satellite images after natural disasters on extremely imbalanced datasets. Autom. Constr. 2022, 140, 104328. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Genuer, R.; Poggi, J.; Tuleau-Malot, C. Variable selection using random forests. Pattern Recognit. Lett. 2010, 31, 2225–2236. [Google Scholar] [CrossRef]
Reis, I.; Baron, D.; Shahaf, S. Probabilistic random forest: A machine learning algorithm for noisy data sets. Astron. J. 2019, 157, 16. [Google Scholar] [CrossRef]
Shi, Y.; Ke, G.; Chen, Z.; Zheng, S.; Liu, T. Quantized training of gradient boosting decision trees. arXiv 2022, arXiv:2207.09682. [Google Scholar] [CrossRef]
Zhang, H.; Si, S.; Hsieh, C. GPU-acceleration for large-scale tree boosting. arXiv 2017, arXiv:1706.08359. [Google Scholar] [CrossRef]
Breiman, L. Bagging predictors. Mach. Learn. 1996, 24, 123–140. [Google Scholar] [CrossRef]
Papatheodorou, K.; Theodoulidis, N.; Klimis, N.; Zulfikar, C.; Vintila, D.; Cardanet, V.; Kirtas, E.; Toma-Danila, D.; Margaris, B.; Fahjan, Y.; et al. Rapid earthquake damage assessment and education to improve earthquake response efficiency and community resilience. Sustainability 2023, 15, 16603. [Google Scholar] [CrossRef]
Haklay, M.; Weber, P. OpenStreetMap: User-generated Street maps. IEEE Pervasive Comput. 2008, 7, 12–18. [Google Scholar] [CrossRef]
Ramm, F.; Topf, J.; Chilton, S. OpenStreetMap: Using and Enhancing the Free Map of the World; UIT Cambridge Ltd.: Cambridge, UK, 2011. [Google Scholar]
Open-Elevation. Available online: https://open-elevation.com/ (accessed on 22 April 2025).
Wang, X.; Feng, G.; He, L.; An, Q.; Xiong, Z.; Lu, H.; Wang, W.; Li, N.; Zhao, Y.; Wang, Y.; et al. Evaluating urban building damage of 2023 Kahramanmaras, Turkey earthquake sequence using SAR change detection. Sensors 2023, 23, 6342. [Google Scholar] [CrossRef] [PubMed]
Avğın, S.; Köse, M.M.; Özbek, A. Damage assessment of structural and geotechnical damages in Kahramanmaraş during the February 6, 2023 earthquakes. Eng. Sci. Technol. Int. J. 2024, 57, 101811. [Google Scholar] [CrossRef]
Wertheimer, B.T. Turkey Earthquake: Death Toll Could Increase Eight-Fold, WHO Says. Available online: https://www.bbc.com/news/world-europe-64533851 (accessed on 6 February 2023).

Figure 1. Flowchart of the proposed framework showing dataset sources, data processing, machine learning model training and optimization, and multi-scale damage assessment.

Figure 2. Building footprint for Kahramanmaras from OSM.

Figure 3. Masonry buildings dataset damage geographical distribution.

Figure 4. RC buildings dataset damage geographical distribution map.

Figure 5. Correlation heat map showing the relationships among structural, ground motion, and site-related input features used in the machine learning models.

Figure 6. Accuracies and F1-Scores for Random Forest Classifier.

Figure 7. Accuracies and F1-Scores for Histogram Gradient Boosting Classifier.

Figure 8. Accuracies and F1-Scores for Bagging Classifier.

Figure 9. Confusion matrices for Masonry building 3 Damage states showing low conflict to predict Unsafe as Safe.

Figure 10. Confusion matrices for Masonry building 4 Damage states showing high conflict in adjacent damage states.

Figure 11. Confusion matrices for RC building 3 Damage states showing low conflict to predict Unsafe as Safe.

Figure 12. Confusion matrices for RC building 4 Damage states showing high conflict in predication between adjacent damage states.

Figure 13. Building counts by damage state for Masonry buildings 3 damage states, district scale, showing high conflict in damage counts.

Figure 14. Building counts by damage state for Masonry buildings 4 damage states, district scale, showing high conflict between LD and MD Counts.

Figure 15. Building counts by damage state for RC buildings 3 damage states, district scale, showing high conflict in building count prediction.

Figure 16. Building counts by damage state for RC buildings 4 damage states, district scale, showing high misclassification in damage state prediction.

Figure 17. Building counts by damage state for Masonry buildings 3 damage states, city scale, showing reasonable accuracy.

Figure 18. Building counts by damage state for Masonry buildings 4 damage states, city scale, showing high accuracy for Bagging classifier.

Figure 19. Building counts by damage state for RC buildings 3 damage states, city scale, showing reasonable accuracy.

Figure 20. Building counts by damage state for RC buildings 4 damage states, city scale, showing reasonable accuracy for bagging classifier.

Figure 21. Feature importance for masonry buildings by ML models.

Figure 22. Feature importance for RC buildings by ML models.

Table 2. Input features considered in the machine learning models, their corresponding categories, and data sources, including structural, seismic, and site-related parameters.

Category	Feature	Data Source
Structural Characteristics	Building Age	Ministry building inventory
Structural Characteristics	Number of Floors	Ministry building inventory
Ground Motion Parameters	Peak Ground Acceleration (PGA)	REDAS (scenario-based analysis)
	Peak Ground Velocity (PGV)	REDAS (scenario-based analysis)
	Intensity	REDAS (scenario-based analysis)
	Rupture Distance (Rrup)	REDAS (scenario-based analysis)
Site-Specific Conditions	VS30	REDAS (scenario-based analysis)
Site-Specific Conditions	Elevation	Open Elevation
Excluded Feature	Building Footprint Area	OpenStreetMap (OSM)

Table 3. Precision and Recall Values for Masonry building 3 Damage states.

Damage State Class	Precision	Recall
Unsafe	0.69	0.65
To be Checked	0.54	0.51
Safe	0.69	0.71

Table 4. Precision and Recall Values for Masonry building 4 Damage states.

Damage State Class	Precision	Recall
SD	0.47	0.64
MD	0.73	0.04
LD	0.31	0.45
ND	0.62	0.70

Table 5. Precision and Recall Values for RC building 3 Damage states.

Damage State Class	Precision	Recall
Unsafe	0.73	0.57
To be Checked	0.53	0.61
Safe	0.67	0.73

Table 6. Precision and Recall Values for RC building 4 Damage states.

Damage State Class	Precision	Recall
SD	0.56	0.60
MD	0.53	0.1
LD	0.42	0.59
ND	0.63	0.74

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ein Larouzi, B.; Fahjan, Y. From Localized Collapse to City-Wide Impact: Ensemble Machine Learning for Post-Earthquake Damage Classification. Infrastructures 2026, 11, 25. https://doi.org/10.3390/infrastructures11010025

AMA Style

Ein Larouzi B, Fahjan Y. From Localized Collapse to City-Wide Impact: Ensemble Machine Learning for Post-Earthquake Damage Classification. Infrastructures. 2026; 11(1):25. https://doi.org/10.3390/infrastructures11010025

Chicago/Turabian Style

Ein Larouzi, Bilal, and Yasin Fahjan. 2026. "From Localized Collapse to City-Wide Impact: Ensemble Machine Learning for Post-Earthquake Damage Classification" Infrastructures 11, no. 1: 25. https://doi.org/10.3390/infrastructures11010025

APA Style

Ein Larouzi, B., & Fahjan, Y. (2026). From Localized Collapse to City-Wide Impact: Ensemble Machine Learning for Post-Earthquake Damage Classification. Infrastructures, 11(1), 25. https://doi.org/10.3390/infrastructures11010025

Article Menu

From Localized Collapse to City-Wide Impact: Ensemble Machine Learning for Post-Earthquake Damage Classification

Abstract

1. Introduction

2. Methodology of Damage Estimation Using AI and Machine Learning

2.1. Machine Learning Algorithms

2.1.1. Random Forest Classifier

2.1.2. Histogram Gradient Boosting Classifier

2.1.3. Bagging Classifier

2.1.4. Comparative Analysis of Random Forest, Histogram Gradient Boosting, and Bagging

2.2. Machine Learning Models

2.3. Dataset Creation

2.4. Damage States Classification

2.4.1. Actual Damage States

2.4.2. Proposed Damage States

2.5. Damage Estimation Across Multiple Spatial Scales

2.6. Dataset Uncertainties and Limitations

3. Case Study

3.1. Distribution of Building Damages in Kahramanmaras

3.2. Dataset

3.2.1. Dataset Collecting

3.2.2. Dataset Distribution and Visualizations

3.2.3. Dataset Features

4. Results

4.1. Model Accuracy and Performance

4.2. Building-Level Evaluation

4.3. District-Scale Evaluation

4.4. City-Scale Evaluation

4.5. Features Importance Analysis

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI