Application of Machine Learning for Predicting Seismic Damage in Base-Isolated Reinforced Concrete Buildings

Algamati, Mohamed; Al-Sakkaf, Abobakr; Bagchi, Ashutosh

doi:10.3390/civileng7010004

Open AccessArticle

Application of Machine Learning for Predicting Seismic Damage in Base-Isolated Reinforced Concrete Buildings

by

Mohamed Algamati

^1,2,*,

Abobakr Al-Sakkaf

^1,3,*

and

Ashutosh Bagchi

¹

Department of Building, Civil, and Environmental Engineering, Concordia University, Montreal, QC H3G 1M8, Canada

²

Department of Civil Engineering, Azzaytuna University, Tarhuna P.O. Box 5338, Libya

³

Department of Architecture & Environmental Planning, College of Engineering & Petroleum, Hadhramout University, Mukalla 50512, Yemen

^*

Authors to whom correspondence should be addressed.

CivilEng 2026, 7(1), 4; https://doi.org/10.3390/civileng7010004

Submission received: 6 October 2025 / Revised: 14 November 2025 / Accepted: 4 January 2026 / Published: 9 January 2026

(This article belongs to the Section Structural and Earthquake Engineering)

Download

Browse Figures

Versions Notes

Abstract

Base isolation is known as a useful and popular technique for seismic upgrading of reinforced concrete buildings. Predicting damage levels based on relative inter-story drift plays an important role for designing optimal base isolation systems. However, the existing codes usually rely on the acceleration spectrum for calculating the relative inter-story drift, and they do not provide an accurate estimation of the relative inter-story drift. Consequently, to cover the research gap, machine learning algorithms are being trained and used for identification of damage levels in retrofitted reinforced concrete buildings. More than 7000 datasets were derived by using nonlinear time-history and incremental dynamic analysis. A total of 48 reinforced concrete buildings with different stories and bay numbers were designed based on an older version of existing building codes, and then, base isolation systems were designed for the seismic retrofit. The machine learning algorithms used here were Decision Tree, Random Forest, Support Vector Machine, Extreme Gradient Boosting, and an Artificial Neural Network. Based on the results, four of the mentioned algorithms have the capability of predicting the damage level with an accuracy of more than 85%, with the best performance being reached by extreme gradient boosting with an accuracy of 89%. Finally, the most important parameters affecting the damage levels of retrofitted reinforced concrete buildings were derived.

Keywords:

base isolation; seismic retrofit; machine learning algorithms; inter-story drift; seismic damage prediction

1. Introduction

An earthquake is known as a devastating natural disaster leading to huge damage to buildings and infrastructures. Therefore, engineers must design buildings to protect them from seismic vibrations. Various technologies are available in the modern construction industry to reduce the financial and human losses in buildings due to ground shaking. Among these technologies, the installation and use of energy-dissipating devices is widely recognized as one of the most effective ways to lower the damage risk of the buildings exposed to an earthquake [1,2,3,4,5,6,7,8,9]. Using energy-dissipating devices can offer many advantages over the standard techniques for seismic upgrading. For instance, these modern devices can reduce both the total acceleration and inter-story drift in the buildings experiencing an earthquake, leading to the effective protection of both the non-structural and structural components of buildings [10].

Base isolation (BI) is a well-known energy-dissipating tool installed in many buildings. Buildings equipped with BI have shown an acceptable performance during some recent, strong, seismic events. For instance, in the 2023 earthquake event in Turkey, there was good performance recorded for hospital buildings with BI systems, which verified the seismic protection capabilities of the BI devices [11]. However, due to the complicated nature of seismic loads, using BI systems can sometimes have some challenges. These challenges can be observed both in far-field and near-field seismic events [12,13]. Consequently, base isolation has been widely studied in recent years given its associated advantages and challenges. Perez-Rocha et al. conducted research for studying the performance of BI for mid-rise buildings considering soil–structure interaction effects [14]. Li et al. evaluated the performance of a nonlinear hybrid base isolation system under the ground motions [15]. Wang et al. studied methods for enhancing the performance of base-isolated structures by employing an optimized, configurable friction isolator-tuned inerter damper system [16]. Liu et al. examined the seismic response of three-dimensional base-isolated structures, while considering the influence of rocking and P-D effects [17].

Confirming the effectiveness of the base isolation system requires a thorough evaluation of the structural response. Such evaluations may be conducted by means of experimental tests, numerical simulations, and modeling, or advanced artificial intelligence (AI) methods. Among the methods, experimental testing is typically the most expensive, whereas numerical modeling and AI-based approaches are usually favored, except in cases where the structural system demonstrates highly complex or unpredictable behavior. Nonlinear time-history analysis is known as one of the best techniques for evaluating the behavior of structures when an earthquake occurs. However, it can be a time-consuming process with huge numbers of computations, which requires creating a finite element model of structure [18]. The advent of complicated AI and soft computing techniques such as machine learning (ML) has made it possible to predict the response of structures under seismic events. Therefore, during recent years, many scholars have focused on evaluating different structural systems using AI tools [19,20,21,22,23]. Nguyen employed ML for predicting the seismic drift responses of planar steel moment frames [24]. Mangalathu et al. used ML interpretability techniques for seismic performance assessment of infrastructure systems [25]. Hwang et al. estimated the economic seismic loss of steel moment-frame buildings using an ML algorithm [26]. Zhang et al. presented data-driven ML methods for seismic response prediction of a damped structure [27]. Yu et al. employed a hybrid approach based on AI to evaluate the condition of bridge decks [28]. Yu et al. also used an EBA-optimized, chemistry-informed, interpretable deep learning model to predict the compressive strength of fly ash/slag-based geopolymer concrete [29].

Numerous studies have investigated the assessment of building damage under seismic events [30,31,32,33,34]. However, despite the extensive research on predicting building damage caused by earthquakes, limited attention has been given to buildings equipped with BI systems. Moreover, among the many studies employing ML techniques to predict the seismic performance of buildings, very few have focused on evaluating the response of retrofitted structures, particularly those retrofitted with BI [19,20,21,22,23]. Therefore, new studies seem essential to explore the capability of ML models in assessing the seismic response of BI-retrofitted buildings. Accordingly, the present study focuses on applying ML methods to predict the seismic damage of BI-equipped buildings. It is evident that accurate prediction of damage in BI-retrofitted reinforced concrete buildings can support engineers in rapid post-earthquake assessment, optimization of isolation design, and cost-effective retrofit decision-making, thereby enhancing the resilience of the built environment [24,25,26,27].

2. Methodology

In the current study, a six-step systematic procedure was employed. This method has been used in several previously published papers. The steps were performed as follows:

Design the RC buildings;
Design base isolation for seismic retrofitting;
Conduct nonlinear time-history analysis to gain datasets;
Classify the damage using proposed machine learning models;
Check the accuracy of learning models for damage prediction;
Train machine learning models named RF, ANN, DT, SVM, and XGBoost 3.1.1.

An overview of the research workflow is as follows: First, various reinforced concrete (RC) buildings, originally designed according to older versions of commonly used seismic codes in the United States, were considered. The proposed code references were AISC 7 and ACI 318, both published in 2002. Subsequently, BI systems were designed following the provisions of widely recognized standards, such as ASCE 7-22, and installed beneath the bottom columns of these buildings. Various seismic ground motions with differing intensities were then applied to the BI-retrofitted buildings. A portion of the resulting data, representing damage levels, was used to train ML models, while the remaining data was reserved for validating the accuracy of the ML algorithms. Finally, the damage levels were determined, and the results of the ML models were compared with those obtained from time-history analysis. Further details on each step are provided in the following sections.

The dataset produced for the machine learning models covers a wide range of configurations, including the number of stories, type of moment-resisting frame, span length of the RC buildings, and various ground motions with different intensities, distance from the fault, and frequency characteristics.

The machine learning models used in this study—DT, RF, XGBoost, SVM, and an ANN—were selected because they can effectively capture nonlinear and interaction effects among input parameters which are characteristic of the seismic response of base-isolated buildings [35,36,37].

The features used in the machine learning algorithms were classified into three main groups: building parameters (e.g., natural period), seismic ground motion parameters (e.g., peak ground acceleration), and soil property parameters (e.g., soil type). The feature selection process was guided by previous studies, correlation analysis, and feature importance ranking to remove redundant variables and retain only the most influential ones. This approach enhances both model accuracy and interpretability [22,23,24,25]. The dataset was divided into training (87%) and testing (13%) subsets using a fixed random seed to ensure reproducibility. Five-fold cross-validation was applied on the training data to achieve stable and unbiased model performance.

Evaluation metrics, including Precision, Recall, and F1-score, were used to assess the performance of the classification models. Precision measures the proportion of correctly predicted positive cases, Recall quantifies the model’s ability to detect all actual positive cases, and the F1-score represents their harmonic mean. These metrics together provide a comprehensive and balanced evaluation of classification accuracy and reliability, especially for datasets with imbalanced class distributions.

The study is organized as follows: Section 2 provides a brief overview of the ML algorithms employed. Section 3 describes the modeling of the buildings and the time-history analysis conducted using finite element (FE) software. Section 4 discusses the properties of the seismic events, while Section 5 presents the generation and training of the datasets. Section 6 evaluates the accuracy of the ML algorithms, and Section 7 concludes the study.

3. Overview of Machine Learning Methods for Damage Prediction

In this section, a brief description is presented about the ML techniques employed in this study. Techniques used include an Artificial Neural Network (ANN), Support Vector Machine (SVM), Random Forest (RF), Decision Tree (DT), and Extreme Gradient Boosting (XGBoost). Presenting a detailed discussion about these methods is beyond the scope of this study, and more information can be found in the cited articles [38]. These methods were selected because of their popularity and widespread application in the articles published about using ML for predicting the seismic behavior of civil engineering structures.

The working mechanism of ANNs was inspired by the structure of animal brains. The core structure of an ANN consists of interconnected nodes (or neurons), where each connection is assigned a specific weight [38]. During training, ANNs adjust the weights of connections between nodes to minimize the prediction error. Once trained, the network uses these learned weights to predict outputs based on input data. ANNs work by employing a feed-forward process to compute outputs and are typically trained using a backpropagation algorithm that updates the weights based on error minimization [19]. Three types of layers known as input, output, and hidden layer(s) exist in an ANN. Information is fed into the ANN through the input layer, while the output layer is responsible for presenting the results. The hidden layers, located between the output and input layers, learn internal representations that enable the ANN to perform complex mappings from inputs to outputs [38].

SVMs are supervised learning algorithms commonly used for data classification due to their high accuracy and low computational cost [39,40]. SVMs work by identifying the optimal hyperplane that maximizes the margin between data points of different classes. The data points that lie closest to this hyperplane are known as support vectors, and they play a key role in defining the decision boundary. SVMs can be classified into linear and nonlinear models, depending on the complexity of the data distribution [38].

A DT is a powerful and simple tool used for data classification. It classifies the data by creating a tree-like model. The main elements of this method are leaves, nodes, and branches. The outcomes of the tests conducted by DT are presented by each branch, while internal nodes are responsible for testing the features. Using the DT can sometimes lead to overfitting issues [38].

RF is an ensemble learning method that produces several decision trees in the training process and combines their outputs to improve accuracy and mitigate overfitting. Each tree is trained based on a random subset of the data with randomly selected features. The final classification of the data is determined through majority voting across all trees [19].

XGBoost is known as one of the developed gradient boosting algorithms designed for improving efficiency and performance. In this method, a group of decision trees are produced, and each decision tree proposes to correct the errors made by its predecessors. Similarly to RF, XGBoost can be a good choice for reducing the overfitting issues of the DT [19].

The selected ML algorithms were specifically chosen to address the nonlinear and multi-parameter nature of the seismic response in base-isolated reinforced concrete buildings. The behavior of these systems depends on a combination of the base-isolated building’s behavior and ground motion characteristics, resulting in complex nonlinear relationships that are difficult to represent analytically. Tree-based ensemble models such as RF and XGBoost were adopted for their robustness, ability to capture nonlinear feature interactions, and capability to rank variable importance—allowing identification of the most influential parameters governing structural response [41,42,43]. An ANN was included because it can approximate highly nonlinear mappings and learn hidden relationships between seismic input parameters and response outputs [44,45]. An SVM was used to benchmark performance on relatively small but high-quality datasets due to its strong generalization capability [46,47], while DT provides interpretability and transparency in identifying key decision boundaries [48].

4. Modeling and Design Details

The focus of this study is to evaluate the performance of retrofitted RC moment-resisting frame buildings using BI. The BI employed in this study is the lead rubber bearing (LRB). The selected buildings for retrofitting range from 5 to 8 floors with three-, four-, and five-bay configurations. In each building, columns are evenly spaced, with span lengths of 5 m, 6.25 m, and 8.33 m for buildings with five, four, and three bays, respectively. It is further assumed that the bay spacing is uniform in both the X and Y directions, and all modeled structures have a symmetrical rectangular plan.

For the purposes of this study, the RC buildings were originally designed without BI. It is assumed that the seismic load is calculated based on an older version of the ASCE 7 code, while the retrofitting was performed according to the latest version. Specifically, the older version refers to the ASCE 7 code published in 2002, while the new version refers to the ASCE 7 released in 2022. Although the general methodology of the two versions is similar, certain parameters, such as the base shear coefficient, have increased in some regions of the United States. For example, in Salt Lake City and Seattle, the base shear coefficient for site class C has increased by more than 25% between the older and newer versions of the ASCE 7 [49,50].

In the design of the RC buildings, the dead loads and live loads are as follows: The live load is assumed to be 200 kg per square meter, while the dead load is assumed to be 500 kg per square meter, excluding the RC structural members. The weight of the RC structural members, including beams and columns, is accounted for separately. Furthermore, the seismic mass of the building is considered as the sum of the dead loads and 20% of the live loads. According to ASCE 7-02, the base shear coefficient can then be calculated as follows [49]:

C_{S} = \frac{S_{D S}}{\frac{R}{I}}

(1)

where I represents the importance factor, which is selected to be 1.0 in the designed buildings, R denotes the R factor, and S_DS is the design spectral acceleration. The value of C_S should not exceed

\frac{S_{D 1}}{T (\frac{R}{I})}

, where T denotes the natural period of the building and S_D₁ represents the design spectral at the period of 1.0 s.

The RC buildings modeled in this study are located on sites B and C according to the ASCE 7 classification. The buildings have R-factors of 6 and 8, corresponding to intermediate and special RC moment-resisting frames, respectively. Note that the R-factors of 6 and 8 correspond to RC structures with medium and high ductility, respectively, and in the design process, the detailing requirements associated with both ductility levels are considered. In summary, the buildings differ in the number of stories, bay lengths, site location, and R-factor, resulting in a total of 48 RC buildings modeled and analyzed.

The nonlinear time-history analysis in this study was conducted using the OpenSees software platform. Both OpenSees and OpenSeesPy provide frameworks for simulating the seismic response of structural systems. OpenSeesPy was selected because it operates within the Python 3.12 programming environment, which offers convenient libraries for ML applications. A brief overview of the modeling details in OpenSeesPy is provided below.

The structural models used for the time-history analysis were two-dimensional. Beams and columns were modeled using nonlinearBeamColumn elements. The materials selected for modeling the structural concrete and steel rebar were Concrete02 and Steel02, respectively, which allow consideration of the cyclic degradation in RC sections. The properties of the proposed steel are listed in Table 1. In Table 1, Fy and E denote the yield stress and Young modulus of the steel rebar, whereas the parameters R0, R1, and R2 control the transition from elastic to plastic branches [51]. Moreover, the properties of the concrete materials used in the numerical modeling are listed in Table 2. Note that the parameters f’_c and

ε_{C 0}

denote the compressive strength and corresponding compressive strain, while

f_{p c u}

and

ε_{c u}

represent the ultimate compressive strength and ultimate strain of the concrete material. Moreover,

λ

is defined as the ratio between the unloading slope at

ε_{c u}

and initial slope [51]. The mechanical parameters of the concrete are defined for both core and cover concrete, and, as seen in Table 2, the core concrete has a higher compressive strength and ultimate strain. Additionally, the stiffness and strength of the concrete under tension are ignored in modeling [52].

It is noted that the type of BI selected for this study is the lead rubber bearing (LRB). According to the provisions of ASCE 7-22, several factors must be considered in the design of the LRBs. The design of LRBs as seismic isolation devices under ASCE 7-22 focuses on ensuring adequate energy dissipation, sufficient vertical and lateral load-carrying capacity, and stability under both seismic and gravity loads. Another critical aspect of LRB design under ASCE 7-22 is the evaluation of displacement demand under the Design Earthquake (DE) and Maximum Considered Earthquake (MCE) levels. The isolation system must accommodate the maximum expected displacement without instability or excessive strain [50].

The nonlinear hysteretic behavior of the BI system is modeled as a bilinear hysteretic material as illustrated in Figure 1. The LRB material is characterized by a post-yielding shear stiffness of 60 psi (0.4136 MPa) [53].

In this study, the output of the ML models is the classification of damage levels following seismic excitations. Damage levels are defined based on maximum inter-story drift ratios (MISDRs), using a five-grade classification: null to slight, moderate, heavy, very heavy, and collapse [54]. The MISDR range corresponding to each damage level is shown in Table 3.

The RC buildings modeled in the current study represent typical mid-rise structures retrofitted by installing BIs. The dataset covers a wide range of building heights, fundamental periods, span lengths, site classes, and ductility levels to ensure representativeness of realistic design variations. The isolation system parameters, including yield strength, post-yield stiffness ratio, and effective isolation period, were selected within ranges reported in design guidelines and validated experimental studies, ensuring that the models reflect realistic isolation behavior observed in practice [50,54,55].

During the design stage, the variability of the BI system and the superstructure properties was explicitly incorporated to represent the expected range of physical and construction uncertainties. The mechanical properties of the isolators—such as effective stiffness, post-yield stiffness ratio, yield strength, and damping—were varied within the upper and lower bounds recommended by ASCE 7-22 to reflect manufacturing tolerances, aging effects, and testing uncertainties inherent in practical isolation systems. Similarly, a ±10% variation in the first fundamental period of the base-isolated building was introduced to account for plausible deviations in material stiffness, concrete strength, and reinforcement detailing that typically influence global dynamic characteristics. These variations ensure that the modeled structures represent realistic design scenarios while capturing the influence of parameter uncertainty on the predicted seismic performance of base-isolated reinforced concrete buildings.

The nonlinear analyses in this study were conducted using two-dimensional (planar) models of the base-isolated structure. While three-dimensional (3D) effects such as torsion and out-of-plane coupling can influence seismic response, the use of planar (2D) modeling remains widely accepted in design codes and research for plan-regular buildings with isolation systems. For example, recent reviews of base isolation systems emphasize that much of the parametric- and performance-based research continues to employ 2D idealizations to balance computational cost and input-motion sampling [56,57]. By adopting this simplified modeling framework, we were able to conduct an extensive parametric and machine learning investigation.

In this study, the soil–structure interaction (SSI) effects were neglected to focus on developing and validating the ML framework for predicting the seismic response of base-isolated buildings. The primary objective was to evaluate the capability of the proposed ML model to capture the nonlinear behavior of isolated superstructures under various ground motions, assuming a fixed-base condition. It is noted that there are many studies in the field of using machine learning to predict the behavior of base-isolated buildings in which the effects of SSI are ignored [58].

It should be noted that during the application of ML, the mechanical properties of the base isolation system and the structural parameters were treated as deterministic values. This assumption is consistent with the purpose of the study, which focuses on developing and validating an ML framework for predicting the seismic response of base-isolated buildings rather than performing a full probabilistic risk analysis. The variability of isolator and structural properties had already been represented during the design stage through code-based parameter bounds and the ±10% variation in the fundamental period, ensuring that the dataset used for training adequately spans the realistic design domain. Using deterministic inputs allows the ML models to learn clear cause–effect relationships between seismic demand parameters and structural responses without introducing additional stochastic noise [59,60,61].

Table 3. The range of maximum inter-story drift ratios for each class of damage in the retrofitted RC buildings [62].

Damage Level	Null to Slight (Level 1)	Moderate (Level 2)	Heavy (Level 3)	Very Heavy (Level 4)	Collapse (Level 5)
Range of MISDR	0 to 0.25%	0.25% to 0.5%	0.5% to 1%	1% to 2%	>2.0%

5. Seismic Excitation Properties

Eleven seismic events were selected for the time-history analysis. The acceleration history of each seismic event is presented in Appendix A, with all seismic events scaled to a maximum peak ground acceleration (PGA) of 1.0 g. For the Kobe (1995) and Tabas (1978) earthquakes, two ground motions from two different stations are considered. The selected seismic events comprise both near-field and far-field earthquakes, which can result in differing performance of base isolation systems. Other properties of the ground motions are listed in Table 4.

To enable a more detailed comparison, the displacement, acceleration, and velocity response spectra of the seismic events were analyzed for a structure with 5% of critical damping. In the derivation process of each spectrum, the PGA was set to be 1.0 g. The resulting spectral displacement, velocity, and acceleration are shown in Figure 2. As observed, near-field events such as Loma Prieta (1989), Kobe I (1995), and Chi-Chi (1999) exhibit higher spectral velocity peaks.

As seen in Figure 2, Table 4, and Appendix A, the selected ground motions represent a wide variety, which allows the acquisition of a suitable insight about the buildings’ performance under various seismic vibrations with different properties. The unscaled magnitude of the ground motions varies from 6.5 to 7.62, while the PGA of the original events covers a band from 0.09 g to 1.23 g. Furthermore, the selected ground motions include both far-fault and near-fault seismic events, where the events El Centro, Northridge, Tabas I, Kobe II, Tabas II, San Fernando, and Friuli are far-field ground motions, while Kobe I, Chi-Chi, Duzchi, and Loma Prieta are near-field ground motions. Moreover, according to Figure 2c the main peak of the acceleration spectrum covers a range from the period (T) of near 0.1 s to 0.85 s, showing the wide range of the dominant period and frequency contents in the proposed seismic ground motions.

For dataset generation, each seismic event was scaled across a range of PGAs from 0.05 g to 0.70 g. Initially, each proposed seismic event was considered at a maximum PGA of 0.05 g, and the corresponding damage level was recorded. Incremental dynamic analysis (IDA) was then conducted, increasing the PGA in 0.05 g increments until structural collapse occurred. This procedure resulted in the generation of more than 7000 datasets.

6. Dataset Generation for Model Training

This study considers both the structural system and the seismic excitation properties used to train the ML models, enabling the prediction of damage levels in retrofitted RC buildings.

As seen in Table 5, 28 parameters were considered to train the datasets in the ML algorithms. Among these 28 variables, 15 variables referred to ground motions, while 12 referred to the properties of the retrofitted buildings, such as height and vibration period in the first mode, and 1 of them represented the site type of the buildings.

It is noted that the input parameters shown in Table 5 were not explicitly normalized or standardized prior to model training. As the majority of the employed algorithms (DT, RF, and XGBoost) are tree-based and inherently insensitive to feature scaling, normalization was not required for those models. However, the SVM and ANN can be sensitive to input magnitudes. In this study, since all input features were within a comparable numerical range (after checking for outliers and unit consistency), the models achieved stable convergence without additional normalization [63,64,65,66,67].

Prior to training the ML models, the datasets were classified into two groups: one group for training the models and the other for evaluating their performance. It is a well- established fact that the accuracy of ML models is highly dependent on the quality and composition of the training data. In this study, approximately 83% of the dataset was used for training the ML models, while the remaining 17% was reserved for testing and validating model accuracy.

To ensure robust performance, the training data were carefully selected to represent the full range of structural and seismic conditions. Subsequently, the performance of each ML algorithm was evaluated using the selected random data. As previously described, the modeled buildings vary in the number of stories, number of bays, and type of RC moment-resisting frame. For each frame with a specified number of stories, the data corresponding to one bay configuration and one type of moment-resisting frame were randomly excluded from the training dataset. The bay configurations included three, four, and five bays, while the moment-resisting frame types were designated as Type 1 and Type 2, corresponding to special and intermediate frames, respectively.

The damage data derived from the time-history analyses were subsequently examined. Figure 3 presents the percentage distribution of each damage level across the entire dataset. As observed, damage levels 3 and 4 are the most prevalent. It is important to note, based on Table 3, that the MISDR range corresponding to these damage levels is wider than that of other levels, which explains their higher frequency in the dataset. The percentage of datasets corresponding to damage level 5 was about 13%, and it can be increased if the number of datasets produced with higher PGAs is also increased. However, earthquakes with PGAs higher than 0.70 g are very rare, and in the time-history analysis conducted in this study, the PGAs value do not exceed 0.70 g.

As shown in Figure 3, levels 3–4 have higher frequencies. In this study, no oversampling or under-sampling technique was applied, as the models were trained on the original dataset to preserve its natural distribution.

To mitigate potential bias, a stratified train–test split was adopted, ensuring that each damage level was proportionally represented in both subsets. Furthermore, class-wise performance metrics (Precision, Recall, and F1-score) were examined, which are presented in the next sections of this paper.

Although this approach maintains the physical realism of the data, it may introduce a minor bias toward the more frequent classes (levels 3–4). This reflects the fact that heavy to very heavy damage states are more prevalent in the simulated scenarios and thus represent realistic but slightly imbalanced conditions. The influence of this imbalance was carefully evaluated through class-wise metrics and confusion matrices to confirm that model performance remained robust across all damage levels.

It is noted that the dataset was generated using numerical simulations based on well-established material and isolation models. To evaluate the sensitivity of the results to modeling assumptions, a range of structural and isolation parameters, such as stiffness, natural period, and height, were used within realistic bounds recommended by design codes. This parametric variability allowed the dataset to capture diverse structural behaviors and reduced dependence on specific modeling choices. Nevertheless, some simplifications were adopted to maintain computational efficiency, such as two-dimensional representation of the buildings and deterministic material properties. These assumptions may influence absolute response values but are not expected to affect the overall trends learned by the machine learning models [68].

Figure 4 shows the histogram of the dataset produced after conducting the time-history analysis. The data are approximately uniformly distributed for MISDR values between 0.1% and 1.5%. Beyond MISDR = 1.5%, the density of data decreases, reflecting the lower probability of very strong seismic events. Consequently, the number of cases exhibiting severe damage declines in the dataset.

7. Results of Machine Learning Classification

To achieve high accuracy in the ML algorithms, the hyperparameters of each model need to be selected within an appropriate range. In this study, the hyperparameters selected for each ML algorithm are listed in Table 6. The hyperparameters listed in Table 6 were determined through a systematic random search process aimed at minimizing the prediction error on the validation dataset. Several combinations of parameters were tested, and the configuration yielding the best performance metrics was selected as optimal for model training. It is noted that the depths of the DT and RF models were varied between 10 and 25, and the maximum reduction in their accuracy was less than 1%, indicating that both models were relatively insensitive to overfitting. Furthermore, the lower accuracy of the DT model demonstrates its limited ability to capture the complex nonlinear behavior of structures under earthquake excitation.

The performance of ML algorithms for the prediction of damage levels in the retrofitted buildings using BI must be evaluated using appropriate criteria. Based on published papers in the field of ML application in the prediction of damage level of buildings exposed to earthquakes, three criteria known as Precision, F1-score, and Recall were selected to assess and compare the accuracy of each model. The mentioned criteria are defined in Table 7. Note that in Table 7, True Positives (TPs) refer to instances where the model correctly predicts a given class. False Positives (FPs) and False Negatives (FNs) represent cases where the model incorrectly assigns a class or fails to identify it, respectively.

Table 8 presents the F1-score of the employed ML algorithms for predicting different damage levels (DLs) in the retrofitted buildings. Note that the averages shown in the table denote the weighted averages. Overall, the average accuracy for all algorithms exceeds 84%. The maximum accuracy is achieved by the XGBoost method, while the minimum accuracy is observed in the performance of DT with a difference of 5% between the two algorithms. For most ML methods, the maximum accuracy is observed in DL1, while the minimum accuracy is seen for the results corresponding to DL5. It is noteworthy that approximately 13% of the dataset represents DL5, which may contribute to the reduced accuracy for this class. Across all ML models, the F1-scores exhibit a non-monotonic trend: accuracy decreases from DL1 to DL2, increases from DL2 to DL4, and declines again from DL4 to DL5. Figure 5 shows a bar plot of the accuracy metrics for the ML methods, and similar trends are observed for Precision and Recall.

In Table 8, although the imbalanced data distribution affects predictive accuracy for extreme cases, the models still achieve acceptable performance: the difference between the overall average F1-score and that of damage level 5 was less than 10%. This suggests that the minority class was reasonably well captured despite its smaller representation.

To quantitatively confirm model differences, paired t-tests were conducted on the 5-fold macro-F1-scores of all models. The results indicated that XGBoost achieved statistically higher performance than the Decision Tree, Random Forest, SVM, and ANN (p < 0.01 for all comparisons). These results confirm that the superiority of XGBoost, although visually subtle in Figure 5, is statistically significant. The corresponding t-values and p-values are reported in Table 9. Table 9 presents the t-statistics and corresponding p-values obtained from paired t-tests on the 5-fold cross-validation macro-F1-scores. The t-statistic indicates the magnitude of difference between models, while the p-value denotes the probability that this difference occurred by chance. Values of p < 0.05 were considered statistically significant.

The prediction accuracy of the ML techniques was further evaluated using confusion matrices as depicted in Figure 6. The confusion matrices are presented for five algorithms: DT, RF, SVM, XGBoost, and an ANN. For instance, the algorithm XGBoost, which achieved the best performance among the ML techniques used in this paper, correctly labeled and predicted 103 out of 109 samples for DL1. In contrast, for the highest damage level (DL5), the number of correct predictions was lower with 91 correct predictions out of 117 samples. Again, the worst performance can be seen for the highest DL.

The superior performance of XGBoost and the ANN compared to DT and RF can be attributed to their ability to capture the strongly nonlinear relationships governing the seismic response of base-isolated RC structures. The behavior of such systems is highly dependent on the interaction between isolator stiffness, ground motion properties, and superstructure flexibility, which leads to complex patterns in drift and damage responses. XGBoost, through gradient boosting and adaptive weighting of misclassified samples, effectively learns these intricate dependencies, while the ANN benefits from its multi-layer architecture to approximate nonlinear mappings between input features and damage levels. In contrast, DT and RF, although interpretable, rely on axis-aligned splits that limit their ability to fully represent the coupled influence of multiple isolation parameters, resulting in comparatively lower prediction accuracy [69].

Examination of the confusion matrices showed that most misclassifications occurred between adjacent damage levels, particularly between DL3 and DL4. This pattern is expected because structural damage evolves continuously rather than in discrete steps, and the boundary between moderate and severe damage is often ambiguous. Misclassification in the collapse category (DL5) stems mainly from the limited number of high-intensity samples and the inherent uncertainty of extreme responses in nonlinear dynamic analyses [70]. Despite these challenges, the models—especially XGBoost and the ANN—maintained consistent predictive accuracy across the range of seismic intensities represented in the dataset. This robustness suggests that the learned relationships between isolation properties, seismic demand, and damage level are stable and generalizable under varying loading conditions.

Finally, the most effective parameters in damage level identification of RC retrofitted buildings by using a BI system are shown in Figure 7. The ML models were trained considering 28 factors, and Figure 7 shows the most important ones. The top three most effective factors are spectral velocity corresponding to the period of the first mode SV(T₁), peak ground velocity (PGV), and Housner intensity (HI). These three parameters are highly dependent on the velocity history of the seismic event and the natural period of the retrofitted buildings. Based on the ranking, the value of the acceleration spectrum in the period of T1 is much less important than the corresponding value of the velocity spectrum. The results shown in Figure 7 support the fact that the seismic performance of base-isolated (BI) buildings is highly influenced by the distance of the seismic event from the fault. It is also important to note that near-field ground motions typically exhibit high maximum spectral velocity (MSV) due to the pulse-like characteristics present in their time-histories.

As seen in Table 7, most of the ML algorithms exhibit their highest accuracy for the lowest damage level (DL1), followed by a noticeable drop from DL1 to DL2. This trend indicates that predicting the onset of nonlinear response in base-isolated buildings is more challenging than identifying the undamaged state. As shown in Figure 7, the spectral velocity at the fundamental period, S_V(T₁), is the most influential parameter in determining the damage level. The observed decrease in accuracy can therefore be attributed to the shift in the fundamental period of the structure as the isolation system enters higher nonlinearity with increasing seismic demand. This period elongation alters S_V(T₁) and reduces the ability of models trained under lower-intensity conditions to accurately predict transitions from elastic to inelastic behavior.

8. Conclusions

This study focused on using ML models to predict the damage levels of retrofitted RC buildings. Initially, 48 RC buildings were designed based on previous versions of the ASCE. The buildings differed in the number of stories (5–8), number of bays (3–5), site classification (types B and C), and type of moment-resisting frame (special and intermediate).

After designing the buildings based on older versions of ASCE and ACI codes, the BI system was designed for upgrading the buildings. The selected type for the BI system was lead rubber bearing, and it was designed based on the provisions of ASCE-22. Afterward, a finite element model was produced for each building using the OpenSees software package. Seismic excitations with variable intensities were imposed on the modeled buildings, and nonlinear time-history and incremental dynamic analyses were conducted. The output of the time-history analysis was the damage level in each building, which was specified based on maximum inter-story drift. Five damage levels were defined based on the maximum inter-story drift.

Afterward, five ML models named DT, RF, SVM, XGBoost, and ANN were employed to predict the damage level in the buildings. The performance of ML models is dependent on the data used for training them. Therefore, a random part of data was used to train the ML models, and the minimum accuracy gained for each method was finally recorded. This process led to removing the effects of data used for training on the performance of ML models.

The performance of ML algorithms was compared by defining appropriate parameters named F1-score, Precision, and Recall. Furthermore, confusion matrices were assembled to present a better comparison of the ML methods in each damage level. Based on the results, the ML algorithms were able to predict the damage level with an accuracy between 84% and 89%. The minimum accuracy was derived for DT, while the maximum accuracy was derived for XGBoost. For each ML algorithm, the best performance was observed for damage level 1, while the worst performance was observed for damage level 5. Note that damage level 1 corresponds to minimum inter-story drift and damage, while damage level 5 corresponds to the maximum damage level.

The most influential parameters affecting the damage levels of the RC buildings retrofitted with BI systems were identified. Based on the results, the velocity time-history of the ground motion emerged as the dominant factor in the damage level of the retrofitted RC buildings using BI. The top three parameters affecting the damage level were the spectral velocity corresponding to the period of the first mode of the retrofitted building, peak ground velocity (PGV), and Housner intensity. These parameters are strongly influenced by the distance of the seismic events from the faults, distinguishing between far-field and near-field events, which is also a very important parameter in the design process of the BI systems for retrofitting the RC buildings. Furthermore, the velocity time-history was found to have a greater impact on damage prediction than the acceleration time-history.

In this study, it was demonstrated that the proposed ML models can predict the damage levels of RC buildings retrofitted with base isolation with suitable accuracy. Such predictive capability provides engineers with a practical decision-support tool when evaluating retrofit options for existing RC buildings, particularly those designed according to older seismic codes with lower base-shear capacity. By using the ML-based predictions, engineers can better assess whether base isolation will achieve the desired performance objectives, estimate the potential reduction in damage under different seismic intensities, and compare the cost-effectiveness of BI retrofitting against alternative strengthening methods. This highlights the practical relevance of the proposed approach for real engineering decision-making in seismic retrofit projects.

Finally, despite the large number of time-history analyses conducted in this study, several limitations remain and are recommended for future investigation. First, this research employed two-dimensional models of retrofitted RC buildings. Although 2D modeling is widely accepted in many well-known seismic design codes, extending the analyses to fully three-dimensional models would provide a more comprehensive understanding of the structural behavior, especially for seismically isolated systems. In addition, the effects of soil–structure interaction were not considered in the present work and should be examined in future studies to better capture the realistic structural response. Moreover, both the properties of the retrofitted buildings and the mechanical characteristics of the base isolation systems can exhibit significant variability. Incorporating sensitivity analyses and explicitly accounting for these uncertainties within the machine learning framework would enhance the robustness and generalizability of the predictive models.

Author Contributions

M.A., A.A.-S., and A.B. developed the methodology and concept. A.B., A.A.-S., and A.B. aided in developing the methodology and concept. A.B., A.A.-S., and M.A. analyzed the findings and the results of the models and aided in writing the article. A.B. supervised this study. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available upon request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. The Acceleration Time-History of the Ground Motions

Figure A1. Time-histories of the selected earthquake records: (a) El Centro (1940); (b) Kobe I (1995); (c) Loma Prieta (1989); (d) Northridge (1994); (e) Tabas I (1978); (f) Chi-Chi (1999); (g) Duzchi (1999); (h) Kobe II (1995); (i) Tabas II (1978); (j) San Fernando (1971); (k) Friuli (1976).

References

Rezazadeh, H.; Amini, F.; DoganiAghcheghloo, P.; Khansefid, A. Effects of geometrical nonlinearity on the performance of bidirectional tuned mass dampers. Earthq. Eng. Struct. Dyn. 2021, 50, 3220–3242. [Google Scholar] [CrossRef]
Rezazadeh, H.; Jafarzadeh, V.; Atabakhsh, S.; Aghcheghloo, P.D. A novel passive nonlinear two-DOF internal resonance-based tuned mass damper. Mech. Syst. Signal Process. 2023, 204, 110788. [Google Scholar] [CrossRef]
Amini, F.; Rezazadeh, H.; Afshar, M.A. Adaptive control of rotationally non-linear asymmetric structures under seismic loads. Struct. Eng. Mech. Int. J. 2018, 65, 721–730. [Google Scholar]
Losanno, D.; Ravichandran, N.; Parisi, F.; Calabrese, A.; Serino, G. Seismic performance of a Low-Cost base isolation system for unreinforced brick Masonry buildings in developing countries. Soil Dyn. Earthq. Eng. 2021, 141, 106501. [Google Scholar] [CrossRef]
Md, Z.N.; Mohan, S.C.; Jyosyula, S.K.R. Development of low-cost base isolation technique using multi-criteria optimization and its application to masonry building. Soil Dyn. Earthq. Eng. 2023, 172, 108024. [Google Scholar]
Rahgozar, A.; Estekanchi, H.E.; Mirfarhadi, S.A. On optimal lead rubber base-isolation design for steel moment frames using value-based seismic design approach. Soil Dyn. Earthq. Eng. 2023, 164, 107520. [Google Scholar] [CrossRef]
Rezazadeh, H.; Amini, F.; Afshar, M.A. Effect of inertia nonlinearity on dynamic response of an asymmetric building equipped with tuned mass dampers. Earthq. Eng. Eng. Vib. 2020, 19, 499–513. [Google Scholar] [CrossRef]
Rezazadeh, H.; Amini, F. The effect of non-linear inertia on dynamic response of asymmetric multi-story buildings. Vibroengineering Procedia 2018, 17, 31–36. [Google Scholar] [CrossRef]
Bargahi, R.; Rezazadeh, H.; Esmaeilabadi, R.; Atabakhsh, S.; Aminnejad, B. Exploiting the vertical vibration to enhance the performance of a tuned mass damper exposed to the horizontal excitation. Structures 2025, 81, 110317. [Google Scholar] [CrossRef]
Tubaldi, E.; Barbato, M.; Dall’Asta, A. Performance-based seismic risk assessment for buildings equipped with linear and nonlinear viscous dampers. Eng. Struct. 2014, 78, 90–99. [Google Scholar] [CrossRef]
Kandemir, E.C. The effect of frequency-filtered earthquakes for optimum base isolation parameters across varied soil conditions. Soil Dyn. Earthq. Eng. 2025, 193, 109330. [Google Scholar] [CrossRef]
Li, Y.; Li, C.; Zhao, G.H. Seismic isolation design for simply-supported beam bridges based on the energy balance method under near-fault ground motions. Soil Dyn. Earthq. Eng. 2021, 145, 106730. [Google Scholar] [CrossRef]
Zhang, Y.; Hu, Y.; Li, N.; Xie, L.; Wang, Z.; Liu, D. Isolation performance evaluation of base-isolated system with active nonlinear negative stiffness devices. Soil Dyn. Earthq. Eng. 2024, 179, 108565. [Google Scholar] [CrossRef]
Pérez-Rocha, L.E.; Avilés-López, J.; Tena-Colunga, A. Base isolation for mid-rise buildings in presence of soil-structure interaction. Soil Dyn. Earthq. Eng. 2021, 151, 106980. [Google Scholar] [CrossRef]
Li, C.; Chang, K.; Cao, L.; Huang, Y. Performance of a nonlinear hybrid base isolation system under the ground motions. Soil Dyn. Earthq. Eng. 2021, 143, 106589. [Google Scholar] [CrossRef]
Wang, Y.; Ye, K.; Hu, L. Enhancing base-isolation structures by optimized configurable friction isolator-tuned inerter damper system. Eng. Struct. 2025, 329, 119818. [Google Scholar] [CrossRef]
Liu, X.Y.; Xu, Z.D.; Huang, X.H.; Tao, Y.; Du, X.; Miao, Q. Seismic response analysis of three-dimensional base isolation structures considering rocking and P-Δ effects. Eng. Struct. 2025, 328, 119689. [Google Scholar] [CrossRef]
Shabbir, K.; Noureldin, M.; Sim, S.H. Data-driven model for seismic assessment, design, and retrofit of structures using explainable artificial intelligence. Comput. -Aided Civ. Infrastruct. Eng. 2025, 40, 281–300. [Google Scholar] [CrossRef]
Asgarkhani, N.; Kazemi, F.; Jakubczyk-Gałczyńska, A.; Mohebi, B.; Jankowski, R. Seismic response and performance prediction of steel buckling-restrained braced frames using machine-learning methods. Eng. Appl. Artif. Intell. 2024, 128, 107388. [Google Scholar] [CrossRef]
Harirchian, E.; Hosseini, S.E.A.; Novelli, V.; Lahmer, T.; Rasulzade, S. Utilizing advanced machine learning approaches to assess the seismic fragility of non-engineered masonry structures. Results Eng. 2024, 21, 101750. [Google Scholar] [CrossRef]
Hwang, S.H.; Mangalathu, S.; Shin, J.; Jeon, J.S. Machine learning-based approaches for seismic demand and collapse of ductile reinforced concrete building frames. J. Build. Eng. 2021, 34, 101905. [Google Scholar] [CrossRef]
Shafighfard, T.; Kazemi, F.; Bagherzadeh, F.; Mieloszyk, M.; Yoo, D.Y. Chained machine learning model for predicting load capacity and ductility of steel fiber–reinforced concrete beams. Comput.-Aided Civ. Infrastruct. Eng. 2024, 39, 3573–3594. [Google Scholar] [CrossRef]
Alhusban, M.; Alhusban, M.; Alkhawaldeh, A.A. The Efficiency of Using Machine Learning Techniques in Fiber-Reinforced-Polymer Applications in Structural Engineering. Sustainability 2024, 16, 11. [Google Scholar] [CrossRef]
Nguyen, H.D.; Dao, N.D.; Shin, M. Prediction of seismic drift responses of planar steel moment frames using artificial neural network and extreme gradient boosting. Eng. Struct. 2021, 242, 112518. [Google Scholar] [CrossRef]
Mangalathu, S.; Karthikeyan, K.; Feng, D.C.; Jeon, J.S. Machine-learning interpretability techniques for seismic performance assessment of infrastructure systems. Eng. Struct. 2022, 250, 112883. [Google Scholar] [CrossRef]
Hwang, S.H.; Mangalathu, S.; Shin, J.; Jeon, J.S. Estimation of economic seismic loss of steel moment-frame buildings using a machine learning algorithm. Eng. Struct. 2022, 254, 113877. [Google Scholar] [CrossRef]
Zhang, T.; Xu, W.; Wang, S.; Du, D.; Tang, J. Seismic response prediction of a damped structure based on data-driven machine learning methods. Eng. Struct. 2024, 301, 117264. [Google Scholar] [CrossRef]
Yu, Y.; Rashidi, M.; Dorafshan, S.; Samali, B.; Farsangi, E.N.; Yi, S.; Ding, Z. Ground penetrating radar-based automated defect identification of bridge decks: A hybrid approach. J. Civ. Struct. Health Monit. 2025, 15, 521–543. [Google Scholar] [CrossRef]
Yu, Y.; Al-Damad, I.M.A.; Foster, S.; Nezhad, A.A.; Hajimohammadi, A. Compressive strength prediction of fly ash/slag-based geopolymer concrete using EBA-optimised chemistry-informed interpretable deep learning model. Dev. Built Environ. 2025, 23, 100736. [Google Scholar] [CrossRef]
Bazzurro, P.; Cornell, C.A.; Menun, C.; Motahari, M. Guidelines for seismic assessment of damaged buildings. In Proceedings of the World Conference on Earthquake Engineering, Vancouver, BC, Canada, 1–6 August 2004; Volume 1708. [Google Scholar]
Feese, C.; Li, Y.; Bulleit, W.M. Assessment of seismic damage of buildings and related environmental impacts. J. Perform. Constr. Facil. 2015, 29, 04014106. [Google Scholar] [CrossRef]
Placidi, L.; Di Marzo, M.; Mannarino, M.; Tomassi, A. The Lekszycki method for damage detection in structures with viscous damping. Math. Mech. Complex Syst. 2025, 13, 501–517. [Google Scholar] [CrossRef]
Li, Q.; Ellingwood, B.R. Performance evaluation and damage assessment of steel frame buildings under main shock–aftershock earthquake sequences. Earthq. Eng. Struct. Dyn. 2007, 36, 405–427. [Google Scholar] [CrossRef]
Dolce, M.; Goretti, A. Building damage assessment after the 2009 Abruzzi earthquake. Bull. Earthq. Eng. 2015, 13, 2241–2264. [Google Scholar] [CrossRef]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Chen, J.Y.; Feng, Y.W.; Teng, D.; Lu, C. Support vector machines-based pre-calculation error for structural reliability analysis. Eng. Comput. 2024, 40, 477–491. [Google Scholar] [CrossRef]
Suryanita, R. The Application of Artificial Neural Networks in Predicting Structural Response (Story Drift) of Multi-Story RC Building under Earthquake Load. KnE Eng. 2016. [Google Scholar] [CrossRef]
Zhang, B.; Yu, Y.; Yi, S.; Ding, Z.; Yousefi, A.M.; Li, J.; Lyu, X. Machine learning methods for compression capacity prediction and sensitivity analysis of concrete-filled steel tubular columns: State-of-the-art review. Structures 2025, 72, 108259. [Google Scholar] [CrossRef]
Hearst, M.A.; Dumais, S.T.; Osuna, E.; Platt, J.; Scholkopf, B. Support vector machines. IEEE Intell. Syst. Their Appl. 1998, 13, 18–28. [Google Scholar] [CrossRef]
Pisner, D.A.; Schnyer, D.M. Support vector machine. In Machine Learning; Elsevier: Amsterdam, The Netherlands, 2020; pp. 101–121. [Google Scholar]
Algamati, M.; Al-Sakkaf, A.; Bagchi, A. Energy Dissipation Technologies in Seismic Retrofitting: A Review. CivilEng 2025, 6, 23. [Google Scholar] [CrossRef]
Nguyen, H.D.; Pham, T.M.; Thai, H.-T. Machine learning-based prediction for maximum lateral displacements of seismic isolation systems subjected to earthquakes. Structures 2022, 38, 1974–1989. [Google Scholar]
Algamati, M.; Al-Sakkaf, A.; Mohammed Abdelkader, E.; Bagchi, A. Studying and Analyzing the Seismic Performance of Concrete Moment-Resisting Frame Buildings. CivilEng 2023, 4, 34–54. [Google Scholar] [CrossRef]
Chowdhury, S.R.; Das, S. Predicting seismic response of buildings using artificial neural networks. J. Build. Eng. 2021, 44, 102643. [Google Scholar]
Altheeb, A.; Hussain, R.R.; Dindarloo, S.R. Artificial neural network model for predicting nonlinear seismic behavior of reinforced concrete buildings. Appl. Sci. 2022, 12, 3827. [Google Scholar]
Tzionas, P.; Kougioumtzoglou, I.A.; De Angelis, M. Support vector machines for earthquake-induced damage classification in base-isolated buildings. Soil Dyn. Earthq. Eng. 2020, 139, 106400. [Google Scholar]
Kiani, B.; Mahdi, T.; Kamgar, R. Support vector machine-based damage prediction of reinforced concrete structures under seismic loads. Adv. Eng. Softw. 2023, 178, 103322. [Google Scholar]
Chakraborty, S.; Chowdhury, R. Data-driven modeling of structural responses using decision trees and random forests. Eng. Struct. 2020, 209, 110276. [Google Scholar]
ASCE/SEI 7-02; Minimum Design Loads for Buildings and Other Structures. American Society of Civil Engineers (ASCE): Reston, VA, USA, 2002.
ASCE/SEI 7-22; Minimum Design Loads and Associated Criteria for Buildings and Other Structures. American Society of Civil Engineers (ASCE): Reston, VA, USA, 2022.
Mazzoni, S.; McKenna, F.; Scott, M.H.; Fenves, G.L. OpenSees command language manual, Pacific earthquake. Eng. Res. Cent. 2006, 264, 137–158. [Google Scholar]
Aloisio, A.; Pelliciari, M.; Sirotti, S.; Boggian, F.; Tomasi, R. Optimization of the structural coupling between RC frames, CLT shear walls and asymmetric friction connections. Bull. Earthq. Eng. 2022, 20, 3775–3800. [Google Scholar] [CrossRef]
Constantinou, M.C.; Kalpakidis, I.V.; Filiatrault, A.; Lay, R.E. LRFD-Based Analysis and Design Procedures for Bridge Bearings and Seismic Isolators; MCEER: Buffalo, NY, USA, 2011; pp. 11–0004. [Google Scholar]
Kelly, J.M. Earthquake-Resistant Design with Rubber; Springer: London, UK, 1993; Volume 7. [Google Scholar]
Nagarajaiah, S.; Ferrell, K. Stability of Elastomeric Isolation Bearings: Experimental Study. J. Struct. Eng. 1999, 125, 946–954. [Google Scholar] [CrossRef]
Sheikh, H.; Ruparathna, R.; Van Engelen, N.C. Investigation of seismic fragility curves of unbonded FREIs: Adaptive characteristics and modeling sensitivity. Earthq. Eng. Struct. Dyn. 2024, 53, 1826–1840. [Google Scholar] [CrossRef]
Bermany, T.H.; Osman, S.A.; Yatim, M.Y.M. A state-of-the-art analysis of base isolation systems and future directions for developing a novel multi-directional smart-hybrid isolation system integrated with earthquake early warning system for building structures. Results Eng. 2025, 25, 104501. [Google Scholar] [CrossRef]
Nguyen, H.D.; Dao, N.D.; Shin, M. Machine learning-based prediction for maximum displacement of seismic isolation systems. J. Build. Eng. 2022, 51, 104251. [Google Scholar] [CrossRef]
Lu, Y.; Li, X.; Chen, J. Seismic response prediction of base-isolated buildings using machine learning techniques. Eng. Struct. 2020, 222, 111144. [Google Scholar]
Zhao, B.; Liu, H.; Xu, C. Machine learning–based prediction models for seismic response of base-isolated structures. Eng. Struct. 2021, 242, 112608. [Google Scholar]
Zhu, Z.; Li, X.; Lu, Y. Machine learning-based seismic fragility analysis of base-isolated structures. Structures 2025, 80, 1468–1482. [Google Scholar] [CrossRef]
Bhatta, S.; Dang, J. Seismic damage prediction of RC buildings using machine learning. Earthq. Eng. Struct. Dyn. 2023, 52, 3504–3527. [Google Scholar] [CrossRef]
Géron, A. Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow, 3rd ed.; O’Reilly Media: Sebastopol, CA, USA, 2022. [Google Scholar]
Kuhn, M.; Johnson, K. Applied Predictive Modeling; Springer: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
Brownlee, J. Better Deep Learning: Train Faster, Reduce Overfitting, and Make Better Prediction; Machine Learning Mastery: Vermont, VT, USA, 2019. [Google Scholar]
Honarjoo, A.; Darvishan, E.; Rezazadeh, H.; Kosarieh, A.H. Damage detection and localization of structural cracks based on dynamic attention based transformer. Int. J. Build. Pathol. Adapt. 2024. [Google Scholar] [CrossRef]
Honarjoo, A.; Darvishan, E.; Rezazadeh, H.; Kosarieh, A.H. SigBERT: Vibration-based steel frame structural damage detection through fine-tuning BERT. Int. J. Struct. Integr. 2024, 15, 851–872. [Google Scholar] [CrossRef]
Liu, Z.; Wu, D.; Liu, Y.; Han, Z.; Lun, L.; Gao, J.; Jin, G.; Cao, G. Accuracy analyses and model comparison of machine learning adopted in building energy consumption prediction. Energy Explor. Exploit. 2019, 37, 1426–1451. [Google Scholar] [CrossRef]
Kazemi, F.; Asgarkhani, N.; Shafighfard, T.; Jankowski, R. Machine-Learning Methods for Estimating Performance of Structural Concrete Members Reinforced with Fiber-Reinforced Polymers. Arch. Comput. Methods Eng. 2025, 32, 571–603. [Google Scholar]
Kashani, A.H.; Ghalehnovi, M.; Etemadfard, H. ML modelling of ultimate and relative bond strength for corroded reinforced concrete—an explainable machine-learning approach. Sci. Rep. 2025, 15, 09532. [Google Scholar] [CrossRef]

Figure 1. Bilinear hysteretic material behavior used for the LRB in the nonlinear time-history analysis [53].

Figure 2. Time spectrum of the selected seismic events for a single-degree-of-freedom system with 5% damping ratio: (a) spectral displacement; (b) spectral velocity; (c) spectral acceleration.

Figure 3. The percentage of each damage level in the produced dataset using time-history analysis of the modeled buildings.

Figure 4. Histogram of the datasets produced after time-history analysis, showing the number of data points in each MISDR range.

Figure 5. Comparison of Precision, Recall, and F1-scores for the ML algorithms: (a) DT, (b) RF, (c) SVM, (d) XGBoost, (e) ANN.

Figure 6. Confusion matrices of the ML algorithms for different seismic damage levels: (a) DT, (b) RF, (c) SVM, (d) XGBoost, (e) ANN.

Figure 7. Feature importance ranking showing the most influential parameters in identifying the damage level of BI-retrofitted buildings.

Table 1. Properties of the modeled steel materials for time-history analysis in OpenSeesPy.

Steel02
F_y (Mpa)	E (Mpa)	Strain Hardening Ratio	R0	R1	R2
400	2 × 10⁵	0.01	15	0.925	0.15

Table 2. Properties of the modeled concrete materials for time-history analysis in OpenSeesPy, including core and cover fibers.

	Concrete02
	f’_c (Mpa)	$ε_{C 0}$	$f_{p c u}$ (MPa)	$ε_{c u}$	$λ$
Core fibers	33	0.002	0.0035	0.1	0.1
Cover fibers	30	0.002	0.0030	0.1	0.1

Table 4. Properties of the historical ground motions used in the current study.

Event Name	Unscaled Magnitude	Station	Original PGA (g)
El Centro (1940)	7.0	Pecknold version	0.32
Kobe I (1995)	6.9	KJMA	0.82
Loma Prieta (1989)	6.93	Corralitos	0.48
Northridge (1994)	6.69	Mulhol	0.62
Tabas I (1978)	7.35	Tabas	0.85
Chi-Chi (1999)	7.62	CHY080	0.97
Duzchi (1999)	7.14	Bolu	0.74
Kobe II (1995)	6.9	Abeno	0.22
Tabas II (1978)	7.35	Bajestan	0.09
San Fernando (1971)	6.61	Pacoima Dam	1.23
Friuli (1976)	6.5	Barcis	0.29

Table 5. Parameters considered for training the dataset in the ML algorithms.

Structural Parameters	Seismic Excitation Parameters
Structural Parameters	Name	Definition
$Period of first natural mode without BI (T_{1 n o B I}$ )	Effective peak velocity (EPV)	$\frac{1}{2.5} S V (T = 1.0 S); S V$ is 5% damped spectral velocity
Period of second natural mode without BI	Arias intensity of earthquake (AIOE)	$\frac{π}{2 g} \int_{0}^{T e n d} a^{2} (t) d t$ ; a(t) denotes the acceleration of the earthquake at the time t
Period of first natural mode with $BI (T_{1}$ )	Housner intensity (HI)	$\int_{0.1}^{2.5} P S_{V} d t$ ; PS_V denotes the pseudo velocity spectrum for the 5% damped system
Period of second natural mode with BI	Modified cumulative absolute velocity (MCAV)	$\int_{0}^{T_{e n d}} \| a (t) \| d t$
Type of moment-resisting frame (special or intermediate)	Peak ground acceleration (PGA)	$m a x \| a (d t) \|$
Height of the building	Peak ground velocity (PGV)	$m a x \| v (t) \|$ ; v(t) denotes the ground velocity at the time t
Number of bays	Peak ground displacement (PGD)	$m a x \| d (t) \|$ ; d(t) denotes the ground displacement at the time t
Ratio of bays to the story height	$Spectral acceleration at T_{1} S_{A} (T = T_{1}$ )	T₁ denotes the natural period of the retrofitted building in the first natural mode
Thickness of BI to the height of the story	$Spectral velocity at T_{1} S_{V} (T = T_{1})$
Ratio of BI stiffness after yielding to ratio of BI stiffness before yielding	Maximum spectral acceleration (MSA)
Ratio of lateral yielding displacement to the height of BI	Maximum spectral velocity (MSV)
Number of stories	Period corresponding to maximum value of acceleration spectrum response
Site type	Period corresponding to maximum value of velocity spectrum response
-----	$Ratio of maximum spectral acceleration to spectral acceleration of the first mode (MSA / S_{A} (T = T_{1}$ ))
-----	Ratio of maximum spectral velocity to spectral velocity of the first mode $(MSV / S_{V} (T = T_{1})$ )

Table 6. Parameters selected for training each ML algorithm.

DT	RF	SVM	ANN	XGBoost
Criterion: Entropy Depth: 15	Criterion: Entropy Estimator: 200 Depth: 15	Kernel: RBF Gamma: scale C: 1	Optimizer = Adam Input: Relu Batch size: 32 Epoch: 200	Maximum depth: 6 Number of trees: 300 Multiclass logarithmic loss: mlogloss

Table 7. Criteria considered for evaluating and comparing the accuracy of ML algorithms.

	Precision (P)	F1-Score	Recall (R)
Definition	$P = \frac{T P s}{T P s + F P s}$	$f_{1} = \frac{2 P \times R}{P + R}$	$P = \frac{T P s}{T P s + F N s}$

Table 8. F1-score for ML algorithms with different damage levels (DLs).

Model	DL1	DL2	DL3	DL4	DL5	Average
DT	0.91	0.84	0.80	0.85	0.76	0.84
RF	0.90	0.82	0.83	0.89	0.80	0.86
SVM	0.90	0.85	0.86	0.90	0.77	0.87
XGBoost	0.93	0.88	0.87	0.91	0.83	0.89
ANN	0.89	0.82	0.86	0.91	0.79	0.87

Table 9. Statistical significance analysis (paired t-test) of macro-F1 differences between XGBoost and other models.

Comparison	t-Statistic	p-Value	Significance
XGBoost vs. Decision Tree	21.395	<0.001	✓ Significant
XGBoost vs. Random Forest	7.942	0.0014	✓ Significant
XGBoost vs. SVM	14.989	0.0001	✓ Significant
XGBoost vs. ANN	11.559	0.0003	✓ Significant

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Algamati, M.; Al-Sakkaf, A.; Bagchi, A. Application of Machine Learning for Predicting Seismic Damage in Base-Isolated Reinforced Concrete Buildings. CivilEng 2026, 7, 4. https://doi.org/10.3390/civileng7010004

AMA Style

Algamati M, Al-Sakkaf A, Bagchi A. Application of Machine Learning for Predicting Seismic Damage in Base-Isolated Reinforced Concrete Buildings. CivilEng. 2026; 7(1):4. https://doi.org/10.3390/civileng7010004

Chicago/Turabian Style

Algamati, Mohamed, Abobakr Al-Sakkaf, and Ashutosh Bagchi. 2026. "Application of Machine Learning for Predicting Seismic Damage in Base-Isolated Reinforced Concrete Buildings" CivilEng 7, no. 1: 4. https://doi.org/10.3390/civileng7010004

APA Style

Algamati, M., Al-Sakkaf, A., & Bagchi, A. (2026). Application of Machine Learning for Predicting Seismic Damage in Base-Isolated Reinforced Concrete Buildings. CivilEng, 7(1), 4. https://doi.org/10.3390/civileng7010004

Article Menu

Application of Machine Learning for Predicting Seismic Damage in Base-Isolated Reinforced Concrete Buildings

Abstract

1. Introduction

2. Methodology

3. Overview of Machine Learning Methods for Damage Prediction

4. Modeling and Design Details

5. Seismic Excitation Properties

6. Dataset Generation for Model Training

7. Results of Machine Learning Classification

8. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A. The Acceleration Time-History of the Ground Motions

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI