Machine Learning-Based Approach for Seismic Damage Prediction Method of Building Structures Considering Soil-Structure Interaction

: Conventional seismic performance evaluation methods for building structures with soil– structure interaction effects are inefﬁcient for regional seismic damage assessment as a predisaster management system. Therefore, this study presented the framework to develop an artiﬁcial neural network-based model, which can rapidly predict seismic responses with soil–structure interaction effects and determine the seismic performance levels. To train, validate and test the model, 11 input parameters were selected as main parameters, and the seismic responses with the soil–structure interaction were generated using a multistep analysis process proposed in this study. The artiﬁcial neural network model generated reliable seismic responses with the soil–structure interaction effects, and it rapidly extended the seismic response database using a simple structure and soil information. This data generation method with high accuracy and speed can be utilized as a regional seismic assessment tool for safe and sustainable structures against natural disasters.


Introduction
Evaluating the seismic performance of building structures is critical for the seismicresistant design and resilience of structures [1,2]. In addition to the free-field ground motion, soil-structure interaction (SSI) effects have been utilized for the reliable evaluation of the seismic performance of structures [3,4] because the dynamic response of structure relies on the soil properties (i.e., geological condition). For example, the 1985 Mexico City and 1989 Loma Prieta earthquakes demonstrated the extremely amplified seismic response of buildings on the soft soil [5][6][7]. Moreover, the recent approach included developing the SSI model coupled with liquefaction and post-liquefaction instability [8], which occurred in the 2011 Christchurch [9] and 2017 Pohang earthquakes [10].
Building structures located in seismic hazard regions are subjected to the damage induced by the earthquake. However, the seismic damage assessment on a regional level using a finite element (FE) analysis for individual building structures was not feasible because the FE-based simulation can be extremely time-consuming due to the individual analytical model's complexities (e.g., geometric effects and a large number of elements). Therefore, many previous studies presented the regional seismic damage assessment based on evaluated seismic performance using a single-degree-of-freedom (SDOF) structure [11][12][13][14][15][16]. For example, the HAZUS-MH earthquake model [14] was developed using the SDOF model for the regional seismic damage simulation. This earthquake model adopted a capacity spectrum method (CSM), which evaluated the seismic damage of the buildings by a capacity curve of a structural model with a seismic demand spectrum. The CSM-based method enables generating a large amount of data regarding the seismic damage level in a short time. However, most of the studies excluded the impact of SSI, which may substantially underestimate the seismic response of buildings at given ground motions. Moreover, the available computer FE-based simulation techniques with SSI [17][18][19][20] for evaluating the earthquake-induced damage of buildings required high computational efforts to build the database for developing the region-based prediction model. Therefore, for rapid decision-making, the SSI treated region-based prediction model without utilizing FEM simulation is needed.
Using artificial neural networks (ANNs) can be one of the reliable alternatives for developing a rapid decision-making tool to evaluate the seismic performance of buildings. Because of its distinguished advantage of enabling constructing the database with low computational effort, ANNs have been widely applied to many engineering problems such as crack detection [21,22], seismic fragility assessment of bridge, skin friction of piles in clay deposit [23,24], and prediction of ground motion parameters [25]. However, only a limited number of previous studies investigated the seismic performance of buildings using ANN [26,27], which did not incorporate SSI effects in the model. Because the soil properties and geology of ground substantially affect the seismic performance of buildings, the ANN-based prediction model with the consideration of SSI effects is required for more accurate prediction of seismic response such as shear force or displacement. Because of the absence of an ANN-based model with SSI effects, this study aims to propose an ANN-based rapid decision-making tool to determine seismic-induced building damage levels with SSI effects. First, each step of the three-step analysis was elaborated with visualization of SDOF system. Then, the dataset for nonlinear seismic response of a building structure was generated by SDOF and the accuracy of developed ANN model was assessed. In addition, the confusion matrix was evaluated to assess the ability to predict the drift-based performance of buildings from the proposed ANN model.

Linear Simplified Model with SSI Effect
The dynamic equation of the motion for the structure in the frequency domain can be written as: where ω is the angular frequency (T −1 ), i is an imaginary number, {U} is the displacement amplitude vector, first two entries of {R} are equal to one while the rest of entries are zero, .. U g is the amplitude of the harmonic ground acceleration, and [M], [C], and [K] respectively are the mass, damping, and stiffness of the structure in which the case of an SDOF model can be expressed as [28]: where m is mass (M), J is the mass moment (ML 2 ), c is dynamic damping coefficient (MT −1 ), k is dynamic stiffness (MT −2 ), and h is the height of SDOF system (L). Note that subscript s, f, h, and θ here represent the structure, foundation, horizontal direction, and radial direction, respectively. It should be also noted that k h and k θ are a function of shear wave velocity (V s ) and Poisson's ratio (ν) of soil.

Nonlinear Displacement of Structure with SSI
To simply estimate nonlinear responses of a structure with SSI effects (u SSI,NL ), this study proposed three-step analysis procedures using SDOF models. Figure 1 summarizes the proposed procedure computing the u SSI,NL . The first step was to compute the linear seismic response with the SSI effect using the frequency domain method as mentioned in Section 2.1. The linear SDOF model with the foundation effects used in the first step can analyze structural responses with the horizontal swaying displacement of foundation (u h ), and rocking displacement of foundation (u θ ) as presented in Figure 1. After that, the frequency domain analysis for the linear SDOF model without the foundation model was performed, and the linear structural response (u S,L ) was utilized to find the displacement of the foundation model (u FSI = u h + u θ ) by subtracting the u SSI (=u FSI + u S,L ) from the u S,L response. Thus, the u FSI response was the pure foundation-soil interaction (FSI) effect without the linear response of structural responses. In the next step, the nonlinear seismic response (u S,NL ) of the structure only was computed from nonlinear time-history analyses for a nonlinear SDOF structural model using OpenSees software [29]. The nonlinear SDOF model was simply developed using the rigid element and the lumped mass. The model's nonlinear behavior was reproduced using the nonlinear rotational spring (K S,NL ), which can capture the structural damage responses, such as strength reduction, stiffness degradations, and structural collapse. In the final step, the u SSI,NL response was computed by combining the u S,NL with u FSI responses. Figure 1 illustrates the seismic responses of the entire processes proposed in this study as an example and schematically compares the structural responses with the SSI effect and without the SSI effect. As shown in Figure 2, the maximum displacement of the u SSI,NL is approximately 42.2% higher than that of the u S,NL . Using the abovementioned approach, it was found that the SSI effect significantly affects the entire structural responses.

Nonlinear Displacement of Structure with SSI
To simply estimate nonlinear responses of a structure with SSI effects (uSSI,NL), thi study proposed three-step analysis procedures using SDOF models. Figure 1 summarize the proposed procedure computing the uSSI,NL. The first step was to compute the linea seismic response with the SSI effect using the frequency domain method as mentioned in Section 2.1. The linear SDOF model with the foundation effects used in the first step can analyze structural responses with the horizontal swaying displacement of foundation (uh) and rocking displacement of foundation (uθ) as presented in Figure 1. After that, the fre quency domain analysis for the linear SDOF model without the foundation model wa performed, and the linear structural response (uS,L) was utilized to find the displacemen of the foundation model (uFSI = uh + uθ) by subtracting the uSSI (=uFSI + uS,L) from the uS response. Thus, the uFSI response was the pure foundation-soil interaction (FSI) effec without the linear response of structural responses. In the next step, the nonlinear seismi response (uS,NL) of the structure only was computed from nonlinear time-history analyse for a nonlinear SDOF structural model using OpenSees software [29]. The nonlinear SDOF model was simply developed using the rigid element and the lumped mass. The model' nonlinear behavior was reproduced using the nonlinear rotational spring (KS,NL), which can capture the structural damage responses, such as strength reduction, stiffness degra dations, and structural collapse. In the final step, the uSSI,NL response was computed by combining the uS,NL with uFSI responses. Figure 1 illustrates the seismic responses of th entire processes proposed in this study as an example and schematically compares th structural responses with the SSI effect and without the SSI effect. As shown in Figure 2 the maximum displacement of the uSSI,NL is approximately 42.2% higher than that of th uS,NL. Using the abovementioned approach, it was found that the SSI effect significantly affects the entire structural responses.

Reference Building Structure
In this study, a three-story, three-bay nonductile RC frame designed by ACI 318-63 [30] was selected as a reference building. The selected building frame was assumed to be located on stiff soil conditions (SD) [31]. Figure 3 shows brief information on the structural model used in this study.
The building model is vulnerable to shear failure due to its seismically deficient detailing, such as a low transverse reinforcing bar ratio, and 90-degree L-shaped corner hooks that led to poor concrete confinement. In the analytical model, the concrete compressive strength (fc′) of 25.9 MPa and rebar yield strength (fy) of 303.4 MPa with 0.5% strain hardening ratio were used in this model. The masses of the frame were lumped at beam-column joint nodes in each story level. The lumped masses were computed with the load combination DL + 0.25LL, where DL = dead load and LL = live load. The self-weight (DL) was calculated based on the cross-section of each element and element length with the assumed density of concrete material (≈2400 kg/m 3 ). The effective mass (Meff) of the building model was approximately 193.60 N-s 2 /mm. More detailed information on the building model can be found in [32].
To derive the structural properties of the building model, this study performed eigenvalue and nonlinear pushover analyses. For the eigenvalue analysis, the model's periods (T1, and T2) for the first and second modes are 0.46 and 0.11 s, respectively, and the mass participation ratio is 90.2% for the first mode and 8.30% for the second mode. Therefore, it was found that the building model was governed by the first mode. The normalized mode vector of the reference model in the first mode is 0.73, 0.88, and 1.0 for first-,

Reference Building Structure
In this study, a three-story, three-bay nonductile RC frame designed by ACI 318-63 [30] was selected as a reference building. The selected building frame was assumed to be located on stiff soil conditions (S D ) [31]. Figure 3 shows brief information on the structural model used in this study. second-and third-story, respectively. Figure 3b shows a nonlinear pushover curve of the reference model with an idealized curve. The idealized force-displacement curve was derived as specified in Federal Emergency Agency (FEMA) 356 [33]. The idealized curve was utilized to compute the structural parameters at yielding and ultimate conditions. The structural parameters include yielding strength (Vy), yielding, and ultimate displacement (Δy and Δu). The Δu was calculated at the point where the ultimate strength (Vu) was reduced by 20%. To implement the building model into the nonlinear SDOF model, the structural parameters obtained from the pushover analysis were transformed to a yielding spectral acceleration (Say) using Equation (3a), and displacement ductility capacity (μ) using Equation (3b), (for flexure-shear failure mode, Say = 0.62 g and μ = 2.33). Based on the structural parameters, the structural input parameters representing the nonlinear behavior were determined in Section 3.2.

Input Parameters
To build the ANN model from generated nonlinear responses with the SSI effects, this study established a dataset with input and output parameters. The 11 input parameters were selected to represent earthquake excitation, nonlinear response of the structure,    The building model is vulnerable to shear failure due to its seismically deficient detailing, such as a low transverse reinforcing bar ratio, and 90-degree L-shaped corner hooks that led to poor concrete confinement. In the analytical model, the concrete compressive strength (f c ) of 25.9 MPa and rebar yield strength (f y ) of 303.4 MPa with 0.5% strain hardening ratio were used in this model. The masses of the frame were lumped at beam-column joint nodes in each story level. The lumped masses were computed with the load combination DL + 0.25LL, where DL = dead load and LL = live load. The self-weight (DL) was calculated based on the cross-section of each element and element length with the assumed density of concrete material (≈2400 kg/m 3 ). The effective mass (M eff ) of the building model was approximately 193.60 N-s 2 /mm. More detailed information on the building model can be found in [32].
To derive the structural properties of the building model, this study performed eigenvalue and nonlinear pushover analyses. For the eigenvalue analysis, the model's periods (T 1 , and T 2 ) for the first and second modes are 0.46 and 0.11 s, respectively, and the mass participation ratio is 90.2% for the first mode and 8.30% for the second mode. Therefore, it was found that the building model was governed by the first mode. The normalized mode vector of the reference model in the first mode is 0.73, 0.88, and 1.0 for first-, secondand third-story, respectively. Figure 3b shows a nonlinear pushover curve of the reference model with an idealized curve. The idealized force-displacement curve was derived as specified in Federal Emergency Agency (FEMA) 356 [33]. The idealized curve was utilized to compute the structural parameters at yielding and ultimate conditions. The structural parameters include yielding strength (V y ), yielding, and ultimate displacement (∆ y and ∆ u ). The ∆ u was calculated at the point where the ultimate strength (V u ) was reduced by 20%. To implement the building model into the nonlinear SDOF model, the structural parameters obtained from the pushover analysis were transformed to a yielding spectral acceleration (S ay ) using Equation (3a), and displacement ductility capacity (µ) using Equation (3b), (for flexure-shear failure mode, S ay = 0.62 g and µ = 2.33). Based on the structural parameters, the structural input parameters representing the nonlinear behavior were determined in Section 3.2.

Input Parameters
To build the ANN model from generated nonlinear responses with the SSI effects, this study established a dataset with input and output parameters. The 11 input parameters were selected to represent earthquake excitation, nonlinear response of the structure, and geotechnical parameters as summarized in Table 1. The past earthquake records were collected from the Next Generation Attenuation (NGA) strong ground motion database [34]. Among a large amount of the past records provided by Pacific Earthquake Engineering Research (PEER), this study randomly selected 1288 ground motions measured at the S D soil condition (S D soil condition represents stiff soil). The reason for choosing the S D soil condition in this study is that the current building design code [31] allowed structural engineers to select default site class as the S D soil condition if the geotechnical information (e.g., average standard penetration test blow count, effective shear wave velocity for site soil conditions, and average undrained shear strength) is not insufficient. Additionally, the 3000 earthquake records were collected from the PEER strong ground motion database to consider the various characteristics of the ground motions. Among the collected ground motions, the authors filtered out the 1288 records measured at the S D soil condition. The number of the earthquake records was determined to enhance the accuracy of the learning model and avoid the overfitting issue on the learning model. For the input parameters of the earthquake excitations, the earthquake magnitude (M w ), epicenter distance (ED) from the site of the structure, and the peak ground acceleration (PGA), peak ground velocity (PGV), and peak ground displacement (PGD) were selected. In addition, post-yielding stiffness (α 1 ), stiffness degradation ratio (α 2 ), and displacement ductility capacity (µ) were selected as the nonlinear structural parameters as illustrated in Figure 4. This enabled considering the various types of failure modes (e.g., shear, flexure-shear, and flexure failure modes). The upper and lower limit values of the structural parameters were determined based on the capacity curves specified in FEMA-P440A [35] and HAZUS-MH 2.1 [14]. The T and S ay (Equation (3a)) values were fixed (T = 0.46, and S ay = 0.62 g) based on the analytical results of the reference model as mentioned in Section 3.1.
T and Say (Equation (3a)) values were fixed (T = 0.46, and Say = 0.62 g) based on the analytical results of the reference model as mentioned in Section 3.1.
The Poisson's ratio (ν) and shear wave velocity (Vs) of soil ranged from 0.15 to 0.5 and ranged from 90 to 760 m/s, and were selected to cover soft clay (high ν and low Vs) to dense sand (low ν and high Vs) [31,36,37]. In addition, a shallow circular foundation was assumed in this study with a range of 0.09 to 0.7 for the ratio of structure to foundation mass.
To cover the ranges of the structural and geotechnical parameters, 10 steps of each parameter were randomly sampled using a uniform distribution from minimum to maximum values presented in Table 1. The structural parameters were randomly linked to the geotechnical parameters to build the numerical models with the SSI effects. These models were simulated under 1288 earthquake records based on the multistep analysis method. Thus, the total number of the random samples for the development of a machine-learning model was 12,880 (=10 × 1288) to find uSSI,NL responses following the multistep analysis procedure proposed in Section 2.2.   The Poisson's ratio (ν) and shear wave velocity (V s ) of soil ranged from 0.15 to 0.5 and ranged from 90 to 760 m/s, and were selected to cover soft clay (high ν and low V s ) to dense sand (low ν and high V s ) [31,36,37]. In addition, a shallow circular foundation was assumed in this study with a range of 0.09 to 0.7 for the ratio of structure to foundation mass. M w = earthquake magnitude; ED = epicenter distance; PGA = peak ground acceleration; PGV = peak ground velocity; PGD = peak ground displacement; α 1 = post-yielding stiffness; α 2 = strength degradation ratio; µ = displacement ductility capacity; η = ratio of structural mass to foundation mass; ν = Poisson's ratio of soil; and V s = shear wave velocity of soil.
To cover the ranges of the structural and geotechnical parameters, 10 steps of each parameter were randomly sampled using a uniform distribution from minimum to maximum values presented in Table 1. The structural parameters were randomly linked to the geotechnical parameters to build the numerical models with the SSI effects. These models were simulated under 1288 earthquake records based on the multistep analysis method. Thus, the total number of the random samples for the development of a machine-learning model was 12,880 (=10 × 1288) to find u SSI,NL responses following the multistep analysis procedure proposed in Section 2.2.

Output Parameter
The u SSI,NL response produced from the multistep analysis procedure considering the SSI effects was used to calculated the peak interstory drift ratio (IDR), which was selected as an output parameter. The evaluated IDR can be utilized to determine the seismic performance of structures by the drift-based performance limits documented in FEMA-356 [33] as given in Table 2. Through the multistep analysis procedure, the 12,880 output responses for the u SSI,NL were generated, and the seismic performance levels were determined using the peak IDR values computed from the u SSI,NL . Among the entire sample data, the 2576 sample data overestimated due to numerical instability during the nonlinear-time history analyses was filtered out from the input and output datasets. Through the multistep analysis procedure, this study developed 10,304 datasets for training, validating, and testing a machine-learning model.

Development of ANN Model
In the past decades, the growing use of ANN in engineering increased the applicability of this brain-inspired technique to solve many engineering problems with the nonlinear relationship between input and output [39]. The ANN structure is composed of the input layer, zero or more hidden layers, and output layer, which are interconnected by multiple nodes in each layer as illustrated in Figure 5. The value of neurons at the immediately following layer of the input layer (x j ) is expressed as: where i is the immediately preceding layer of j layer, j is the immediately following layer of i layer, x i is the input value in the neuron, f (•) is transfer or activation function, w ij is weight coefficient that represents the degree of importance of the connection between the ith and jth neurons, ∑ j w ij x i = weighted summation, and b = threshold or bias value in the associated neuron. The neural network approach is based on a supervised process, which knows actual (target) outputs for the inputs utilized in the model training and provides predicted outputs by comparing the target and predicted outputs. The errors between the target and predicted values are minimized by the learning algorithms, such as back-propagation, Quasi-Newton and Levenberg-Marquardt algorithms. By using the learning algorithms, the weight and bias values of the multiple hidden and output layers are automatically found. back-propagation, Quasi-Newton and Levenberg-Marquardt algorithms. By using the learning algorithms, the weight and bias values of the multiple hidden and output layers are automatically found.  Figure 6 illustrates the ANN structure developed in this study to produce the peak IDR with the SSI effects. The model was constructed with the 11 input and 1 output parameter as described in Section 3. The neural network model used in this study was composed of multiple hidden layers: two nine-hidden layers and one four-hidden layer, and one two-hidden layer. The number of the hidden layers was determined using the trialand-error method until the ANN model ensured more than 80% accuracy. As illustrated in Figure 6, a linear transfer function was assigned to the first hidden layer and the output layer, and a log-sigmoid transfer function was utilized to the remaining hidden layers. To   Figure 6 illustrates the ANN structure developed in this study to produce the peak IDR with the SSI effects. The model was constructed with the 11 input and 1 output parameter as described in Section 3. The neural network model used in this study was composed of multiple hidden layers: two nine-hidden layers and one four-hidden layer, and one twohidden layer. The number of the hidden layers was determined using the trial-and-error method until the ANN model ensured more than 80% accuracy. As illustrated in Figure 6, a linear transfer function was assigned to the first hidden layer and the output layer, and a log-sigmoid transfer function was utilized to the remaining hidden layers. To solve the nonlinear problem, this study utilized the Levenberg-Marquardt Algorithm (LMA) in the neural network model as the learning algorithm, which provides the best-fit values of the w and b parameters in the activation functions to minimize the mean squared error (MSE) in an iterative process.

Model Training, Validation, and Testing
This study randomly split the entire dataset into 70% for a training set, 10% for a validation set, and 20% for a testing set. The training set was utilized to develop the ANN model using the structure given in Figure 7, and the validation set was to tune the hyperparameters of the ANN model and to prevent overfitting issues of the ANN model in the training process. The testing set was used to estimate the accuracy of the ANN model.

Inputs
Hidden Layers Figure 6. Artificial neural network structure developed in this study.

Model Training, Validation, and Testing
This study randomly split the entire dataset into 70% for a training set, 10% for a validation set, and 20% for a testing set. The training set was utilized to develop the ANN model using the structure given in Figure 7, and the validation set was to tune the hyper-parameters of the ANN model and to prevent overfitting issues of the ANN model in the training process. The testing set was used to estimate the accuracy of the ANN model. This study trained, validated, and tested the ANN model using 10,304 sample data (8115 for the training set, 1159 for the validation set, and 2318 for the testing set) excluding 2576 sample data, filtered out due to the numerical instability. Since input and output parameters follow the lognormal distribution, this study adopted the natural logarithm to the input and output parameters for training, validation, and testing. seismic performance levels from the developed ANN model. The evaluated target and predicted peak IDR values were utilized to build a confusion chart, which enables the developed ANN model's accuracy for classifying the seismic performance levels by Driftbased Performance Limit States in Table 2. Thus, the off-diagonal elements in Figures 9  and 10 represent the failure of predicting the seismic performance level by the developed ANN model. The accuracy of the ANN model for the classification problem defines the ratio of the number of successful predictions over the total number of samples.
On average of 4 levels (IO, LS, CP, and collapse), the accuracy of the developed ANN model was 85.6% for the training dataset and 83.71% for the testing dataset. Notably, the accuracy of IO level was extremely high (99.7% and 99.8% for training and testing dataset) while the lowest accuracy was observed for CP level for the training dataset and collapse level for testing dataset. Overall, the accuracy of the developed ANN model decreases as peak IDR values increase. However, the accuracy of 83.71% for the testing dataset indicates that the developed ANN model in this study satisfied the target performance (>80%). The larger dataset would be required to increase the model accuracy at high IDR values. The investigation of the ANN model's performance (e.g., MSE and R 2 values, and confusion matrix) demonstrated that the developed ANN model by generated the Peak IDR responses with the SSI effects is well-trained without the overfitting issues, and the accuracy of the ANN model satisfied the target accuracy.
where n = the number of samples; Y i = ith predicted value; and T i = ith target value. As shown in Figure 7, the training of the ANN model was stopped at the 129th iteration step where the MSE was minimized, where the values of the MSE for the training, validation and testing sets are estimated as 0.003, 0.004, and 0.005, respectively. The difference between the MSE-values between the training and testing sets was negligible. In addition, the performance of the ANN model was evaluated by a simple regression model, Y = T regression model. The regression model can judge the correlation between the Yand T-values. Figure 8 shows the relationship between the Y and T values for the training, validation, testing datasets, and the entire dataset with the coefficient of determination (R 2 ) values. The R 2 can be evaluated as: where Y is the mean of the predicted values.  As shown in Figure 8, the R 2 values for the training, validation, testing, and entire datasets were approximately 0.98, 0.96, 0.96, and 0.97. This indicates that the Y values are highly correlated to the T values predicted from the ANN model. Since the R 2 values for the validation and testing datasets are marginally different from the R 2 value for the training dataset, the ANN model was well-developed without the overfitting issue.
The confusion matrix was evaluated in this study to estimate the classification of the seismic performance levels from the developed ANN model. The evaluated target and predicted peak IDR values were utilized to build a confusion chart, which enables the developed ANN model's accuracy for classifying the seismic performance levels by Drift-based Performance Limit States in Table 2. Thus, the off-diagonal elements in Figures 9 and 10 represent the failure of predicting the seismic performance level by the developed ANN model. The accuracy of the ANN model for the classification problem defines the ratio of the number of successful predictions over the total number of samples.

ANN-Based Rapid Decision-Making Tool
This section presents the potential use of the developed ANN model (with >80% accuracy) as a rapid decision-making tool for determining the seismic performance levels. The peak IDR responses were generated by the developed ANN model without the analytical modeling of the building structures and multistep analysis procedure described in Section 2.2. Figure 11 presents the 4000 data points among 50,000 evaluated peak IDR values with respect to 50,000 sets of randomly generated input parameters. Figure 11 provides the fraction of performance levels at a given range of structure-related and geotechnical input parameters used in this study. Notably, to produce the 50,000 reliable responses using the ANN model, about 3 min were taken using a computer with 3.00 GHz CPU and 8 GB RAM. This indicates that the proposed framework in this study for developing the ANN model substantially reduced the computational time compared to the traditional FE models.
The most and the second most important structure-related and geotechnical input parameters were selected as parameters on the x and y-axis in Figure 11, which were determined by the Relief algorithm. The Relief algorithm has been widely used to analyze the impact of input parameters on the output by computing the weights of the multiple variables, which represent the importance of input parameters. The use of a Relief algorithm in this study showed that the μ and α2 parameters for the structural parameters and the η and Vs for the geotechnical parameters were significantly affected to the peak IDR. On average of 4 levels (IO, LS, CP, and collapse), the accuracy of the developed ANN model was 85.6% for the training dataset and 83.71% for the testing dataset. Notably, the accuracy of IO level was extremely high (99.7% and 99.8% for training and testing dataset) while the lowest accuracy was observed for CP level for the training dataset and collapse level for testing dataset. Overall, the accuracy of the developed ANN model decreases as peak IDR values increase. However, the accuracy of 83.71% for the testing dataset indicates that the developed ANN model in this study satisfied the target performance (>80%). The larger dataset would be required to increase the model accuracy at high IDR values. The investigation of the ANN model's performance (e.g., MSE and R 2 values, and confusion matrix) demonstrated that the developed ANN model by generated the Peak IDR responses with the SSI effects is well-trained without the overfitting issues, and the accuracy of the ANN model satisfied the target accuracy.

ANN-Based Rapid Decision-Making Tool
This section presents the potential use of the developed ANN model (with >80% accuracy) as a rapid decision-making tool for determining the seismic performance levels. The peak IDR responses were generated by the developed ANN model without the analytical modeling of the building structures and multistep analysis procedure described in Section 2.2. Figure 11 presents the 4000 data points among 50,000 evaluated peak IDR values with respect to 50,000 sets of randomly generated input parameters. Figure 11 provides the fraction of performance levels at a given range of structure-related and geotechnical input parameters used in this study. Notably, to produce the 50,000 reliable responses using the ANN model, about 3 min were taken using a computer with 3.00 GHz CPU and 8 GB RAM. This indicates that the proposed framework in this study for developing the ANN model substantially reduced the computational time compared to the traditional FE models.

Conclusions
This study proposed the framework of developing the ANN model for predicting the seismic performance levels of the building structure with soil-structure interaction effects. The soil-structure interaction effect was incorporated into the single-degree-of-freedom model by a proposed three-step analysis, and 11 input parameters were selected to account for the characteristics of the earthquake, nonlinear behavior of the structure, and soil characteristics. Unlike the ANN-based models in previous studies, the developed framework presented in this study enables incorporating SSI effects in the model. The ratio of structural mass to foundation mass (η), shear wave velocity (Vs), and Poisson's ratio (υ) can be reflected in the model in addition to structural and earthquake-related parameters. Based on the performance and the application of the developed ANN model, the following conclusions can be drawn: (1) The proposed three-step analysis provides the methodology for generating the dataset for nonlinear response in the SDOF system with the consideration of SSI effects. The proposed analysis is anticipated to be used for investigating the nonlinear response in the MDOF system or FEM-based results. (2) Evaluated MSE, R 2 values and confusion matrix indicate that the developed ANN model in this study showed more than 80% accuracy without any overfitting issues. In addition, the confusion matrix presented in this study can be an efficient method to evaluate the accuracy of the developed model in each seismic performance level. (3) The developed ANN model can rapidly generate an extensive database and determine the drift-based performance levels within a training range of the input parameters. Furthermore, using the developed ANN model significantly reduced computational time to generate large datasets compared to the traditional FE models. Since the ANN model can rapidly generate reliable responses using brief structural and geotechnical information without any complex modeling processes, the model can be a useful alternative to the seismic damage assessment on a regional level for safe and sustainable structures.  The most and the second most important structure-related and geotechnical input parameters were selected as parameters on the x and y-axis in Figure 11, which were determined by the Relief algorithm. The Relief algorithm has been widely used to analyze the impact of input parameters on the output by computing the weights of the multiple variables, which represent the importance of input parameters. The use of a Relief algorithm in this study showed that the µ and α 2 parameters for the structural parameters and the η and V s for the geotechnical parameters were significantly affected to the peak IDR.

Conclusions
This study proposed the framework of developing the ANN model for predicting the seismic performance levels of the building structure with soil-structure interaction effects. The soil-structure interaction effect was incorporated into the single-degree-offreedom model by a proposed three-step analysis, and 11 input parameters were selected to account for the characteristics of the earthquake, nonlinear behavior of the structure, and soil characteristics. Unlike the ANN-based models in previous studies, the developed framework presented in this study enables incorporating SSI effects in the model. The ratio of structural mass to foundation mass (η), shear wave velocity (V s ), and Poisson's ratio (υ) can be reflected in the model in addition to structural and earthquake-related parameters. Based on the performance and the application of the developed ANN model, the following conclusions can be drawn: (1) The proposed three-step analysis provides the methodology for generating the dataset for nonlinear response in the SDOF system with the consideration of SSI effects. The proposed analysis is anticipated to be used for investigating the nonlinear response in the MDOF system or FEM-based results. (2) Evaluated MSE, R 2 values and confusion matrix indicate that the developed ANN model in this study showed more than 80% accuracy without any overfitting issues. In addition, the confusion matrix presented in this study can be an efficient method to evaluate the accuracy of the developed model in each seismic performance level. (3) The developed ANN model can rapidly generate an extensive database and determine the drift-based performance levels within a training range of the input parameters. Furthermore, using the developed ANN model significantly reduced computational time to generate large datasets compared to the traditional FE models. Since the ANN model can rapidly generate reliable responses using brief structural and geotechnical information without any complex modeling processes, the model can be a useful alternative to the seismic damage assessment on a regional level for safe and sustainable structures. Institutional Review Board Statement: Not applicable.

Informed Consent Statement:
Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
Datasets presented in this work can be provided upon request.