1. Introduction
A fundamental consideration in the seismic design of buildings is their dynamic response to ground-motion excitation. Design practice typically monitors global response quantities—such as inter-storey drift, base shear and fundamental period—to check serviceability and damage limits [
1,
2,
3]. These response measures are estimated using simplified procedures incorporated in design codes, which cannot capture the ultimate structural response.
Alternatively, non-linear time-history analysis can be used to capture material and geometric non-linearity, local yielding and failure mechanisms at the expense of significant modelling effort and computational cost [
4]. In common implicit integration schemes for dynamic time-history analysis, such as the constant average acceleration Newmark method, the stiffness is updated at each step and accuracy demands sufficiently small time increments, further increasing the computational burden for detailed structural models. In practice, this trade-off between fidelity and efficiency motivates complementary, data-driven surrogate models capable of predicting the non-linear structural response with lower computational cost.
A critical aspect of the seismic design of steel buildings is the choice of lateral load-resisting system during the design phase. Conventional solutions typically include diagonal cross bracing or moment-resisting frames. While effective, these systems can present challenges such as architectural constraints, limited energy dissipation capacity, or susceptibility to buckling [
5]. In contrast, Steel Plate Shear Walls (SPSWs) have emerged as a highly effective alternative for providing lateral resistance [
6]. A SPSW system consists of thin infill steel plates bounded by adjacent columns and beams, designed to yield and dissipate energy through the development of diagonal tension fields after buckling [
7]. Compared to conventional bracing systems, SPSWs may offer higher stiffness, superior energy absorption, and stable hysteretic behaviour under cyclic loading. In addition, their use can lead to improved control of inter-storey drifts and more uniform distribution of seismic forces along the building height, thereby enhancing overall seismic resilience [
7].
Steel plate shear walls (SPSWs) are widely adopted as lateral force-resisting systems for multi-storey steel buildings, providing high initial stiffness and stable hysteretic energy dissipation through tension field action after panel buckling [
8]. Fundamental analytical and experimental studies established the mechanics of SPSWs and informed design approaches [
9,
10,
11]. Design provisions (e.g., AISC 341-22) consolidate the best practices for steel shear plates to achieve ductile behaviour and reliable energy dissipation [
3]. Yet, even for code-compliant SPSWs, accurately resolving local plastic demand (e.g., strain localization bands and their spatial distribution) under recorded earthquakes still relies on non-linear time history finite element analysis, especially when plate slenderness, boundary frame flexibility, and connection details are considered to determine system-level performance. This creates an appropriate framework for learning-based surrogate models that map earthquake demands and structural attributes directly to high-resolution fields of damage-relevant quantities, such as plastic strain, rather than only to global scalar responses.
Over the last decade, machine learning (ML) has gained attention across earthquake engineering for failure prediction, post-event decision support, and structural health monitoring (SHM) [
12,
13]. Reviews consistently report improved predictive capacity when ML augments or replaces parts of conventional workflows, while underscoring the need for physics-guided models and careful uncertainty treatment [
14,
15,
16]. For structural response prediction under seismic loading, researchers have trained supervised learners (e.g., deep neural networks, random forests, kernel regressors) on simulation- or sensing-derived datasets to emulate the dynamic response and collapse capacity, often achieving much faster predictions compared with traditional finite element simulations [
17,
18,
19].
Recent studies exploit deep learning, such as Convolutional Neural Networks (CNNs), Recurrent Neural Networks with Gated Recurrent Units (RNNs/GRUs), and hybrid Convolutional Neural Network–Long Short-Term Memory models (CNN-LSTM), to learn non-linear mappings from structural properties and ground-motion descriptors (or raw waveforms) to storey and time-history responses, achieving satisfactory accuracy and real-time prediction potential [
20,
21,
22,
23]. Physics-based deep learning further improves generalization with limited data and helps maintain the physical description of the problem [
24]. Despite this progress, the dominant emphasis remains on global responses (drift, shear response, acceleration) and scalar performance metrics, noticing that comparatively fewer efforts target pixel-wise, full-field predictions of damage or plastic demand directly relevant to failure assessment.
Concurrently, computer vision has transformed how engineering fields handle spatially distributed quantities. Convolutional Neural Networks (CNNs) excel in dense, pixel-wise tasks (segmentation, super-resolution, image-to-image translation), with U-Net and its variants being adopted for learning high-fidelity mappings between images [
25]. In mechanics, CNN-based surrogate models have been used to predict stress/strain fields from geometry and loading, producing full-field predictions competitive with finite element solutions at a fraction of the computational cost, and also enabling rapid parametric analysis and design space exploration [
26]. In this context, relevant ML models can be used to learn the spatial distribution of plastic strains that governs damage localization and failure progression, information obtained from non-linear dynamic finite element simulations.
Among ML models, Convolutional Neural Networks (CNNs) are specifically designed to work with images: they hierarchically extract features, progressing from low-level local pixel patterns to high-level semantic representations [
27].
Accurately capturing the structural response of multi-storey steel structures equipped with both lateral-resisting systems (cross-bracing and SPSWs) using conventional FEA can be computationally demanding and time consuming, particularly for extensive parametric investigations. Consequently, there is a clear need for an efficient alternative approach capable of providing rapid and reliable response predictions without the burden of repeated full numerical simulations. ML provides a solution to this problem by offering data-driven surrogate models capable of learning the complex non-linear relationships between structural parameters and seismic building response metrics. The potential of ML to serve as a fast and reliable predictive tool has been demonstrated in numerous engineering domains, yet its application to multi-storey steel structures remains relatively unexplored. The motivation for this research, therefore, stems from the need to bridge the gap between computational efficiency and accuracy in structural seismic predictions. By integrating finite element analysis with advanced machine learning techniques, this study aims to develop robust data-driven models that can accurately predict the structural seismic response. Achieving this integration will not only enable rapid seismic evaluation of multi-storey steel buildings, but will also contribute to the broader adoption of intelligent, performance-based design and digital twin frameworks in earthquake engineering.
For SPSW-equipped steel buildings, such full-field plasticity information is particularly valuable, since it may predict local strain concentrations at panel corners, along diagonal tension fields and near boundary element interfaces damage, also revealing the potential to repair the affected structural elements after severe seismic events. In addition, the capacity to predict the failure response of these elements without the need for advanced and computationally expensive non-linear dynamic finite element simulation offers more tools towards the structural health-monitoring for multi-storey steel buildings and their integration in structural digital twins.
Existing data-driven seismic studies seldom address SPSWs explicitly, and when they do, they typically focus on system-level indicators (e.g., base shear, storey drift) rather than pixel-wise plastic demand maps. To the authors’ best knowledge, there is a clear research gap at the intersection of (i) multi-hazard non-linear dynamic finite element-derived datasets using recorded ground motions on multi-storey SPSW frames and (ii) image-based deep learning models that predict the resulting plastic strain distributions and failure patterns with low computational cost. Bridging this gap could lead to rapid screening-level assessments of multi-storey steel buildings subject to seismic events, directly supporting performance-based design and detailing decisions.
The present study aims to cover this gap, proposing a computer vision-based data-driven framework that adopts non-linear time history finite element analysis to generate labelled image datasets of plastic strain fields for multi-storey steel frames with SPSWs under past seismic events. The plastic strain distributions derived from finite element analysis are treated as images and paired with input descriptors of geometric–structural configuration and ground motions. In our approach, we trained a CNN model that takes geometric and structural parameters as input and generates realistic plastic strain fields as output. In particular, CNN models were developed using U-Net-style encoder-decoder backbones augmented, as needed, with physics-motivated losses, to predict failure-relevant plastic strain maps for unseen structures and earthquakes. Compared to purely physics-based workflows, the learned surrogate model aims to provide (i) near-instantaneous plastic demand “images” suitable for screening and sensitivity analyses, (ii) the ability to generalize across topologies and plate parameters encountered in practical design, and (iii) compatibility with uncertainty quantification for model auditing and risk communication. By focusing on local field prediction rather than only on global metrics, the proposed approach aligns with emerging digital twin solutions in structural engineering, which combine high-fidelity simulation, measurement data, and ML to support rapid assessment and decision-making before and after earthquakes [
16].
3. Introduction to the Adopted Deep Learning Framework
This project focuses on simulating and predicting the effects of earthquake loading on synthetic buildings composed of bays arranged in a 3D grid of size R × C × K, where R is the number of floors and C and K are the number of bays along the building’s width and depth, respectively. Each building configuration is subjected to simulated seismic events, and the resulting response is analysed using finite element analysis (FEA).
This synthetic dataset is then used to train a CNN model that can predict post-earthquake plastic strain distributions developed on SPSWs, based on frame geometry and loading. By learning from the synthetic dataset generated using FEA simulations, the CNN aims to estimate the resulting plastic strain distribution on SPSWs across the building bays with high spatial resolution. This approach can potentially enable the real-time assessment of structural vulnerability without the need for computationally expensive simulations.
3.1. Data Preparation: Images
The accumulated image data consists of a collection of 3403 finite element simulations related to synthetic buildings rendered from four viewpoints, corresponding to the cardinal directions around the structure, i.e., front, back, left, and right view. For each building, we therefore have
Four input images showing the structure from four cardinal directions;
Four output images visualizing the maximum plastic strain distribution on SPSWs after an earthquake, also from the same four directions.
These views are captured both before and after the simulated earthquake loadings:
The post-earthquake images use colour-coded bays to indicate the level of plastic strain: higher plastic strain regions are shown in warmer colours (e.g., red), and lower plastic strain regions in cooler tones (e.g., blue or green).
Figure 3 presents an example of one building’s input and its corresponding output after the application of the earthquake loading, both from the front view.
As shown in the
Figure 3, the structural analysis products are defined over a mesh of bays, where a bay represents a portion of a building bounded by structural support. For example, the building in
Figure 3 is composed by a grid of nine bays along the rows (r) and ten bays along the columns (c).
It is noted that the conducted analysis focuses only on a subset of the total bays within each building (in the examples of
Figure 3, the fourth and fifth column of bays). In this representation, the finite elements are shown as a mesh that subdivides the central bays. The mesh resolution and bay layout are consistent across all buildings, ensuring that plastic strain patterns can be compared between different designs. However, they may correspond to different physical sizes. This size and layout information is stored in accompanying metadata files provided alongside the images (see
Section 3.2).
Each building image pair is therefore uniquely identified by
For all remaining descriptions of the proposed deep learning process, the following accumulated plastic strain contour plot values correspond to the colours given in
Table 3 and
Figure 4. It is noted that the physical meaning of the low values of the plastic strain distribution is failure of low intensity.
3.2. Metadata
In addition to the images, each building–earthquake pair is associated with a set of metadata describing the structural and seismic parameters. These include the following:
Length, width, and height: Measured in number of bays along the two horizontal directions (length and width) and height of each storey;
Wall thickness: Structural thickness of each SPSW;
PGA (peak ground acceleration): A measure of earthquake intensity;
POV (point of view): The view direction from one of the four sides (A, B, C, or D);
Hz: The fundamental frequency of each building (as obtained from modal analysis).
An example of these metadata is provided in
Table 4.
3.3. Model Features
The metadata variables are used as predictors (X) in the CNN model, providing critical context about both the building’s geometry and the seismic input. They enable the CNN to learn how different structural configurations and earthquake characteristics affect the resulting plastic strain distribution.
While the metadata serve as the input features for the predictive model, the post-earthquake images—which show the plastic strain distribution—constitute the target variables (y). These images provide the ground truth output that the CNN is trained to predict.
In contrast, the pre-earthquake images were not used as inputs to the model, but were just employed for preprocessing purposes (see
Section 3.4).
3.4. Data Preprocessing
Predicting the full plastic strain map of a SPSW as a single image is a highly complex task due to the high dimensionality of the output and the variability in structural layouts. To reduce this complexity and make the learning problem tractable, a “per-bay” prediction strategy was adopted. That is, rather than predicting the entire post-earthquake image at once, the model is was to predict the plastic strain at the level of individual bays. These local predictions can then be reassembled to reconstruct the complete plastic strain map.
Additionally, it is noted that the images—like the examples shown—often contain various artefacts that the proposed model is not expected to predict. These include the following:
Ticks from the finite element mesh;
Labels or annotations from the visualization tool;
Slightly irregular or non-straight grid lines;
Artefacts introduced by the finite element software (e.g., white boxes);
Inconsistent bay sizes in pixel dimensions.
These elements are removed or reduced during preprocessing: the pipeline is specifically designed to filter out such noise and standardize the bay regions. This ensures that the model focuses solely on learning the meaningful plastic strain patterns and not on irrelevant visual distortions.
The result of this pipeline is a clean, well-aligned dataset of labelled bay-level image samples, which can be used to train a deep learning model. This strategy allows us to frame the problem as a structured, supervised learning task without the complexity of generating entire contour plot maps in one shot.
The predicting variables are assembled from the metadata introduced in
Section 3.2, with two minor preprocessing steps. First, the POV feature is converted from a categorical variable to a numerical format using one-hot encoding (OHE), resulting in three new binary columns: POV
A, POV
B, and POV
C. The category corresponding to the fourth and last point of view, POV
D, is intentionally omitted to serve as the reference class and to avoid introducing artificial correlations in the data.
Secondly, since we are considering the images per-bay, we add two extra columns, r and c, indicating the row and column indices of each bay within the building grid, allowing for the model to learn spatial relationships between adjacent bays. The resulting metadata, which ultimately serve as input features to the machine learning model, appear as in
Table 5.
3.5. Summary
After preprocessing and metadata integration, the dataset is organized into two components:
X: The input feature matrix, including geometric and seismic metadata along with one-hot encoded orientation;
y: The target data, consisting of images representing plastic strain contour plot distributions for individual bays.
Starting from 3403 simulations, each captured from four different viewpoints, we processed a total of approximately 14,000 images. From these, the pipeline extracted roughly 35,000 individual bay regions, which served as the model samples.
Table 6 summarizes the shapes of the input
X and targets
y.
6. Conclusions
This article provides a methodology for evaluating the failure response of steel plate shear walls adopted to support multi-storey steel buildings against lateral, seismic actions using a deep learning approach. A dataset has been developed using non-linear dynamic finite element analysis on a range of multi-storey steel buildings. Input parameters included undeformed steel frame data providing geometry characteristics such as bay and storey numbers, as well as the thicknesses of steel plate shear walls, the fundamental frequencies of each building, and peak ground accelerations of the seismic loading events. Output parameters were images providing failure on steel shear walls in the form of accumulated plastic distribution.
To capture the plastic strain distribution, a Convolutional Neural Network model was developed and trained using the mentioned input–output parameters. The results indicate that the comparison of the predicted plastic strain distribution on shear plates is close to the one provided in the dataset. Mean Squared Error values between 0.002 and 0.017 were obtained in this study.
The proposed methodology can be adopted to provide a fast prediction of the ultimate response of multi-storey steel buildings under seismic actions. This approach can assist structural design tasks when evaluation of the response of a large number of buildings is needed and where the manual and computationally expensive simulation of every building is challenging.
The adopted metadata used as input on this study is relatively coarse. This may limit the model’s applicability. Therefore, this study can be enhanced and the model’s applicability can be improved by developing more dataset points that describe a greater range of building patterns and generate more dense inputs for the metadata. Experimental validation, even in smaller-sized buildings adopting some scaling, would also increase the impact of the proposed approach.
It can also be extended, by incorporating the image recognition part proposed here, in more holistic digital twin solutions. Thus, this work can be considered as a step towards developing more complete digital twin solutions where a real-time monitoring of the structural system is adopted and relevant information with the digital counterpart of the monitored building is exchanged. Within this framework, accelerations of the investigated building will be measured, providing seismic response data. This can be considered by the digital twin platform, in addition to the geometric characteristics of the building, towards predicting the failure response of SPSWs using the computer vision ML approach shown in this study. The response of the overall structural system and the influence of SPSWs on the dynamic response can be further evaluated in future work.