1. Introduction
Recent devastating seismic events, such as the 2023 Turkey–Syria earthquake, call for investing more scientific and economic resources in earthquake-resistant structures. The growing focus on sustainable constructions directs investment towards optimal seismic design. In addition to pursuing performance goals, optimizing the structural design of buildings can, in fact, meet sustainable goals like saving materials/costs and reducing embodied carbon emissions [
1]. Conventionally, the design of earthquake-resistant buildings typically involves a tedious trial-and-error procedure of pre-dimensioning and performance validation, which is unlikely to fully satisfy the goal of optimal material use [
2]. On the other hand, the complexity and nonlinearity inherent in the problem make it challenging to find effective methods for the optimal design of earthquake-resistant buildings [
1,
3].
In addition to the random nature of seismic excitation, the nonlinearity of the problem lies in the ductile and dissipative behavior that is required for buildings by current regulations [
4,
5]. The ductility-based (or performance-based) approach ensures that earthquake-resistant buildings meet life-safety constraints under given seismic hazard levels. Since only a small amount of the input seismic energy is dissipated by structural inherent damping, the post-elastic ductile and dissipative behavior of structural elements is a beneficial way to withstand strong seismic actions. Therefore, according to strength hierarchy rules, plastic hinges should develop in “weak-designed” structural elements of reinforced concrete buildings to suitably dissipate seismic energy under severe earthquakes [
6,
7]. The task of dissipating energy is instead entrusted to the internal and external metallic joints in wooden buildings [
8], and to the diagonal bracing in steel frames [
9]. Steel braces typically plasticize under high tensile loads, and dissipative devices are sometimes added to enhance their performance [
10,
11].
Due to their excellent seismic behavior, moment-resisting steel frames are frequently adopted in earthquake-prone countries [
12] and many studies have been conducted to estimate their seismic behavior as well as local and global failure mechanisms under far-fault and near-fault earthquakes (cf. e.g., [
13,
14]). Bracing systems are typically considered in the design of new or retrofitted framed steel buildings to enhance their seismic performance [
15,
16]. It is worth noting that the failure modes of this structural typology typically occur because of the combined effect of seismic and gravitational loads acting simultaneously. Seismic loads generate lateral displacements in the building floors, which may lead to the plastic hinging of member sections (beams and columns) and to global buckling of the diagonals. In parallel, the increase in lateral displacements causes a shift in the floor mass centers, which amplifies the overturning effect induced by the lateral thrust of the seismic forces. The cumulative effect produced by increasing lateral displacements is known as the P-delta effect, which contributes to the collapse mechanism of the structure. Low-rise steel buildings often exhibit a collapse mechanism associated with the failure of first-story columns. The type of earthquake may play a decisive role in the failure modes. Experimental and numerical investigations on steel frames subjected to long-period earthquakes demonstrated that deterioration in welded connections accelerates collapse [
17]. On the other hand, the local behavior of the steel members may strongly affect the ductility capacity, when near-field earthquakes (pulse-like) are concerned [
18]. The configuration and detailing of concentric bracing system may significantly affect collapse margin ratios, with certain layouts improving seismic safety more effectively than others [
19]. Seismic connection type and detailing may strongly affect progressive collapse resistance [
20], while aftershock sequences represent a critical factor in collapse risk, as buildings damaged during the mainshock exhibit significantly higher fragility under subsequent ground motions [
21]. Overall, the seismic collapse of steel structures is governed by a complex interaction between global instability phenomena, local connection behavior, and cumulative seismic demands under different kinds of earthquakes. The capacity design and strength hierarchy philosophy of modern seismic regulations, as well as improved detailing strategies, aim to enhance the seismic safety of steel buildings, although structural vulnerabilities such as weak-story mechanisms and aftershock fragility remain major concerns in collapse performance.
According to the regulations, different kinds of seismic analyses can be performed to assess the elastic and post-elastic performance of new or existing buildings: from the simplest linear static to the most complex non-linear dynamic and incremental ones [
6,
8,
22]. In general, the simpler the analysis, the less rigorous it is, since some dynamic and/or non-linear features are neglected. The choice of seismic analysis and relevant performance constraints adds further uncertainty and represents a key step within the optimization problem.
In recent decades, several methods have been proposed to optimize the seismic design of buildings [
1,
3]. Sizing, shape and topology optimization have been considered, depending on the design variables assumed in the study; deterministic or probabilistic approaches have been adopted depending upon how the loading or the structural properties are defined; single or multi-objective functions as well as gradient-based or non-gradient-based optimizers are alternatively used. A review of the methods proposed in the literature can be found in [
1,
3]. Confining our attention to steel moment-resisting frames, the most common choices are (see, e.g., [
23,
24]): performance-based sizing optimization; cross-sectional properties of structural members set as design variables; structural weight (or initial cost, or cycle-life cost of the building) chosen as single-objective function; code-based performance requirements considered as constraints.
Different algorithms are adopted for the performance-based sizing optimization of steel buildings [
3,
25]. Among them are genetic algorithms or evolution strategies balancing construction costs and seismic post-elastic performance [
24,
26]. Meta-heuristic algorithms like the Modified Firefly [
23], the Harmony Search [
25], the Ant Colony [
27] and the Particle Swarm [
28] are also applied. Although such approaches are usually successful, some drawbacks, relevant to control parameters and sensitivity to the specific problem, typically affect metaheuristic algorithms [
29].
Artificial Intelligence, and particularly Artificial Neural Networks (ANNs), may help to achieve optimization goals, overcoming and addressing some weaknesses of other methods. Alone or in conjunction with other approaches, ANNs have in fact recently been used for the optimal seismic design of steel frames. ANNs were trained in [
29] on examples of optimum structural design of two-dimensional steel frames, obtained through an adaptive harmony search algorithm, thus making trained ANNs capable of predicting the optimal solution of new variants. Combined with discrete wavelet transform (which decomposes earthquake records into a useful low-frequency part and a discardable high-frequency part), ANNs were trained to predict structural responses entailed by non-linear dynamic analyses [
16] or by non-linear static (pushover) analyses [
30], thus speeding the weight optimization of steel moment-resisting frames.
The inversion of a trained ANN was recently proposed in [
31] as a powerful multi-objective optimization tool to solve engineering inverse problems. A preliminary application of this tool to optimal seismic design was done in [
32] with reference to a reinforced concrete building. The neural network inversion algorithm, at the best of our knowledge, has no equivalent in the literature. It consists of solving the equations describing the previously trained neural network. Other methods proposed in the literature for solving inverse problems use trained neural networks to speed up the search for an optimal solution (cf., e.g., [
33]). This is usually achieved by replacing the function call, which can be very costly when carried out using a FEM calculation. Nonetheless, these optimization methods are subject to the limitations typical of this class of problems, such as defining feasible solutions or distinguishing local from global minima. These limitations are avoided by the proposed ANN inversion methodology.
Based on the inversion of Multi-Layer Perceptron (MLP) ANNs [
34], the present paper presents an intelligent three-step procedure for the optimal design of earthquake-resistant buildings. The methodology is illustrated in
Section 2, where some backgrounds on the training of MLP neural networks (step 1), their inversion (step 2) and the cost-effective gradient-based search algorithm (step 3) adopted for the optimization are provided. To test the procedure, reference to a benchmark multi-story steel frame and to some earthquakes consistent with the Chilean response spectrum is made, while non-linear time-history numerical analyses are performed on the three-dimensional model of the building, as described in
Section 3. Four characteristic cross-sectional design parameters and five noteworthy capacity-design constraints are considered to obtain the dataset referring to 148 cases.
Section 4 illustrates the process of building up input and output matrices, while
Section 5 provides an application of the three-step procedure (training, inversion and optimization) to the archetypal steel building. The results are discussed in
Section 6, while some conclusive considerations are finally given in
Section 6.
2. Methodology
This study addresses the inverse structural problem of finding the optimal design parameters for an earthquake-resistant building under the constraint of meeting the code-based performance checks. According to a standard input–output dynamic test scheme, the study is carried out by applying to the structure a series of earthquakes as external excitation signals and evaluating the performance of structural elements. Each earthquake provides a wealth of information about structural behavior, highlighting potential issues. The time history of the stress/deformation patterns is obtained through numerical dynamic analyses carried out under each given earthquake, which leads to assessing the seismic performance of the structure. For optimization purposes, only information relevant to critical sections is taken as a performance indicator. The dataset for training consists of pairs of design parameter sets and their corresponding performance indicator sets.
It is worth noting that it is difficult to estimate in advance the number of earthquakes required to characterize a structure’s seismic behavior. Seismic regulations generally state that three seismic records are sufficient for this purpose, provided they are selected in accordance with the code rules. Therefore, three earthquakes will be used to create the training set.
The optimal design goal is achieved through a general three-step procedure. In the first step the functional relationship between design parameters and structural performNoance variables is approximated through MLP ANN [
34] trained on a series of numerically simulated cases. The second step concerns the inversion of the trained MLP ANN [
31] under given performance constraints to find feasible sets of design parameters. Based on the inverted MLP ANN, the third step deals with a gradient-based search algorithm to obtain optimal design parameters. The general methodology is illustrated in detail below. This three-step procedure will be applied to a framed steel building in the subsequent sections as a specific example.
2.1. Step 1: MLP Training
The method starts with a training phase. This consists of creating an MLP that associates any given set of design parameters with performance variables that represent the structural behavior of the considered archetypal building under given seismic actions. This phase involves considering many scenarios, each characterized by a specific set of design parameters. A finite element model of the building is implemented, and the structural response (in terms of code-based performance variables) is evaluated for each scenario, by carrying out non-linear dynamic analyses under different earthquakes. A check on the performance variables, scenario by scenario, leads to verification of non-compliance with code requirements and identification of the responsible structural elements. Design scenarios for which no element mismatches the code requirements under any of the considered earthquakes are labeled “feasible”, while those where at least one element does mismatch are labeled “unfeasible”. The feasibility/unfeasibility defines a mapping in the space of the design parameters. The aim is to make it easy to explore the feasibility region in search of optimal design. The region of the design space near to frontier between feasible and unfeasible domains is strategic for the training phase. In fact, the neural network cannot approximate the input/output relationship all along the space; therefore, a bounded region must be chosen where the function must be approximated. Thus, the region across the frontier between feasible and unfeasible domains (transition region) must be modeled carefully, because the optimal design is likely to be found in this area. Conversely, a design solution far from the frontier is also presumably far from the optimal solution; therefore, an inaccurate approximation in this zone does not negatively affect the optimization purpose. This strategy allows us to obtain a training set in which the inputs and outputs are strongly correlated, which is beneficial during the training phase.
The design scenarios are divided, respectively, into a Training Set, a Validation Set and a Test Set. The training algorithm is applied to the training set. To avoid overtraining, which negatively affects generalization capability, the performance on the validation set is monitored during training, and when the performance begins to rise, the training is stopped. Finally, the approximation degree is evaluated using the test set, which is independent of the MLP training process. The main objective of the training is in fact to achieve a good approximation in the test set because it provides a measure of the generalization capability of the trained MLP. In fact, poor performance on the test set indicates that the training set does not adequately cover the range of possible examples.
To limit the computational burden, a criterion is adopted for creating the training set. The first set of examples is generated by regularly sampling the space of parameters. Since the aim is to focus on the transition region between feasible and unfeasible solutions, cases resulting in large margins with respect to the frontier are excluded from the set. An MLP is trained several times on this set by changing the subdivision into training, validation and test sets. Examples that are not recognized when they are part of the test set indicate that their surrounding region is not sampled properly. To generate new examples, a convex linear combination is created between matched and failed examples in the design parameter space. This procedure is iterated until all the examples included in the test set are well approximated.
Figure 1 shows the scheme of the adopted MLP (shallow neural network). It consists of one input layer whose neurons correspond to the design parameters; one output layer whose neurons correspond to the performance variables; and one hidden layer whose neurons provide the MLP’s degrees of freedom. It is worth noting that a greater number of hidden layers could facilitate the training. In contrast, it has been shown that an MLP with a single hidden layer is a universal approximator [
35]. This means that a single hidden layer with a greater number of neurons may produce the same mapping as several hidden layers with fewer neurons. As the inversion process is highly dependent on the number of hidden layers and only slightly affected by their size, it is useful to use MLPs with a single hidden layer (choice made in the present study). The final size of the hidden layer is determined by means of a trial-and-error procedure, where new neurons are added until the desired level of precision is achieved on the training set.
The structure of
Figure 1 implements an input–output relationship that can be formulated as follows:
Here,
is the vector of the MLP inputs;
is the vector of the MLP outputs;
is the connection matrix between input and hidden layer;
is the bias of the hidden layer;
is the connection matrix between the hidden layer and the output layer;
is the vector of bias of the output. The two auxiliary variables
and
represent, respectively, input and output vectors of the hidden layer. Finally,
is a sigmoidal activation function of the hidden neurons. For instance, a sigmoidal function can be
(see
Figure 2).
The activation function of the hidden layer creates a non-linear relationship between the input and output of the ANN, enabling it to solve non-linear problems. From
Figure 2 it can be inferred that this sigmoidal activation function maps the entire range [
into the interval [
. During the inversion process, this function imposes a constraint on feasible solutions. In fact, values of the output outside the interval [−1, +1] do not correspond to any input values; therefore, they are not feasible. This aspect is formalized through a set of constraints applied during the inversion process, as shown more clearly below.
The MLP is trained by means of routines developed with MATLAB R2024b [
36]. By the end of the training process, the MLP is expected to be able to reproduce the input–output relationship perfectly for all possible combinations, not just the training examples. It is important to reiterate that choosing an appropriate test set is vital to achieving this goal.
2.2. Step 2: MLP Inversion
The second step is to invert the trained ANN, in order to define the input space domain that corresponds to the assigned codomain (as its image) in the output space. The trained ANN establishes a relationship (Equation (1)) between input space (design parameters) and output space (performance of the building). By exploiting such a relationship, the inversion of the ANN can associate an input point with a given output, namely the design parameters that guarantee the required performance. An algorithm for inverting MLPs is described in [
37]. Only what is functional to the present application is illustrated below.
The constraints imposed by the regulations on the output variables
can generally be expressed as a set of linear inequalities:
where
is a matrix of coefficients and
is a vector of constant terms. The linearity of such constraints is by far the most frequent case, and it allows us to simplify the discussion. Substituting Equation (1a) in Equation (2) leads to expressing the feasibility domain of the output
in terms of the auxiliary variable
, that is:
Through Equation (3), the feasibility domain (2) is mapped from the MLP output space to the hidden layer output space. It is worth noting that the linearity of domain (2) is preserved in the mapping, due to the linear relationship between
and
. Variable
must fulfill supplementary constraints deriving from the fact that the transfer function has lower and upper saturation values (see
Figure 2). Only the values within the saturation interval correspond to inputs of the MLP. Thus, the following constraints must be added, which assume the form of a hypercube:
where
and
are, respectively, the lower and upper bounds of the components of
. The two linear systems (3) and (4) are merged into a unique system of linear inequalities:
where
,
and
is the identity matrix.
Equation (1b) establishes a biunivocal correspondence between the input of the hidden layer (space K in
Figure 1) and a hypercube defined in the output of the hidden layer (space H). Domain (5) is nonlinearly mapped into space K, where the domain becomes:
Finally, by exploiting Equation (1c), the feasibility domain of the output can be expressed in the input space as follows:
All and only the inputs which fulfill constraints (7) are associated by the MLP to outputs fulfilling constraints (2). These are feasible solutions (see
Section 2.1).
2.3. Step 3: Optimization Algorithm
The optimization step needs a feasible starting point, namely fulfilling (7), and then the objective function is optimized by means of an iterative procedure. Further constraints could be defined in the input space to prevent unreasonable design parameters, such as negative values. The cost
of the structure is typically assumed as an objective function. In general,
can be related to the design parameters through non-linear relationships. The optimization problem, which is non-linear both in the constraints and in the objective function, can be written as:
Among the training examples, there is a subset of feasible solutions, which can be assumed as starting points. Generally, the gradient of the objective function is evaluated at each starting point, and the search direction is determined accordingly. When the objective function is monotonic with respect to each parameter (as it occurs in the case study considered in this paper), the criterion “the smaller, the better” can be adopted. This leads to the concept of a “utopia point”, where all the design parameters assume their minimum values. If such a minimum value cannot be defined in advance, the null value is assumed instead.
The straight line connecting each feasible point to the utopia point may intersect one or more constraints. Among the intersection points, the nearest to the utopia point is assumed to be the optimum point (i.e., the local optimal solution).
The process is performed by starting from each feasible point in turn. In this way, a set of optimum points is eventually found. The final optimal solution is the one which corresponds to the minimum value of the objective function among the collected optimal points (i.e., the minimum of minima).
5. Three-Step Procedure (Training–Inversion–Optimization)
Preprocessing of data, detailed in
Section 4, led us to obtain a database made of a 4 × 168 input design matrix and a 31 × 168 output performance matrix. To manage the complexity of potentially interdependent outputs, a specific strategy was adopted: rather than training a single MLP with 31 output neurons, a separate MLP neural network was trained for each individual target output. This resulted in a total of 31 distinct MLPs, each dedicated to predicting one specific performance parameter. This architectural choice, while seemingly increasing the number of networks, offers significant advantages. Firstly, it simplifies the learning task for each individual network, as each MLP focuses solely on the relationship between the input design parameters and a single, isolated output performance check. This often leads to faster convergence during training and potentially higher accuracy for each specific prediction. Secondly, it enhances the interpretability and troubleshooting of the models; if a prediction for a particular performance check is inaccurate, attention can be directed to retraining only the specific MLP responsible for that output, rather than re-evaluating a complex, monolithic network.
5.1. Step 1: MLPs Training
A preliminary training set was created by regularly sampling the three-dimensional input space (design parameters), ensuring that a reasonable range was covered for each parameter. Each sample represents a design scenario, for which the performance parameters can be obtained through a numerical analysis under any given earthquake. Due to the computational cost of each scenario, we minimized their number in the first stage by assuming that further cases could be added as required in subsequent stages. Since the number of cases considered was comparatively small, the leave-one-out method [
49] was adopted. It involves performing as many training runs as there are cases in the set, using only one case at a time as the validation set. This led us to establish a suitable number of epochs (100) to use in the final training. To this end, training was stopped in each run as soon as the performance diagram for the unique validation case began to rise (early stopping). For the final training, the minimum number of epochs from all training runs was adopted and the entire dataset was used. Proceeding in this way allowed the training set to focus more on the problem at hand. In fact, if the trained network cannot approximate the validation case, it implies that the case is not well represented by the rest of the set.
A different strategy is adopted depending on the location of the outlier. As the region of interest is close to the frontier (see
Section 2.1), any outliers that fall far from the frontier are dropped from the training set, regardless of feasibility. Conversely, if it falls close to the frontier, this indicates that the region of interest is not adequately sampled, and new examples are required around the outlier. To generate new examples, a convex combination is calculated between each outlier and the nearest examples in the dataset. Each new example represents a different design scenario, and a FEM analysis is performed for each one. The dataset is progressively adapted to the problem at hand through the two actions of pruning and growing. The dataset adaptation procedure ends when the trained network provides a good approximation of all the examples when used for validation.
In the study, we found that 10 out of 168 examples were difficult for the MLP to approximate. Subsequent checks revealed that these examples were far from the region of interest, justifying their removal from the database.
A dedicated MLP was trained for each of the 31 outputs over the entire dataset. This choice was made purely to simplify the training phase. In fact, the output of the network represents constraints for the training problem; therefore, the greater the number of outputs, the more difficult the training. Splitting the training problem into multiple training runs, each with a different output, greatly reduces the total complexity.
The dataset for the MLP training was made up of a 4 × 158 input design matrix and a 31 × 158 output target matrix (
Figure 10). Reducing the training set enabled good performance to be achieved despite the low number of epochs, with a mean squared error (MSE) of less than 1.00 × 10
−4 for all the trained MLPs. As an instance, the regression curve relevant to one of the 31 trained MLPs is provided in
Figure 11. The plot reveals a strong correlation between predicted outputs and relevant targets. This is because training focuses on the transition region (see
Section 2.1), which is a small subspace where the input–output function can be considered approximately linear.
Incidentally, it can be noted that a dataset of just 168 cases (then reduced to 158) might seem too small for training purposes. Although it is difficult to estimate a priori the number of cases required to describe a distribution, there are controls that can help to determine whether the number of cases is adequate or not. One of them concerns the ratio between the number of cases and the size of the search space, which is quite reasonable in the present study. Another control involves assessing the neural network’s ability to generalize during training. To improve this ability, the leave-one-out method was exploited, as discussed before.
5.2. Step 2: Inversion of the Trained MLPs
The trained MLPs were then inverted according to the procedure described in
Section 2.2. Each inverted MLP is a constraint in the input design space; thus, the union of all constraints gives the frontier of the search domain. The optimal design parameters are sought within the search domain, starting from feasible points of the input design space, by minimizing the objective function.
5.3. Step 3: Optimization
The cost of the structure was assumed to be the objective function. As the structure is made entirely of steel, the cost is proportional to the total structural mass, which in turn depends on the length and cross area of the structural elements. As the length of the structural elements is constant for all the examples considered, the mass of the prototype is directly proportional to the cross-sectional areas of the four categories of elements. Finally, as the section profile of the four categories of structural element is assumed to be the same for all the considered cases, the cost of the structure depends on the design parameters and , i.e., the moments of inertia about the horizontal local axis of the cross-sections of the four elements’ categories.
To relate the objective function
to the input design parameters
, the relationship between inertia moment and cross area is needed for each of the three standard profiles considered for the archetype (namely, HEB, UPN, and IPE). It was very hard to obtain a closed-form relationship, so an interpolation was derived instead. By plotting the values of inertia moments versus the corresponding cross areas taken from look-up-table datasheets, the diagrams in
Figure 12 were derived. Based on such diagrams and on cubic splines, approximate non-linear relationships between cross areas
,
,
and inertia moments
,
,
were derived for the four categories of elements of the archetypal building. Since the inertia moments
,
,
are the input parameters
, the objective function can be eventually defined as follows:
where γ is the steel mass density, while
,
,
and
denote the total length of columns, X-beams, Y-beams and diagonals, respectively.
The inverted MLPs obtained in
Section 5.2 are exploited to seek, in the input search domain, optimal design parameters for the archetypal steel building. A schematic representation of the optimization process, based on the utopia point (see
Section 2.3), is sketched in
Figure 13. The process starts with a set of feasible points in the input space (blue points in
Figure 13). Such points are design solutions (retrieved from the training set) that are relevant to safe performance checks, according to code requirements. From each feasible point, a seeking segment is traced that connects that point with the utopia point (red point in
Figure 13). The local optimal solution in each seeking direction is found at the intersection with the first met constraint (green point in
Figure 13).
It is to note that when the margin between the points of the seeking segment and the constraint is negative or null (interception point) the constraint is fulfilled, in agreement with Equation (2). On the contrary, positive margins mean unfeasible solutions. Accordingly, the green part of the seeking segment is relevant to feasible solutions, while the red part is relevant to unfeasible ones.
It is worth noting that, rather than guaranteeing convergence to the global optimum, this procedure provides a sub-optimal solution at a reasonable computational cost. The quality of the solution depends on the number of feasible points that belong to the training set. Ultimately, the greater the number of feasible starting points, the greater the chance of approaching the global optimum. However, this is not the only possible strategy. For instance, once the frontier point has been reached, further searches could be conducted along the frontier. This method is beyond the scope of the present work and will be the subject of future research.
To give an explicative example, the envelope of the margin between the points of the seeking segment and the nearest constraint, relevant to case n. 148, is plotted in
Figure 14. The side with positive values of the envelope corresponds to unfeasible points (which means that at least one constraint is violated). Instead, feasible solutions are relevant to the side where the points of the envelope assume negative or null values. On the feasible side, the value closest to zero, i.e., the point nearest to the utopia point (green square), is eventually chosen as the optimal solution. This aligns with the criterion of “the smaller, the better”, which can be adopted here since the objective function is monotonic with respect to each input design parameter.
The values of the objective function c (which corresponds to the total mass) relevant to the different local optimal solutions, obtained starting from different feasible points of the input domain, are plotted in
Figure 15. In the diagram, the minimum value of the total mass identifies the optimal solution (point highlighted in red).
Table 8 provides the design values corresponding to the final optimal solution. In the same table, the commercial sections that better match the values of moment of inertia given by the optimization process are also provided. The results of the numerical time-history non-linear analyses carried out under the three considered earthquakes show that the archetypal structure with these optimal design parameters satisfies the code performance checks of
Table 6.
This confirms that the proposed three-step procedure effectively identifies the optimal design parameters for a given archetypal building by ensuring that the capacity-design performance constraints are met.
6. Conclusions
The aim of this study was to investigate the possibility of artificial neural networks (ANNs) gaining design versus performance experience for earthquake-resistant buildings and using such experience in solving the structural inverse problem of optimizing the design variables for assigned code-based performance constraints. To achieve this aim, a three-step procedure was proposed. The first step is to train multi-layer perceptron (MLP) ANNs using a database of input design parameters and their corresponding output performance variables, relevant to a given archetypal building. In the second step, the trained MLPs are inverted under code-based capacity-demand constraints, so that the feasibility domain of the output variables can be reflected in the input design space, thus defining a search domain for the successive step. The third step exploits a gradient-based search algorithm to seek optimal design solutions, starting from feasible input points laying in the input search domain.
The procedure was tested on an archetypal multi-story building, namely a concentrically braced steel framed building with active tension diagonal bracing, which was assumed to be built in Chile. The inertia moments of the cross-sections of the four classes of structural element were chosen as the four design parameters. This produced 168 distinct cases, each characterized by a specific arrangement of cross-section profiles. Five performance parameters were assumed as the most representative outputs for checking the building capacity to withstand seismic actions, and their value was calculated for all the involved elements by carrying out non-linear dynamic analyses for all the cases considered. Three earthquakes, which were consistent with the Chilean design response spectrum and had been suitably cut using an energy-saving process, were used for the analyses. According to the regulations, three earthquakes are in fact enough to assess the seismic performance of buildings using non-linear dynamic analysis, since the additional information provided by a fourth earthquake is statistically negligible. For optimization purposes, each earthquake is considered as an external excitation that can provide a wealth of information about structural behavior, highlighting potential issues. Thus, the training dataset was built by considering three earthquakes consistent with code requirements.
To reduce the training complexity, the input and output matrices were resized during the training phase by dropping out both the outliers that fall from the frontier (where the optimal solution is likely to be found) and the outputs that always meet the code constraints. Finally, 31 output parameters were identified that could capture critical aspects of the building’s seismic performance (i.e., parameters that result in unsafe values of the performance parameters in at least one of the considered cases or under any of the earthquakes considered). Due to the comparatively low number of cases considered, the leave-one-out method was adopted. An MLP was trained for each of the 31 outputs, which simplified the training phase. Splitting the training problem into multiple training runs, each with a different output, greatly reduced the total complexity of the problem without compromising effectiveness of the solution. The trained MLPs were then inverted by using the code-based performance checks as constraints to solve the inverse problem of defining a feasible design solution domain.
Concerning the numerosity of the dataset, it is worth noting that the validation process establishes whether the training set is suitable in terms of both number and distribution of cases, or in other words if it is representative of the entire distribution that the network is tasked with mapping. In the present case, the validation process is even more rigorous than usual since, in the leave-one-out method, each case acts once as a validation set. Therefore, each case of the training set is well represented by the rest of the training set, which avoids overfitting. On the other hand, the strong correlation between target and calculated values is just a consequence of the small size of the sub-space (transition region) spanned by the training set. Such a strong correlation between inputs and outputs is beneficial during the training phase.
The structural weight of the building (which is directly related to the initial cost of the structure) was assumed as the objective function. An optimum search was performed starting from each feasible design solution within the search domain. This led to obtaining a set of local optima, the minimum value of which eventually gave the final optimal solution. A double check made through numerical analyses confirmed the good performance of the optimal solution found, thus confirming the success of the procedure. It should be noted that this procedure provides a sub-optimal solution at a reasonable cost. In general, the greater the number of feasible starting points, the greater the chance of approaching the global optimum.
The significance of the results obtained in this study depends on the specific example considered, so it is not possible to predict the outcome in other cases. Nevertheless, this example demonstrates the effectiveness of the proposed methodology for solving complex optimization problems, suggesting that it could be applied successfully to the optimal seismic design of any type of building. Future work will be done on testing the three-step procedure on other kinds of structures (reinforced concrete buildings with shear walls, X-lam timber multistory buildings) and on providing insights to make it faster and more effective.