Learning the Link between Architectural Form and Structural Efﬁciency: A Supervised Machine Learning Approach †

: In this work, we exploit supervised machine learning (ML) to investigate the relationship between architectural form and structural efﬁciency under seismic excitations. We inspect a small dataset of simulated responses of tall buildings, differing in terms of base and top plans within which a vertical transformation method is adopted (tapered forms). A diagrid structure with members having a tubular cross-section is mapped on the architectural forms, and static loads equivalent to the seismic excitation are applied. Different ML algorithms, such as kNN, SVM, Decision Tree, Ensemble methods, discriminant analysis, Naïve Bayes are trained, to classify the seismic response of each form on the basis of a speciﬁc label. Presented results rely upon the drift of the building at its top ﬂoor, though the same procedure can be generalized and adopt any performance characteristic of the considered structure, like e.g., the drift ratio, the total mass, or the expected design weight. The classiﬁcation algorithms are all tested within a Bayesian optimization approach; it is then found that the Decision Tree classiﬁer provides the highest accuracy, linked to the lowest computing time. This research activity puts forward a promising perspective for the use of ML algorithms to help architectural and structural designers during the early stages of conception and control of tall buildings.


Introduction
Architects and designers have always been curious about building novel forms, though there were lots of restrictions for exploring complex forms.While in some cases there are less limitations in the design, on other occasions engineering fields dictate many considerations, for example in the case of tall buildings, which represent one of the most complicated design processes [1].Tall buildings are an outstanding architectural production and require amazing resources with immense expenses due to their large scale.Since they became nowadays more sophisticated, it is essential to feature suitable and efficient structural configurations.Design teams are currently looking for specialists with a knowledge about efficient, or optimal structural design [2].The entire design process also requires a close collaboration between architects and engineers, who look for a software that provides them clear requirements for the architectural form and identifies the alternative with the highest structural efficiency and, at the same time, provides a portfolio of different options.Such a software might help contractors and clients to reduce the total costs of construction [2], since about one third of the total expenses are related to the structure; accordingly, structural considerations should be allowed for in the early stage of the design process [3].Although the early-stage design phase is a negligible part of the whole design process, it thus plays a relevant role in the whole procedure, see [4].In our modern, smart-city age, the design of tall buildings has become the outcome of the close teamwork of architects, structural and Comput.Sci.Math.Forum 2021, 2, 18 2 of 8 mechanical engineers that resulted in a considerable efficiency.By the advance of technology and the constructability of complex forms, the structural efficiency may fade and, even more regrettably, it can lead to depleting the Earth resources in case of an inefficient use of materials load bearing capacity [5].In the contemporary tall building design approach, structural considerations are not fully taken into account, till when the architectural form is appropriately generated; this procedure compels the structural intervention to fix the single, individual problem rather than really integrating the structural model into the initial architectural form.Since the design process in the early-stage is so critical, a workflow should be utilized to assure that all aspects are considered simultaneously, see e.g., [6,7].Moreover, architects should simultaneously consider various design objectives, including structural efficiency, since the 80% of the consumption of construction materials is defined at this stage.
Some researchers investigated the relation of architecture and structural efficiency for tall buildings [8].On some occasions, it was claimed that hyperboloid form has better structural efficiency in comparison to the cylindrical form.Others also considered the effect of the architectural form on the structural efficiency, the investigation resting on different, alternate geometries [9][10][11][12].
Within the frame of a parametric design paradigm, all the design features such as form, can be modified at any time during the design process [13].In the so-called computational design approach, a steady capability to create complex models in terms of form is pursued; with this approach, ordinary structural modeling performances are supplied through a simulation tool, to cope with the problem complexity and speed up requirement compliance: several alternatives can be thus adopted in the current model of a building at a glance [13].It turns out that parameters that define the architectural and structural components of the model, are flexible and can be adjusted on the fly.
There is currently a lack of research activities regarding the application of ML in the field of architecture.For example, in [14] ML was adopted to generate non-conventional structural forms that can be classified as objectives or subjective of the design product.ML indeed provides the designer with an insight on the structural efficiency of the solutions [15].Alternatively, artificial intelligence has been exploited to add more creativity to the design process, e.g., by using a variational autoencoder in a design framework: the algorithm generates some samples and then a the autoencoder can start training the model.
In this work, we focus on the use of ML tools to learn the link between the outer shape of tall buildings, their load bearing frame and the overall capacity to resist earthquake excitations.Different algorithms are trained by exploiting a rather small dataset of results regarding the response of buildings of different shapes excited by a seismic-like loading, and a comparison in provided in terms of their efficiency to get trained and their capability to provide accurate surrogates of the real structures.

Proposed Methodology
The present approach consists of three stages: (i) architectural form generation; (ii) structural analysis; and (iii) supervised ML.Out of these three stages, only the last one provides novelties, since it addresses the question as to whether it would be possible to obtain accurate surrogates within a ML-based process.It was indeed time consuming to generate all the architectural forms, and then build the structural model for all the considered 144 forms, apply loads, and finally carry out the structural analysis.For example, if only a part of the 144 forms would be investigated for modelling and training the ML tool, the result for the remaining part of the dataset could be generated automatically.The main goal of this research is so to apply ML on the aforementioned problem or, more precisely, to find an optimal case-dependent classification algorithm.

Architectural Form Generation
The architectural form of a tall building has been interpreted here as a top and bottom plan and a vertical transformation method consisting of morph, twist, or a curvilinear transformation, see [16].A set of 144 different architectural forms of tall buildings has been generated.Top and base plans could be varied within 3,4,5,6,7,8,9,10,11,12,13 or 24 sided polygons.This process exploited Rhinoceros™ and Grasshopper™, thanks to their powerful parametric tools.In next step, a diagrid (tubular) structure has been designed, sharing pinned joints with intermediate concrete floor slabs carrying only dead load.A seismic load has been then applied to the center of mass of each concrete floor, according to a statical equivalent method, see [17] for details.Finally, a structural analysis was carried out in Karamba™, a parametric structural analysis plug-in for Grasshopper™.In Figure 1, a part of the 144 mentioned tall buildings is shown including both the architectural and the structural models.

Architectural Form Generation
The architectural form of a tall building has been interpreted here as a top and bottom plan and a vertical transformation method consisting of morph, twist, or a curvilinear transformation, see [16].A set of 144 different architectural forms of tall buildings has been generated.Top and base plans could be varied within 3,4,5,6,7,8,9,10,11,12,13 or 24 sided polygons.This process exploited Rhinoceros™ and Grasshopper™, thanks to their powerful parametric tools.In next step, a diagrid (tubular) structure has been designed, sharing pinned joints with intermediate concrete floor slabs carrying only dead load.A seismic load has been then applied to the center of mass of each concrete floor, according to a statical equivalent method, see [17] for details.Finally, a structural analysis was carried out in Karamba™, a parametric structural analysis plug-in for Grasshopper™.In

Structural Results
After having analyzed all the considered building forms, a spreadsheet summarizing the structural behavior of the models has been filled in.Parameters characterizing the

Structural Results
After having analyzed all the considered building forms, a spreadsheet summarizing the structural behavior of the models has been filled in.Parameters characterizing the structural response such as drift, total weight, maximum normal forces, maximum utilization have been investigated to compare all the models.A graph, showing the top and base plan of each form, with a color representing the range of the structural parameter of interest, results insightful to compare the outcomes at a glance.Figure 2 shown such a graph in relation to the drift, for all the generated models.According to it, the green color qualitatively represents the tall buildings which are characterized by a lower drift, while the red color shows the tall buildings featuring a higher drift.It can be seen that, by increasing the side number of plans the structural efficiency is improved, and forms located along the diagonal blue lines in the figure mostly have similar structural behavior [18].It is however difficult to foresee the behavior of a single (variant) form without retracing all the mentioned stages of the analysis.Hence, ML could help in recognizing the patterns in such a representation of the results.

Supervised Machine Learning-A Classification Approach
While it is possible to explore the structural outcomes for all the models manually, it results profitable to do it automatically by means of a supervised ML approach.In this work, a small data set has been considered: out of the 144 architectural forms, 75% (108 forms) have been used for training, and 25% percent (36 forms) have been instead used for testing.First, a randomization algorithm has been applied to split the dataset into the training and testing sets, without any bias.The next step has been to define a label for data classification: as already mentioned, in tall buildings an important factor is represented by the drift, i.e., the horizontal displacement of top floor [19]; several standards define a limit for it, like e.g., 1/500 height of the building height.A qualitative label has been defined for the drift, exploiting its values ranging from a minimum of 34 cm to a maximum of 158 cm within the dataset.Tall buildings whose drift was near 34 cm have been considered "very good" in their structural behavior; a drift increase would be linked to a diminished structural efficiency.Five classes have been then defined for data classification (0: very bad, 1: bad, 2: not bad not good, 3: good, 4: very good).In Figure 3, all the five classes are shown for the whole data, in case the drift is represented against the total weight of the structure; similar representations can be obtained with all the other indices, though the trend might not show up so clear in the graph.We anticipate that good classification results are obtained if this label is chosen.It can be also understood that, by increasing the total weight the drift decreased, as it leads to a stiffer structure and, accordingly, to a smaller displacement or drift under the selected excitation.

Structural Results
After having analyzed all the considered building forms, a spreadsheet summ the structural behavior of the models has been filled in.Parameters characteriz zation have been investigated to compare all the models.A graph, showing the base plan of each form, with a color representing the range of the structural para interest, results insightful to compare the outcomes at a glance.Figure 2 shown graph in relation to the drift, for all the generated models.According to it, the gre qualitatively represents the tall buildings which are characterized by a lower dri the red color shows the tall buildings featuring a higher drift.It can be seen tha creasing the side number of plans the structural efficiency is improved, and forms along the diagonal blue lines in the figure mostly have similar structural behavio is however difficult to foresee the behavior of a single (variant) form without retr the mentioned stages of the analysis.Hence, ML could help in recognizing the pa such a representation of the results.

Supervised Machine Learning-A Classification Approach
While it is possible to explore the structural outcomes for all the models man results profitable to do it automatically by means of a supervised ML approach work, a small data set has been considered: out of the 144 architectural forms, 7 forms) have been used for training, and 25% percent (36 forms) have been inste for testing.First, a randomization algorithm has been applied to split the dataset training and testing sets, without any bias.The next step has been to define a label classification: as already mentioned, in tall buildings an important factor is rep by the drift, i.e., the horizontal displacement of top floor [19]; several standards limit for it, like e.g., 1/500 height of the building height.A qualitative label has b fined for the drift, exploiting its values ranging from a minimum of 34 cm to a m of 158 cm within the dataset.Tall buildings whose drift was near 34 cm have been ered "very good" in their structural behavior; a drift increase would be linked to ished structural efficiency.Five classes have been then defined for data classific very bad, 1: bad, 2: not bad not good, 3: good, 4: very good).In Figure 3, all the fiv are shown for the whole data, in case the drift is represented against the total w the structure; similar representations can be obtained with all the other indices the trend might not show up so clear in the graph.We anticipate that good class results are obtained if this label is chosen.It can be also understood that, by increa total weight the drift decreased, as it leads to a stiffer structure and, accordin smaller displacement or drift under the selected excitation.Another possible strategy would be to categorize the forms according to the b geometry.In this additional case, twelve labels have been defined by considering t Another possible strategy would be to categorize the forms according to the base plan geometry.In this additional case, twelve labels have been defined by considering the sides or vertices of the polygons (i.e., 3,4,5,6,7,8,9,10,11,12,13,24 sided polygons).In this case the classification algorithms have performed rather badly, with no remarkable results.

Classification Algorithms and Hyperparameter Optimization
5-fold cross validation has been applied to guarantee lack of overfitting and eight predictors have been considered.After assigning the label, the following six classification algorithms have been adopted in the MATLAB classification learner toolbox: k-nearest neighbors; support vector machine; decision tree; ensemble method; discriminant analysis; and Naïve Bayes.Instead of tuning each classification algorithm parameter manually, it would be better to define them within an optimization process.We have inspected three types of optimizations [20]: grid search, random search, and Bayesian optimization.
Each of these optimization approaches has a specific property, see e.g., [20] for further details.The Bayesian optimization approach has been used because it can lead to better results in a shorter time and through fewer iterations; moreover, it is the only approach that efficiently exploits the iteration results according to the Bayes rule.In Figure 4, the Bayesian optimization is showed for the kNN algorithm, for 50 iterations: at iteration 35 the optimum result has been already attained, with a minimum classification error of about 12.5%, so with an accuracy for the training dataset of 87.5%.The four tuned hyperparameters of kNN are also reported in the graph.

Classification Algorithms and Hyperparameter Optimization
5-fold cross validation has been applied to guarantee lack of overfitting and eigh predictors have been considered.After assigning the label, the following six classificatio algorithms have been adopted in the MATLAB classification learner toolbox: k-neares neighbors; support vector machine; decision tree; ensemble method; discriminant analy sis; and Naïve Bayes.Instead of tuning each classification algorithm parameter manually it would be better to define them within an optimization process.We have inspected thre types of optimizations [20]: grid search, random search, and Bayesian optimization.
Each of these optimization approaches has a specific property, see e.g., [20] for fur ther details.The Bayesian optimization approach has been used because it can lead t better results in a shorter time and through fewer iterations; moreover, it is the only ap proach that efficiently exploits the iteration results according to the Bayes rule.In Figur 4, the Bayesian optimization is showed for the kNN algorithm, for 50 iterations: at itera tion 35 the optimum result has been already attained, with a minimum classification erro of about 12.5%, so with an accuracy for the training dataset of 87.5%.The four tuned hy perparameters of kNN are also reported in the graph.

Results of the ML Classification
First, it has been tested whether supervised ML classification can be used in this cas study.By means of a very simple implementation of the kNN algorithm, the accuracy fo training has resulted to be 91.7%, while the accuracy for testing has been 83.3%.It has bee thus proved that the classification algorithm can correctly predict the structural respons of tall buildings, in case the label is appropriately chosen.According to the confusion ma trix for training and testing depicted in Figure 5, it can be understood that each class doe not have the same number of observations (represented by the numbers in the matrices Via the kNN classifier, 4 observations have been misclassified in the training dataset, an 11 in the testing dataset.Another important note is that all observations related to class are completely misclassified; this is due to the fact that, in the training dataset, there ar no data associated to this class, and the ML model cannot be trained appropriately.Suc results occurred for this specific randomization, and it may vary from one randomizatio to another of the same set.It is therefore claimed to be a drawback of the procedure mainly linked to the small dataset.

Results of the ML Classification
First, it has been tested whether supervised ML classification can be used in this case study.By means of a very simple implementation of the kNN algorithm, the accuracy for training has resulted to be 91.7%, while the accuracy for testing has been 83.3%.It has been thus proved that the classification algorithm can correctly predict the structural response of tall buildings, in case the label is appropriately chosen.According to the confusion matrix for training and testing depicted in Figure 5, it can be understood that each class does not have the same number of observations (represented by the numbers in the matrices).Via the kNN classifier, 4 observations have been misclassified in the training dataset, and 11 in the testing dataset.Another important note is that all observations related to class 1 are completely misclassified; this is due to the fact that, in the training dataset, there are no data associated to this class, and the ML model cannot be trained appropriately.Such results occurred for this specific randomization, and it may vary from one randomization to another of the same set.It is therefore claimed to be a drawback of the procedure, mainly linked to the small dataset.In what follows, a brief account of the results achieved with the six different classification algorithms is provided.

k-Nearest Neighbors
kNN [21] results depend on (i) the number of neighbors allowed for in the state space, (ii) the metric to measure the distance between neighbors, and (iii) a weight for the measured distances.In this research an optimization method was adopted to reach the maximum accuracy, by changing the hyperparameters, by enabling or disabling a principal component analysis (PCA) of the data [22], and by using random search and grid search, instead of Bayesian optimization.A range 1-54 was defined for k, and a variety of distance metrics have been adopted.The accuracy has been ranging from 80% to 91.7% for training, and from 94.4% to 97.2% for testing; the computing time was instead ranging from 16.3 s to 64.6 s.

Support Vector Machine
In comparison to kNN, support vector machine (SVM) consumes a considerable amount of time for the computation, as it originally works with binary classes; multiple classes are treated as several combinations of binary ones [23].Four kernel functions have been adopted, namely the Gaussian, linear, quadratic, cubic ones, which are related to the kind of support vector classifiers.There is a kernel scale feature, and the multi class method could be one-vs-one, or one-vs-all; the one-vs-one method has turned out to provide more accurate results, though can be very time consuming.The accuracy has finally ranged from 94.4% to 97.2% for training, and from 94.4% to 97.2% for testing; the computing time has varied from 134 s to 250 s.

Decision Tree
Decision tree works with the number of splits, and a criterion for them [24].The number of splits has been varied from 1 to 107; the criterion for split has been selected among the Gini's diversity index, Twoing rule, maximum deviance reduction.The computing time has varied from 17.4 s to 45 s, the accuracy from 86.1% to 93.5% for training, and from 77.8% to 100% for testing.The accuracy for testing of four models out of the five considered has attained the 100% result.It has thus resulted the best classification algorithm.

Ensemble Classifier
The ensemble classifier algorithm exploits several learning algorithms to reach a final prediction [25].One of the most famous ensemble classifiers is the bootstrap aggregating (Bagging) one.In this work, the ensemble method has been selected among Bag, Ada-Boost, RUS Boost, and the maximum spilt has been varied from 1 to 107.The number of learners has changed from 10 to 500, the learning rate ranged from 0.001 to 1, and the In what follows, a brief account of the results achieved with the six different classification algorithms is provided.

k-Nearest Neighbors
kNN [21] results depend on (i) the number of neighbors allowed for in the state space, (ii) the metric to measure the distance between neighbors, and (iii) a weight for the measured distances.In this research an optimization method was adopted to reach the maximum accuracy, by changing the hyperparameters, by enabling or disabling a principal component analysis (PCA) of the data [22], and by using random search and grid search, instead of Bayesian optimization.A range 1-54 was defined for k, and a variety of distance metrics have been adopted.The accuracy has been ranging from 80% to 91.7% for training, and from 94.4% to 97.2% for testing; the computing time was instead ranging from 16.3 s to 64.6 s.

Support Vector Machine
In comparison to kNN, support vector machine (SVM) consumes a considerable amount of time for the computation, as it originally works with binary classes; multiple classes are treated as several combinations of binary ones [23].Four kernel functions have been adopted, namely the Gaussian, linear, quadratic, cubic ones, which are related to the kind of support vector classifiers.There is a kernel scale feature, and the multi class method could be one-vs-one, or one-vs-all; the one-vs-one method has turned out to provide more accurate results, though can be very time consuming.The accuracy has finally ranged from 94.4% to 97.2% for training, and from 94.4% to 97.2% for testing; the computing time has varied from 134 s to 250 s.

Decision Tree
Decision tree works with the number of splits, and a criterion for them [24].The number of splits has been varied from 1 to 107; the criterion for split has been selected among the Gini's diversity index, Twoing rule, maximum deviance reduction.The computing time has varied from 17.4 s to 45 s, the accuracy from 86.1% to 93.5% for training, and from 77.8% to 100% for testing.The accuracy for testing of four models out of the five considered has attained the 100% result.It has thus resulted the best classification algorithm.

Ensemble Classifier
The ensemble classifier algorithm exploits several learning algorithms to reach a final prediction [25].One of the most famous ensemble classifiers is the bootstrap aggregating (Bagging) one.In this work, the ensemble method has been selected among Bag, AdaBoost, RUS Boost, and the maximum spilt has been varied from 1 to 107.The number of learners has changed from 10 to 500, the learning rate ranged from 0.001 to 1, and the number of predictor samples from 1 to 8. The computing time varied from 72 s to 129.4 s, with an accuracy from 85.2% to 98.1% for training, and from 94.4% to 100% for testing.After the decision tree, ensemble turns out to be the best possible classification algorithm.

Naïve Bayes
Naïve Bayes classifier works within a stochastic frame [26], by applying the Bayes theorem.This algorithm features only two hyperparameters: distribution type, and kernel type.Specifically, the kernel type can be one out of Gaussian, Box, Epanechnikov, Triangular ones.The computing time varied from 13.7 s to 109.7 s, with an accuracy for training from 82.4% to 92.6%, and from 83.3% to 91.7% for testing.

Discriminant Analysis
A discriminant classifier assumes that different classes produce data according to different Gaussian distributions [27].The only model hyperparameter to select is the discriminant type, which can be linear, quadratic, diagonal linear, or diagonal quadratic.In this case, the computing time varied from 45.3 s to 48.7 s, and the accuracy from 85.2% to 94.4%, and from 83.3% to 91.7% for training and testing, respectively.
For the sake of brevity, all the results are not directly compared here.A detailed analysis, in terms of accuracy and computational costs to go beyond the brief account provided here above, is going to be given in the conference presentation.Readers are therefore directed to it for a thorough discuss on the of the adopted ML tools.

Conclusions
In this work, the relation between architectural form and structural efficiency of tall buildings has been studied via a data-driven approach.Several architectural and structural model generation methods could be used to get insights into which architectural detail or modification may increase the structural efficiency, moving in the direction of morphing or smart structures.A novel view has been provided by adopting machine learning tools to learn the links between shape and structural response under seismic excitations, by also reducing the computing time: a sample dataset has been used to predict the performance of new architectural forms of tall buildings.
It has been proven that supervised machine learning can be successfully applied to this case study.Moreover, among the six investigated classification algorithms, even though each of them provides advantages and disadvantages, the ensemble and the decision tree classifier algorithms have attained the best results.
Figure 1, a part of the 144 mentioned tall buildings is shown including both the architectural and the structural models.

Figure 1 .
Figure 1.Sketch of 36 out of 144 generated architectural forms, with the diagrid structural model visible on the building skin.

Figure 2 .
Figure 2. Colored diagram of the drift response computed for all the analyzed models.

Figure 1 .
Figure 1.Sketch of 36 out of 144 generated architectural forms, with the diagrid structural model visible on the building skin.

Figure 1 .
Figure 1.Sketch of 36 out of 144 generated architectural forms, with the diagrid structural m visible on the building skin.

Figure 2 .
Figure 2. Colored diagram of the drift response computed for all the analyzed models.

Figure 2 .
Figure 2. Colored diagram of the drift response computed for all the analyzed models.

Figure 3 .
Figure 3. Representation of the structural response in terms of drift against the total mass, the five classes defined on the basis of the drift as a label.

Figure 3 .
Figure 3. Representation of the structural response in terms of drift against the total mass, and of the five classes defined on the basis of the drift as a label.

Figure 4 .
Figure 4. Example of Bayesian optimization of the ML hyperparameters, and relevant results.

Figure 4 .
Figure 4. Example of Bayesian optimization of the ML hyperparameters, and relevant results.

Figure 5 .
Figure 5. Confusion matrix relevant to the kNN classification algorithm: (a) training dataset, (b) testing dataset.

Figure 5 .
Figure 5. Confusion matrix relevant to the kNN classification algorithm: (a) training dataset, (b) testing dataset.