Novel ANOVA-Statistic-Reduced Deep Fully Connected Neural Network for the Damage Grade Prediction of Post-Earthquake Buildings

Earthquakes are cataclysmic events that can harm structures and human existence. The estimation of seismic damage to buildings remains a challenging task due to several environmental uncertainties. The damage grade categorization of a building takes a significant amount of time and work. The early analysis of the damage rate of concrete building structures is essential for addressing the need to repair and avoid accidents. With this motivation, an ANOVA-Statistic-Reduced Deep Fully Connected Neural Network (ASR-DFCNN) model is proposed that can grade damages accurately by considering significant damage features. A dataset containing 26 attributes from 762,106 damaged buildings was used for the model building. This work focused on analyzing the importance of feature selection and enhancing the accuracy of damage grade categorization. Initially, a dataset without primary feature selection was utilized for damage grade categorization using various machine learning (ML) classifiers, and the performance was recorded. Secondly, ANOVA was applied to the original dataset to eliminate the insignificant attributes for determining the damage grade. The selected features were subjected to 10-component principal component analysis (PCA) to scrutinize the top-ten-ranked significant features that contributed to grading the building damage. The 10-component ANOVA PCA-reduced (ASR) dataset was applied to the classifiers for damage grade prediction. The results showed that the Bagging classifier with the reduced dataset produced the greatest accuracy of 83% among all the classifiers considering an 80:20 ratio of data for the training and testing phases. To enhance the performance of prediction, a deep fully connected convolutional neural network (DFCNN) was implemented with a reduced dataset (ASR). The proposed ASR-DFCNN model was designed with the sequential keras model with four dense layers, with the first three dense layers fitted with the ReLU activation function and the final dense layer fitted with a tanh activation function with a dropout of 0.2. The ASR-DFCNN model was compiled with a NADAM optimizer with the weight decay of L2 regularization. The damage grade categorization performance of the ASR-DFCNN model was compared with that of other ML classifiers using precision, recall, F-Scores, and accuracy values. From the results, it is evident that the ASR-DFCNN model performance was better, with 98% accuracy.


Introduction
Earthquakes are calamitous events that can harm structures and human existence. The seriousness of earthquake-actuated structure damage relies upon factors like size, distance from the focal point, land conditions, and seismic structure execution. The rapid assessment of the distribution pattern and severity of damage to buildings is critical for post-event emergency response and recovery [1]. According to Samuel Roeslin et al., earthquake information from the seismic tremor perception division shows that small seismic tremors often happen in Northern Thailand. On 5 May 2014, there was a 6.3 seismic tremor in Chiang Rai Province [2]. The strike was a tremendous recorded seismic tremor in Thailand, damaging structures in a vast region. The time of a potential earthquake is unpredictable. Consequently, seismic danger evaluation is vital for readiness for proper tremor relief exercises. An exhibition-based quake-designing evaluation technique was created to give an essential dynamic seismic arrangement. According to the suggestions of Rao, M.V.V. et al., a detailed nonlinear relocation-based strategy is a complex computation for improving a structure's seismic presentation. Machine learning strategies are computationally requested to evaluate enormous topographical regions [3]. Tianyu Ci et al. stated that, after an earthquake, it is crucial to assess the level of endangerment sustained by buildings so that people can avoid being in unsafe structures. The post-earthquake structure of a building must be restored with a proper analysis of the damage. Appropriate technology must be utilized to categorize the damage to the building facilities. Building damage is visually observed and recorded. Manual assessment and categorization are labor-intensive and time-consuming and may take place for months after the disaster. Machine learning (ML) and deep learning (DL) techniques could be applied as a tool for the faster assessment of damage and early restoration, thus preventing loss of life and property [4].
The main objective of this research was to investigate how much the performance of a deep, fully connected neural network could be improved by tuning the input features and parameters of the model compilation. For this purpose, this research is explained as follows. Section 2 outlines the past studies related to damage grade classification using ML and DL techniques. Section 3 describes the research workflow, the steps to frame the proposed ASR-DCNN model, and the module workflow with the architecture. Section 4 explains the implementation steps of data preprocessing and model training. Section 5 discusses the results of experimentation at various stages of development. Section 6 concludes by summarizing the findings and the future scope for enhancements.

Literature Review
In recent years, ML and DL algorithms have been widely used in various domains. Specifically, in applications in which rapid and accurate analysis is required for large data, ML and DL methods are necessary. ML and DL algorithms significantly improve automation and accuracy enhancement in structural analysis. Based on the severity of the earthquake, there are five grades of damage to building structures. The damage level of the buildings can be assessed by the data gathered using regression methods [5]. Xiao-Wei Zheng et al. stated that ML algorithms are instrumental when the damage-sensitive features obtained from the seismic response are influenced by operation, environmental variability, and damage. ML techniques generalize the normal structural behavior to detect deviations from the baseline period associated with destruction. To assess the risk of damage, high-rise building seismic data can be studied [6,7]. Although the researchers have worked on incorporating various types of ML and DL to assess the damage risk of earthquake-affected buildings, there is a vast need for a new methodology that effectively assesses the damage grade of a facility by considering all the parameters of the building damage details, according to the analysis of Kim, B et al. [8]. According to Mangalathu S. et al., after an earthquake, it is crucial to assess the level of endangerment sustained by the buildings so that people can avoid being in unsafe structures after the calamity. The potential advantage of using ML methods, such as discriminant analysis, k-nearest neighbors, decision trees, and random forests, is that they support the faster prediction of earthquake-induced damage to structures. Using a subset of Napa earthquake data, a predictive model was developed and assessed using various ML methods, among which random forest achieved 66% accuracy in predicting the building damage [9]. Based on the methods for assessing a structure's seismic vulnerability needed prior knowledge regarding the capabilities and applications of ANN-based methods [22]. Rashid, M. et al. discovered that the destruction to internal structures was determined for each intensity level and integrated throughout the structure with the cost of necessary repairs to determine the structure repair cost ratio (RCR). The seismic vulnerability curves can be used to estimate the economic loss (direct repairability cost) of SMRF structures and the structural RCR correlated with the seismic intensity [23].
Rao, M.V.V. et al. analyzed the state of the structure and determined its potential risk was two critical goals of structural strength-monitoring frameworks. Investigating, identifying, and characterizing risk in complicated structures is crucial to additional strength checking. The capacities are explored as scaled variations of a fundamental Gaussian hypothesis task and a vocabulary of time-recurrence movement. Characterization is then completed by synchronizing the eliminated damaged components with the time frequency. Signals collected from sensors are disintegrated into direct blends of minimal Gaussian capabilities using the coordinated significance decay computation. The balanced scratchoff and high-pass sifting procedures are sufficiently combined to address challenges in numerical reconciliation. As opposed to earlier numerical integrators, the combination's accuracy was improved. In contrast to fuzzy set methods, the rough set analysis uses internal knowledge and does not assume any prior models [24].
The structural building health damage monitoring system (SBHDMS) is a powerful technology for predicting the health of civil building structures. Buildings in SBHDMS have undergone unusual alterations in terms of damage levels. Earthquakes, floods, and cyclones are natural disasters that have an extraordinary impact on structures. The sensors record the vibration data and are used to alter the construction of the building in the event of a natural disaster. The peculiar variations were examined according to the vibration data. The RAS approach forecasts the vibration data levels recorded by the sensors in the damaged structure [25].
Gemma Cremen et al. claimed that the early detection of earthquakes reduces the risk factor and environmental hazards. Earthquake early warning (EEW) systems make real-time information about active earthquakes available, allowing people in far-off communities, governments, companies, and other settings to take prompt action to reduce the likelihood of harm or loss before the earthquake-induced ground shaking reaches them. The limitation of the existing methods is the need for engineering-related metrics to make early detection decisions [26]. Natt, L. et al. showed that the outcomes of significant damage analysis parameters provide engineers, architects, and construction and disaster recovery management personnel with a practical understanding of the degree of damage experienced by buildings [27]. According to Chaparala A. et al., performing feature reduction or choosing features from the dataset is crucial before classifying data to ensure accuracy. Examining the dataset using the rough set theory helped to streamline the classification process by reducing the complexity of the feature selection procedure. Rough set theory analyzes the essential parameters without added data [28]. In order to forecast the flexural strength of ultra-high-strength concrete, Wang et al. [29] examined different supervised ML techniques such as decision tree bagging, decision tree gradient boosting, decision tree AdaBoost, and decision tree XG boost. The findings proved that DT bagging is the method that most closely approximates experimental outcomes. In order to show how seismically vulnerable existing structures are, Ruggieri et al. [30] published a vulnerability study utilizing machine learning (VULMA). The procedure comprised four modules that each offer specific and specialized services. Street VULMA begins by gathering raw data. Data VULMA offers a way to classify and store data. Bi VULMA rates the pictures, looks at them, and calculates the vulnerability index after using the gathered data to train multiple machine learning models for picture classification. The five most typical flaws in RC bridges may be classified and detected using convolutional neural network (CNN)-based Xception and Vanilla models for the picture categorization procedure. The two models were developed, tested, and compared using the concrete fault The results demonstrated the possible use of the Xception and Vanilla models to classify defects in concrete bridges and the superiority of the Xception model in highly accurate defect classification [31].
The literature outlined that varied methodologies were used to study the damage grade caused by earthquakes. Though ML and DL technologies can be utilized to assess the damage risk of earthquake-affected buildings, several challenges exist during implementation. The research challenges lie in considering the recorded dataset's significant features, encoding the feature values, outlier analysis, sampling the target feature, and freezing the number of hidden layers with the appropriate activation function and optimizers. The present work focuses on building an efficient neural-network-based model with a competent dataset. For this purpose, varied techniques such as the ANOVA test and 10-component PCA methods are utilized.

Research Methodology
The proposed work aims to design an effective machine-learning-based model for categorizing post-earthquake building damages. The workflow of the proposed method is divided into four phases, as depicted in Figure 1. The first stage includes collecting related earthquake-affected building damage data from open-source repositories. Stage two involves data analysis, preprocessing, and constructing the ASR-DCNN framework for implementation. The exploratory data analysis visualizes the statistics and distribution of damage grade concerning the plinth area, foundation type, other floor types, ground floor type, land surface type, position, and roof type. During data preprocessing, missing data are imputed, and feature scaling, categorical encoding, and normalization are conducted. Damage grade classification using existing classifiers was applied to the imputed data by splitting the data into 60:40, 70:30, and 80:20. Then, the proposed ASR-DFCNN model was built by analyzing the performance of the cross-validation and ANOVA results. During stage three, the dataset was fitted to the proposed ASR-DFCNN model and validated for efficiency in predicting the damage grade of the building by comparing it with existing classifiers. At stage four, the proposed model was evaluated using performance metrics such as precision, accuracy, recall, F-score, and R-squared error.

Development of ASR-DFCNN Framework
The development steps of the proposed ASR-DFCNN framework model are depicted in Figure 2. The earthquake dataset collected from the Kaggle repository was preprocessed for missing values, and categorical encoding was conducted to generate the normalized dataset. The processed dataset was segregated into 25 independent variables and 'dam-age_grade' as the dependent variable. The dataset was then split for training and testing in the 80:20, 70:30, and 60:40 ratios, and then fitted with various existing ML classifiers. The performance of the classifiers was analyzed using precision, recall, accuracy, F-score, and run time. The results showed that classification with the classifiers exhibited better performance with a data split ratio of 80:20. The ANOVA test was applied to the preprocessed dataset to find the significance of the input variables in deciding the damage grade. The insignificant attributes were identified and removed from the dataset. Further, the most significant features that decide the damage grade were extracted from the dataset using a 10-component PCA model. After the ANOVA and 10-component PCA methods, the reduced dataset was used to create a deep, fully connected neural network for predicting building damage grade.

Development of ASR-DFCNN Framework
The development steps of the proposed ASR-DFCNN framework model are depicted in Figure 2. The earthquake dataset collected from the Kaggle repository was preprocessed for missing values, and categorical encoding was conducted to generate the normalized dataset. The processed dataset was segregated into 25 independent variables and 'damage_grade' as the dependent variable. The dataset was then split for training and testing in the 80:20, 70:30, and 60:40 ratios, and then fitted with various existing ML classifiers. The performance of the classifiers was analyzed using precision, recall, accuracy, F-score, and run time. The results showed that classification with the classifiers exhibited better performance with a data split ratio of 80:20. The ANOVA test was applied to the preprocessed dataset to find the significance of the input variables in deciding the damage grade. The insignificant attributes were identified and removed from the dataset. Further, the most significant features that decide the damage grade were extracted from the dataset using a 10-component PCA model. After the ANOVA and 10-component PCA methods, the reduced dataset was used to create a deep, fully connected neural network for predicting building damage grade. The module workflow of the ANOVA-statistic-reduced deep fully connected neural network is shown in Figure 3. The dataset with the entire 26 features was applied to all the classifiers before and after component scaling of the data to grade the damage. The performance was analyzed by cross-sectioning the training and testing data with 80:20, 70:30, and 60:40. The ANOVA test was applied to the dataset to identify the features that did not contribute to the damage grade prediction. Based on the results, insignificant variables were removed. Using a 10-component PCA method, the dataset was further reduced to 10 necessary variables to estimate the damage grade. The resultant dataset was applied to all the classifiers to grade the damage. The performance was analyzed in the presence and absence of component scaling at every step of dataset reduction, and the best-fit dataset was used for the building of the proposed model. The module workflow of the ANOVA-statistic-reduced deep fully connected neural network is shown in Figure 3. The dataset with the entire 26 features was applied to all the classifiers before and after component scaling of the data to grade the damage. The performance was analyzed by cross-sectioning the training and testing data with 80:20, 70:30, and 60:40. The ANOVA test was applied to the dataset to identify the features that did not contribute to the damage grade prediction. Based on the results, insignificant variables were removed. Using a 10-component PCA method, the dataset was further reduced to 10 necessary variables to estimate the damage grade. The resultant dataset was applied to all the classifiers to grade the damage. The performance was analyzed in the presence and absence of component scaling at every step of dataset reduction, and the best-fit dataset was used for the building of the proposed model. The ASR-DFCNN module initiated an ANOVA test that eliminated the insignificant variables that produced values below the threshold. Features selected through the 10-component PCA-reduced dataset were applied with all the classifiers to analyze the performance. The analysis showed that the DCNN model and 80:20 data splitting were approved to construct the proposed ASR-DFCNN model.  The ASR-DFCNN module initiated an ANOVA test that eliminated the insignificant variables that produced values below the threshold. Features selected through the 10-component PCA-reduced dataset were applied with all the classifiers to analyze the performance. The analysis showed that the DCNN model and 80:20 data splitting were approved to construct the proposed ASR-DFCNN model.

Architecture of ASR-DFCNN Framework
The layered framework of the fully connected ASR-DFCNN and model compilation flow is illustrated in Figure 4. The proposed ASR-DFCNN model was designed with a sequential keras model consisting of four dense layers, where the first three dense layers were fitted with the ReLU activation function, and the final dense layer was fitted with the tanh activation function with a dropout of 0.2. The ASR-DFCNN model was compiled with the NADAM optimizer with a weight decay of L2 regularization. The ASR-DFCNN model was trained with a learning rate of L2 regularized gradient descent function.

Architecture of ASR-DFCNN Framework
The layered framework of the fully connected ASR-DFCNN and model compilation flow is illustrated in Figure 4. The proposed ASR-DFCNN model was designed with a sequential keras model consisting of four dense layers, where the first three dense layers were fitted with the ReLU activation function, and the final dense layer was fitted with the tanh activation function with a dropout of 0.2. The ASR-DFCNN model was compiled with the NADAM optimizer with a weight decay of L2 regularization. The ASR-DFCNN model was trained with a learning rate of L2 regularized gradient descent function.  The ASR-DFCNN module network architecture, with input and dense layer framework, is shown in Figure 5, where the green circles represent each layerʹs components. The model was fed with 10 input components at the input layer; convolution was conducted at four dense layers with 14 nodes and a single output layer that categorized five classes of damage grade. The performance of the proposed model was analyzed, and accuracy was compared with other classifier models. The ASR-DFCNN module network architecture, with input and dense layer framework, is shown in Figure 5, where the green circles represent each layer's components. The model was fed with 10 input components at the input layer; convolution was conducted at four dense layers with 14 nodes and a single output layer that categorized five classes of damage grade. The performance of the proposed model was analyzed, and accuracy was compared with other classifier models. After modeling the proposed framework, data were preprocessed as follows. The earthquake damage dataset with 25 independent features and one dependent feature "damage grade," is represented in Equation (1).
The represents the dataset, the target variable, and [ 1, 2, 3, … … … . . , 25] the 25 independent variables. The feature encoding of categorical attributes in the dataset is shown in Equation (2), where is the encoded variable and denotes the integer values for each category.
After data encoding, the missing values in the dataset were filled with the mean values of the feature column using Equation (3): where " " is the estimated mean of each feature column in the dataset and ' ' is the total number of features designated as '25'. The resultant imputed data were subjected to feature scaling to maintain uniformity in values using Equation (4): where is the processed dataset over attributes = 1, 2, …, 25, the independent features of the original dataset. The variance between the original and the imputed feature is estimated using Equation (5): After modeling the proposed framework, data were preprocessed as follows. The earthquake damage dataset with 25 independent features and one dependent feature "damage grade," is represented in Equation (1).
The E represents the dataset, D the target variable, and [e1, e2, e3, . . . . . . . . . .., e25] the 25 independent variables. The feature encoding of categorical attributes in the dataset is shown in Equation (2), where ei is the encoded variable and U denotes the integer values for each category.
After data encoding, the missing values in the dataset were filled with the mean values of the feature column using Equation (3): where "eR" is the estimated mean of each feature column in the dataset and 'd' is the total number of features designated as '25'. The resultant imputed data were subjected to feature scaling to maintain uniformity in values using Equation (4): where E is the processed dataset over attributes e = 1, 2, . . . , 25, the independent features of the original dataset. The variance between the original and the imputed feature is estimated using Equation (5): where EW is the variance computed on the completed data. Impute dataset was generated by applying calculated feature variance from Equation (5) to each column of the dataset as shown in Equation (6).
The final preprocessed data 'FinalData' with no missing values and having the total variance generated using Equation (7).
The min-max normalization method was applied to normalize the encoded imputed final data using the formula as shown in Equation (8). e i is the normalized feature value that lies within the range of 0 and 1.
The preprocessed dataset was fitted to all classifier models before and after component scaling. The performance was assessed by cross-sectioning the training and testing data in 80:20, 70:30, and 60:40 ratios. Let variable eR denote the number of rows in the dataset, 'd' is the number of features, and variable ij represents the instances of the rows and columns. The principal component analysis was conducted by finding the covariance of two feature variables COV eR ij , eR pq in the dataset, as shown in Equation (9).
The ANOVA test was conducted to find the essential features in the dataset for determining the building damage grade using Equations (10)- (12).
The proposed ASRDFCNN containing one input 'InLayer' with 10 components, four dense layers 'DenseLayer' with 14 nodes, and an output layer 'OutputLayer' with five classes are represented in Equations (13)- (16). DenseLayer OutputLayer The ASR-DFCNN was implemented using a sequential keras model with four dense layers, in which the first three dense layers were fitted with the ReLU activation function, and the final dense layer was fitted with the tanh activation function with a dropout of 0.2. The ASR-DFCNN model is compiled with a NADAM optimizer with the weight decay of L2 regularization. The ASR-DFCNN model is trained with a learning rate using the L2-regularized gradient descent function. The first three dense layers were activated with the ReLU activation function, and the weight initialization 'W(t)' of those first three dense layers having t as weight is shown in Equation (17).
The final dense layer is fitted with the tanh activation function and the weight initialization W(t) of the final dense layer is given by Equation (18).
After designing the dense layer, the ASR-DFCNN model was compiled with weight decay L2RGD i using the L2-regularized gradient descent function RG as shown through Equations (19)- (21).
The ASR-DFCNN model was optimized using a NADAM optimizer as shown in Equations (22)- (25): where GD t is the gradient descent function at step 't', MGT t is the sum of the previous gradient vectors, µ is the decay constant, BCN t is the initialization bias correction term, MV is the momentum vector, LeR i is the learning rate, η is the learning rate decay, and λ is the loss error. The model performance was measured using accuracy which denotes the ratio of number of correct predictions to the total number of predictions, as given in Equation (26).

Implementation
The building earthquake damage dataset used for implementation was taken from the KAGGLE repository. Twenty-six attributes collected from 762,106 buildings were utilized for building damage grade prediction. The attributes describe the structure of the building, the nature of material used, and land and surface conditions. Among the 26 listed features, damage_grade was the target variable, and the remaining 25 features were considered independent variables. The details of the data are tabulated in Table 1. Our target variable had five labels of damage grade, namely, 'Minor', 'Moderate', 'Heavy', 'Extremely severe', and 'Disaster'. The dataset was subjected to data preprocessing, such as filling in the missing values, categorical encoding, feature scaling, and normalization. Python programming was used for the model implementation with pandas and keras libraries.

Prescriptive Data Analysis of Building Damage
The exploratory data analysis of the target variable 'damage_grade' is shown in Figures 6 and 7. From the distribution, it was observed that each grade occurs more frequently. Grade 5 appears more extensively in the samples, and Grade 1 occurs the least often. Evidently, this classification problem with various grades accounts for drastically        The statistical distribution of damage grade to foundation type is shown in Figures 9  and 10. The foundation_type is a categorical variable with six materials used in building foundations. It was observed that ʹGrade 5ʹ damages were less likely across all foundations. ʹGrade 1ʹ damage occurred in almost all foundation types. The buildings made of mud and brick showed disastrous effects, whereas bamboo and cement types exhibited an inverse relation to damage grade. The RC and other floor types showed similar patterns of damage, which might be due to a smaller sample size. Figure 10 shows the distribution of damage grades across the foundation types in percentage.   The statistical distribution of damage grade to foundation type is shown in Figures 9  and 10. The foundation_type is a categorical variable with six materials used in building foundations. It was observed that ʹGrade 5ʹ damages were less likely across all foundations. ʹGrade 1ʹ damage occurred in almost all foundation types. The buildings made of mud and brick showed disastrous effects, whereas bamboo and cement types exhibited an inverse relation to damage grade. The RC and other floor types showed similar patterns of damage, which might be due to a smaller sample size. Figure 10 shows the distribution of damage grades across the foundation types in percentage.  The statistical distribution of damage grade to floor types other than the ground floor is depicted in Figures 11 and 12. The data illustrated that RCC, brick, and plank were widely used as flooring materials. Also, the graph demonstrated that the catastrophic damage was minor in RCC floors, whereas it was high in timber-based materials. Almostdamaged Grades 2, 3, and 4 were likely to appear on all types of floors. The floor type feature revealed a strong association with damage grades and thus was considered for classification. The statistical distribution of damage grade to floor types other than the ground floor is depicted in Figures 11 and 12. The data illustrated that RCC, brick, and plank were widely used as flooring materials. Also, the graph demonstrated that the catastrophic damage was minor in RCC floors, whereas it was high in timber-based materials. Almostdamaged Grades 2, 3, and 4 were likely to appear on all types of floors. The floor type feature revealed a strong association with damage grades and thus was considered for classification.   The statistical distribution of damage grade to floor types other than the ground floor is depicted in Figures 11 and 12. The data illustrated that RCC, brick, and plank were widely used as flooring materials. Also, the graph demonstrated that the catastrophic damage was minor in RCC floors, whereas it was high in timber-based materials. Almostdamaged Grades 2, 3, and 4 were likely to appear on all types of floors. The floor type feature revealed a strong association with damage grades and thus was considered for classification.   Devasting damages were prominent among buildings with timber and brick, whereas stone floors were prone to Grade 1 damages. Depending on the materials used to make the ground, the relationship between ground floor type and damage grade varies significantly. The observations imply that ground_floor_type has significant forecasting potential.      The position attribute denotes the attachment of other buildings at their sides. The building may be attached to between one and three sides of its location. Most buildings in the dataset were represented by the ʺNot attachedʺ position. Figures 17 and 18 show that the buildings with an attachment on one side and with no attachment show a similar relationship between position and damage grade. The damages are severe in the one-side and no-attachment types of buildings. The patterns of the buildings with two-and threesided attachments were less vulnerable to devasting damages.  The position attribute denotes the attachment of other buildings at their sides. The building may be attached to between one and three sides of its location. Most buildings in the dataset were represented by the ʺNot attachedʺ position. Figures 17 and 18 show that the buildings with an attachment on one side and with no attachment show a similar relationship between position and damage grade. The damages are severe in the one-side and no-attachment types of buildings. The patterns of the buildings with two-and threesided attachments were less vulnerable to devasting damages. The position attribute denotes the attachment of other buildings at their sides. The building may be attached to between one and three sides of its location. Most buildings in the dataset were represented by the ʺNot attachedʺ position. Figures 17 and 18 show that the buildings with an attachment on one side and with no attachment show a similar relationship between position and damage grade. The damages are severe in the one-side and no-attachment types of buildings. The patterns of the buildings with two-and threesided attachments were less vulnerable to devasting damages.  The data distribution among roof type and damage grade summarizes that building roofs made of timber are prone to damage of all grades in increasing order from simple to disaster. The building roofs made of RCC were less affected by the earthquake. The RCC roof might be considered an alternative roofing material to prevent heavy damage during earthquakes. The data distribution of roof material and damage grade is illustrated in Figures 19 and 20.  The data distribution among roof type and damage grade summarizes that building roofs made of timber are prone to damage of all grades in increasing order from simple to disaster. The building roofs made of RCC were less affected by the earthquake. The RCC roof might be considered an alternative roofing material to prevent heavy damage during earthquakes. The data distribution of roof material and damage grade is illustrated in Figures 19 and 20.     The statistical distribution of damage grade to foundation type is shown in Figures 9  and 10. The foundation_type is a categorical variable with six materials used in building foundations. It was observed that 'Grade 5' damages were less likely across all foundations. 'Grade 1' damage occurred in almost all foundation types. The buildings made of mud and brick showed disastrous effects, whereas bamboo and cement types exhibited an inverse relation to damage grade. The RC and other floor types showed similar patterns of damage, which might be due to a smaller sample size. Figure 10 shows the distribution of damage grades across the foundation types in percentage.
The statistical distribution of damage grade to floor types other than the ground floor is depicted in Figures 11 and 12. The data illustrated that RCC, brick, and plank were widely used as flooring materials. Also, the graph demonstrated that the catastrophic damage was minor in RCC floors, whereas it was high in timber-based materials. Almostdamaged Grades 2, 3, and 4 were likely to appear on all types of floors. The floor type feature revealed a strong association with damage grades and thus was considered for classification. Figures 13 and 14 illustrate the statistical data distribution of damage grade to ground floor type. Mud was most used for ground floors, followed by brick and RCC. The damage grade distribution across the ground floor differed from other floor damages. Devasting damages were prominent among buildings with timber and brick, whereas stone floors were prone to Grade 1 damages. Depending on the materials used to make the ground, the relationship between ground floor type and damage grade varies significantly. The observations imply that ground_floor_type has significant forecasting potential.
The data distribution of building damages to the nature of the land surface is shown in Figures 15 and 16. The instances show that moderate and steep slopes had significant damage, and flat surfaces had lesser damage grades. The data distribution pattern is strongly related to the land surface and damage grade features.
The position attribute denotes the attachment of other buildings at their sides. The building may be attached to between one and three sides of its location. Most buildings in the dataset were represented by the "Not attached" position. Figures 17 and 18 show that the buildings with an attachment on one side and with no attachment show a similar relationship between position and damage grade. The damages are severe in the oneside and no-attachment types of buildings. The patterns of the buildings with two-and three-sided attachments were less vulnerable to devasting damages.
The data distribution among roof type and damage grade summarizes that building roofs made of timber are prone to damage of all grades in increasing order from simple to disaster. The building roofs made of RCC were less affected by the earthquake. The RCC roof might be considered an alternative roofing material to prevent heavy damage  Figures 19 and 20.
The impact analysis of the damage to the number of rooms in the building before and after the earthquake is illustrated in Figure 21. The data distribution indicates that damage grades were observed in an average of one to four rooms. There were some outliers where damage was caused in all building rooms, indicating that the damages depended on the center of seismic activity. The number of rooms is unimportant in grading the earthquake damage categorization. The impact analysis of the damage to the number of rooms in the building before and after the earthquake is illustrated in Figure 21. The data distribution indicates that damage grades were observed in an average of one to four rooms. There were some outliers where damage was caused in all building rooms, indicating that the damages depended on the center of seismic activity. The number of rooms is unimportant in grading the earthquake damage categorization. The distribution of building heights to damage grade before and after the earthquake was analyzed with the graph shown in Figure 22. The general distribution of building heights for the damage Grades 1-3 remained the same before and after the earthquake. Damage Grade 5 dropped to zero for most of the buildings, implying that these buildings completely collapsed, whereas damage Grade 4 showed a modest height loss after an earthquake. The prescriptive data exploratory analysis of the features is shown in Table 2. From the prescriptive analysis, the metrics like mean, standard deviation, and minimum and maximum values of the dataset features are extracted. The statistics show that the plinth area projects the highest mean, standard deviation, and minimum and maximum values compared to the rest of the features. The distribution of building heights to damage grade before and after the earthquake was analyzed with the graph shown in Figure 22. The general distribution of building heights for the damage Grades 1-3 remained the same before and after the earthquake. Damage Grade 5 dropped to zero for most of the buildings, implying that these buildings completely collapsed, whereas damage Grade 4 showed a modest height loss after an earthquake. The impact analysis of the damage to the number of rooms in the building before and after the earthquake is illustrated in Figure 21. The data distribution indicates that damage grades were observed in an average of one to four rooms. There were some outliers where damage was caused in all building rooms, indicating that the damages depended on the center of seismic activity. The number of rooms is unimportant in grading the earthquake damage categorization. The distribution of building heights to damage grade before and after the earthquake was analyzed with the graph shown in Figure 22. The general distribution of building heights for the damage Grades 1-3 remained the same before and after the earthquake. Damage Grade 5 dropped to zero for most of the buildings, implying that these buildings completely collapsed, whereas damage Grade 4 showed a modest height loss after an earthquake. The prescriptive data exploratory analysis of the features is shown in Table 2. From the prescriptive analysis, the metrics like mean, standard deviation, and minimum and maximum values of the dataset features are extracted. The statistics show that the plinth area projects the highest mean, standard deviation, and minimum and maximum values compared to the rest of the features. The prescriptive data exploratory analysis of the features is shown in Table 2. From the prescriptive analysis, the metrics like mean, standard deviation, and minimum and maximum values of the dataset features are extracted. The statistics show that the plinth area projects the highest mean, standard deviation, and minimum and maximum values compared to the rest of the features.

Results and Discussion
The dataset with 26 features was fitted to all the classifiers to grade the damage before and after feature scaling. The ML classifier models such as logistic regression (LReg), K nearest neighbors (KNN), kernel support vector machine (KSVM), Gaussian naive Bayes (GNB), decision tree (Dtree), extra tree (Etree), random forest (RFor), ridge classifier (Ridge), RidgeClassifierCV (RCV), stochastic gradient descent (SGD), passive aggressive (PAg), and bagging (Bagg) were implemented. The exploratory data analysis aided in understanding the pattern of data distribution. Further, an adequate dataset was needed to construct an efficient damage classification model. Thus, the performance of the classifiers was analyzed at the train-test split with various ratios (60:40, 70:30, and 80:20), with and without feature scaling, and feature selection using the ANOVA test and PCA. Initially, the performance analysis of the models was conducted on the original data considering the training and testing dataset in the 60:40, 70:30, and 80:20 ratios. The evaluation metrics obtained are tabulated in Tables 3-5, and the graphical plots are shown in Figures 23-25. The results proved that the model's classification accuracy and precision significantly improved after implementing feature scaling. The performance indicated that feature scaling eliminates feature biasing and enables the machine learning models to interpret the features on a similar scale. A comparison of the performance of models with various train-test split ratios showed that the model performed better when the dataset was divided into an 80:20 ratio. The accuracy of the models improved when the training data were increased.
The effectiveness of the dataset was improved by selecting the most prominent features that contributed to categorizing the building damage. A 10-component PCA was applied to select the essential features from the raw dataset before and after feature scaling. The resultant dataset was fitted to the classifiers, and performances were compared. Evaluation metrics were obtained using cross-sectioning training and testing data with 80:20, 70:30, and 60:40 ratios, as shown in Tables 6-8 and Figures 26-28, respectively. The results verified that model performance was further improved when fitted with significant features than on the entire dataset.          The effectiveness of the dataset was improved by selecting the most prominent features that contributed to categorizing the building damage. A 10-component PCA was applied to select the essential features from the raw dataset before and after feature scaling. The resultant dataset was fitted to the classifiers, and performances were compared. Evaluation metrics were obtained using cross-sectioning training and testing data with 80:20, 70:30, and 60:40 ratios, as shown in Tables 6-8 and Figures 26-28, respectively. The results verified that model performance was further improved when fitted with significant features than on the entire dataset.

ANOVA Test Analysis
The ANOVA test was used to determine the influence of the independent features on the target variable. The ANOVA test analyzes the dataset's features by comparing the null and alternate hypotheses. The PR value implies the probability of obtaining the observed value believing that the null hypothesis is true. F denotes the ratio between the variability among the variables and within group variables. An F value less than 0.05 indicates that the variable highly influenced the target variable. The degree of freedom (df) represents the number of independent variables considered to check the variability among the group. The preprocessed dataset with 25 input variables was subjected to an ANOVA test, and the results are tabulated in Table 9. The features plinth_area_sq_ft, position, has_superstructure_cement_mortar_stone, has_superstructure_1_mortar_brick, has_super_4_non_engineered, has_superstructure_ stone_flag, and has_superstructure_4 _ engineered have a PR(>F) >0.05, and determined insignificance. Thus, the dataset was refined by eliminating the irrelevant features to form an ANOVA-reduced building damage dataset.

PCA ANOVA-Reduced Predictive Analysis
The ANOVA-reduced building damage dataset contains 18 input features. The effectiveness of the dataset was further improved by applying a 10-component PCA, which filtered the top 10 significant features that intend to predict the building damage grade. The 10-component PCA ANOVA-reduced dataset was fitted to all the classifiers, and performances were recorded. The performance of the classifiers on feature scaling and with varied split ratios were listed in Tables 10-12              The extensive experimentation provided the following perceptions: 1. The ML model could perform a task more efficiently when the data used for training was considerable. The 80:20 split ratio fit well while dividing the dataset for training and testing. 2. The model interpretation of the input features would be unbiased if a similar scaling was adopted across all the features. 3. The model's performance can be enriched if trained with relevant and significant features that determine the output. Thus, a 10-component PCA ANOVA-reduced dataset was created to build an earthquake damage prediction model.

Proposed ANOVA-Statistic-Reduced Deep Fully Connected Neural Network
The deep, fully connected neural network model that accepts 10 input components and outputs any of the five classes of damage grade was implemented using Python programming on an NVIDIA Tesla V100 GPU workstation. The proposed ASR-DFCNN was trained and tested with 80% and 20% of the data. The model was executed with a batch size of 64 for 30 epochs. The proposed model was validated by comparing the accuracy and R-squared values with all other classifiers on the reduced dataset. Also, the designed DFCNN model was implemented with the raw dataset. The observations in Table 13 and Figure 32 proved that the proposed ASR-DFCNN model attained a greater accuracy of 98% and R-squared 97% than other models. The DFCNN model, when trained with raw data, attained only 87% accuracy, thus proving the importance of feature selection for building an efficient model. gramming on an NVIDIA Tesla V100 GPU workstation. The proposed ASR-DFCNN was trained and tested with 80% and 20% of the data. The model was executed with a batch size of 64 for 30 epochs. The proposed model was validated by comparing the accuracy and R-squared values with all other classifiers on the reduced dataset. Also, the designed DFCNN model was implemented with the raw dataset. The observations in Table 13 and Figure 32 proved that the proposed ASR-DFCNN model attained a greater accuracy of 98% and R-squared 97% than other models. The DFCNN model, when trained with raw data, attained only 87% accuracy, thus proving the importance of feature selection for building an efficient model.

Conclusions
The presented research attempted to grade the type of building damage caused by an earthquake by analyzing the essential features. The main objective of this research is to investigate how well the performance of the deep, fully connected neural network can be improved by tuning the input features and parameters of model compilation. The contribution of the research was two-fold. The first was identifying essential data processing techniques on input features to build a competent dataset. The second focus was designing the ASR-DFCNN model-based architecture that efficiently classified the earthquake-affected building's damage grade compared to existing ML classifiers. The challenges in building the proposed ASR-DFCNN were input feature selection and opting the best the activation and optimization functions to improve model accuracy. The earthquake damage dataset with 26 attributes describing 762,106 buildings was used to train the proposed ASR-DFCNN model, which fulfilled this research work's requirements and outperformed the existing DFCNN and classifier models. Initially, the raw dataset without feature scaling and selection was directly fitted to the classifiers in data split ratios of 60:40, 70:30, and 80:20 for training and testing. The performance analysis concluded that the accuracy of the models without feature scaling lay between 50% to 70%. The performance increased with feature scaling and splitting data with an 80:20 ratio for training and testing. However, when the same classifiers were applied with a 10-component PCA-reduced dataset, the accuracy of the models showed better improvement. After the ANOVA test implementation, the features that produced the PR (>F) > 0.05 were considered insignificant and eliminated from the dataset. Thus, the size of the dataset was reduced from 25 to 18 input features. A 10-component PCA was applied to the ANOVA-reduced dataset to select the top 10 input features contributing significantly to damage prediction. The results exhibited that the bagging classifier with the reduced dataset produced the greatest accuracy of 83% among all the classifiers considering an 80:20 ratio of data for the training and testing phases. To enhance the performance of prediction, a deep fully connected convolutional neural network (DFCNN) was implemented with a reduced dataset (ASR).
The proposed ASR-DFCNN model was designed with the sequential keras model with four dense layers, with the first three dense layers fitted with the ReLU activation function and the final dense layer fitted with a tanh activation function with a dropout of 0.2. The ASR-DFCNN model was compiled with a NADAM optimizer with the weight decay of L2 regularization. The research model fitted with an appropriate activation function to the dense hidden layers and model optimizers reduced the loss and produced improved accuracy in damage grade classification. The ASR-DFCNN model was trained and tested with the resultant dataset and validated by comparing its performance with all other classifier models. The results proved that the ASR-DFCNN model outperformed other models by achieving 98% accuracy and 97% R-squared value. Despite the proposed ASR-DFCNN model's remarkable performance, it is still challenging for researchers to finetune the sampling ratios of data features by experimenting with various oversampling or under-sampling methods. This research work could also be further enriched by extending the outlier analysis and extraction of the significant data features.