SAR and LIDAR Datasets for Building Damage Evaluation Based on Support Vector Machine and Random Forest Algorithms—A Case Study of Kumamoto Earthquake, Japan

: The evaluation of buildings damage following disasters from natural hazards is a crucial step in determining the extent of the damage and measuring renovation needs. In this study, a combination of the synthetic aperture radar (SAR) and light detection and ranging (LIDAR) data before and after the earthquake were used to assess the damage to buildings caused by the Kumamoto earthquake. For damage assessment, three variables including elevation di ﬀ erence (ELD) and texture di ﬀ erence (TD) in pre- and post-event LIDAR images and coherence di ﬀ erence (CD) in SAR images before and after the event were considered and their results were extracted. Machine learning algorithms including random forest (RDF) and the support vector machine (SVM) were used to classify and predict the rate of damage. The results showed that ELD parameter played a key role in identifying the damaged buildings. The SVM algorithm using the ELD parameter and considering three damage rates, including D0 and D1 (Negligible to slight damages), D2, D3 and D4 (Moderate to Heavy damages) and D5 and D6 (Collapsed buildings) provided an overall accuracy of about 87.1%. In addition, for four damage rates, the overall accuracy was about 78.1%.


Introduction
Immediate response and planning for rescue and reconstruction operations are essential after an earthquake. Field survey is time-consuming and costly and, in some cases, this is impossible due to road closures [1][2][3][4]. Hence, several remote sensing datasets and methods have been proposed to accelerate this work. One of the best ways to assess earthquake damage is to use space borne and airborne information before and after the event [5]. Synthetic aperture radar (SAR) is one of the most powerful tools for monitoring natural and unnatural events on Earth that can collect information during the day and night without being affected by weather conditions. Other advantage of this monitoring system is fast observations on a large scale [6]. Airborne light detection and ranging (LIDAR) is another efficient remote sensing technology that measures distance by sending a pulsed laser at a target and analyzing the reflected light. LIDAR sensors collect data in the form of three-dimension cloud points. Landslide detection and damage assessment of buildings are the main capabilities of this method [7][8][9][10][11]. LIDAR

Methodology
In this study, three most common methods of damage assessment including elevation difference, coherence difference and texture difference have been used. In addition, machine-learning algorithms including RDF and SVM have been used to classify the information obtained from these methods. By using these methods in analyzing LIDAR and SAR images, valuable information can be obtained. Moreover, three computer programs including ENVI v.5.3, ArcGIS v.10.7.1 and XLSTAT v.2020 were used for buildings damage assessment. ENVI and ArcGIS, which are powerful tools for analyzing maps and other geographic information, were used to analyze SAR and LIDAR data. In addition, XLSTAT, which is an efficient tool for statistical data analysis, was used for machine learning.

Coherence
The coherence method has been used for damage assessment in several studies with successful results [1,13,22]. One of its applications is to measure the deformation and displacement of plates caused by earthquakes. In this method, the coherence extracted from the images before and after the event is subtracted. If the obtained result has a high value, it can be a reason for the destruction in the area. A simple or normalized difference method can be used to calculate the CD before and after the event.
where γ pre and γ co are pre-and co-event coherence images, respectively [1].

Texture Analysis
The textural properties of the areas can change over time due to natural or man-made events. These factors can affect the texture of the images in a natural or man-made event. By using the TD in the images before and after the event, different levels of damage can be identified. This method is used to classify land cover and assess the damage and other applications [23,24]. The seven second-order texture including Mean, Variance, Homogeneity, Dissimilarity, Contrast, Entropy and Correlation features were used in this study. The abovementioned features in the five window sizes (3 × 3, 5 × 5, 7 × 7, 9 × 9, 11 × 11) in the DSM images pre-and post-event were calculated separately and then the value obtained in the two images was subtracted.
Gray-level co-occurrence matrix (GLCM) is a statistical method for evaluating texture that considers the spatial relations between pixels. In order to calculate the second-order statistics, the pixel values of the image must first be converted to GLCM (Figure 3).

Methodology
In this study, three most common methods of damage assessment including elevation difference, coherence difference and texture difference have been used. In addition, machine-learning algorithms including RDF and SVM have been used to classify the information obtained from these methods. By using these methods in analyzing LIDAR and SAR images, valuable information can be obtained. Moreover, three computer programs including ENVI v.5.3, ArcGIS v.10.7.1 and XLSTAT v.2020 were used for buildings damage assessment. ENVI and ArcGIS, which are powerful tools for analyzing maps and other geographic information, were used to analyze SAR and LIDAR data. In addition, XLSTAT, which is an efficient tool for statistical data analysis, was used for machine learning.

Coherence
The coherence method has been used for damage assessment in several studies with successful results [1,13,22]. One of its applications is to measure the deformation and displacement of plates caused by earthquakes. In this method, the coherence extracted from the images before and after the event is subtracted. If the obtained result has a high value, it can be a reason for the destruction in the area. A simple or normalized difference method can be used to calculate the CD before and after the event.

Texture Analysis
The textural properties of the areas can change over time due to natural or man-made events. These factors can affect the texture of the images in a natural or man-made event. By using the TD in the images before and after the event, different levels of damage can be identified. This method is used to classify land cover and assess the damage and other applications [23,24]. The seven secondorder texture including Mean, Variance, Homogeneity, Dissimilarity, Contrast, Entropy and Correlation features were used in this study. The abovementioned features in the five window sizes (3 × 3, 5 × 5, 7 × 7, 9 × 9, 11 × 11) in the DSM images pre-and post-event were calculated separately and then the value obtained in the two images was subtracted.
Gray-level co-occurrence matrix (GLCM) is a statistical method for evaluating texture that considers the spatial relations between pixels. In order to calculate the second-order statistics, the pixel values of the image must first be converted to GLCM ( Figure 3). Mean (GLCM): The mean is applied to measure the average gray-level in the specific window on GLCM showing the location of distribution [24,25].  Mean (GLCM): The mean is applied to measure the average gray-level in the specific window on GLCM showing the location of distribution [24,25].
Variance (GLCM): The variance is used to measure the gray-level variance and it is a measure of heterogeneity feature. It shows the condition of spreading the data around the mean [25].
Homogeneity (GLCM): The homogeneity calculates the number of homogeneous pixel values in a selected window in an image. If an image is homogeneous, a co-occurrence matrix will be formed with a combination of upper and lower values of P[i,j]. Otherwise a matrix with uniform values is created [24,25].
Contrast (GLCM): The contrast GLCM feature indicates the local difference of pixel values in neighboring pixels. If the values of local variation are high, the p(i, j) will be concentrated away from the basic diagonal and the amount of contrast will be greater. [24,25].
Entropy (GLCM): The entropy GLCM feature indicates the content of the information. If all entries in P [i, j] are of the same size, the entropy value will be higher and if the entries do not have equal values, the entropy value will be lower. [24,25,27].
Correlation (GLCM): The correlation GLCM shows the scale of image linearity. If an image has a significant linear structure, its value will be high [25].

Machine Learning
Machine learning is a computer tool that, by using the algorithms developed in it, can learn the numerical and visual patterns in data and images and use it to predict and categorize new data. The main part of machine learning is the training data in which the algorithms receive two types of data from the user. The first data is the input (extracted data of analysis) and the second is the output (rate of damages data, extracted from field survey). They learn the patterns and make predictions about new data. Today, machine learning is used in medical sciences, robotics, damage assessment and so forth. [14,16,28]. Conventional algorithms in this system include random forest (RDF), the support vector machine (SVM) and K-nearest neighbor (KNN). In this study, we used RDF and SVM.

Random Forest (RDF)
A random forest algorithm consists of a group of decision trees, each of which depends on a random vector and the same distribution of all forest trees. In addition, the amount of error in the forest depends on the power of each tree and the connection between them. This algorithm is of the supervised type that generates the forest algorithm randomly. RDF uses several decision trees to create accurate predictions. Classification and regression are the main applications of this algorithm. Any decision tree can easily work with complex data and make decisions from it [29,30]. In the regression problem, the result is the average of all trees (Equation (10)) and in the classification problem, we obtain the final answer by voting between the trees ( Figure 4).

Support Vector Machine (SVM)
SVM is a non-parametric statistical monitoring method used for classification and regression. The SVM algorithm can be used wherever there is a need to identify patterns or classify objects in specific classes. In this method, each data sample is represented as a point in the n-dimensional space in the data scatter diagram. The value of each feature related to the data determines one of the point coordinate parameters in the graph. The SVM classification base is a linear classification of data and in linear segmentation of data, a more reliable line is chosen. [16,31]. Figure 4 makes it easier to understand. As you can see, the H3 does not divide the two batches. H1 does this with a small margin and H2 separates the two categories with a maximum margin. Figure 5 shows the flowchart of methods used for damage assessment using machine-learning algorithms.

Damage Classification
In this study, we considered seven damage rates based on the Architectural Institute of Japan (AIJ) including D0, D1, D2, D3, D4, D5 and D6, which due to the limited ability of the methods, the number of rates has been reduced to 3 and 4 levels. In the first case, four damage rates (4DR), including D0 and D1 (Negligible to slight damages), D2 and D3 (Moderate damages), D4 and D5 (Very Heavy damage) and D6 (Collapsed buildings) and in the second case, three damage rates (3DR), including D0 and D1 (Negligible to slight damages), D2, D3 and D4 (Moderate to Heavy damages) and D5 and D6 (Collapsed buildings) have been considered. It should be noted that the truth data (damage rate of structures) used in this study were extracted from the research of Goto et al. [32]. The total number of buildings studied in two cases was 18,445 buildings, which are shown separately in Table 2.

Support Vector Machine (SVM)
SVM is a non-parametric statistical monitoring method used for classification and regression. The SVM algorithm can be used wherever there is a need to identify patterns or classify objects in specific classes. In this method, each data sample is represented as a point in the n-dimensional space in the data scatter diagram. The value of each feature related to the data determines one of the point coordinate parameters in the graph. The SVM classification base is a linear classification of data and in linear segmentation of data, a more reliable line is chosen. [16,31]. Figure 4 makes it easier to understand. As you can see, the H3 does not divide the two batches. H1 does this with a small margin and H2 separates the two categories with a maximum margin. Figure 5 shows the flowchart of methods used for damage assessment using machine-learning algorithms.
Appl. Sci. 2020, 10, 8932 7 of 18 more training data is given to the algorithm, its analysis will be better and it can have more accurate estimation results. But in this study, due to the low number of buildings in group D6 (1128 buildings), we had to choose the number of data in proportion to this group. Therefore, fewer buildings were selected to the algorithm training, which in turn reduces the accuracy of the assessment. The number of selected buildings for training and estimation in 3DR and 4DR cases are shown in Table 2. Also, Figure 6 shows the position of buildings in 3DR mode.

Damage Classification
In this study, we considered seven damage rates based on the Architectural Institute of Japan (AIJ) including D0, D1, D2, D3, D4, D5 and D6, which due to the limited ability of the methods, the number of rates has been reduced to 3 and 4 levels. In the first case, four damage rates (4DR), including D0 and D1 (Negligible to slight damages), D2 and D3 (Moderate damages), D4 and D5 (Very Heavy damage) and D6 (Collapsed buildings) and in the second case, three damage rates (3DR), including D0 and D1 (Negligible to slight damages), D2, D3 and D4 (Moderate to Heavy damages) and D5 and D6 (Collapsed buildings) have been considered. It should be noted that the truth data (damage rate of structures) used in this study were extracted from the research of Goto et al. [32]. The total number of buildings studied in two cases was 18,445 buildings, which are shown separately in Table 2.

Training and Prediction Dataset
In the machine learning method, part of the data must be allocated for algorithm training. If more training data is given to the algorithm, its analysis will be better and it can have more accurate estimation results. But in this study, due to the low number of buildings in group D6 (1128 buildings), we had to choose the number of data in proportion to this group. Therefore, fewer buildings were selected to the algorithm training, which in turn reduces the accuracy of the assessment. The number of selected buildings for training and estimation in 3DR and 4DR cases are shown in Table 2. Also, Figure 6 shows the position of buildings in 3DR mode.

Results
For better understanding of the effectiveness of ELD, TD and CD methods, each was first analyzed separately and then three methods were combined. Due to the variability of window size in texture analysis, this method was first examined in five window sizes. The results showed that 3 × 3 window size had the best performance (Figure 7), so we chose it as the final result of this method according to damage states explained in Figure 8. It should be noted that among the 7 parameters of this method, mean and variance parameters provided the highest overall accuracy. In the next stage of analysis, the results of DSMs were extracted, which showed that this method has an ability to assess the damage so that it provided an overall accuracy of 78.1% for 4DR state (Figure 9). In addition, in 3DR state, the overall accuracy was 87.1%. In damage detection using remote sensing images that are viewed vertically, since direct information about the damage of columns, walls and internal components of the building is not available, the roof of the structure plays a key role in classifying the damage. In some cases, in moderately damaged buildings, residents may cover the roof of the building with a blue plastic tarp to prevent water penetration. This makes it difficult to detect moderate damage in blue tarp-covered ceilings in post-event images [15]. Also, by considering other factors that reduce the accuracy of the results, such as the horizontal difference in the images caused by landslides and the great imbalance in the three groups of damage, it can be said that the evaluation provided an acceptable accuracy ( Figure 10). According to the experimental results of this study, if the number of response variables (in this study: the number of damage rates) is increased, the amount of training data should be increased accordingly. Otherwise, the algorithm will not have the correct analysis and estimation. In this study, due to the low number of buildings with a damage rate of D2 to D6, it was not possible to increase the number of training data. If the number of training data increased, there would be no data to estimate. For example, the number of structures in the collapsed group was 1128 structures. From these, 1000 structures were used for machine training and 128 structures for predicting. Basically, the number of training data in each group should be the same, so in the other groups, the number of this data was selected in proportion to the D6 group. Despite

Results
For better understanding of the effectiveness of ELD, TD and CD methods, each was first analyzed separately and then three methods were combined. Due to the variability of window size in texture analysis, this method was first examined in five window sizes. The results showed that 3 × 3 window size had the best performance (Figure 7), so we chose it as the final result of this method according to damage states explained in Figure 8. It should be noted that among the 7 parameters of this method, mean and variance parameters provided the highest overall accuracy. In the next stage of analysis, the results of DSMs were extracted, which showed that this method has an ability to assess the damage so that it provided an overall accuracy of 78.1% for 4DR state (Figure 9). In addition, in 3DR state, the overall accuracy was 87.1%. In damage detection using remote sensing images that are viewed vertically, since direct information about the damage of columns, walls and internal components of the building is not available, the roof of the structure plays a key role in classifying the damage. In some cases, in moderately damaged buildings, residents may cover the roof of the building with a blue plastic tarp to prevent water penetration. This makes it difficult to detect moderate damage in blue tarp-covered ceilings in post-event images [15]. Also, by considering other factors that reduce the accuracy of the results, such as the horizontal difference in the images caused by landslides and the great imbalance in the three groups of damage, it can be said that the evaluation provided an acceptable accuracy) Figure 11). According to the experimental results of this study, if the number of response variables (in this study: the number of damage rates) is increased, the amount of training data should be increased accordingly. Otherwise, the algorithm will not have the correct analysis and estimation. In this study, due to the low number of buildings with a damage rate of D2 to D6, it was not possible to increase the number of training data. If the number of training data increased, there would be no data to estimate. For example, the number of structures in the collapsed group was 1128 structures. From these, 1000 structures were used for machine training and 128 structures for predicting. Basically, the number of training data in each group should be the same, so in the other groups, the number of this data was selected in proportion to the D6 group. Despite this fact, most of the time, after earthquakes or other catastrophes, there is no balance in the data and this problem must be solved in another way. On the other hand, having a large number of structures in the study area and multiplicity of variables reduce the efficiency of algorithms and the accuracy of the results. To reduce the impact of these problems, better results can be obtained by dividing large areas into several small areas and by performing several separate analyses. In the following part of the study, the pre-processed coherence data of pre-and post-event were subtracted from each other. After using machine learning algorithms, despite the significant advantage of this method, which has been emphasized in various researches, it resulted in very low accuracy. The results of CD method showed that this method is not able to detect minor to moderate damages. For example, in 3DR case, by examining the results, it was found that the algorithm classified the majority of D2 data as underestimation in the almost intact group (Negligible to Slight damage) and categorized the majority of D4 data as overestimation in the Collapsed group. In addition, group D3 is divided between groups depending on the extent of damages.   In the following part of the study, the pre-processed coherence data of pre-and post-event were subtracted from each other. After using machine learning algorithms, despite the significant advantage of this method, which has been emphasized in various researches, it resulted in very low accuracy. The results of CD method showed that this method is not able to detect minor to moderate damages. For example, in 3DR case, by examining the results, it was found that the algorithm classified the majority of D2 data as underestimation in the almost intact group (Negligible to Slight damage) and categorized the majority of D4 data as overestimation in the Collapsed group. In addition, group D3 is divided between groups depending on the extent of damages.
advantage of this method, which has been emphasized in various researches, it resulted in very low accuracy. The results of CD method showed that this method is not able to detect minor to moderate damages. For example, in 3DR case, by examining the results, it was found that the algorithm classified the majority of D2 data as underestimation in the almost intact group (Negligible to Slight damage) and categorized the majority of D4 data as overestimation in the Collapsed group. In addition, group D3 is divided between groups depending on the extent of damages. In the last part of the study, the values of the above three methods were combined. The results showed that the TD and CD, despite their good capabilities, led to poorer results in this evaluation in combination with the ELD (Table 3). In the last part of the study, the values of the above three methods were combined. The results showed that the TD and CD, despite their good capabilities, led to poorer results in this evaluation in combination with the ELD (Table 3).

Type of Accuracy SVM (%) RDF (%)
Producer   Examining the results of this study and various researches were done with SVM and RDF algorithms; it can be experimentally concluded that the SVM algorithm using LIDAR images leads to higher accuracy than the RDF algorithm. In addition, the performance of the RDF algorithm in evaluating SAR images in compared to the SVM algorithm is better ( Figure 10). Although the RDF is known as a strong classifier due to its "bagged decision tree" nature, which can split data on a subset of features, as mentioned above, the main reason for the low overall accuracy in RDF were the increase in the number of damage rates (response variables) and lack of enough training data. In general, we can say that, when we use several different datasets its performance and accuracy may reduce [35]. On the other hand, SVM provides better results when multiple datasets and smaller training set is available [35]. Table 4 shows some previous studies in which SVM and RDF algorithms were used for damage/land classification. Figures 12 and 13 also show the estimated results, of the SVM and RDF algorithms.

Discussion
The ELD method showed that the calculation of changes in the height of buildings in pre-and post-event LIDAR images provides valuable information. Through this method, a quick and acceptable assessment can be made after earthquakes and other disasters. TD is another method that can be applied for both SAR and LIDAR images. In this study, it was used to assess changes in textural features in pre-and post-event LIDAR images (in five window sizes). The results showed that this method has a good ability to assess the damage of buildings. Despite several advantages of CD method, poor results were obtained. It can be said that one of the main reasons of very low overall accuracy of this method is the overestimation or underestimation evaluation of the algorithms, which eliminates the low to moderate rate of damage from the results. Regarding the fusion of the above three methods together or in fact, the combination of two sensors data, it can be said that this method could not improve the overall accuracy of the evaluation. The results of SVM algorithm demonstrated that this algorithm could provide an acceptable estimate using LIDAR images and by considering ELD and TD methods. In the present study, the RDF algorithm provided low accuracy. The main reason is that this algorithm, by increasing the response variables (in this study: the number of degrees of damage), requires more training data for accurate analysis and estimation.

Conclusions
In this study, a combination of SAR and LIDAR images was used for evaluation. The CD parameter in pre-and co-event SAR images and the TD (in five window sizes) and the ELD parameters in pre-and post-event LIDAR images were analyzed. To classify and predict the results, machine learning-based algorithms, including RDF and SVM were used. In the first step, the LIDAR images were preprocessed and pre-and post-event DSMs were extracted. Then, the variation between the two images was calculated by simple differentiation. In the second step, the seven second-order texture parameters (Mean, Variance, Homogeneity, Dissimilarity, Contrast, Entropy and Correlation) in 5 window sizes (3 × 3, 5 × 5, 7 × 7, 9 × 9, 11 × 11) were calculated separately for pre-and post-event images. Then, the differences of the extracted parameters in the two images were computed. In the third step, after analyzing the SAR data and creating coherence images, the difference between both images was calculated. In order to estimate and categorize the results, machine-learning algorithms including RDF and SVM were used. Part of the data was allocated for algorithms training and the rest of the data was used to predict the rate of damages. This study was performed in two cases: in the first case, 3 damage rates and in the second case, 4 damage rates were considered. The SVM algorithm using the ELD parameter and considering three damage rates provided an overall accuracy of about 87.1%. In addition, in 4 damage rates, the overall accuracy was about 78.1%. The results showed that methods based on LIDAR images are more efficient. Regarding the methods used to analyze these images, we conclude that:

1.
In the future, the LED method can be a good alternative to field research, which is very time consuming and costly.

2.
Among seven texture properties mentioned in the previous sections, mean and variance played a more effective role in the results. According to the results from this method, it can be considered as a complement to other methods. In a separate experiment with two damage rates, the overall accuracy of this method increased about 10%.
The CD method based on SAR data provided poor results in identifying the three damage rates. In evaluating the two groups of damage, including intact buildings (D0, D1, D2) and collapsed group (D3, D4, D5, D6), the overall accuracy of this method increased to about 60%. Another factor that could increase the accuracy of this method about 8% was the division of study areas, which made it easier to make algorithms decisions. In general, it can be said that this method cannot still be replaced by field study because it has low ability to evaluate buildings with moderate to low damage.
Regarding the current capability of machine learning algorithms, for future work, it is recommended that building damage assessments be performed in several separate sections with less data, rather than an integrated analysis for a large area. In addition, the effects of landslides on reducing the accuracy of the results can be investigated.