Detection of Collapsed Bridges from Multi-Temporal SAR Intensity Images by Machine Learning Techniques

: Bridges are an important part of road networks in an emergency period, as well as in ordinary times. Bridge collapses have occurred as a result of many recent disasters. Synthetic aperture radar (SAR), which can acquire images under any weather or sunlight conditions, has been shown to be effective in assessing the damage situation of structures in the emergency response phase. We investigate the backscattering characteristics of washed-away or collapsed bridges from the multi-temporal high-resolution SAR intensity imagery introduced in our previous studies. In this study, we address the challenge of building a model to identify collapsed bridges using ﬁve change features obtained from multi-temporal SAR intensity images. Forty-four bridges affected by the 2011 Tohoku-oki earthquake, in Japan, and forty-four bridges affected by the 2020 July ﬂoods, also in Japan, including a total of 21 collapsed bridges, were divided into training, test, and validation sets. Twelve models were trained, using different numbers of features as input in random forest and logistic regression methods. Comparing the accuracies of the validation sets, the random forest model trained with the two mixed events using all the features showed the highest capability to extract collapsed bridges. After improvement by introducing an oversampling technique, the F-score for collapsed bridges was 0.87 and the kappa coefﬁcient was 0.82, showing highly accurate agreement.


Introduction
As a key component of transportation systems, bridges are important infrastructural elements in both normal and disaster periods. In the recent decade, mega natural disasters have hit Japan many times. In the 2011 Tohoku-oki earthquake (Mw 9.0), more than 90 bridges collapsed or were washed away in Iwate, Miyagi, and Fukushima Prefectures, due to the consequent huge tsunamis [1]. In 2016, a series of earthquakes occurred in Kumamoto Prefecture, which caused damage to 83 bridges [2]. The Aso Ohashi bridge had a big arch structure, 206 m in length and 9 m in width, which collapsed completely due to a large landslide on Aso mountain [3]. The Hagibis typhoon in 2019 brought heavy rainfall over half of Japan. Affected by rising river waters, eight bridges in Nagano Prefecture collapsed. In the 2020 East Asian rainy season, record-breaking heavy rain hit the Kyushu and central regions of Japan. Ten road bridges and three railway bridges over the Kuma River in Kumamoto Prefecture either collapsed or were washed away. Even in the year 2021, one bridge in Shizuoka Prefecture was reported to have partially collapsed, due to strong river flow after a heavy rainfall. The effective and quick damage assessment of bridges is an essential issue after natural disasters. However, the collapse of a bridge causes the blockage of traffic; thus, it is often difficult to assess on site at an early stage.
Remote sensing, as an effective tool to collect information without reaching affected sites, has been widely used for damage assessment after disasters in recent decades [4][5][6][7]. With the improvement of the spatial resolution of optical sensors, high-resolution optical limited to 78%. Detections using only the post-event TSX intensity in the same area have been conducted, based on the low backscatter intensity [29]. Using the average values of backscatter intensity within the outlines of bridges, 7 out of 9 severely damaged bridges were identified; thus, the accuracy increased to 84%. Using the percentage of water region within the bridge outline, 4 out of 7 washed-away bridges were detected successfully. The overall accuracy was 91%. Although the method using single post-event images obtained higher accuracy than that using the multi-temporal pair, it requires accurate bridge outlines and bridge clearances from the water.
The second event was the 2020 July floods in Japan (July floods). Record-breaking heavy rainfall hit the Kyushu and central regions of Japan from July 3 to 31, 2020. In Kumamoto Prefecture, 513 mm of rainfall was recorded in Kuma Town in the period from July 3 to 4 [38]. Due to the flooding of rivers, the state-managed class-A Kuma River overran its bank at eleven different locations. Ten road bridges and three railway bridges over the Kuma River were washed away or collapsed [39]. We investigated the differences and correlation coefficients of 25 bridges in the downstream of the Kuma River, using the pre-and post-event ALOS-2 intensity pair [40]. The characteristics of collapsed bridges in this event were similar to those observed in the Tohoku earthquake. The collapsed bridges showed low correlation coefficients and had significant backscatter differences.
Seven washed-away bridges and 34 surviving bridges located in the inundation area of Miyagi Prefecture and 14 collapsed bridges and 30 surviving bridges over the Kuma River in Kumamoto Prefecture were selected as targets. Five features of each bridge were obtained from pre-and post-event SAR intensity pairs. Then, those features were divided into training, test, and validation sets and were used to build the model for detecting collapsed bridges.

Satellite Images and Pre-Processing
For the 44 target bridges in Miyagi Prefecture, the same pre-and post-event TSX pair used in the previous study was adopted [28,30]. The pre-event TSX image was acquired on October 21, 2010, and the post-event image was acquired on March 13, 2011; that is, two days after the earthquake. The TSX images were taken by the HH polarization from the descending path with a right look. The incident angle at the center was 37.3 • . The spatial resolution was around 3 m. The images were provided as enhanced ellipsoid corrected (EEC) products with a 1.25 m/pixel spacing.
For the 44 target bridges in Kumamoto Prefecture, the pre-and post-event ALOS-2 pair was adopted. This pair was also used in our previous study [40]. The PALSAR-2 sensor onboard the ALOS-2 satellite took an emergency observation at 13:12 on July 4, 2020, soon after the floods. The pre-event image was observed on April 16, 2016, under the same acquisition condition. They were acquired by the HH polarization from the descending path with a left look. The incident angle was 58.5 • . For the emergency response, they were provided as processing level 1.5 products with a 2.5 m/pixel spacing.
As the EEC products of TSX have very high spatial accuracy, registration was conducted only for the ALOS-2 pair. Then, the two intensity pairs were calibrated to sigma naught values, respectively. Since speckle noises reduce the correlation between multitemporal SAR images, an enhanced Lee filter with a 3 × 3 pixel window was applied to the SAR images. Color composite images of the two intensity pairs after the abovementioned pre-processing steps are shown in Figure 1. The locations of the 21 collapsed and 63 surviving bridges are also shown in Figure 1. The numbers of the collapsed bridges are summarized in Table 1. Eleven target bridges in Miyagi Prefecture have been investigated by the National Institute for Land and Infrastructure Management (NILM), which were labeled from 159 to 169 [1]. The same labeling numbers were used in this study. The other 33 bridges, in the tsunami-inundated areas, were labeled in ascending order, from North to South and from the downstream to the upstream of the rivers. For the bridges in Kumamoto Prefecture, all bridges were labeled in the same order. Nos. 1 to 42 are the bridges over the mainstream of the Kuma River, while Nos. 43 and 44 are two bridges over 63 surviving bridges are also shown in Figure 1. The numbers of the collapsed bridges are summarized in Table 1. Eleven target bridges in Miyagi Prefecture have been investigated by the National Institute for Land and Infrastructure Management (NILM), which were labeled from 159 to 169 [1]. The same labeling numbers were used in this study. The other 33 bridges, in the tsunami-inundated areas, were labeled in ascending order, from North to South and from the downstream to the upstream of the rivers. For the bridges in Kumamoto Prefecture, all bridges were labeled in the same order. Nos. 1 to 42 are the bridges over the mainstream of the Kuma River, while Nos. 43 and 44 are two bridges over its branch, the Kawabe River. To separate the bridge numbers in the two events, we labeled the bridges in Miyagi Prefecture with an initial "M", whereas the bridges in Kumamoto Prefecture were labeled with the initial "K".

Backscatter Model of Bridges
As multiple radar bounces occur between the water and the bridge, bridges over water have complicated backscatter patterns [22]. Backscatter models for bridges can be classified into two types, according to their heights [23,29]. Generally, a large-scale bridge with a high clearance over water is sturdy and difficult to collapse by floods or tsunamis. Thus, the damaged bridges were mainly small bridges with low clearance. Their backscatter model is shown in Figure 2. Layover (single-bounce), double-bounce, and triple-bounce patterns can be observed from the near-range to the far-range. When the height of the deck is not high enough, compared with the width of the deck, the triple-bounce overlaps on the layover. The length of layover (L) in the front of the outline in the ground range can be calculated using Equation (1), whereas the length of the triple-bounce (T) behind the outline can be calculated by Equation (2): where h is the clearance between the deck and water, and θ is the SAR incidence angle.
As multiple radar bounces occur between the water and the bridge, bridges over water have complicated backscatter patterns [22]. Backscatter models for bridges can be classified into two types, according to their heights [23,29]. Generally, a large-scale bridge with a high clearance over water is sturdy and difficult to collapse by floods or tsunamis. Thus, the damaged bridges were mainly small bridges with low clearance. Their backscatter model is shown in Figure 2. Layover (single-bounce), double-bounce, and triple-bounce patterns can be observed from the near-range to the far-range. When the height of the deck is not high enough, compared with the width of the deck, the triple-bounce overlaps on the layover. The length of layover ( ) in the front of the outline in the ground range can be calculated using Equation (1), whereas the length of the triple-bounce ( ) behind the outline can be calculated by Equation (2): where ℎ is the clearance between the deck and water, and is the SAR incidence angle. To observe these signals clearly, an example of a low bridge in a high-resolution X-band airborne SAR intensity image is shown in Figure 3 [41]. It is an arch bridge over the Sumida River in Tokyo, Japan. The X-band airborne SAR sensor (Pi-SAR2) has 0.3 m resolution with full polarizations [42]. From the color composite of the HH, HV, and VV polarizations, we can observe the layover clearly as the white outline of the deck and To observe these signals clearly, an example of a low bridge in a high-resolution X-band airborne SAR intensity image is shown in Figure 3 [41]. It is an arch bridge over the Sumida River in Tokyo, Japan. The X-band airborne SAR sensor (Pi-SAR2) has 0.3 m resolution with full polarizations [42]. From the color composite of the HH, HV, and VV polarizations, we can observe the layover clearly as the white outline of the deck and arch. The purple signals consist of strong backscatter in the HH and VV polarizations, observed around the outline in the near-range, which are mainly the double-bounce backscattering from the side of the piers and the deck. Green signals with strong backscatter in the HV polarization can be observed from the middle of the bridge outline to the far-range, which are mainly due to triple-bounce backscattering. arch. The purple signals consist of strong backscatter in the HH and VV polarizations, observed around the outline in the near-range, which are mainly the double-bounce backscattering from the side of the piers and the deck. Green signals with strong backscatter in the HV polarization can be observed from the middle of the bridge outline to the far-range, which are mainly due to triple-bounce backscattering.  When the water level changes, the backscatter model of a bridge changes accordingly. Figure 2 also shows the model for an increased water level. As the distance between the sensor and the bridge does not change, the layover remains at the same location. However, the locations of the double-and triple-bounces move to the near-range. When the water level changes, the backscatter model of a bridge changes accordingly. Figure 2 also shows the model for an increased water level. As the distance between the sensor and the bridge does not change, the layover remains at the same location. However, the locations of the double-and triple-bounces move to the near-range. The movement of the double-bounce (M L ) can be described by Equation (3), and the movement of the triple-bounce is twice that of M L . Thus, the backscatter of the bridge will be seen differently after the water level rises.
where d is the increase in the water level.

Extraction of Change Features
Five features obtained by the change detection method were used to build a classification model of collapsed bridges. These features were estimated within the outlines of bridges. According to the backscatter model in the previous section, we expanded the original outlines of bridges to include all of the backscatter signals. First, the original outlines were created from the GIS data of roads and rivers, which are available from the Geospatial Information Authority of Japan [44]. The roads over a water surface were extracted as the original outlines of bridges. The lengths of the bridges were defined as from bank to bank or nearest road to road. Then, we shifted the original outlines 10 m toward and away from the sensor direction. The expanded outlines were generated as the minimum rectangles including the original outline and two shifted outlines. According to the incident angles of the used SAR satellite images and Equations (1) and (2), the expanded outlines included the layover, double-bounce, and triple-bounce signals of the bridges lower than 7.6 m in both the TSX pairs and ALOS-2 pairs. Two examples of the original and expanded outlines in the pre-event TSX and ALOS-2 images are shown in Figure 4a,c.
According to the backscatter model shown in Figure 3, the double-bounce and triplebounce of the bridge moved in the sensor's direction after the water level rose. If we conducted change detection directly, undamaged bridges would show similar features as damaged bridges. Thus, we shifted the expanded outlines of bridges in the post-event image to estimate the new location in the post-event SAR image [40]. In the Tohoku earthquake scenario, the flow observation stations located in the inundated area were all damaged by tsunamis. The increase in water level when the post-event SAR images were acquired was unknown, but it should be negligible two days after the tsunami. Furthermore, significant crustal movements occurred in the target area [45,46]. We shifted the expanded outlines in the eastern and southern directions, respectively. To increase accuracy, the SAR images were re-sampled to 0.25 m/pixel (i.e., one-fifth of the original products). In each shift of the sub-pixel step, the correlation of the backscatter intensity within the outlines was calculated. The location with the highest correlation was selected as the new outline in the post-event image.
In the July floods, three out of the six flow observation stations recorded an increase in water level at the peak that was more than 10 m. We also re-sampled the ALOS-2 images to 0.5 m/pixel (i.e., one-fifth of the original products). Then, shifting of the outline was conducted in the range direction, in a sub-pixel by sub-pixel manner. Two examples of the shifted outlines are shown in Figure 4b,d. Figure 4a,b show the pre-and post-event TSX images of an undamaged bridge (no. M30). Before the shifts, the correlation within the outline was 0.55. The correlation between the expanded outline in the pre-event image and the shifted outline in the post-event image increased to 0.85. Figure 4c,d are the pre-and post-event ALOS-2 images of an undamaged bridge (no. K8). After the shift of the outline, the correlation increased from 0.66 to 0.69. As the high water level at this location changed the backscatter pattern, the increase in the correlation was not significant, as was the case in the Tohoku earthquake scenario.
After modification of the outlines of bridges in the post-event SAR images, five features were calculated within the outlines from the multi-temporal SAR images. Correlation and differences were the most common features for the change direction. In a previous study, the correlation (r) was used to classify the bridges damaged in the Tohoku earthquake [28]. In this study, the correlation was also adopted as one of the change features. In the study [28], the correlation coefficient was calculated using a 3 ×3 pixel window. Then, the average value of the correlation within the outline was used for the classification step. However, we calculated the correlation using all the pixels within the outline in this study. Thus, only one correlation value was obtained for each bridge.
Remote Sens. 2021, 13, x FOR PEER REVIEW According to the backscatter model shown in Figure 3, the double-bounce and triple-bounce of the bridge moved in the sensor's direction after the water level rose. If we conducted change detection directly, undamaged bridges would show similar features as damaged bridges. Thus, we shifted the expanded outlines of bridges in the post-event image to estimate the new location in the post-event SAR image [40]. In the Tohoku earthquake scenario, the flow observation stations located in the inundated area were all damaged by tsunamis. The increase in water level when the post-event SAR images were acquired was unknown, but it should be negligible two days after the tsunami. Furthermore, significant crustal movements occurred in the target area [45,46]. We shifted the expanded outlines in the eastern and southern directions, respectively. To increase accuracy, the SAR images were re-sampled to 0.25 m/pixel (i.e., one-fifth of the original products). In each shift of the sub-pixel step, the correlation of the backscatter The difference was also adopted as an effective feature. We calculated the differences for each pixel within the outline of bridges. Then, the average value of the difference (µ d ), the standard deviation (σ d ), and the minimum value (d m ) were selected as three change features for the models. When the deck of a bridge has completely washed away, the backscatter within the outline might decrease significantly; however, sometimes, the average value of the difference does not change much when only a part of the deck is washed away. In addition, the remaining piers show high backscatter, reducing the change Remote Sens. 2021, 13, 3508 8 of 20 in the difference. Thus, we added the standard deviation and the minimum value of the difference, in order to improve the detection of partly washed-away bridges.
The last change feature was the percentage of the significantly changed area within the outline (p). The purpose of introducing this feature was similar to the selection of the standard deviation and the minimum value of difference. Thresholding of the significantly changed area (v) was defined by subtracting twice the standard deviation from the average value of the difference for the whole scene. This thresholding method has been used to extract flooding areas where positive results were obtained in our previous studies [47,48].
A summary of these five change features is shown in Table 2, and the scattering plots in the two events are shown in Figure 5, respectively. Although the features of the target bridges in the two events were obtained from different SAR images with different wavelengths, their values and scattering patterns were similar. In the Tohoku earthquake, the surrounding environments of the target bridges changed significantly, due to the tsunami and its associated debris. The features of the surviving and collapsed bridges showed high variability. To the contrary, the features of the surviving bridges in the July floods had less variability. According to Figure 5, the collapsed bridges in the July floods could be identified easier than those after the Tohoku earthquake. Table 2. List of the five change features; where x i , y i are the pixels within the bridge outline in the pre-and post-event SAR images respectively, and * means the average value.

Symbol
Equations where T is the thresholding value for significant changes Remote Sens. 2021, 13, x FOR PEER REVIEW 9 of 22 Table 2. List of the five change features; where , are the pixels within the bridge outline in the pre-and post-event SAR images respectively, and * means the average value.

Symbol
Equations where is the thresholding value for significant changes (a) 2011 Tohoku earthquake (b) 2020 July floods

Generation of Data Sets
Two different data sets were generated for this study. The first data set used the features of bridges in one event, in order to train and test the model. Then, the model was applied to the features of the other event for verification. As the ratio of collapsed

Generation of Data Sets
Two different data sets were generated for this study. The first data set used the features of bridges in one event, in order to train and test the model. Then, the model was applied to the features of the other event for verification. As the ratio of collapsed bridges in the July floods was larger than that after the Tohoku earthquake, we adopted this event to train the models. Fourteen collapsed bridges and 30 surviving bridges from the July floods were divided into training and test data randomly, with a ratio of 7:3. As a result, 11 collapsed bridges and 19 surviving bridges were used for training, while three collapsed bridges and 11 surviving bridges were used for testing. The seven collapsed bridges and 37 surviving bridges after the Tohoku earthquake were used as the validation set.
The second data set mixed all the features of the two events. First, the 88 target bridges were divided to two parts randomly, with a ratio of 7:3. The 30% was used for validation, including six collapsed bridges and 21 surviving bridges, while the 70% was divided again into training and test data randomly, with a ratio of 8:2. As a result, 10 collapsed bridges and 38 surviving bridges were used for training, while five collapsed bridges and eight surviving bridges were used for testing. The remaining six collapsed bridges and 21 surviving bridges were used as the validation set. The breakdown of the two data sets is shown in Table 3.

Machine Learning Models
For this study, we adopted two common supervised algorithms to train and test the models. One is random forest (RF), an ensemble learning method for classification and regression [49]. Random forest builds multiple decision trees and merges them together to obtain an accurate and stable prediction. In this study, the input change features are selected randomly at each node to grow a decision tree. Then each tree casts a unit vote for the most popular class to classify bridges. The other one is logistic regression (LR), a reliable procedure to solve binary classification problems [36]. The core of logistic regression is the "S"-shaped logistic function. The curve of the logistic function indicates the likelihood of the target class. Using the two data sets and the two machine learning methods, twelve models were generated, with different numbers of input features.

Models Using Data Set 1
First, RF and LR models were trained and tested using data set 1. To determine the optimal values for the models, hyperparameter tuning was conducted, using the GridSearchCV function of the Python package scikit-learn [50]. For the RF models, the best combination of the three hyperparameters were examined: n_estimators, max_depth, and random_state. "n_estimators" is the number of trees in the model, which was pre-defined as 5, 10, 30, and 50. "max_depth" is the maximum depth of the tree, which was pre-defined as 3, 5, 10, 30, and 50. "random_state" controls both the randomness of bootstrapping for the samples to build trees and the sampling of the features to consider the best split at each node, which was predefined as 0, 7, and 42. As the number of bridges in the training data was limited, we performed three-fold cross-validation, in order to obtain the best combination of hyperparameters.
For the first try, the five change features obtained in the previous section were all used for fitting. Fitting of the RF model was conducted using the abovementioned function of scikit-learn [50]. As a result, 10 out of 11 collapsed bridges in the training data and 2 out of 3 collapsed bridges in the test data were detected successfully. In addition, no surviving bridge was detected as a commission error. According to the impurity-based feature importance obtained after fitting, we deleted the feature p, as it had the lowest importance value. Then, fitting was conducted again, using the remaining four features. As a result, all the collapsed bridges in the training data and two collapsed bridges in the test data were detected without a commission error. Finally, the feature σ d , with the lowest importance value, was also removed from the input. Fitting was then conducted using the features r, µ d , and d m . The result was the same as that with the model using all five features.
A comparison of the three RF models using different numbers of features is shown in Table 4. Four indices were used to evaluate the performance of the models: the recall and precision of the collapsed bridges, the accuracy, and the kappa coefficient. In this study, the collapsed class was set as positive. A high recall means the collapsed bridges were detected successfully, and a high precision means the surviving bridges were not misidentified as collapsed. Since there were three times more surviving bridges than collapsed bridges, we adopted the kappa coefficient to describe the performance of the models. According to Table 4, the RF models using different numbers of features obtained the same results on the test data. As the model using four features had higher accuracy on the training set, this model was applied to the Tohoku earthquake validation set. The confusion matrix is shown in Table 5. Five out of seven collapsed bridges were detected successfully, but 12 surviving bridges were misclassified as collapsed. The kappa coefficient was 0.25, showing only fair agreement. For the LR models, the best combination of two hyperparameters was examined: C and random_state. "C" is the inverse of regularization strength, wherein smaller values specify stronger regularization. "random_state" is the same as that in the RF models. "C" was pre-defined as 10 n , where n is an integer from −5 to 6. "random_state" was pre-defined as an integer from 0 to 101. The hyperparameters were tuned by three-fold cross-validation. The threshold value of the logistic function was set as 0.5. Firstly, the LR models were fed the five features. Seven out of 11 collapsed bridges in the training set and two out of three collapsed bridges in the test set were detected successfully, without a commission error. The feature d m was deleted, due to its high p-value. The new LR model, using four features, detected nine collapsed bridges in the training set and three collapsed bridges in the test set; meanwhile, one surviving bridge in the training set and one in the test set were misclassified. The third LR model, using the three features r, µ d , and p with the lowest p-values of the five features, was then considered. The result was the same as that using the four features. Considering that less processing time is taken when using fewer features, the LR model with three features was adopted for verification. A comparison of the three LR models is shown in Table 6. The confusion matrix for the validation set is shown in Table 7. Five out of seven collapsed bridges were detected, whereas 10 surviving bridges were misclassified. This result was better than that obtained with the RF model, although the kappa coefficient was still low (0.3).  Comparing the results with those using data set 1, the RF models obtained a better accuracy for the training set, whereas the LR models obtained better results on the test set. Although all the models obtained substantial agreement with the test set, they had difficulty with the validation set, which consisted of the damage data set for the other event. The LR model using the three features-namely, r, µ d , and p-was the best model when fitted with data set 1. Using this model, 12 out of 14 collapsed bridges were detected for the July floods, and five out of seven collapsed bridges were detected for the Tohoku earthquake. The omitted collapsed bridges were K23, K30, M163, and M164. For all of the target bridges, the recall of the collapsed bridges was 0.81 and the precision was 0.59. The accuracy was 0.82 and the kappa coefficient was 0.56, showing only a moderate level of agreement.

Models Using the Dataset 2
Next, the fitting of the RF and LR models using data set 2 was conducted. The hyperparameters were pre-defined using the same values as those for data set 1. Hyperparameter tuning was carried out using the training and test sets. Then, the best combination of hyperparameters was applied to each model. Examination of the features was also carried out, by reducing the number of input data. Comparisons of the RF and LR models are shown in Tables 8 and 9, respectively. For the LR models, the threshold value of the logistic function was also set as 0.5.
For the training set, the RF model using all the features and the one using four features obtained the same results, wherein seven out of 10 collapsed bridges were successfully identified. The model using three features detected six collapsed bridges, worse than the other models. For the test set, the models using all the features classified three out of five collapsed bridges, better than the other two models. Thus, the RF model using all five features showed the best discrimination ability for data set 2. These models were applied to the validation set. Five out of six collapsed bridges were detected by the best RF model and the model using the three features (r, d m , and p). Only one surviving bridge (K31) was misclassified as collapsed.
As the validation set was selected from the target bridges randomly, we changed the division of the validation sets 10 times to improve the reliability. Then, the RF model using all the features was applied to those validation sets. The best discrimination result was obtained for the validation set, including four collapsed bridges and 23 surviving bridges, wherein all the bridges were successfully classified. The worst result identified two out of five collapsed bridges and 22 surviving bridges. The recall of the collapsed bridges was 0.40, and the accuracy was 0.89. The kappa coefficient of the worst result was 0.52, showing moderate agreement. The RF model for the other nine validation sets had higher kappa coefficients than 0.60, indicating substantial agreement. The average values of the evaluation indices were used as the final accuracy of the best model. The recall of the collapsed bridges was 0.71, and the precision was 0.94. The accuracy was 0.91, and the kappa coefficient was 0.74. For the LR models, three models using different numbers of features obtained the same results. Five out of 10 collapsed bridges in the training set and three out of five collapsed bridges in the test set were detected. No surviving bridges were misclassified as collapsed. For the validation set, these models also obtained the same results, wherein four out of six collapsed bridges were detected successfully and one surviving bridge was misclassified. Considering the number of features, we applied the LR model using the three features r, µ d , and p to 10 random validation sets. The best discrimination result was obtained for the validation set including four collapsed bridges and 23 surviving bridges, wherein all the bridges were successfully classified. The worst result identified two out of five collapsed bridges and 21 out of 22 surviving bridges. The recall of the collapsed bridges was 0.40, and the accuracy was 0.85. The kappa coefficient of the worst result was 0.42, showing moderate agreement. The LR model, for four out of the 10 validation sets, had a kappa coefficient lower than 0.6, while it showed a kappa coefficient higher than 0.8 for only one of the validation sets, which gave the best accuracy. The average value of the recall was 0.70, and the precision was 0.89. The accuracy was 0.88, and the kappa coefficient was 0.62. This accuracy was worse than that of the RF model using all the features. Thus, the RF model using the five features was the best model when fitted using data set 2.
Using the best RF model, 11 out of 14 collapsed bridges were detected for the July floods, and four out of seven collapsed bridges were detected for the Tohoku earthquake, respectively. The omitted collapsed bridges were K21, K23, K30, M160, M163, and M168. The only misclassified surviving bridge was K31. For all the target bridges, the recall of the collapsed bridges was 0.71, and the precision was 0.94. The accuracy was 0.92, and the kappa coefficient was 0.76, indicating substantial agreement.

Comparison and Improvement of the Models
According to a comparison of the RF models and LR models using different numbers of features, the LR model using the features r, µ d , and p was the best model when fitted by data set 1, whereas the RF model using all the features was the best when fitted by data set 2. The LR model with three features obtained high accuracy for the test data where the recall was 1.00 and the kappa coefficient was 0.81; however, this model had difficulty in classifying the collapsed bridges in the other event. The RF model trained using the collapsed bridges from both events identified more than 60% of collapsed bridges in the training and test data. It obtained high accuracy in the validation data. Considering the high kappa coefficient in the validation, the RF model using all the change features (i.e., r, µ d , σ d , d m , and p) was the best of the twelve proposed models.
The pre-and post-event TSX intensity images and the post-event aerial photos taken by the GSI [43] for the collapsed bridges M160, M163, and M168 are shown in Figure 6. These bridges could not be identified, even by the best RF model. From the aerial photos of the bridges M160 and M163, we recognized that only part of the deck was washed away. The changed regions were limited, compared with the whole bridges. Thus, they were difficult to identify. The bridge M168 was a small bridge, 16.8 m in length and 3.7 m in width [1]. As the spatial resolution of the TSX images is about 3 m, the reflection region from the bridge was very small. Although the whole deck was washed away, it was still difficult to identify.
Remote Sens. 2021, 13, x FOR PEER REVIEW 16 of 22 Figure 6. Three partly collapsed or washed-away bridges that could not be identified by the best RF model using five features for the data set 2 [43].
The pre-and post-event TSX intensity images and the post-event ground photos of the collapsed bridges K23 and K30 are shown in Figure 7. They were not identified by the best RF model. One surviving bridges, K31, which was misclassified as collapsed is also shown in Figure 7. The ground photos were taken by one of the co-authors on Sept. 17, 2020. Bridge K23, which was located at the radar shadow of mountains, had weak backscatter in the pre-event ALOS-2 image. Although the deck was completely washed away, the change in the intensity images was slight. Thus, it was classified as surviving. The collapsed bridge K21 was omitted for the same reason. Bridge K30 was a partly collapsed old bridge. One span of the deck, connecting to the embankment, had failed. The collapsed part was narrow and close to the land; hence, it was missed by the model. Bridge K31 was the only surviving bridge classified as collapsed. According to the ground photo, there was no significant damage. However, the backscatter patterns of the pre-and post-event intensity images were different. This change was caused by two reasons, at the time of the post-event ALOS-2 image acquisition. One was the flooding over the embankment, which reduced the backscatter from the east part of the outline. The other reason is the high water level. The water level increased close to the deck, which reduced the backscatter of the double-and triple-bounces. It is quite a rare case that a SAR image is obtained under flooding conditions.

No.
Pre-event TSX image Post-event TSX image Post-event Aerial photo [43] M160 M163 M168 Figure 6. Three partly collapsed or washed-away bridges that could not be identified by the best RF model using five features for the data set 2 [43].
The pre-and post-event TSX intensity images and the post-event ground photos of the collapsed bridges K23 and K30 are shown in Figure 7. They were not identified by the best RF model. One surviving bridges, K31, which was misclassified as collapsed is also shown in Figure 7. The ground photos were taken by one of the co-authors on Sept. 17, 2020. Bridge K23, which was located at the radar shadow of mountains, had weak backscatter in the pre-event ALOS-2 image. Although the deck was completely washed away, the change in the intensity images was slight. Thus, it was classified as surviving. The collapsed bridge K21 was omitted for the same reason. Bridge K30 was a partly collapsed old bridge. One span of the deck, connecting to the embankment, had failed. The collapsed part was narrow and close to the land; hence, it was missed by the model. Bridge K31 was the only surviving bridge classified as collapsed. According to the ground photo, there was no significant damage. However, the backscatter patterns of the pre-and post-event intensity images were different. This change was caused by two reasons, at the time of the post-event ALOS-2 image acquisition. One was the flooding over the embankment, which reduced the backscatter from the east part of the outline. The other reason is the high water level. The water level increased close to the deck, which reduced the backscatter of the double-and triple-bounces. It is quite a rare case that a SAR image is obtained under flooding conditions.  Figure 7. Two collapsed bridges, K23 and K30, that could not be identified by the RF model using five features for data set 2, and one surviving bridge, K31, which was misclassified as collapsed.
The RF model using the five change features of data set 2 showed good capability for the detection of collapsed bridges, with a kappa coefficient of 0.76. As our objective is the detection of collapsed bridges, a recall accuracy of 0.71 is still not enough. Thus, we introduced an oversampling technique to improve the model. The synthetic minority over-sample technique (SMOTE) is an oversampling approach to create synthetic examples for the minority class [51]. There were 10 collapsed bridges and 38 surviving bridges in the training set of data set 2. We used the SMOTE function in the Imbalanced-learn library to increase the number of collapsed bridges to 38, the same as the number of surviving bridges in the training set [52]. Then, the RF model was trained using the five features from the 76 samples.
The accuracies of the results are shown in Table 10. The kappa coefficient for the training set decreased from 0.79 to 0.75, whereas it increased from 0.65 to 0.68 for the test set. For the training set, eight out of 10 collapsed bridges were identified successfully, with two misclassified surviving bridges. For the test set, four out of five collapsed bridges were detected, whereas one surviving bridge was misclassified. For the validation set, all six collapsed bridges were identified, and two surviving ones were misclassified as collapsed. The recall of the collapsed bridges reached 1.0. The obtained model was applied to 10 validation sets, which were divided randomly. The minimum recall was 0.80, and the minimum kappa coefficient was 0.70, representing better accuracy than the original RF model. The final recall of the validation set was 0.96, and the precision was 0.80. The accuracy was 0.93, and the kappa coefficient was 0.82, indicating almost perfect agreement.
Using the modified RF model, 13 out of 14 collapsed bridges were detected in the July floods event, and five out of seven collapsed bridges were detected in the Tohoku earthquake. The omission errors for collapsed bridges were K30, M160, and M163. The The RF model using the five change features of data set 2 showed good capability for the detection of collapsed bridges, with a kappa coefficient of 0.76. As our objective is the detection of collapsed bridges, a recall accuracy of 0.71 is still not enough. Thus, we introduced an oversampling technique to improve the model. The synthetic minority over-sample technique (SMOTE) is an oversampling approach to create synthetic examples for the minority class [51]. There were 10 collapsed bridges and 38 surviving bridges in the training set of data set 2. We used the SMOTE function in the Imbalanced-learn library to increase the number of collapsed bridges to 38, the same as the number of surviving bridges in the training set [52]. Then, the RF model was trained using the five features from the 76 samples.
The accuracies of the results are shown in Table 10. The kappa coefficient for the training set decreased from 0.79 to 0.75, whereas it increased from 0.65 to 0.68 for the test set. For the training set, eight out of 10 collapsed bridges were identified successfully, with two misclassified surviving bridges. For the test set, four out of five collapsed bridges were detected, whereas one surviving bridge was misclassified. For the validation set, all six collapsed bridges were identified, and two surviving ones were misclassified as collapsed. The recall of the collapsed bridges reached 1.0. The obtained model was applied to 10 validation sets, which were divided randomly. The minimum recall was 0.80, and the minimum kappa coefficient was 0.70, representing better accuracy than the original RF model. The final recall of the validation set was 0.96, and the precision was 0.80. The accuracy was 0.93, and the kappa coefficient was 0.82, indicating almost perfect agreement. Using the modified RF model, 13 out of 14 collapsed bridges were detected in the July floods event, and five out of seven collapsed bridges were detected in the Tohoku earthquake. The omission errors for collapsed bridges were K30, M160, and M163. The misclassified surviving bridges increased, being K3, K31, K38, M10, and M16. For all the target bridges, the recall of the collapsed bridges was 0.95, and the precision was 0.80. The accuracy was 0.93, and the kappa coefficient was 0.82, indicating substantial agreement. Therefore, this model showed sufficient capability for the detection of collapsed bridges.
We applied the random forest and logistic regression methods to two different data sets. When fitting with data set 1, the RF models using four features and the LR models using three features obtained the best results. The RF models showed better accuracy in the training step; however, their accuracies in the test step were lower than those of the LR models. When fitting with data set 2, the RF model using five features obtained the best accuracy, whereas all the LR models obtained the same results, regardless of input features. The accuracy of the test data using data set 1 was better than that using data set 2. On the contrary, the accuracy of validation using data set 2 was better than that using data set 1. The training and test data of data set 1 were from one event-the July floods. The variation in damage patterns and surrounding environment was small, which led to good accuracy in the training and test steps. When those models were applied to the other event-that is, the Tohoku earthquake-their accuracy decreased significantly. For the models trained using the mixed data from the two events (data set 2), the accuracies in the fitting approach were moderate. However, the models have more versatility, which led to better validation accuracy. According to these comparisons, we can summarize the characteristics of the models as follows: (1) when the training data were limited, the LR model with a few features showed a good capability to describe the data; (2) the FR models had more capability for the various data sets; and (3) the increase in diversity in the training data improved the performance of the model.
After applying the oversampling approach to the training data for data set 2, the recall, accuracy, and kappa coefficient of the RF model increased. Only the partially collapsed bridge, M30, which was affected by the July floods, could not be identified. Further, the commission error increased from one to five bridges, including two bridges affected by the July floods and three bridges affected by the Tohoku earthquake. Comparing the importance of the five features in the RF model and the modified RF model, the third important feature changed from the correlation in the RF model to the average of difference in the modified RF model. This might be the reason for the improvement in the recall accuracy. The modified RF model showed very high capability for detecting collapsed bridges.
In the previous study [28], the thresholding of correlation coefficient was used to detect severely damaged bridges for the Tohoku earthquake. The threshold value was defined by a statistic discussion. The recall of the severely damaged bridges was 89%, and the kappa coefficient was only 0.43. Comparing to the result in [28], the best model in this study achieved 95% recall and 0.82 kappa coefficient, showing significant improvement. In additional, the thresholding method needs manual definition of the threshold value, whereas the proposed model classifies bridges automatedly.
There were two limitations to the proposed models. First, the outlines of bridges were necessary. In this study, we created the original outlines using GIS data of roads and rivers. When GIS data are not available, it is difficult to generate bridge outlines. The detection of bridges using a deep learning method could provide a solution to this problem [26]. Another limitation is that both pre-and post-event SAR images acquired under the same conditions are required; however, pre-event SAR images are not always available. On the contrary, the proposed method can be applied to different SAR pairs, even with different wavelength. As the features were obtained according to the changes, the influence of the acquisition conditions and sensors was reduced. Thus, we could improve our model by combining more damaged bridges from different events using different SAR sensors.

Conclusions
In this study, we used random forest and logistic regression methods to identify collapsed bridges. Two data sets were generated, using the change features from 88 affected bridges in the 2011 Tohoku-oki earthquake and the 2020 July floods, both in Japan. The first data set included training and test data from one event, and validation data from another event. The second data set included mixed data from the two events, which were divided into training, test, and validation data randomly. The input data of the models were five features determined by the change detection step. After a comparison of different numbers of input features, the random forest model using five features for data set 2 showed the best validation accuracy. Fifteen out of 21 collapsed bridges were identified successfully, with one misclassified surviving bridge. Then, an oversampling approach was introduced to the training data to balance the number of samples for the two classes. The new model, trained by the modified training data, could identify more collapsed bridges, showing almost perfect agreement with the true data. In the future, we intend to keep collecting the data of damaged bridges to improve the versatility of the RF model.