Towards Automatic Landslide-Quake Identification Using a Random Forest Classifier

: Landslide-generated seismic waves (landslide-quakes), exhibiting distinctive waveforms and frequency characteristics, can be recorded by nearby seismometers. Implementing an automatic classifier for landslide-quakes could help provide objective and accurate initiation times of landslides with efficiency. This study collected and analyzed the time information of 214 landslide seismic records due to 33 documented landslide events, from the Broadband Array in Taiwan for Seismology (BATS). In addition, equal numbers of earthquake and noise signals were also incorporated. The 642 seismic signals and time information were carefully examined using the random forest algorithm to create an automatic landslide-quake classifier. By validating the signal attributes of the landslide, earthquake, and noise events, specifically in the time and frequency domains, it was shown that the proposed classifier can reach an accuracy (the proportion of all correctly classified events to the total number of events) of 91.3%. To further evaluate the applicability of the automatic classifier, landslide-quakes generated during the devastating Typhoon Morakot (2009) and Typhoon Soudelor (2015) were also verified, showing that the sensitivity of the classifier is higher than 98%.


Introduction
Landslides are one of the most frequent geohazards in mountain regions, often causing severe damage to facilities and threatening human lives [1,2]. There are two critical landslide triggers, earthquakes and heavy rains, which occur frequently in Taiwan due to the subtropical monsoon climate and the location on the convergent boundary between the Eurasian plate and the Philippine Sea plate [3][4][5]. In previous studies of landslides, whether focused on the collapse mechanism or the rainfall conditions for landslides, the occurrence time of a landslide event was a critical, but hardly accessible, piece of information [6]. Nevertheless, it was indicated that seismic activities causing ground vibrations, such as earthquakes, landslides, and volcano-tectonics, could be recorded as seismic signals by a seismograph, and the features of a seismic signal originating from different seismic sources are distinctive and discriminable for seismic event classification [7][8][9]. Note that the occurrence time of a landslide event has generally been estimated by interviewing local witnesses in a field survey [6]. If the characteristics of a landslide event could be detected and identified from its seismic signal, a more precise and objective estimation of the occurrence time could be acquired [10], and uninhabited areas would have a higher possibility of being investigated.
Consider that large sets of seismic signals, manual interpretation, and classification of seismic records can be time-consuming and may be highly subjective. Consequently, an intelligent alternative, such as an automatic classifier utilizing machine learning algorithms for seismic event classification, would be helpful in characterizing the seismic signals for understanding the seismic activities.
In this study, the Broadband Array in Taiwan for Seismology (BATS) was utilized to distinguish landslide events from cases of earthquakes and noise. First, three kinds of seismograms, those containing signals of landslides, earthquakes, and noise, were collected for a training dataset. Then the attributes of the training dataset were calculated from the waveforms or frequency spectra to construct and to train an automatic classification model. Finally, seismic network data collected during Typhoon Morakot in 2009 and Typhoon Soudelor in 2015 were analyzed to examine the model, and the attribute performance was evaluated to determine which attributes are the most discriminant for the classification.

Seismic Data
The seismic data used belonged to the Broadband Array in Taiwan for Seismology (BATS), which was deployed by Academia Sinica and Taiwan's Central Weather Bureau (CWB) since 1992 [11]. BATS provides high-precision continuous seismic records to document the ground vibrations of a landslide. The array features 39 seismic stations comprising an intensive seismic network with a mean inter-distance of 30 km and a covering area of approximately 350 km by 400 km (Figure 1). All permanent BATS stations are equipped with state-of-the-art very-broadband sensors with a frequency band of 0.00833-8 Hz. Data streams with a sampling rate of 20 samples per s were recorded by the Q330 recorder. All stations are capable of internet connection for immediate retrieval of realtime data. In the study, we used the vertical component of seismic records. The seismic sensor in each station has its instrumental response. The website of the BATS provides information of the response of seismic sensors in each station. Since this study focused on massive landslides having a disturbed area larger than 10 ha on the island of Taiwan, the off-island stations were excluded and suitable stations with qualifying records were selected.
Landslide-generated seismic waves (landslide-quakes) can be attributed to the shear force and loading on the ground surface as the mass moves downslope. The seismic amplitude gradually rises above the level from the ambient noise to the peak, exhibiting a cigar-shaped envelope. After the peak amplitude, most landslide-generated seismic signals experience relatively long decay times. In the frequency domain, landslide-quake energy is mainly distributed below 10 Hz, and the signature in the spectrogram appears triangular due to an increase in high-frequency constituents over the time [7][8][9]. The distinctive triangular signature in the spectrogram provides a great solution for differentiate the landslide-induced signals from those produced from the earthquakes and the ambient noise. Figure 1 presents the waveforms and spectrograms recorded from the catastrophic landslide with an area of 252 ha (the Xiaolin landslide) at 22:16, 8 August 2009, during Typhoon Morakot by four seismic stations. Similar seismograms and spectrograms of the Xiaolin landslideinduced seismic signals are recorded by the YULB and the YHNB stations, located 70 km and 185 km, respectively, away from the landslide, as shown in Figure 2a,b. However, the signals recorded by the RLNB (85 km) and TWKB (135 km) seismic stations, located amid the above mentioned stations, show undistinguishable fluctuations in association with this landslide event. These undistinguishable fluctuations can be explained by the seismic site effects or the disturbance of ambient noise from sea waves or human activities. The short-period (<10 s) ambient noise generated by sea waves is proved to influence all seismic records in the Taiwan island [12]. The TWKB station is located at the coastline and close to a famous sightseeing spot. The ambient noise from sea waves and the city is more significant. By examining whether each seismic station can clearly observe the characteristics of the seismic signals generated by the known massive landslides, eleven seismic stations, shown in Figure  1, were selected for the investigation.

Random Forest Algorithm and Training Sample Collection
Monitoring landslide-induce seismic signals leading to successfully early warnings for their occurrences has many years been a challenge in the scientific community. Random forest algorithm is one of the useful machine learning algorithms to discriminate various kinds of seismicity (e.g., volcanic earthquakes, tectonic earthquake, rockfall, landslide, and artificial noise) [13][14][15]. The idea of the random forest algorithm is to build a large collection of de-correlated decision trees and make a majority vote in order to reduce the variance of classification results [16,17]. An illustration is shown in Figure 3. First, bagging, which means randomly sampling with replacement and outputting the result of a majority vote, was used to generate k training subsets from a known training dataset containing N records, wherein the size of every subset was the same as that of the dataset. By means of bagging, some samples in a subset might be repeated, while some samples might never be selected. It ensured some discrepancy among the training subsets, none of which was identical to the dataset. To produce de-correlated decision trees, only m attributes of a total number of M were chosen randomly without replacement for each involved decision tree (m < M). Through bagging and random and partial selection of attributes, every decision tree involved different numbers of attributes, so the automatic classifier was trained by unique subsets to ensure their decorrelation among other trees in the forest.  For the random forest algorithm, each decision tree outputted a predicted result, and the majority of all predicted results determined the final result of the classification. A total of 100 decision trees were adopted for the random forest algorithm in the present study. To build an automatic classifier, the samples in a training dataset should be sufficiently representative of the event to be classified; therefore, we collected documented landslide events in the previous literature and interpreted those seismic signals from 11 seismic stations. The ground motions caused by the landslides may not be observed in all seismic stations. Therefore, the seismic traces for training were carefully collected only if the seismic signals exhibit distinct waveforms and spectrogram features induced by landslides. The seismic traces are excluded if the unique waveform characteristics cannot be observed in the records of the seismic stations. In the end, 214 seismic traces of 33 landslide events were collected for training the class of landslide.
Since similarly-sized training samples for different classes can obtain better classification results, 214 seismic traces of 45 earthquake events, documented by the CWB, were chosen from the 11 seismic stations. The magnitudes of the 45 earthquakes ranged from ML 1.4 to 5.1. All used earthquake events are belong to regional earthquakes around or in the Taiwan island. The farthest epicentral distance was 320 km, and the mean epicentral distance was 112 km. Among the 214 seismic traces of earthquakes, only 19 (~9%) were collected from the stations having an epicentral distance of larger than 200 km. In addition, the 214 seismic traces of noise were selected as the training samples. For background noise, we randomly collected the seismic traces without apparent spikes.
A sum of 642 seismic records comprised the training dataset with three equally-sized classes, and each seismic record was preprocessed and cut into a 5-min signal. Preprocessing, including removing mean, detrending and removing instrumental response of data, was performed. Then the attributes in the time domain or in the frequency domain were calculated from their seismic signals. The flowchart of the methodology is presented in Figure 4. We first collected a training dataset consisting of three classes, namely, landslides, earthquakes, and noise, and then the attributes of the preprocessed data were extracted to build an automatic classifier using the random forest algorithm. Finally, the seismic records collected during the Typhoon Morakot in 2009 and the Typhoon Soudelor in 2015 were classified automatically by the trained classifier to acquire the occurrence times of the landslides. The classification results of the automatic classifier validation will be discussed further.

Time-Domain Attributes
Kao et al. [18] developed an algorithm to automatically detect and characterize seismic waveforms in association with episodic tremor and slip events in northern Cascadia. Two significant attributes in the time domain were utilized to determine the pattern of a waveform: moving average (MA) and scintillation index (SI). MA is a common smoothing process to spot trends in seismic signals. Before calculating the value of MA, a fourth-order band-pass Butterworth filter was applied with a frequency band of 1-5 Hz to best manifest landslide signals [19]. The filtered seismic records were normalized by the mean of the eight largest amplitudes to avoid the presence of anomalies. In this study, the mean of the absolute amplitudes of all samples within a specific time window was calculated. That calculated mean value is assigned at the center point of the time window. The moving average can be expressed as: where iμ|y| is the value of the MA for a time window centered at point i, y is the absolute amplitude within the time window, |y(j)| is the absolute value of the normalized amplitude at point j, N is the total number of samples within the time window, and t is the length of the time window. Conventionally, the length of a time window has to be long enough to catch the trend of the variation in waveform. In our case, the length of the time window was 60 s, and a total of 1200 of data points measured at a sample rate of 20 data points per s were in a time window. Figure 5 shows the moving average functions for three classes of seismic events. For the moving average function, the signals are obviously smooth, the case of background noise shows only small fluctuation, and the mean value of the signal is greater than the two other cases. In contrast, the earthquake and landslide signals are significantly higher during the event, and outside of the duration, they are depressed. For earthquakes, the noise level in the moving average function is below 0.1, and the MA value during the event is higher than 1.0. For landslides, the noise level in the moving average function ranges between 0.5 and 1.0, and the MA value during the event can be higher than 1.5. The fluctuation of the moving average function of the earthquake is significantly larger than that of the landslide. In the presentation of Kao et al. [18], four significant attributes were found to be important in association with the characteristics of the MA function: the mean (μMA), the standard deviation (σMA), the moving average ratio (MAR), and the ratio of the mean value to the standard deviation (σMA/μMA), which can be expressed as: where K is the total number of the data points, and MAmax is the maximum of the MA function. The scintillation index (SI), a parameter proposed by Yeh and Liu [20] to measure the intensity of energy bursts of radio waves in the ionosphere, was used to characterize the intensity of the seismic signals. SI can be defined as the square root of the standard deviation of a seismic signal, and it sensitively increases for variation of the signal intensity during earthquake or landslide events. The expression of SI in mathematics can be written as: where | | is the MA value of the square of the absolute amplitude. Four significant attributes are calculated from the SI function: the mean for SI (μSI), the standard deviation for SI (σSI), the scintillation index ratio (SIR), and the ratio of the mean value to the standard deviation for SI (σSI/μSI). In Figure 5, it is easy to find that the variations of MA and SI for the noise case are less than those for earthquake and landslide cases, so the attribute values, including σMA, σSI, MAR, and SIR, are lower. Due to the normalization, the ranking of μMA for different cases follows the rule μMAnoise > μMALS > μMAEQ. The ranking of μSI is μSIEQ > μSILS > μSInoise, since SI is associated with the signal magnitude. In addition to the above eight attributes, the mean of an absolute seismic signal itself is an additional time-domain attribute to facilitate the efficiency of classification.

Frequency-Domain Attributes
We collected the seismic signals of ten earthquakes (three in M3.0-3.9, four in M4.0-4.9, and three in M5.0-5.9) from the MASB station to calculate their frequency spectrum. Since these seismic signals reflect different earthquake magnitudes, we normalized each spectra to the range of 0-1.0 by its maximum value of amplitude. Subsequently, the normalized spectrum were used to calculate the average spectra for earthquakes. The above-mentioned calculation approach was used to obtain the spectrum for landslide-quakes and background noise. Through this normalization, the average frequency spectrum display the energy distribution in the frequency domain for different classes of seismic events, and eliminate the influence of magnitude/size and distance. Figure 6 shows the frequency spectra for the earthquake, landslide, and noise cases, and each one is the average of 10 samples normalized by its maximum. At higher frequency (> 1 Hz), the energy of an earthquake is greater than that of a landslide, but the situation is reversed at lower frequency (< 1 Hz). Spectral energies from different frequency ranges were calculated to characterize the distribution of energy related to a seismic event as the attributes in the frequency domain. For the seismic data of the BATS, the frequency passband is from 0.00833 Hz to 8 Hz. Any records beyond that frequency limit might be distorted, so the upper frequency limit is 8 Hz for the calculation of attributes in the frequency domain.
Welch [21] proposed a method for the estimation of power spectral density (PSD). The energy distribution along a frequency can be obtained by a continuous waveform through fast Fourier transform. In this study, the frequency difference was 0.01 Hz, the length of the time window was 5 s, and the overlap was 50%. To eliminate the effect of geometric spreading between the epicenter and the seismic station, the PSD was normalized by its maximum.
For landslide analysis, Lin et al. [22] and Kao et al. [23] proposed frequency ranges of 0.02-0.05 Hz and 1-5 Hz as suitable; therefore, seven frequency ranges were chosen for the BATS data according to the limits of the passband as follows: (1) Figure 5, the energy distribution patterns of seismic signals for the three classes are quite different. In addition to the PSD values for the different frequency ranges, the PSD ratios were also used to reflect the difference of energy distribution of seismic signals in the frequency domain.
Provost et al. [14] used frequencies in the range of a spectral peak as attributes of classification, including the peak frequency (F_max), the highest frequency (F_high), and the lower frequency (F_low). The determination of the above frequencies can be used as an example, as shown in Figure  6. First, the peak frequency of a frequency spectrum, calculated from a waveform through fast Fourier transformation, is the frequency of the spectral peak. A threshold value is determined by multiplying the peak value of the spectrum by 0.2; then the higher and lower frequencies of the intersection of the frequency and the threshold value are the highest and lowest frequencies, respectively. The formulas can be written as follows: F_low = minF(PSD(F) < 0.2 × max(PSD)), where PSD(F) is the spectral value of the frequency F. In summary, the 24 attributes listed as Table 1 were considered for the signal characteristics of different classes of seismic events, including 9 attributes in the time domain and 15 in the frequency domain. Those attributes were calculated from all 642 training data, and a classifier was built using the attributes. Then the trained classifier could automatically identify the class of a new seismic signal through its calculated attributes.

Classifier Performance
The confusion matrix is the most commonly used tool for presenting the performance of supervised machine learning algorithms. It can effectively and clearly represent the accuracy of a classifier and its sensitivity to various types of events [24]. Given a classifier and a targeted event, there are four possible outcomes. If the event is positive and it is classified as positive, it is counted as a true positive (TP); if it is classified as negative, it is counted as a false negative (FN). If the instance is negative and it is classified as negative, it is counted as a true negative (TN); if it is classified as positive, it is counted as a false positive (FP). Given a classifier and a set of validation events, a twoby-two confusion matrix can be created to represent the classification results of the set of events ( Table 2). The three indicators (i.e., sensitivity, precision, and accuracy) are often used in the confusion matrix to evaluate the performance of the classifier. The sensitivity (also called recall) of the classifier is estimated as: The precision of the classifier is: The accuracy of the classifier represents the proportion of all correctly classified events to the total number of events, and is calculated as: Accuracy = (TP + TN)/(TP + TN + FP + FN),

Manual class A True Positive False Negative B False Positive True Negative
Since the training database used in this study was not large (<1000 samples), K-fold cross validation was employed to effectively use every sample. K-fold cross validation divides the database into K subsets. The samples of each class in each subset will have similar proportions. During the construction of the classifier, one subset is used as a validation set each time, and the remaining K-1 subsets are used as training sets (Figure 7). Repeated K times, each test will get one classification accuracy and then the average K results to get the final performance of classification [25]. The K value used in this study was five. After five-fold cross validation, the values in the mean confusion matrix were obtained by summing up the confusion matrixes of the five tests. The mean sensitivity, precision, and accuracy were the averages of five tests.

Performance of the Automatic Classifier Using 24 Attributes
It is important to note that the use of a confusion matrix may be affected by the number of samples of various classes. The performance of the classification may suffer from an unbalanced proportion between the noise to the others, potentially leading to a biased result. In the study, when calculating the confusion matrix, the number of the three classes (earthquake, landslide, and noise) was deliberately controlled to be the same. Table 3 presents the classifying result of the training dataset from the trained classifier containing 24 attributes. In 214 seismic records of landslide events, 186 records were correctly classified, and 28 records were misclassified as earthquakes or noise. The sensitivity of landslide events was 86.9%. In 209 predicted landslide events classified by the presented classifier, 186 records were real landslide events; however, 19 earthquake events and four noise events were misclassified as landslides. The precision for landslide events was 89.0%. In addition, the sensitivity and precision for earthquake events were 88.8% and 88.4%, respectively; the sensitivity and precision for noise events were 98.1% and 96.3%, respectively. The accuracy of the automatic classifier for that training dataset was 91.3%, and the sensitivity or precision for all three different classes exceeded at least 85%. It is evident that the present automatic classifier succeeded in classifying most of the seismic events in the training dataset.  Figure 8 shows the distributions of individual attribute values for the three classes of events calculated by all 642 training records. No single attribute could perfectly discriminate the three classes of events, but to some extent, the differences among the distributions of individual attribute values were associated with different classes. The difference in attribute values between earthquake and noise events was rather large in comparison with the attribute values of landslide events, which are prone to have values in between. Thus, a classifier integrating many attributes is necessary for desirable classification results. Two kinds of classifier were built to evaluate the importance of the attributes. The first was single-attribute classifiers, and the second was 23-attribute classifiers, which were composed of all possible combinations of 23 of the 24 attributes. The relative importance of the attributes in the classification is presented in Figure 9. The respective number of attributes refers to Table 1, the first nine of the respective numbers are time domain attributes, and the rest belong to the frequency domain. In Figure 9a, the black line represents the result of the proposed classifier using all attributes, and the accuracy was 91.3%; the blue line indicates the result of the 23-attribute classifier, which omitted the numbered attribute and relied on the rest of the 23 attributes; and the red line indicates that for the single-attribute classifier. In other words, the blue line demonstrates the variation of the accuracy if a single attribute is lost, and the red line illustrates the accuracy of a single attribute. Through testing of single-attribute classifiers, we observed that the accuracy of a single attribute was largely distinct among the others, and the most discriminant attribute was the moving average ratio (MAR, T3), with accuracy of 83.7%. The least discriminant attribute was the PSD between 0.02 and 0.05 Hz (LF1, F10), with accuracy of 53.8%. The mean accuracy of the classifiers created by time-domain attributes was higher than that created by frequency-domain attributes. Although some of the single-attribute classifiers had lower accuracy, omitting any one of the attributes decreased the accuracy by at least 2%-3%, which implies that every attribute had a certain amount of influence on the classification result. An attribute contributes improvement to a classifier if its accuracy is beyond 50%. To clarify the role of an attribute in a classifier, the single-attribute classifiers with the lowest accuracy in the time domain and frequency domain are discussed as follows. The accuracy of the σMA single-attribute classifier was 65.6%, as shown as Table 4, but its sensitivity or precision for noise events was higher than 90%. In the confusion matrix, for identifying landslides, the sensitivity value is obtained by dividing the correctly identified landslide signals by the manually identified landslide signals, whereas the precision value is obtained by dividing the correctly identified landslide signals by all of the classified landslide signals. It is a fact that the σMA attribute is not very effective for discriminating between earthquake and landslide events, but it does work for noise detection. Comparing the three classes of MA function in Figure 5, the similar humpshaped evolutions of earthquake and landslide would result in the close standard deviation values, which represent the similar discrete levels. On the other hand, the evolution of MA function for the noise case shows a horizontal line with small noisy disturbance, which means a lower discrete level and lower standard deviation values.   Table 4 shows that the accuracy of the LF1 single-attribute classifier was 53.9%, and its sensitivity for noise events was over 70%. In Figure 6, it can be observed that the noise case had lower energy in the frequency range of 0.02-0.05 Hz than did the earthquake or landslide cases, so the LF1 attribute efficiently separates noise events from landslide and earthquake events.

Influence of Signal Attributes
In order to prove that every attribute improves the accuracy of classification, all attributes were sorted by the accuracy of their single-attribute classifiers in descending order and accumulated to build new classifiers to calculate their accuracy. The accuracy increased with cumulative numbers of attributes, as shown as Figure 9b. We observed that when the attributes F23, F21, F18, F17, and F11 were introduced into the training set, the accuracy decreased slightly (< 0.2%). It is noted that the slight decrease in the accuracy in the curve of cumulative attributes was likely due to the grouping of the training dataset. The accuracy of the classifier was expected to increase as the number of attributes increases. However, it is found that inclusion of several attributes (e.g., F16, F11) would lower the accuracy very slightly (Figure 9b due to the grouping effect [16]. This effect resulted in a slight decrease in accuracy when cumulating the number of attributes. The small drop in the value of accuracy caused by the grouping effect can be ignored. This means that after some attributes enter the training, their positive effect on the classifier is very small, but it will not have a negative effect.  Table 4. The TD classifier had a higher accuracy (87.7%) than that of the FD classifier (84.7%), in agreement with the test of single-attribute classifiers. Comparing the results between the TD and FD classifiers, the sensitivities for earthquakes had no significant difference. However, the misclassification of landslide and noise cases increased obviously. For landslide cases, the sensitivity of the FD classifier was 6.1% lower than that of TD classifier, and for noise cases, the sensitivity was 3.7% lower. Since the TD classifiers were better for discriminating the cases of landslides from noise and the FD classifier was better for discriminating earthquakes from noise, a classifier combining attributes in the time and frequency domains together could provide better classification of all three cases.

Typhoon Morakot in 2009
To test the applicability of the present classifier, seismic data collected during Typhoon Morakot on 7-10 August 2009 were adopted. First, manual classification was conducted in accordance with the earthquake catalog of Taiwan's CWB to construct a seismic event catalog. The four days of data contained 193 landslide events and 52 earthquake events. In addition, 67 landslide events were measured and identified by at least three seismic stations; 30 landslide events, by two seismic stations; and the remaining 96 landslide events, by single stations. The seismic data during Typhoon Morakot were automatically classified by the TD-FD classifier, and the results had accuracy of 87.3%, as shown in Table 4. Of a total of 193 landslide events, the TD-FD classifier correctly interpreted 191 landslide events and failed on two events: one was misidentified as an earthquake event, and the other was not detected. The sensitivity for landslides was 98.9%. Of 52 earthquake events, the classifier correctly identified 23 earthquake events and misclassified 29 as landslide events. The sensitivity for earthquake cases was 44.2%. Of a total of 220 records classified as landslide events, 191 were actually landslide cases, and the other 29 were earthquake cases in fact. The precision for landsides was 86.8%. Of 24 records classified as earthquake events, 23 were actually earthquake cases and the remaining one was a landslide case. The precision for earthquakes was 95.8%. It was concluded that the present automatic classifier had a higher possibility of misclassifying an earthquake as a landslide than of misclassifying a landslide as an earthquake. Figure 10a is an example of a correct classification of a landslide event. The features of a landslide case are obvious in both the waveform and the spectrogram, and the signal of the feature is over 20 s in length. Figure 10b is a successful classification of an earthquake event. The characteristics of an earthquake can be observed clearly in the waveform and in the frequency. Figure 10c is a landslide event which was not detected by the present classifier due to the features of the landslide being discoverable only by visual detection. Some ambiguous patterns resembling landslide signals can be seen in the waveform and the spectrogram. However, it could not be recognized by the present classifier because of the short duration and small amplitude. The signalto-noise ratio of this record is 6.54, which is comparable to the normal values of 4-4.5, indicating the very close vibration magnitudes of this record and a noisy signal. The attributes of this record are similar to the features of a noise case, and the classifier was unable to discriminate it from a noise event. Figure 10d is an example of misclassifying an earthquake as a landslide. This seismic event is recorded in the earthquake catalog of the CWB. The location of the ML 3.1 earthquake is 22.88° N/121.29° E, and the epicentral distance to the SSLB station is 106 km. It is easy to confuse this record with a landslide case due to the waveform and frequency, since the attributes of this misidentified earthquake event are very close to those of a landslide case.

Typhoon Soudelor in 2015
Since the rainfall and damage caused by Typhoon Morakot occurred in southern Taiwan, the seismic data collected during Typhoon Soudelor on 6-10 August 2015 were used to examine the present classifier for the storm event occurred in Northern Taiwan. Visual detection of events in the earthquake catalog of the CWB indicated 17 landslide events and 70 earthquake events during Typhoon Soudelor. Of the landslide events, 15 records were measured and identified by at least three seismic stations, and the remaining two were detected by single stations. The accuracy of classification was 62.1%, as shown as Table 4. The present classifier succeeded in identifying all 17 landslide events, so the sensitivity for landslide cases was 100%. Of the 70 earthquake events, 37 records were classified correctly, and the remaining 33 records were identified as landslide events. The sensitivity for earthquake cases was 52.9%. Of the 50 records classified as landslide events, 17 were landslide events, and the other 33 belonged to earthquake events. The precision for landslide cases was 34.0%. All 37 records classified as earthquake events were confirmed by visual detection, so the precision for earthquake cases was 100%. Comparing the classification results of Typhoon Morakot and Typhoon Soudelor, the results for Typhoon Soudelor had lower accuracy (62.1%), but the sensitivity for landslides and earthquakes and the precision for earthquakes were comparable.

Conclusions
In this study, an automatic classifier using a random forest supervised algorithm with a training dataset of 642 seismic signals was built to classify three kinds of events: landslides, earthquakes, and noise. To train the model, 24 attributes in the time and frequency domains were computed and integrated. The present classifier was validated by the training dataset using five-fold cross validation, and the excellent accuracy of 91.3% supported that this methodology can discriminate the characteristics of the three kinds of events. The sensitivities and precision for landslide cases are 86.9% and 89.0%, respectively. Tests of single-attribute classifiers were performed to clarify the importance of each attribute. We concluded that the influence on the classification result was positive if the accuracy of a single-attribute classifier exceeded 50%, and the influences of the time-domain attributes as a whole were more important than those in the frequency domain because the former had better discrimination results for the cases of landslides and noise. Generally, the time-domain attributes had better discriminant ability for the cases of landslides and noise, and the frequencydomain attributes were better for discriminating cases of earthquakes from noise. An integrated classifier combining attributes in the domains of time and frequency together achieved better classification results for all three kinds of events. To test the practical application, two datasets collected during different typhoon events were processed by the established automatic classifier. For data collected during Typhoon Morakot, of a total of 193 landslide events identified by human operators, 191 landslide events were correctly classified, one was misclassified as an earthquake event, and one was not detected. For data collected during Typhoon Soudelor, all 17 landslide events were classified successfully, in agreement with the human operator results. The very high sensitivity of 98% demonstrated that the present automatic classifier using the random forest supervised algorithm is feasible, for it can provide more precise occurrence times of landslides for historical events and presents the potential for implementation of a nearly real-time alarm system for large landslides.

Conflicts of Interest:
The authors declare no conflict of interest.