Automated Landslides Detection for Mountain Cities Using Multi-Temporal Remote Sensing Imagery

Landslides that take place in mountain cities tend to cause huge casualties and economic losses, and a precise survey of landslide areas is a critical task for disaster emergency. However, because of the complicated appearance of the nature, it is difficult to find a spatial regularity that only relates to landslides, thus landslides detection based on only spatial information or artificial features usually performs poorly. In this paper, an automated landslides detection approach that is aiming at mountain cities has been proposed based on pre- and post-event remote sensing images, it mainly utilizes the knowledge of landslide-related surface covering changes, and makes full use of the temporal and spatial information. A change detection method using Deep Convolution Neural Network (DCNN) was introduced to extract the areas where drastic alterations have taken place; then, focusing on the changed areas, the Spatial Temporal Context Learning (STCL) was conducted to identify the landslides areas; finally, we use slope degree which is derived from digital elevation model (DEM) to make the result more reliable, and the change of DEM is used for making the detected areas more complete. The approach was applied to detecting the landslides in Shenzhen, Zhouqu County and Beichuan County in China, and a quantitative accuracy assessment has been taken. The assessment indicates that this approach can guarantee less commission error of landslide areal extent which is below 17.6% and achieves a quality percentage above 61.1%, and for landslide areas, the detection percentage is also competitive, the experimental results proves the feasibility and accuracy of the proposed approach for the detection landslides in mountain cities.


Introduction
Landslides are a major hazard in lots of countries, causing huge property losses and many casualties every year [1]. Especially in the areas that are close to both cities and mountains, landslides that are triggered by earthquakes or heavy rain happen frequently and cause even more damage, just like those that happened in Wenchuan [2]. Against this background, quick landslides hazard emergency response and accurate risk assessment based on landslide inventories are of great importance and the key of them is to detect critical areas [3,4]. If a landslide area is detected, its specific analysis can be carried out for emergency response and the precautions can be done.
Remote sensing is an ideal tool for landslides detection because it offers a wide area of territorial data with various degrees of both spatial and temporal resolution. It has been widely used in landslides detection with the help of image analysis technology, which can save human resources costs and make the process more automatic. In the past, there are two main approaches that use remote sensing images in landslides detection, one of them only makes use of the post-event data and the landslides areas are identified by classification, such as scene classification method [5], object-oriented analysis and

Study Area and Data
Landslides in Shenzhen, Zhouqu Country, and Beichuan County are chosen to validate the proposed method. These three Chinese cities are mountainous and they suffered significant losses because of landslides. The landslides of Shenzhen dated on 20 December 2015 led to the loss of 69 lives and dozens of buildings in ruins, and the massive landslides coupled with debris flow in Zhouqu County where earthquakes, landslides, and debris flow often occur, happened on 7 August 2010, leaving 1765 people dead or missing, and smashed thousands of houses. Beichuan County was devastated because of the Wenchuan earthquake happened on 12 May 2008, the earthquake triggered lots of landslides that destroyed the city and buried thousands of people. Figure 1 shows the experimental images for landslides detection. The three image sets are processed with registration [24], and then histogram matching is applied to removing the influence of discrepant radiation intensity [25]. Landslides that had taken place in the three sites led to drastic alteration in land surface and obvious sediment transport. In Figure 2a-d, we can see the sediment transport bury vegetation and buildings. Additionally, the landslides detection method uses a digital elevation model (DEM) of 5 m spatial resolution to derive slope information, which can make the landslides detection more reliable [17], and the change of DEM is also used for recognizing the residual part of landslide areas where there are no change of the cover type.

Methodology
The proposed method aims at the automated landslides detection in mountain cities using multi-temporal remote sensing data. Thus, the changes of the earth surface during the analyzed time span should be taken in consideration and the landslide-related phenomena in the study area should be used for extracting the landslide areas. To meet these requirements, the method uses a patch-oriented change detection and conducts a spatial-temporal analysis of the NDVI and PANTEX in the changed patches of the image. The Integral process of the proposed method is showed in Figure 3.
Procedures of the proposed approach for landslides detection are as following: 1. Cloud-covered regions ought to be abandoned in all images. Unconcerned regions with vegetation, water and building are also supposed to be discarded in the post-event image, then the possible landslide areas are obtained (Section 3.1). 2. Extract changed areas by applying change detection with Deep Convolution Neural Network to pre-and post-event images (Section 3.2). 3. STCL aiming at binary temporal NDVI and PANTEX in changed areas is used and the suspicious landslide areas is extracted (Section 3.3). 4. The post process is added to get more reliable and complete landslide areas (Section 3.4).

Methodology
The proposed method aims at the automated landslides detection in mountain cities using multi-temporal remote sensing data. Thus, the changes of the earth surface during the analyzed time span should be taken in consideration and the landslide-related phenomena in the study area should be used for extracting the landslide areas. To meet these requirements, the method uses a patch-oriented change detection and conducts a spatial-temporal analysis of the NDVI and PANTEX in the changed patches of the image. The Integral process of the proposed method is showed in Figure 3.
Procedures of the proposed approach for landslides detection are as following: 1.
Cloud-covered regions ought to be abandoned in all images. Unconcerned regions with vegetation, water and building are also supposed to be discarded in the post-event image, then the possible landslide areas are obtained (Section 3.1).

2.
Extract changed areas by applying change detection with Deep Convolution Neural Network to pre-and post-event images (Section 3.2). 3.
STCL aiming at binary temporal NDVI and PANTEX in changed areas is used and the suspicious landslide areas is extracted (Section 3.3).

4.
The post process is added to get more reliable and complete landslide areas (Section 3.4).

Remove the Irrelevant Areas
According to the terrain transformation analysis [26] that is related to landslides, regions that are covered by vegetation, water, and buildings in post-event images can be ignored to reduce the area of terrain to be detected, and the computational cost of the whole method. Besides, areas with cloud in pre-and post-landslide images should be abandoned to guarantee finding real changed area in change detection.
The built-up areas are extracted by calculating PANTEX, the index is based on fuzzy rule-based composition of anisotropic textural co-occurrence measures that are derived from the gray-level co-occurrence matrix (GLCM) [23]. Calculating NDVI, Normalized Difference Water Index (NDWI), for each pixel is a simple and quick method to obtain vegetation covering and water covering from remote sensing data [27,28]. The two indexes are defined as where IR N stands for near infrared band, R stands for red band and G is the green band.
In the experiment, the threshold of NDVI and NDWI is 0.32 and 0, respectively, the pixel whose value is higher than the threshold means that there is vegetation or water, and the pixel whose PANTEX is higher than 0.3 would be determined as a built-up area. For masking clouds, a threshold (threshold = 180) has been applied to the average value of four bands, the threshold is selected

Remove the Irrelevant Areas
According to the terrain transformation analysis [26] that is related to landslides, regions that are covered by vegetation, water, and buildings in post-event images can be ignored to reduce the area of terrain to be detected, and the computational cost of the whole method. Besides, areas with cloud in pre-and post-landslide images should be abandoned to guarantee finding real changed area in change detection.
The built-up areas are extracted by calculating PANTEX, the index is based on fuzzy rule-based composition of anisotropic textural co-occurrence measures that are derived from the gray-level co-occurrence matrix (GLCM) [23]. Calculating NDVI, Normalized Difference Water Index (NDWI), for each pixel is a simple and quick method to obtain vegetation covering and water covering from remote sensing data [27,28]. The two indexes are defined as where N IR stands for near infrared band, R stands for red band and G is the green band.
In the experiment, the threshold of NDVI and NDWI is 0.32 and 0, respectively, the pixel whose value is higher than the threshold means that there is vegetation or water, and the pixel whose PANTEX is higher than 0.3 would be determined as a built-up area. For masking clouds, a threshold (threshold = 180) has been applied to the average value of four bands, the threshold is selected thinking of separating temporal bright objects (e.g., clouds) from permanent ones (e.g., buildings and sand).
After marking the vegetation, water, and built-up areas in post-event images and clouds covering in all of the images, binary image a and b in Figure 3 can be generated. The values of pixels in binary image a would be 0 if the pixel in the same position in pre-landslide image have been marked; other pixels in binary image a would take 1 as their value. For binary image b, the values of pixels which have been marked in post-landslide images would be 0, others would be 1. Binary image that shows possible landslide regions is made of above images a and b, it is the result of "and" operation from two images, so its pixel whose value is 1 represents the possible landslide region.

Change Detection Using DCNN
In this section, change detection is applied to the possible landslide areas that have been found in Section 3.1. As shown in Figure 4, the key ideas are leveraging the high capacity of Deep Convolutional Neural Network for feature extraction [29,30] to learn robust representations of images, and calculating the Manhattan distance between representations of corresponding possible landslide area in two images, the Manhattan distance is used for measuring the change degree, and then a threshold is used to determine whether the change have occurred in this area. thinking of separating temporal bright objects (e.g., clouds) from permanent ones (e.g., buildings and sand). After marking the vegetation, water, and built-up areas in post-event images and clouds covering in all of the images, binary image a and b in Figure 3 can be generated. The values of pixels in binary image a would be 0 if the pixel in the same position in pre-landslide image have been marked; other pixels in binary image a would take 1 as their value. For binary image b, the values of pixels which have been marked in post-landslide images would be 0, others would be 1. Binary image that shows possible landslide regions is made of above images a and b, it is the result of "and" operation from two images, so its pixel whose value is 1 represents the possible landslide region.

Change Detection Using DCNN
In this section, change detection is applied to the possible landslide areas that have been found in Section 3.1. As shown in Figure 4, the key ideas are leveraging the high capacity of Deep Convolutional Neural Network for feature extraction [29,30] to learn robust representations of images, and calculating the Manhattan distance between representations of corresponding possible landslide area in two images, the Manhattan distance is used for measuring the change degree, and then a threshold is used to determine whether the change have occurred in this area.

Feature Extraction
The architecture of our DCNN model is summarized in Figure 5. It contains five learned layers-two convolutional, two pooling (average pooling and max pooling), and a fully connected. In convolutional process, the size of kernels is 5 × 5 and the stride parameter is 1; in pooling process, the size of kernels is 3 × 3 and the stride parameter is 2. The DCNN model is trained by SAT-4 [31] dataset, and an extra fully connected layer is added for classification in the training process, this layer is not shown in Figure 5 because it would be abandoned in the feature extraction process.
After the DCNN model has been learnt, the possible landslide pixels with their 100 × 100 neighbor would be resized to 28 × 28 image patches that are the input of the model, and the 64-D output vector is the feature of corresponding image patch. Due to the multilayer deeply abstraction of DCNN, the feature consists of spectrum, texture, and shape information that can describe the nature of the input image patch.

Feature Extraction
The architecture of our DCNN model is summarized in Figure 5. It contains five learned layers-two convolutional, two pooling (average pooling and max pooling), and a fully connected. In convolutional process, the size of kernels is 5 × 5 and the stride parameter is 1; in pooling process, the size of kernels is 3 × 3 and the stride parameter is 2. The DCNN model is trained by SAT-4 [31] dataset, and an extra fully connected layer is added for classification in the training process, this layer is not shown in Figure 5 because it would be abandoned in the feature extraction process.
After the DCNN model has been learnt, the possible landslide pixels with their 100 × 100 neighbor would be resized to 28 × 28 image patches that are the input of the model, and the 64-D output vector is the feature of corresponding image patch. Due to the multilayer deeply abstraction of DCNN, the feature consists of spectrum, texture, and shape information that can describe the nature of the input image patch.

Change Detection
Since the feature extracted by DCNN can describe the nature of the image, if changes happen in an area, the feature of the corresponding part of the image would change accordingly. Therefore, after extracting the two features of the corresponding 100 × 100 neighbor that center on the same possible landslide pixel in pre-and post-landslide image, the Manhattan distance is calculated between these two features. Manhattan distance between feature ( 1 , 2 , … , ) and feature ( 1 , 2 , … , ) can be expressed as follows: If the Manhattan distance is bigger than the threshold, the pixel and its 100 × 100 neighbor would be determined as a changed possible landslide area, and the value of pixels in this area can be retained based on the binary image that shows possible landslide regions in Section 3.1; but when the Manhattan distance does not reach the threshold, these pixels would be modified to 0. Though the threshold should be modified when the experimental area or data is changed, it is not sensitive and it can be set to 1.5 times the average Manhattan distance between features of all the corresponding patches in pre-and post-landslide image. The process is visualized in Figure 6. It obviously shows that the landslide-related surface covering change causes larger variation of features than that caused by common artificial or natural surface covering change.
After this procedure, a binary image that indicates changed regions have been formed. Because the change detection is only applied to possible landslide pixels and their neighbor, the pixels whose values are still 1 represent the areas where the landslide may occur and the striking changes of the earth surface have happened between the date of pre-and post-landslide image.

Change Detection
Since the feature extracted by DCNN can describe the nature of the image, if changes happen in an area, the feature of the corresponding part of the image would change accordingly. Therefore, after extracting the two features of the corresponding 100 × 100 neighbor that center on the same possible landslide pixel in pre-and post-landslide image, the Manhattan distance is calculated between these two features. Manhattan distance between feature (x 1 , x 2 , . . . , x n ) and feature (y 1 , y 2 , . . . , y n ) can be expressed as follows: If the Manhattan distance is bigger than the threshold, the pixel and its 100 × 100 neighbor would be determined as a changed possible landslide area, and the value of pixels in this area can be retained based on the binary image that shows possible landslide regions in Section 3.1; but when the Manhattan distance does not reach the threshold, these pixels would be modified to 0. Though the threshold should be modified when the experimental area or data is changed, it is not sensitive and it can be set to 1.5 times the average Manhattan distance between features of all the corresponding patches in pre-and post-landslide image. The process is visualized in Figure 6. It obviously shows that the landslide-related surface covering change causes larger variation of features than that caused by common artificial or natural surface covering change.
After this procedure, a binary image that indicates changed regions have been formed. Because the change detection is only applied to possible landslide pixels and their neighbor, the pixels whose values are still 1 represent the areas where the landslide may occur and the striking changes of the earth surface have happened between the date of pre-and post-landslide image.

Spatial Temporal Context Analysis
In this section, the exact landslide areas are detected based on spatial and temporary information, and the process is pixel-oriented for higher extraction accuracy. The STCL algorithm is used in this process, it is based on formulating the spatial and temporal relationships between the target and its local context. A Bayesian framework is used to express the statistical correlation of low-level features (i.e., intensity and position). In our research, changed pixels in Section 3.2 are regarded as the target and the two type spatial context models of these pixels are built based on NVDI and PANTEX information, respectively, in pre-landslide image, then the NDVI models and PANTEX models are used to identify the same position in the post-landslide image to determine

Spatial Temporal Context Analysis
In this section, the exact landslide areas are detected based on spatial and temporary information, and the process is pixel-oriented for higher extraction accuracy. The STCL algorithm is used in this process, it is based on formulating the spatial and temporal relationships between the target and its local context. A Bayesian framework is used to express the statistical correlation of low-level features (i.e., intensity and position). In our research, changed pixels in Section 3.2 are regarded as the target and the two type spatial context models of these pixels are built based on NVDI and PANTEX information, respectively, in pre-landslide image, then the NDVI models and PANTEX models are used to identify the same position in the post-landslide image to determine weather the vegetation or built-up area have disappeared, in consideration of the landslide-related disappearance of vegetation and buildings in mountain cities.

Spatial Context Modeling
In STCL, a confidence map is computed as a following equation to estimate the target location likelihood: where x ∈ R 2 is the location of target and o indicates the target present in the scene. The spatial context set is defined as X c = {c(z) = (I(z), z)|z ∈ Ω c (x * )} and I(z) denotes the intensity of location z in image, Ω c (x * ) denotes the 21 × 21 neighborhood of the object location x * . Location likelihood is based on condition probability P(x|c(z), o) and prior probability P(c(z)|o) . The condition probability is spatial context model that describes the spatial relationship between the target and its context information, it is defined as: where h sc (x − z) is related to the distance and direction between the location x of the target and its neighbor context location z. P(c(z)|o) denotes a context prior probability, it is computed as follows: with a weighted function w σ (x) which is defined by where parameter a is used for normalization and σ is a scale parameter. If the target location x * is known, the confidence map can be calculated as the following equation: where b is a normalization constant, β and α is, respectively, scale parameter and shape parameter. Based on (5)-(8), the likelihood (4) can be written as: where ⊗ represents the operator of convolution.
In consideration of that convolution can be replaced by Fast Fourier Transform (FFT), (8) can be expressed as: where F denotes FFT and is the element-wise product. Then, the spatial context model can be calculated by where F −1 is the inverse FFT. In our experiment, every changed pixel in possible landslide area is regarded as target for which two models (h sc NDV I (x) and h sc PANTEX (x)) are built, h sc NDV I (x) is based on NVDI image computed as (1), h sc PANTEX (x) is based on PANTEX image computed as [23], the parameters of likelihood function are set to α = 2.25 and β = 1, the scale parameter σ = 1 and the normalization constant a and b is respectively 0.075 and 0.5 according to [22].

Temporary Confidence Learning
After getting the spatial context model of the changed possible landslide pixel in pre-landslide image, a confidence is computed for the same location in post-landslide image. At first, the NDVI image and PANTEX image are calculated based on post-landslide image, and then for every changed possible landslide pixel, convolution process using NDVI spatial model and PANTEX spatial model is conducted to compute the confidence maps, as follows: where I NDV I (x) and I PANTEX (x) denotes the intensity of locating x in NDVI image and PANTEX image calculated based on post-landslide image, x ∈ Ω c (x * ), where x * is the location of the changed possible landslide pixel. The confidence map is a 21 × 21 matrix, and the likelihood L of the location in post-landslide image is the value of matrix center, L is represented as L NDV I = c NDV I (11,11), and L PANTEX = c PANTEX (11,11). if L NDV I or L PANTEX is smaller than 0, it seems that changes have happened in the vegetation or built-up area, and the corresponding pixel is determined as suspicious landslide pixel because landslides can lead to disappearing of vegetation and buildings in mountain cities.

Post Processing
Suspicious landslide areas are further evaluated with relief-based plausibility using the slope parameter which is derived from DEM and has been generally used for landslide validation [32][33][34], it considers the fact that the occurrence of landslides needs an initial relief contrast that enable the downward flow of material. In the experiments, the suspicious landslide areas whose average slope is greater than 7 • are confirmed to be landslide areas. Then, minute and scattered detected regions are supposed to be filtered out according to their size, because of the spatial concentration of landslides, and the size threshold is 50 (pixels) in the experiments. Finally, because there are some areas without cover type changes but with topography changes should also be part of landslide areas, and they are usually connected to landslide areas that are with cover type changes, thus for all of the pixels that are connected to the detected landslide areas, the change of DEM data in corresponding position should be taken into account. If the change of DEM data is higher than 4m, then the corresponding pixel would be taken as part of landslide areas, and the pixels that are connected to this new part of landslide areas should also be checked in the same way. Figure 7 shows the results of Shenzhen's landslide areas being detected by the proposed method. The binary image of possible landslide areas is shown in Figure 7a, from which we can see the possible landslide areas, which are represented by white pixels, occupy a small number of pixels and have very high concentration in landslide area because the main landslide-unrelated land covering types (i.e., vegetation, water, and built-up area) have been extracted and removed. Figure 7b shows changed possible landslide areas that are marked as white pixels, when compared to Figure 7a, this process only retains the white pixels that are in changed area, and removes others to prevent the confusing causing by land covering types that are visually similar to landslides in remote sensing images (e.g., bare land). In Figure 7c, we can see some white regions were abandoned by STCL because in the changed area there are still many areas that are visually similar to landslide. With the help of STCL that is based on NDVI and PANTEX, only the areas where vegetation or buildings have disappeared in the given period can be left, in keeping with the knowledge about landslide-related surface covering changes in mountain cities. In consideration of the concentrative and spatial pattern of landslides, regions that are smaller than 50 are filtered out, and some pixels which are connected to the detected landslide areas are included as part of landslide areas because of the great change of DEM in corresponding areas, the final result is shown in Figure 7d. Results of the landslides detection in Zhouqu County are shown in Figure 8. The binary image of possible landslide areas is shown in Figure 8a. The Binary image of changed areas is shown in Figure 8b, we can find the river was incorrectly extracted, but it was removed by STCL whose result is shown in Figure 8c. Figure 8d is the result image and it shows the location and shape of the landslides. Results of the landslides detection in Zhouqu County are shown in Figure 8. The binary image of possible landslide areas is shown in Figure 8a. The Binary image of changed areas is shown in Figure 8b, we can find the river was incorrectly extracted, but it was removed by STCL whose result is shown in Figure 8c. Figure 8d is the result image and it shows the location and shape of the landslides. Results of the landslides detection in Beichuan County are shown in Figure 9. There are many landslides that are widely distributed and part of them are close to others, and the changed regions Results of the landslides detection in Beichuan County are shown in Figure 9. There are many landslides that are widely distributed and part of them are close to others, and the changed regions Results of the landslides detection in Beichuan County are shown in Figure 9. There are many landslides that are widely distributed and part of them are close to others, and the changed regions

Accuracy Assessment
To acquire a synthetical assessment of the landslides detection method, consistency analysis between the automated detected landslides and the real ones should be done. The comparison puts the experimental areas into three classes: true positive (TP), false positive (FP) and false negative (FN). TPs contains the real landslide pixels which have been found by the method. FNs and FPs denotes two different error, the former represents the missing landslide pixels, and the latter represents the wrongly extracted pixels, which are not landslides.
Further, several evaluation indicators, DP, QP, and CE [6,14] are calculated using the following equations to implement quantitative evaluation: 100% 100% DP denotes the proportion of correctly detected landslides, with respect to real landslides. QP takes FN into account and measures the result synthetically. CE is error rate of detected landslides, it is the proportion of falsely detected.
The experimental results of proposed method are concluded in Table 1 based on landslide areal extent. For the case of Shenzhen, DP and CE are, respectively, 79.9% and 9.9%. Furthermore, the QP reaches 73.4%. For the landslides in Zhouqu County, DP and CE take the value of 70.3% and 17.6%, and the QP is 61.1%, less than that in Shenzhen's landslide because of the varied topography, disperse coverage and the wispy shape of the landslide in Zhouqu County. For the landslides in Beichuan County, DP and QP reach, respectively, 88.5% and 80.1%, while CE is 9.6%. For the number of landslide areas, the result of accuracy assessment is shown in Table 2.

Accuracy Assessment
To acquire a synthetical assessment of the landslides detection method, consistency analysis between the automated detected landslides and the real ones should be done. The comparison puts the experimental areas into three classes: true positive (TP), false positive (FP) and false negative (FN). TPs contains the real landslide pixels which have been found by the method. FNs and FPs denotes two different error, the former represents the missing landslide pixels, and the latter represents the wrongly extracted pixels, which are not landslides.
Further, several evaluation indicators, DP, QP, and CE [6,14] are calculated using the following equations to implement quantitative evaluation: DP denotes the proportion of correctly detected landslides, with respect to real landslides. QP takes FN into account and measures the result synthetically. CE is error rate of detected landslides, it is the proportion of falsely detected.
The experimental results of proposed method are concluded in Table 1 based on landslide areal extent. For the case of Shenzhen, DP and CE are, respectively, 79.9% and 9.9%. Furthermore, the QP reaches 73.4%. For the landslides in Zhouqu County, DP and CE take the value of 70.3% and 17.6%, and the QP is 61.1%, less than that in Shenzhen's landslide because of the varied topography, disperse coverage and the wispy shape of the landslide in Zhouqu County. For the landslides in Beichuan County, DP and QP reach, respectively, 88.5% and 80.1%, while CE is 9.6%. For the number of landslide areas, the result of accuracy assessment is shown in Table 2. To make the experimental results more intuitive, TPs, FNs, and FPs of Shenzhen, Zhouqu County, and Beichuan County are showed in Figure 10c, Figure 11c, and Figure 12c respectively, they are generated through comparison between the detected landslides and the real ones. As shown in Figure 10a,b, five landslide regions in Shenzhen have been extracted by proposed method with no one missing, achieving a recall rate of 100%. QP in region 1 to 5 reach 76.1%, 77.6%, 65.2%, 45.7% and 82.1% severally, region 4 only have a QP of 45.7% because of the cloud coverage. For the landslide in Zhouqu County, as Figure 11a,b show, the recall rate of landslide areas reaches 100%, despite some areas being wispy and tiny, and QP of areas 1-4 are, respectively, 70.6%, 60.0%, 58.5%, and 55.2%. Besides, there are several wrongly detected landslides areas depicted by isolated red region just like area 6-7 in Figure 10c and area 5-6 in Figure 11c, in these areas, the earth surface changes that are similar to landslide have happened artificially or naturally. For the landslides in Beichuan County, as Figure 12 shows, the method can identify large landslides well, just like landslide 1-5 in Figure 12c whose QP are, respectively, 95.5%, 94.0%, 92.9%, 85.3%, and 84.5%. But, there are still isolated and tiny landslides, like landslide 6-10, which have not been identified because the change degree of the surface is small.  To make the experimental results more intuitive, TPs, FNs, and FPs of Shenzhen, Zhouqu County, and Beichuan County are showed in Figure 10c, 11c, and 12crespectively, they are generated through comparison between the detected landslides and the real ones. As shown in Figure 10a,b, five landslide regions in Shenzhen have been extracted by proposed method with no one missing, achieving a recall rate of 100%. QP in region 1 to 5 reach 76.1%, 77.6%, 65.2%, 45.7% and 82.1% severally, region 4 only have a QP of 45.7% because of the cloud coverage. For the landslide in Zhouqu County, as Figure 11a,b show, the recall rate of landslide areas reaches 100%, despite some areas being wispy and tiny, and QP of areas 1-4 are, respectively, 70.6%, 60.0%, 58.5%, and 55.2%. Besides, there are several wrongly detected landslides areas depicted by isolated red region just like area 6-7 in Figure 10c and area 5-6 in Figure 11c, in these areas, the earth surface changes that are similar to landslide have happened artificially or naturally. For the landslides in Beichuan County, as Figure 12 shows, the method can identify large landslides well, just like landslide 1-5 in Figure 12c whose QP are, respectively, 95.5%, 94.0%, 92.9%, 85.3%, and 84.5%. But, there are still isolated and tiny landslides, like landslide 6-10, which have not been identified because the change degree of the surface is small.

Discussion
The proposed method automatically detects landslides in mountain cities based on the knowledge of the earth surface change, without other artificial spatial or spectral features, just like texture. To obtain robust and precise results in areas that cover various land surface changes caused by human or natural behaviors, regions that contain irrelevant land cover types in post-event image are filtered out in first, then change detection using DCNN is implemented for merely extracting regions where the change degree of land surface is high, therefore those scattered little regions where the land surface change are similar to landslide are removed, and the CE for landslides areal extent is effectively reduced.
Next, The STCL based on NDVI and PANTEX achieves fully utilizing the temporal and spatial information and the landslides-related surface change knowledge-disappearing of vegetation and built-up areas. Because the first two steps do not take account of the land covering type changing that is caused by landslides, the result image of them indicates the areas where changes have intensively taken place on large scale, the changes may be triggered by landslides or other events. With the help of STCL, regions of landslides are identified and other areas can be ruled out. Besides, the land covering models learned by STCL is based on spatial context instead of single pixel, making the changing measurement robust and stable, and can overcome the disturbance of illumination variation and salt & pepper noise.
Finally, slope parameter which is derived from post-landslide DEM is used to evaluate the detected areas and makes the results more reliable. Meanwhile, taking the size of landslides in account, tiny areas are filtered out for reducing the false alarm rate, which can also be depicted by

Discussion
The proposed method automatically detects landslides in mountain cities based on the knowledge of the earth surface change, without other artificial spatial or spectral features, just like texture. To obtain robust and precise results in areas that cover various land surface changes caused by human or natural behaviors, regions that contain irrelevant land cover types in post-event image are filtered out in first, then change detection using DCNN is implemented for merely extracting regions where the change degree of land surface is high, therefore those scattered little regions where the land surface change are similar to landslide are removed, and the CE for landslides areal extent is effectively reduced.
Next, The STCL based on NDVI and PANTEX achieves fully utilizing the temporal and spatial information and the landslides-related surface change knowledge-disappearing of vegetation and built-up areas. Because the first two steps do not take account of the land covering type changing that is caused by landslides, the result image of them indicates the areas where changes have intensively taken place on large scale, the changes may be triggered by landslides or other events. With the help of STCL, regions of landslides are identified and other areas can be ruled out. Besides, the land covering models learned by STCL is based on spatial context instead of single pixel, making the changing measurement robust and stable, and can overcome the disturbance of illumination variation and salt & pepper noise.
Finally, slope parameter which is derived from post-landslide DEM is used to evaluate the detected areas and makes the results more reliable. Meanwhile, taking the size of landslides in account, tiny areas are filtered out for reducing the false alarm rate, which can also be depicted by commission error. In addition, in view of there are part of landslide areas which are without cover type changes, the change of DEM is used for detecting these landslide areas, making the shapes and outlines of the detected landslides more consistent with the ground truth.
In order to compare the proposed approach with other landslides detection approaches, two recent studies which employ the same accuracy assessment metrics have been selected. Behling et al. [17] proposed a method based on NDVI-trajectories and achieved QP of 50-89% and CE of 10-21% for landslide areal extent. Rau et al. [6] achieved QP of 58-81.7% and CE of 11-32%. In our experiment in three sites, the QP for landslide areal extent are between 61.1-80.1%, the CE for landslide areal extent is below 17.6%. For the accuracy assessment on the number of landslide areas, the QP and CE of Behling's method is, respectively, 34.6-44.1% and 30-70%, and our approach, whose QP and CE are between 66.7-77.4% and 5.9-33.3%, performs better.
Overall, in terms of the detection accuracy, the proposed method reaches a lower commission error due to filtering the unnecessary areas based on land covering type, change degree, knowledge of land covering change, and size. The quality percentages of landslide areal extent are lower than the maximum quality percentages of Behling's and Rau's studies, but higher than their minimum quality percentages. For the assessment against the number of landslide areas, our approach performs better with higher quality percentages. In terms of practicality, the proposed automated approach can detect landslides in mountain cities and Behling's method cannot identify landslides where there used to be built-up areas. Rau's method is semiautomatic and need landslides samples which should be extracted by experts in target zone, if the detection is conduct in a new area, the proposed automated approach which need not samples and saves labor cost is more appropriate.

Conclusions
In this paper we presented an automated approach of landslides detection using multi-temporal remote sensing imagery, aiming at mountain cities. The patch-oriented change detection strategy based on DCNN model was introduced for extracting area with high change degree, and the usage of STCL makes the most of temporal and spatial information of remote sensing images for the accurate identification of landslide areas. This multi-level extraction and filtering frame achieves removing the unrelated areas and identifying the landslides gradually, and ensuring reliable detection percentage while reducing the false alarm rate.
The approach was conducted on the areas of landslide in Shenzhen, Zhouqu County and Beichuan County where both the cities and mountains existed, and the results showed that more than 70.3% landslide areal extent was extracted, meanwhile, the CE for landslide areal extent was below 17.6%. The experiment demonstrates that the proposed approach has a high capacity of landslide detection in mountain cities. Besides, the detection is absolutely based on the knowledge of land covering change related to landslide without extracting samples or other manual operation, thus the approach is automated and saves a lot of labor resources and time cost. However, there are still landslide areas that may not conform to the landslides-related surface change knowledge of the proposed method, like the source area of debris flow where erosion occurred, and rivers buried by landslides. Although these areas usually occupy small part of the landslides in mountain cities, more comprehensive models would be studied to solve the problem and enhance the robustness and accuracy in the next step. Since the spatial resolutions of remote sensing images in this experiment are 8 m and 30 m, in future work, the approach will be developed for images that cover larger area and has more complex natural environment.