Next Article in Journal
Comparison of Three Convolution Neural Network Schemes to Retrieve Temperature and Humidity Profiles from the FY4A GIIRS Observations
Next Article in Special Issue
A Change Detection Method Based on Multi-Scale Adaptive Convolution Kernel Network and Multimodal Conditional Random Field for Multi-Temporal Multispectral Images
Previous Article in Journal
Editorial for Special Issue “Advances in Hyperspectral Data Exploitation”
Previous Article in Special Issue
A Novel Hybrid Attention-Driven Multistream Hierarchical Graph Embedding Network for Remote Sensing Object Detection
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Gully Erosion Monitoring Based on Semi-Supervised Semantic Segmentation with Boundary-Guided Pseudo-Label Generation Strategy and Adaptive Loss Function

1
College of Information and Communication Engineering, Harbin Engineering University, Harbin 150001, China
2
Key Laboratory of Advanced Marine Communication and Information Technology, Ministry of Industry and Information Technology, Harbin Engineering University, Harbin 150001, China
3
Heilongjiang Province Hydraulic Research Institute, Harbin 150001, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2022, 14(20), 5110; https://doi.org/10.3390/rs14205110
Submission received: 2 September 2022 / Revised: 3 October 2022 / Accepted: 5 October 2022 / Published: 13 October 2022

Abstract

:
Gully erosion is a major threat to ecosystems, potentially leading to desertification, land degradation, and crop loss. Developing viable gully erosion prevention and remediation strategies requires regular monitoring of the gullies. Nevertheless, it is highly challenging to automatically access the monitoring results of the gullies from the latest monitoring data by training historical data acquired by different sensors at different times. To this end, this paper presents a novel semi-supervised semantic segmentation with boundary-guided pseudo-label generation strategy and adaptive loss function method. This method takes full advantage of the historical data with labels and the latest monitoring data without labels to obtain the latest monitoring results of the gullies. The boundary-guided pseudo-label generation strategy (BPGS), guided by the inherent boundary maps of real geographic objects, fuses multiple evidence data to generate reliable pseudo-labels. Additionally, we propose an adaptive loss function based on centroid similarity (CSIM) to further alleviate the impact of pseudo-label noise. To verify the proposed method, two datasets for gully erosion monitoring are constructed according to the satellite data acquired in northeastern China. Extensive experiments demonstrate that the proposed method is more appropriate for automatic gully erosion monitoring than four state-of-the-art methods, including supervised methods and semi-supervised methods.

1. Introduction

As one of the forms of soil degradation, gully erosion not only incises the land surface and nibbles the field but also washes out the fertile soil and degrades the efficiency of the large cultivator [1,2,3]. Regular investigation of gullies is conducive to understanding their evolution regularity and formulating appropriate restoration strategies [4,5,6]. At present, gullies have been investigated in many countries around the world [7,8,9,10]. Nevertheless, these investigations mainly rely on field surveys or manual interpretation of aerial/satellite images, which requires enormous human, material, and financial resources. In the past few years, artificial intelligence (AI) in the form of deep learning (DL) could automatically extract representative abstract features, thus obtaining remarkable achievements in many fields [11,12]. Therefore, there is an urgent need to design a DL-based monitoring method that allows automatic access to the latest monitoring results of the gullies relying on the existing historical data.
As shown in Figure 1, the automatic monitoring process of gully erosion is illustrated by taking northeastern China as an example. In this figure, DL is the historical data with labels, in which the labels come from the previous manual investigation process, whereas DU is the latest monitoring data without labels and needs to be automatically obtained the monitoring results. Moreover, DL and DU were acquired over the same area at different times. Nevertheless, due to the different acquisition times, it is also difficult to ensure the consistency of the sensors collecting the two types of data. As a result, the same geographical object from DL and DU could have significant feature differences. Under the above conditions, the automatic monitoring method aims to accurately attain the latest monitoring results (the monitoring results of DU).
In the field of DL, supervised semantic segmentation methods based on convolutional neural networks (CNNs) have attracted extensive research and application [13,14,15]. For instance, Wang et al. [16] proposed a segmentation method based on object context and boundary enhanced loss for earthquake-damaged buildings, aimed at enhancing the feature representation ability and refining the segmented boundaries of earthquake-damaged buildings; Yu et al. [17] researched the impact of the attention mechanism on the segmentation model and designed an attention-gates Unetwork (AGs-Unet) for buildings segmentation; Zhu et al. [18] introduced a multiscale-aware and segmentation-prior conditional random fields (MSCRF) to solve the problem of excessive smoothing of building edges in segmentation results. In fact, for the latest monitoring data (test data), it is difficult to achieve the desired monitoring results relying only on the historical data (training data) acquired by different sensors at different times.
To address this issue, we shift our focus to semi-supervised semantic segmentation methods. Compared with supervised methods, semi-supervised methods can provide pseudo-labels for the latest monitoring data to reduce the impact of feature differences [19,20]. Nevertheless, one inherent problem exists, in that the pseudo-labels inevitably contain noise. Therefore, some methods filter the predictions with confidence [21,22,23,24]. In other words, only highly confident predictions are used as pseudo-labels, whereas ambiguous ones are discarded. Obviously, these methods ignore numerous correct predictions with low confidence. Thus, many improved methods have been proposed. For example, Zhang et al. [25] exploited virtual adversarial perturbation and density-aware entropy to find valuable predictions with low confidence as the candidate samples; Wang et al. [26] argued that every pixel matters to the model training and proposed a category-wise queue that consists of negative samples to filter unreliable pixels; Yao et al. [27] measured the variance of pseudo-labels and regularized the network to learn with more confident pseudo-labels; He et al. [28] presented an effective distribution alignment and random sampling method to create unbiased pseudo-labels that match the true class distribution estimated from the labeled data.
Although many achievements have been made in semi-supervised methods, several limitations still remain for high-precision gully erosion monitoring. (1) Due to the enormous differences between historical data and the latest monitoring data, it is a challenge to provide credible pseudo-labels for the latest monitoring data. Thus, how to mine more valuable information to improve the reliability of pseudo-labels is a limitation that needs to be addressed. (2) The pseudo-labels inevitably contain noise, and even state-of-the-art methods cannot solve this problem. Therefore, it is necessary to further reduce the impact of pseudo-label noise. In response to these limitations, a novel semi-supervised semantic segmentation with boundary-guided pseudo-label generation strategy and adaptive loss function method is proposed. The main contributions of this paper are summarized as follows:
(1)
To improve the reliability of pseudo-labels, we propose a boundary-guided pseudo-label generation strategy (BPGS), which is composed of an object boundary generator and a multi-evidence fusion strategy. First, an object boundary generator based on CNNs and superpixel segmentation is proposed to output boundary maps of geographic objects, aiming to exploit the structural prior information and neighborhood correlation of pixels. On this basis, guided by these boundary maps, a multi-evidence fusion strategy is designed to fully utilize historical labels as well as the output of the student model and the teacher model to generate high-quality pseudo-labels.
(2)
To alleviate the impact of pseudo-label noise, an adaptive loss function based on centroid similarity is developed. In this loss function, centroid similarity (CSIM) is designed to measure the reliability of pseudo-labels and adjust the loss value. In this manner, the loss value is weighted lower when the reliability is lower. Consequently, we can endure pseudo-label noise through this loss function during the training process.
(3)
To validate our method, two benchmark datasets for gully erosion monitoring are constructed according to the satellite data acquired in northeastern China.
The rest of the paper is organized as follows: Section 2 introduces the proposed method in detail; Section 3 displays the datasets, experiments, and results; Section 4 presents a discussion on this paper; Section 5 draws the conclusions and offers some prospects.

2. Method

For gully erosion monitoring, we propose a semi-supervised semantic segmentation with boundary-guided pseudo-label generation strategy and adaptive loss function method, illustrated in Figure 2. Our method contains two models with the same backbone, including a student model (e.g., Ψ A in Figure 2) and a teacher model (e.g., Ψ B in Figure 2). The historical data with labels (DL) are fed directly into Ψ A for supervised training. Nevertheless, as for the latest monitoring data (DU), the training process consists of four steps. First, we employ the designed object boundary generator to obtain the boundary maps of DU. Next, by using Ψ A and Ψ B to predict DU, we can obtain the predictions and confidences of two models for DU. Then, guided by the boundary maps, a multi-evidence fusion strategy combines historical labels and the output of two models to generate reliable pseudo-labels. Finally, the proposed centroid similarity automatically adjusts the loss value by measuring the reliability of pseudo-labels during the training process. Based on the above operations, the training of Ψ A is completed. Furthermore, the proposed method is converted to an iterative training framework for better performance, and a detailed introduction can be found in Section 2.3.

2.1. Boundary-Guided Pseudo-Label Generation Strategy

This section describes the process of generating high-quality pseudo-labels through the proposed BPGS. The BPGS consists of an object boundary generator and a multi-evidence fusion strategy, corresponding to serial numbers 1–3 in Figure 2. Then, we will describe the object boundary generator and the multi-evidence fusion strategy in detail.

2.1.1. Object Boundary Generator

Most semi-supervised methods ignore structural prior information and pixel neighborhood information when generating pseudo-labels, thus often producing noisy predictions around the object boundaries. To address this, we propose the object boundary generator to extract the boundary maps of geographic objects, thus providing guidance for the subsequent pseudo-label generation.
An object instance often includes similar colors or texture features [29,30], so grouping spatially continuous pixels that have similar features into the same object is a reasonable strategy. Based on this principle, we first designed a dual attention network based on CNNs to extract pixels with similar features, in which two attention modules enhance the ability to focus on the gullies. On this basis, a superpixel segmentation method, the SLIC method [31,32], is applied to further identify spatially continuous pixels. Based on the above introduction, the specific flowchart of the object boundary generator is shown in Figure 3.
As shown in Figure 3, the input is an image of gully erosion, and the output is the boundary map of this image. We first input this image into the dual attention network. In this way, pixels with similar features are aggregated into an object. Then, the SLIC method is employed to refine the output of the dual attention network by mining the spatial relationships between pixels. On this basis, the refined results are used as labels for updating the parameters of the dual attention network. Repeat the above operations until the set termination condition is reached. The termination condition can be found in Section 3.2.2. Finally, we extract the boundary pixels in the final refined result as the boundary map. Based on the above steps, we can extract the boundary map of the input image. To illustrate the object boundary generator in more detail, we will introduce the dual attention network and the SLIC refinement.
(1)
Dual attention network: As shown in Figure 3, the dual attention network consists of three parts, taking an image as the input and the predictions of this image as the output. The first part is features encoding, consisting of three convolution blocks. Each block includes a convolution layer, a batch normalization layer, and a ReLU activation function layer. The second part is dual attention enhancement, which is beneficial for focusing on the information of interest and suppressing irrelevant background information. In this part, we adopt a channel attention module and a spatial attention module, as shown in Figure 4. As for the third part, a convolution layer, a batch normalization layer, and an argmax method are used to decode the features. With the above three parts, pixels with similar features can be aggregated into an object.
(2)
SLIC refinement: The dual attention network aims to aggregate pixels with similar features into one object. However, it is also preferable for objects to be spatially continuous [29]. Meanwhile, the SLIC method considers the spatial correlation of pixels in images when generating irregular pixel regions (superpixels). Therefore, we adopt the SLIC method to refine the output of the dual attention network. Specifically, based on the SLIC method, we first extract the superpixel set S = { S o } o = 1 O from the input image, where O is the total number of superpixels, and S o denotes a set of the index of pixels belonging to the oth superpixel. Then, on the basis of the predictions obtained from the dual attention network and S o , we can obtain L o (the predictions for pixels belonging to S o ). Next, we force the most frequent category in L o as the category for all pixels of S o . Finally, iterate the above steps until all superpixels are updated. Based on the above operations, continuous pixels with similar features are aggregated into the same object.

2.1.2. Multi-Evidence Fusion Strategy

Although the object boundary generator is able to exploit the structural prior information and neighborhood correlations of pixels, more useful information should also be mined to improve the reliability of pseudo-labels. Compared with other disasters, such as earthquakes and floods, most gullies will not change significantly in the short term. As shown in Figure 1, although DL and DU were acquired over the same area at different times, the gullies in them had not altered dramatically. This means that historical labels contain a wealth of valuable information for generating pseudo-labels. Meanwhile, some scholars argue that different model initialization could help models describe the same data from different aspects or perspectives and significantly improve performance [33]. Inspired by this view, Ψ A and Ψ B were pre-trained on historical data from two different initializations, thus providing different supports for pseudo-labels. In summary, we take the historical labels and the output of two models with different initializations as evidence data to generate pseudo-labels. However, with the introduction of more evidence, this also brings evidence conflict. In other words, as a basis for determining whether a pixel belongs to gully erosion, three evidence data may give opposite conclusions, and the experimental results in Section 4.1 verify that it is difficult to achieve reliable results for the simple practice of taking the intersections of all evidence data. Therefore, a multi-evidence fusion strategy is designed to combine three evidence data in a holistic way.
The multi-evidence fusion strategy is based on the Dempster/Shafer (D–S) evidence theory [34,35,36] and generates reliable labels under the guidance of boundary maps. Specifically, for an input image, we first denote its boundary map as R = { R i } i = 1 I . Here, I is the total number of objects in the boundary map, and R i is the ith object. According to the D–S evidence theory, the decision fusion framework can be defined as Θ : {GE, NG} for any object R i , where GE and NG denote gully erosion and non-gully erosion, respectively. Thus, the non-empty subset ψ of Θ should contain {GE}, {NG}, and {GE, NG}.
Then, we define the basic probability assignment formula (BPAF) of ψ as m ( ψ ) , which can be represented as follows:
m ( ψ ) = ψ = ψ 1 g G m g ( ψ ) / ψ 1 g G m g ( ψ )
where G is the total number of evidence data, m g ( ) is the BPAF of gth evidence. On this basis, the corresponding BPAF of R i is established through the following equations:
m i g [ { G E } ] = C i g × P i g
m i g [ { N G } ] = ( 1 C i g ) × ( 1 P i g )
m i g [ { G E , N G } ] = ( C i g + P i g ) 2 C i g P i g
where C i g and P i g are the overall confidence indicators that R i belongs to GE in the gth evidence and can be represented as
C i g = 1 N ( R i ) × q = 1 N ( R i ) E   i g   q
P i g = 1 N ( R i ) × ϕ i g
Here, N ( R i ) is the total number of pixels of R i , E   i g   q is the confidence that the qth pixel of R i belongs to GE in the projection of R i on the confidences of gth evidence, ϕ i g is the total number of pixels belonging to GE in the projection of R i on the predictions of gth evidence. In addition, it should be noted that we set the confidence to 1 for pixels belonging to GE and 0 for pixels belonging to NG when adopting history labels to calculate E i g q .
Finally, calculate the m i [ { G E } ] , m i [ { N G } ] , and m i [ { G E , N G } ] by using Equations (1)–(6). If m i [ { G E } ] > m i [ { N G } ] and m i [ { G E } ] > m i [ { G E , N G } ] are satisfied, R i belongs to gully erosion; otherwise, R i belongs to non-gully erosion. On this basis, iterate over all objects of R to obtain the pseudo-label of the input image.

2.2. The Adaptive Loss Function Based on Centroid Similarity

Since it is unlikely that all evidence data will make mistakes at the same time, their similarity also reflects the reliability of pseudo-labels. Moreover, the centroid reflects the mass distribution of an object, so the closer the distance between the pixel and the centroid, the more reliable it will be as proof of monitoring [37,38]. Thus, the similarity should also be more focused on the representative pixels of each object. Based on the above analysis, the adaptive loss function based on CSIM is proposed. The adaptive loss ζ includes a supervised loss ζ s l and a pseudo-supervised loss ζ p l and can be represented as
ζ = ζ s l + ζ p l
The supervised loss ζ s l is formulated using the standard binary cross-entropy loss in DL:
ζ s l = 1 | D L | Z D L 1 W × H r = 1 W × H [ υ r Z × log ( F r Z ) + ( 1 υ r Z ) × log ( 1 F r Z ) ]
where Z is an image in DL, W and H represent the width and height of Z, respectively. F r Z [ 0 , 1 ] and υ r Z { 0 , 1 } represent the prediction and label for the rth pixel of Z, respectively. Meanwhile, it should be noted that if the pixel belongs to gullies, we set its label to 1. Otherwise, its label is 0. Similarly, the pseudo-supervised loss ζ p l in DU can be represented as
ζ p l = 1 | D U | X D U 1 W × H r = 1 W × H [ υ r X × log ( F r X ) + ( 1 υ r X ) × log ( 1 F r X ) ] × C S I M r X
where X is an image in DU, F r X [ 0 , 1 ] and υ r X { 0 , 1 } represent the prediction and pseudo-label for rth pixel of X, respectively. C S I M r X is the CSIM for rth pixel of X, i.e., the reliability of pseudo-labels for rth pixel of X. The specific introduction of C S I M r X is as follows.
For ease of introduction, we let the rth pixel of X be pixel j, and its projection on the boundary map belongs to R i . Meanwhile, R i X and R i T represent the projection of R i on X and historical labels, respectively. R i A and R i B represent the projection of R i on the predictions of Ψ A and Ψ B for X, respectively. On this basis, the specific steps for calculating C S I M r X are as follows.
Step 1: Define ( ε , μ ) as the centroid coordinates of R i X , and the calculation of centroid coordinates can be defined as
ε = q = 1 N ( R i X )   b q ε q q = 1 N ( R i X )   b q
μ = q = 1 N ( R i X )   b q μ q q = 1 N ( R i X )   b q
where N ( R i X ) is the total number of pixels of R i X , b q is the pixel value of the qth pixel in R i X . ε q and μ q are the X-axis and Y-axis coordinates of the qth pixel, respectively.
Step 2: Compute the Euclidean distance of each pixel in R i X from the centroid coordinates, thus obtaining the centroid distance set ED.
Step 3: Calculation of the pixel value after the centroid distance constraint, and the specific formula can be defined as
R i A * ( j ) = R i A ( j ) × e   ( 1 [ E D ( j ) / max ( E D ) ] )
where e ( · ) is the exponential function, R i A ( j ) is the value of pixel j in R i A , R i A * ( j ) is the constrained value, and E D ( j ) is the distance from pixel j to the centroid. On this basis, it can be seen that the centroid distance constraint ( e   ( 1 [ E D ( j ) / max ( E D ) ] ) ) varies according to the distance to the centroid and reaches a maximum at the centroid. Therefore, the centroid distance constraint can increase the divergence around the centroid (more representative predictions) and correspondingly alleviate the effect of predictions far from the centroid.
Step 4: Repeating Step2, the constrained object R i A * can be obtained by traversing each pixel of R i A . Similarly, R i B * and R i T * can be obtained.
Step 5: Calculation of the reliability for the rth pixel, and the specific calculation is as follows:
C S I M r X = ( 1 - κ ) × S M i A i B * + κ × [ ( S M i A i T * + S M i B i T * ) / 2 ]
where S M i A i B * is the structural similarity (SSIM) of R i A * and R i B * , and the specific calculation process of SSIM can be referred to [39,40]. Similarly, S M i A i T * and S M i B i T * can be calculated. Furthermore, κ is the similarity regulation indicator. When κ is small, the student model will focus more on the predictions of Ψ A and Ψ B . Otherwise, it will pay more attention to historical labels.

2.3. Model Training and Testing

2.3.1. Iterative Training Framework

In this section, our method is converted to an iterative training framework for better performance. Among them, both Ψ A and Ψ B can be used as the student model first. The pseudo-code using Ψ A first as the student model is shown in Algorithm 1.
Algorithm 1 Iterative Training Framework
Input: Historical data (labeled data) DL, the latest monitoring data (unlabeled data) DU.
Output: Trained model Ψ A V and Ψ B V .
1 Initialize Ψ A 0 and Ψ B 0 with different pre-trained weights.
2 Train Ψ A 0 on DL.
3 Train Ψ B 0 on DL.
4 for  n { 1 , , V } , do
5    Predict on DU with Ψ A n 1 .
6    Predict on DU with Ψ B n 1 .
7    Use BPGS to fuse historical labels and the output of Ψ A n 1 and Ψ B n 1 pseudo-labeled data DPL.
8    Fine tune Ψ A n from Ψ A n 1 on both DL and latest DPL with the adaptive loss ζ .
9    Predict on DU with Ψ A n .
10  Use BPGS to fuse historical labels and the output of Ψ A n and Ψ B n 1 pseudo-labeled data DPL.
11  Fine tune Ψ B n from Ψ B n 1 on both DL and latest DPL with the adaptive loss ζ .
12 end for

2.3.2. Testing Process

In the testing phase, we first employ Ψ A V and Ψ B V to predict the test images. Then, the fusion results of the two models can be attained using Equations (1)–(6). Furthermore, the test labels are pixel wise and contain boundary pixels, so the closed operation is used to process these fusion results. Based on the above steps, we can obtain the final monitoring results.

3. Experiments and Results

3.1. Datasets

3.1.1. Dataset Description

Due to the lack of publicly available datasets for gully erosion monitoring, we constructed two benchmark datasets (including HC2012 and HC2020) to verify the performance of the proposed method. The HC2012 dataset was carried out based on the remote-sensing images of Huachuan County, Heilongjiang Province, China, collected by the ZY-3 satellite in 2012. The images included the panchromatic and multispectral (blue, green, red, and near-infrared) bands, with spatial resolutions of 2.1 m and 5.8 m, respectively. By using ENVI software, the images were fused to pan-sharpened RGB images with a spatial resolution of 2.1 m, as shown in Figure 5a. The HC2020 dataset was based on the same-site (Huachuan County) remote-sensing images collected by the GF-2 satellite in 2020. The images also covered the panchromatic and multispectral bands, with spatial resolutions of 1 m and 4 m, respectively. Then, the images were fused to pan-sharpened RGB images with a spatial resolution of 1 m, as shown in Figure 5b. In addition, during the experiments, the pan-sharpened RGB images of two datasets were also rigorously registered to facilitate the implementation of the proposed method.

3.1.2. Label of Dataset

In view of the experimental hardware condition and keeping abundant spatial contexts of the gullies, as far as possible, we segmented the original pan-sharpened RGB images into sub-images with 384 × 384 pixels. As a result, the HC2012 dataset obtained 981 samples. In addition, since two datasets were registered, the samples with the same number and size were obtained for the HC2020 dataset. Based on this, we annotated these samples with the assistance of the water department. Figure 6 shows a sample from the HC2020 dataset and its corresponding label.

3.2. Experimental Design and Implementation Details

3.2.1. Experimental Design

To evaluate our proposed method, we designed two experiments. In Experiment 1 (HC2012 to HC2020), the HC2012 dataset was employed as the training set (historical data), while the HC2020 dataset was used as the test set (the latest monitoring data). In Experiment 2 (HC2020 to HC2012), we reversed Experiment 1. In other words, the HC2020 dataset and the HC2012 dataset were treated as the training set and the test set, respectively. Moreover, based on the practical monitoring needs, we also provided the images of the test set as unlabeled data to the semi-supervised methods, whereas their labels were only used in the test phase to calculate the evaluation metrics. By carrying out the above two experiments, it is beneficial to validate the effectiveness and extensiveness of our method.
In these two experiments, our method was compared with four state-of-the-art methods, including two supervised and two semi-supervised methods. The supervised methods include Bisenetv2 [41] and Segformer [42], and the semi-supervised ones include U2PL [26] and DMT [33]. All comparison methods were reproduced using the open code of the original authors. Due to different data sources and hardware conditions, we made some necessary modifications, such as image size, batch size, number of categories, etc. To ensure the fairness of the experiments, the same data augmentation, learning rate, iterations, and optimizer were set, whereas the other hyperparameters followed the settings recommended by the original authors. Moreover, we evaluated different methods with four evaluation indicators, including precision, recall, F1 score, and intersection over union (IoU). The specific calculation formulae of these indicators can be found in Refs [43,44,45,46]. Compared with precision and recall, the F1 score and IoU are two comprehensive indicators. Hence, we will pay more attention to the IoU and F1 score in the experiments.

3.2.2. Implementation Details

The proposed method was implemented based on the PyTorch-1.7.0 framework in the Ubuntu 18.04 environment. All experiments were performed on a workstation. Its model was the Dell Precision T7920, which consists mainly of an Intel Xeon Silver 4210R CPU with 40G RAM and an Nvidia GeForce RTX 3090 GPU with 24G RAM. In the experiments, Ψ A 0 and Ψ B 0 used DeepLab-v3 with Resnet101 as the backbone but separately used pre-trained weights from COCO [47] and ImageNet [48]. In the training phase, Ψ A was first used as the student model, V was set to 5, each iteration had 20 epochs, and the SGD optimizer with the poly learning rate schedule was used to optimize the network. Some common data augmentation methods were adopted, including the horizontal flip of the image, random rotation of the image within the angle range of [–15,15], and random scaling was set at 0.75 and 1.5. The termination condition of the object boundary generator in Section 2.1.1 was 128 iterations. Furthermore, the similarity regulation indicator κ in Equation (11) was set to 0.6, and its detailed introduction can be found in Section 4.2.

3.3. Experiment 1: HC2012 to HC2020

In Experiment 1, the HC2012 dataset was employed as the historical data (training set), while the HC2020 dataset was treated as the latest monitoring data (test set). Figure 7 displays the latest monitoring data and the corresponding labels. Among them, some representative lands are marked with orange, yellow, blue, and red rectangles. On this basis, the monitoring results of five different methods for representative lands are shown in Figure 8.
As shown in Figure 8, two supervised methods (Bisenetv2 and Segformer) manifest poor performance in all representative lands, which is consistent with the previous analysis of supervised methods. Compared with supervised methods, semi-supervised methods can provide pseudo-labels for the latest monitoring data to reduce the impact of feature differences. Thus, three semi-supervised methods (U2PL, DMT, and the proposed method) monitor most of the gully erosion pixels. However, there are plenty of false positive pixels in the results of U2PL and DMT. For example, in the field roads (the golden rectangles in the second and third rows of Figure 8), wasteland (the golden rectangle in the first row), and the edge regions of the gullies (the golden rectangle in the fourth row), U2PL and DMT all present large areas of false positive pixels. These results suggest that it is difficult to acquire reliable monitoring results through the teacher model alone. With the introduction of BPGS and the adaptive loss function, the proposed method can better handle the impact of feature differences on automatic monitoring, thus presenting the best visual results.
To further evaluate the performance of the five methods, precision, recall, F1 score, and IoU were selected as the evaluation indicators for gully erosion monitoring. On this basis, the three times average quantitative evaluation results of five different methods are shown in Table 1.
As shown in Table 1, the proposed method outperforms the other methods in terms of the IoU, precision, and F1 score. In particular, it raises the IOU by 22.1% compared with the second ranked method (DMT). As state-of-the-art supervised methods, Bisenetv2 and Segformer manifest poor performance in all evaluation indicators, which is consistent with the visual analysis. Moreover, the recalls of U2PL and DMT are higher than 83%, while their precisions are lower than 45%. This suggests that these two methods improve recall by sacrificing precision. Thus, they perform worse than the proposed method in terms of more balanced indicators (F1 score and IoU). Nevertheless, BPGS generates reliable pseudo-labels, and the adaptive loss function reduces the impact of pseudo-label noise, so the proposed method can acquire the best quantitative evaluation results. In summary, our method is more competent for complex gully erosion monitoring than other methods.

3.4. Experiment 2: HC2020 to HC2012

In Experiment 2, the HC2020 dataset and the HC2012 dataset were employed as the historical data and the latest monitoring data, respectively. Figure 9 displays the latest monitoring data and the corresponding labels. On this basis, the monitoring results of five different methods for representative lands are shown in Figure 10.
As shown in Figure 10, the supervised methods display massive false negative results, and the three semi-supervised ones monitor most of the gully erosion pixels, which is similar to Experiment 1. As in the golden rectangle in the first row of Figure 10, the three semi-supervised methods achieve satisfactory results, while the supervised ones still have a mass of false negative pixels. These results again justify the choice of semi-supervised methods for gully monitoring. When monitoring some field roads with similar characteristics to gullies (the golden rectangles in the second and third rows of Figure 10), U2PL and DMT show false positive results, while the proposed method is able to monitor correctly. Meanwhile, compared with Experiment 1, the edge regions of the gullies are more likely to show errors due to the reduced image resolution. For example, in the golden rectangle in the fourth row of Figure 10, three semi-supervised methods display different degrees of false positive and false negative results. Nevertheless, the proposed method still achieves the best visual results in this region. To further evaluate the performance of the methods, the three times average quantitative evaluation results of five different methods are shown in Table 2.
As shown in Table 2, the proposed method outperforms the other methods in terms of the IoU, precision, and F1 score. Among them, it attains an IoU improved by 18.3% and an F1 score improved by 6% in comparison with the second ranked method (DMT). Due to the feature differences between historical data and the latest monitoring data, Bisenetv2 and Segformer once again present bad performance in all evaluation indicators. As for U2PL and DMT, these methods exhibit similar evaluation results as in Experiment 1, i.e., higher recall and lower precision. In addition, compared with Experiment 1, the IoUs of all methods show different degrees of degradation due to the reduced image resolution. Nevertheless, because BPGS can merge more valuable information to generate reliable pseudo-labels, and the adaptive loss can further reduce the impact of pseudo-label noise, the IoU of the proposed method is still higher than 60%. As a result, the proposed method can better address the impact of feature differences on automatic monitoring and possesses a higher superiority in complex gully erosion monitoring tasks.

4. Discussion

4.1. Ablation Study

In order to analyze the effectiveness of BPGS and adaptive loss function in the proposed method, an ablation study was conducted. First, we constructed a baseline method. Based on the iterative framework in Section 2.3, this baseline method used the intersection of evidence data as the pseudo-labels of the latest monitoring data. Additionally, this baseline method applied the standard binary cross-entropy loss function. On this basis, we take the IoU indicator as the evaluation criterion, and the three times average evaluation results of the ablation study are illustrated in Table 3.
As shown in Table 3, Baseline + BPGS improves the IoU by 11.4% and 9.7%, respectively, compared with Baseline. This demonstrates that the proposed BPGS can effectively improve the reliability of pseudo-labels by mining more useful information. Additionally, compared with Baseline+ BPGS, the proposed method (Baseline + BPGS + Adaptive Loss) attains preferable results. Hence, it is feasible to adjust the loss function according to the similarity of evidence data, thereby effectively mitigating the impact of pseudo-label noise. Based on the above analysis, it can be concluded that BPGS and the adaptive loss function are effective and necessary.

4.2. Analysis of the Setting of Similarity Regulation Indicator

In the adaptive loss function, the similarity regulation indicator κ in Equation (11) is used to determine the level of attention to different evidence data. To clarify the setting basis of κ , the relationship between parameter κ and the IoU is analyzed, as shown in Figure 11.
As shown in Figure 11, the horizontal coordinates are the similarity regulation indicator κ , the interval is 0.05, the longitudinal coordinates are the IoU, and the results of the two experiments are expressed by two curves with different styles. In addition, we also marked separately the highest IoU of the two experiments. With the increase in κ , the IoU curves of the two experiments first gradually increase and then decrease after reaching the highest point. Among them, κ = 0.6 and κ = 0.55 correspond to the highest points of the IoU curves, with 64.5% and 60.9% in Experiment 1 and Experiment 2, respectively. The detailed κ IoU values in the two experiments are shown in Table 4.
By analyzing Table 4, we discover that when κ was set at 0.6, the IoU could reach 60.4%, which is only 0.5% lower than the corresponding highest IoU in Experiment 2. That is to say, the ideal monitoring results could be obtained in the two experiments by setting κ at 0.6. Therefore, to avoid excessive hyperparameter tuning, the similarity regulation indicator κ is advised to be set directly at 0.6 in practical applications.

5. Conclusions

In this paper, we propose a semi-supervised semantic segmentation with boundary-guided pseudo-label generation strategy and adaptive loss function method. To the best of our knowledge, this is the first paper to implement automatic gully erosion monitoring from data acquired by different sensors at different times. In this method, the boundary-guided pseudo-label generation strategy (BPGS) composed of the object boundary generator and the multi-evidence fusion strategy is designed to enhance the reliability of pseudo-labels. Meanwhile, the adaptive loss function based on centroid similarity (CSIM) is proposed to further alleviate the impact of pseudo-label noise. Two experiments carried out on the HC2012 and HC2020 datasets show that the proposed method can better cope with the impact of feature differences on automatic monitoring than other state-of-the-art methods and present an IoU above 64% and 60%, respectively. Therefore, our method is more suitable for complex gully erosion monitoring. Furthermore, the ablation study demonstrates that BPGS and the adaptive loss function are effective and necessary. Concerning future work, we will further improve the performance and reliability of gully erosion monitoring, thereby continuing to make contributions to the research on automatic gully erosion monitoring.

Author Contributions

Conceptualization, C.Z. and N.S.; methodology, Y.S. and N.S.; software, Y.S. and Y.Y.; validation, Y.S., Y.Y. and Y.L.; formal analysis, C.Z. and Y.S.; data curation, C.Z. and Y.L.; writing—original draft preparation, Y.S. and N.S.; writing—review and editing, C.Z. and Y.Y.; supervision, C.Z., N.S. and Y.L.; project administration, C.Z. and N.S.; funding acquisition, C.Z. and N.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (No. 62071136, No. 62271159, No. 62002083, No. 61971153); Heilongjiang Outstanding Youth Foundation (YQ2022F002); Heilongjiang Postdoctoral Foundation (LBH-Q20085 and LBH-Z20051); Fundamental Research Funds for the Central Universities Grant (3072022QBZ0805, 3072021CFT0801 and 3072022CF0808).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Conflicts of Interest

All authors have reviewed the manuscript and approved submission to this journal. The authors declare that there are no conflicts of interest regarding the publication of this article and no self–citations included in the manuscript.

References

  1. Valentin, C.; Poesen, J.; Li, Y. Gully Erosion: Impacts, Factors and Control. Catena 2005, 63, 132–153. [Google Scholar] [CrossRef]
  2. Hitouri, S.; Varasano, A.; Mohajane, M.; Ijlil, S.; Essahlaoui, N.; Ali, S.A.; Essahlaoui, A.; Pham, Q.B.; Waleed, M.; Palateerdham, S.K.; et al. Hybrid Machine Learning Approach for Gully Erosion Mapping Susceptibility at a Watershed Scale. ISPRS Int. J. Geo-Inf. 2022, 11, 401. [Google Scholar] [CrossRef]
  3. Wang, Z.; Zhang, G.; Wang, C.; Xing, S. Assessment of the Gully Erosion Susceptibility Using Three Hybrid Models in One Small Watershed on the Loess Plateau. Soil Tillage Res. 2022, 223, 105481. [Google Scholar] [CrossRef]
  4. Kong, H.; Wu, D.; Yang, L. Quantification of Soil Erosion in Small Watersheds on the Loess Plateau Based on a Modified Soil Loss Model. Water Supply 2022, 22, 6308–6320. [Google Scholar] [CrossRef]
  5. Rafique, N.; Bhat, M.S.; Muntazari, T.H. Identification and Mapping of Land Degradation through Remote Sensing in Budgam District of Jammu and Kashmir, India. Indian J. Ecol. 2022, 49, 602–606. [Google Scholar]
  6. Wang, R.; Sun, H.; Yang, J.; Zhang, S.; Fu, H.; Wang, N.; Liu, Q. Quantitative Evaluation of Gully Erosion Using Multitemporal UAV Data in the Southern Black Soil Region of Northeast China: A Case Study. Remote Sens. 2022, 14, 1479. [Google Scholar] [CrossRef]
  7. Liu, G.; Zheng, F.; Wilson, G.V.; Xu, X.; Liu, C. Three decades of ephemeral gully erosion studies. Soil Tillage Res. 2021, 212, 105046. [Google Scholar] [CrossRef]
  8. Slimane, A.B.; Raclot, D.; Rebai, H.; Le Bissonnais, Y.; Planchon, O.; Bouksila, F. Combining field monitoring and aerial imagery to evaluate the role of gully erosion in a Mediterranean catchment (Tunisia). Catena 2018, 170, 73–83. [Google Scholar] [CrossRef]
  9. Evans, M.; Lindsay, J. High resolution quantification of gully erosion in upland peatlands at the landscape scale. Earth Surf. Proc. Land 2010, 35, 876–886. [Google Scholar] [CrossRef]
  10. Li, H.; Cruse, R.M.; Bingner, R.L.; Gesch, K.R.; Zhang, X. Evaluating ephemeral gully erosion impact on Zea mays L. yield and economics using AnnAGNPS. Soil Till. Res. 2016, 155, 157–165. [Google Scholar] [CrossRef]
  11. Guo, Y.; Liu, Y.; Georgiou, T.; Lew, M.S. A review of semantic segmentation using deep neural networks. Int. J. Multimed. Inf. Retr. 2018, 7, 87–93. [Google Scholar] [CrossRef]
  12. Shivappriya, S.N.; Priyadarsini, M.J.P.; Stateczny, A.; Puttamadappa, C.; Parameshachari, B.D. Cascade object detection and remote sensing object detection method based on trainable activation function. Remote Sens. 2021, 13, 200. [Google Scholar] [CrossRef]
  13. Xie, S.; Hu, H. Facial Expression Recognition Using Hierarchical Features with Deep Comprehensive Multipatches Aggregation Convolutional Neural Networks. IEEE Trans. Multimed. 2018, 21, 211–220. [Google Scholar] [CrossRef]
  14. Song, J.; Gao, S.; Zhu, Y.; Ma, C. A Survey of Remote Sensing Image Classification Based on CNNs. Big Earth Data 2019, 3, 232–254. [Google Scholar] [CrossRef]
  15. Wang, G.; Fan, B.; Xiang, S.; Pan, C. Aggregating rich hierarchical features for scene classification in remote sensing imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2017, 10, 4104–4115. [Google Scholar] [CrossRef]
  16. Wang, C.; Qiu, X.; Huan, H.; Wang, S.; Zhang, Y.; Chen, X.; He, W. Earthquake-Damaged Buildings Detection in Very High-Resolution Remote Sensing Images Based on Object Context and Boundary Enhanced Loss. Remote Sens. 2021, 13, 3119. [Google Scholar] [CrossRef]
  17. Yu, M.; Chen, X.; Zhang, W.; Liu, Y. AGs-Unet: Building Extraction Model for High Resolution Remote Sensing Images Based on Attention Gates U Network. Sensors 2022, 22, 2932. [Google Scholar] [CrossRef]
  18. Zhu, Q.; Li, Z.; Zhang, Y.; Guan, Q. Building Extraction from High Spatial Resolution Remote Sensing Images via Multiscale-Aware and Segmentation-Prior Conditional Random Fields. Remote Sens. 2020, 12, 3983. [Google Scholar] [CrossRef]
  19. Zhang, P.; Zhang, B.; Zhang, T.; Chen, D.; Wang, Y.; Wen, F. Prototypical pseudo label denoising and target structure learning for domain adaptive semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Online, 19–25 June 2021; pp. 12414–12424. [Google Scholar]
  20. Zhang, Q.; Zhang, J.; Liu, W.; Tao, D. Category anchor-guided unsupervised domain adaptation for semantic segmentation. In Proceedings of the Conference and Workshop on Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; p. 32. [Google Scholar]
  21. Yang, L.; Zhuo, W.; Qi, L.; Shi, Y.; Gao, Y. St++: Make Self-Training Work Better for Semi-Supervised Semantic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 19–23 June 2022; pp. 4268–4277. [Google Scholar]
  22. Xu, Y.; Shang, L.; Ye, J.; Qian, Q.; Li, Y.F.; Sun, B.; Jin, R. Dash: Semi-supervised learning with dynamic thresholding. In Proceedings of the International Conference on Machine Learning, Online, 18–24 July 2021; pp. 11525–11536. [Google Scholar]
  23. Zou, Y.; Zhang, Z.; Zhang, H.; Li, C.L.; Bian, X.; Huang, J.B.; Pfister, T. Pseudoseg: Designing pseudo labels for semantic segmentation. arXiv 2020, arXiv:2010.09713. [Google Scholar]
  24. Zuo, S.; Yu, Y.; Liang, C.; Jiang, H.; Er, S.; Zhang, C.; Zha, H. Self-training with differentiable teacher. arXiv 2021, arXiv:2109.07049. [Google Scholar]
  25. Zhang, W.; Zhu, L.; Hallinan, J.; Zhang, S.; Makmur, A.; Cai, Q.; Ooi, B.C. Boostmis: Boosting Medical Image Semi-Supervised Learning with Adaptive Pseudo Labeling and Informative Active Annotation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 19–23 June 2022; pp. 20666–20676. [Google Scholar]
  26. Wang, Y.; Wang, H.; Shen, Y.; Fei, J.; Li, W.; Jin, G.; Le, X. Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 19–23 June 2022; pp. 4248–4257. [Google Scholar]
  27. Yao, H.; Hu, X.; Li, X. Enhancing Pseudo Label Quality for Semi-Supervised Domain-Generalized Medical Image Segmentation. arXiv 2022, arXiv:2201.08657. [Google Scholar] [CrossRef]
  28. He, R.; Yang, J.; Qi, X. Re-Distributing Biased Pseudo Labels for Semi-Supervised Semantic Segmentation: A Baseline Investigation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Online, 19–25 June 2021; pp. 6930–6940. [Google Scholar]
  29. Kanezaki, A. Unsupervised Image Segmentation by Backpropagation. In Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, Canada, 15–20 April 2018; pp. 1543–1547. [Google Scholar]
  30. Wang, C.; Shi, A.; Wang, X.; Wu, F.; Huang, F.; Xu, L. A novel multi-scale segmentation algorithm for high resolution remote sensing images based on wavelet transform and improved JSEG algorithm. Light Electron. Opt. 2014, 125, 5588–5595. [Google Scholar] [CrossRef]
  31. Achanta, R.; Shaji, A.; Smith, K.; Lucchi, A.; Fua, P.; Süsstrunk, S. SLIC Superpixels Compared to State-of-the-art Superpixel Methods. IEEE Trans. Pattern Anal. 2012, 34, 2274–2282. [Google Scholar] [CrossRef] [PubMed]
  32. Csillik, O. Fast segmentation and classification of very high resolution remote sensing data using SLIC superpixels. Remote Sens. 2017, 9, 243. [Google Scholar] [CrossRef] [Green Version]
  33. Feng, Z.; Zhou, Q.; Gu, Q.; Tan, X.; Cheng, G.; Lu, X.; Ma, L. Dmt: Dynamic Mutual Training for Semi-Supervised Learning. Pattern Recogn. 2022, 2022, 108777. [Google Scholar] [CrossRef]
  34. Wang, C.; Liu, H.; Shen, Y.; Zhao, K.; Xing, H.; Wu, H. High-Resolution Remote-Sensing Image-Change Detection Based on Morphological Attribute Profiles and Decision Fusion. Complexity 2020, 171, 8360361. [Google Scholar] [CrossRef]
  35. Shi, A.; Gao, G.; Shen, S. Change detection of bitemporal multispectral images based on FCM and DS theory. EURASIP J. Adv. Sig. Process. 2016, 2016, 96. [Google Scholar] [CrossRef] [Green Version]
  36. Dempster, A.P. Upper and lower probabilities induced by multivalue mapping. Ann. Math. Stat. 1967, 38, 325–339. [Google Scholar] [CrossRef]
  37. Wang, C.; Zhang, Y.; Chen, X.; Jiang, H.; Mukherjee, M.; Wang, S. Automatic Building Detection from High-Resolution Remote Sensing Images Based on Joint Optimization and Decision Fusion of Morphological Attribute Profiles. Remote Sens. 2021, 13, 357. [Google Scholar] [CrossRef]
  38. Trivedi, M.M.; Mills, J.K. Centroid calculation of the blastomere from 3D Z-Stack image data of a 2-cell mouse embryo. Biomed Signal. Proces. 2020, 57, 101726. [Google Scholar] [CrossRef]
  39. Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image Quality Assessment: From Error Visibility to Structural Similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  40. Brunet, D.; Vrscay, E.R.; Wang, Z. On the mathematical properties of the structural similarity index. IEEE Trans. Image Process. 2011, 21, 1488–1499. [Google Scholar] [CrossRef] [PubMed]
  41. Yu, C.; Gao, C.; Wang, J.; Yu, G.; Shen, C.; Sang, N. Bisenet v2: Bilateral Network with Guided Aggregation for Real-Time Semantic Segmentation. Int. J. Comput. Vision 2021, 129, 3051–3068. [Google Scholar] [CrossRef]
  42. Xie, E.; Wang, W.; Yu, Z.; Anandkumar, A.; Alvarez, J.M.; Luo, P. SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers. In Proceedings of the Conference and Workshop on Neural Information Processing Systems, Vancouver, BC, Canada, 6–14 December 2021; pp. 12077–12090. [Google Scholar]
  43. Pozzer, S.; Rezazadeh Azar, E.; Dalla Rosa, F.; Chamberlain Pravia, Z.M. Semantic segmentation of defects in infrared thermographic images of highly damaged concrete structures. J. Perform. Constr. Facil. 2021, 35, 04020131. [Google Scholar] [CrossRef]
  44. Xia, L.; Zhang, R.; Chen, L.; Li, L.; Yi, T.; Wen, Y.; Xie, C. Evaluation of Deep Learning Segmentation Models for Detection of Pine Wilt Disease in Unmanned Aerial Vehicle Images. Remote Sens. 2021, 13, 3594. [Google Scholar] [CrossRef]
  45. Peng, X.; Zhong, R.; Li, Z.; Li, Q. Optical remote sensing image change detection based on attention mechanism and image difference. IEEE Trans. Geosci. Remote 2020, 59, 7296–7307. [Google Scholar] [CrossRef]
  46. He, N.; Fang, L.; Plaza, A. Hybrid first and second order attention Unet for building segmentation in remote sensing images. Sci. China Inform. Sci. 2020, 63, 140305. [Google Scholar] [CrossRef]
  47. Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Zitnick, C.L. Microsoft Coco: Common Objects in Context. In Proceedings of the European Conference on Computer Vision (ECCV), Zurich, Switzerland, 5–12 September 2014; pp. 740–755. [Google Scholar]
  48. Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Fei-Fei, L. Imagenet: A Large-Scale Hierarchical Image Database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar]
Figure 1. The automatic monitoring process of gully erosion. The input contains historical data with labels (DL) and the latest monitoring data without labels (DU). The output is the monitoring results of DU.
Figure 1. The automatic monitoring process of gully erosion. The input contains historical data with labels (DL) and the latest monitoring data without labels (DU). The output is the monitoring results of DU.
Remotesensing 14 05110 g001
Figure 2. The training flowchart of the proposed method. In the flowchart, our method contains a student model and a teacher model, and the training of the latest monitoring data is conducted sequentially from 1 to 4.
Figure 2. The training flowchart of the proposed method. In the flowchart, our method contains a student model and a teacher model, and the training of the latest monitoring data is conducted sequentially from 1 to 4.
Remotesensing 14 05110 g002
Figure 3. Flowchart of the object boundary generator. Boundary extraction is only performed after iteration has stopped.
Figure 3. Flowchart of the object boundary generator. Boundary extraction is only performed after iteration has stopped.
Remotesensing 14 05110 g003
Figure 4. The structure of channel attention module and spatial attention module. (a) The channel attention module. (b) The spatial attention module. The channel attention module helps focus on the meaningful channels, while the spatial attention module contributes to extracting features at the location of interest.
Figure 4. The structure of channel attention module and spatial attention module. (a) The channel attention module. (b) The spatial attention module. The channel attention module helps focus on the meaningful channels, while the spatial attention module contributes to extracting features at the location of interest.
Remotesensing 14 05110 g004
Figure 5. The pan-sharpened RGB images of two datasets. (a) HC2012 dataset. (b) HC2020 dataset. The HC2012 dataset was carried out based on the remote-sensing image of Huachuan County, Heilongjiang Province, China, collected by the ZY-3 satellite in 2012, whereas the HC2012 dataset was based on the same-site remote-sensing image collected by the GF-2 satellite in 2020.
Figure 5. The pan-sharpened RGB images of two datasets. (a) HC2012 dataset. (b) HC2020 dataset. The HC2012 dataset was carried out based on the remote-sensing image of Huachuan County, Heilongjiang Province, China, collected by the ZY-3 satellite in 2012, whereas the HC2012 dataset was based on the same-site remote-sensing image collected by the GF-2 satellite in 2020.
Remotesensing 14 05110 g005
Figure 6. The sample from the HC2020 dataset and its corresponding label. (a) Sample. (b) Corresponding label. In (b), white and black pixels stand for gully erosion and non-gully erosion, respectively.
Figure 6. The sample from the HC2020 dataset and its corresponding label. (a) Sample. (b) Corresponding label. In (b), white and black pixels stand for gully erosion and non-gully erosion, respectively.
Remotesensing 14 05110 g006
Figure 7. The latest monitoring data of Experiment 1 and corresponding labels. Among them, the colored rectangles are the representative lands, and the white as well as black pixels in the labels separately represent gully erosion and non-gully erosion.
Figure 7. The latest monitoring data of Experiment 1 and corresponding labels. Among them, the colored rectangles are the representative lands, and the white as well as black pixels in the labels separately represent gully erosion and non-gully erosion.
Remotesensing 14 05110 g007
Figure 8. The monitoring results of five different methods for representative lands of Experiment 1: (a) original image; (b) Bisenetv2; (c) Segformer; (d) U2PL; (e) DMT; (f) the proposed method. The white, blue, and green pixels in (bf) separately represent true positive, false positive, and false negative.
Figure 8. The monitoring results of five different methods for representative lands of Experiment 1: (a) original image; (b) Bisenetv2; (c) Segformer; (d) U2PL; (e) DMT; (f) the proposed method. The white, blue, and green pixels in (bf) separately represent true positive, false positive, and false negative.
Remotesensing 14 05110 g008
Figure 9. The latest monitoring data of Experiment 2 and corresponding labels. Among them, the colored rectangles are the representative lands, and the white as well as black pixels in the labels separately represent gully erosion and non-gully erosion.
Figure 9. The latest monitoring data of Experiment 2 and corresponding labels. Among them, the colored rectangles are the representative lands, and the white as well as black pixels in the labels separately represent gully erosion and non-gully erosion.
Remotesensing 14 05110 g009
Figure 10. The monitoring results of five different methods for representative lands of Experiment 2: (a) original image; (b) Bisenetv2; (c) Segformer; (d) U2PL; (e) DMT; (f) the proposed method. The white, blue, and green pixels in (bf) separately represent true positive, false positive, and false negative.
Figure 10. The monitoring results of five different methods for representative lands of Experiment 2: (a) original image; (b) Bisenetv2; (c) Segformer; (d) U2PL; (e) DMT; (f) the proposed method. The white, blue, and green pixels in (bf) separately represent true positive, false positive, and false negative.
Remotesensing 14 05110 g010
Figure 11. The relationship between similarity regulation indicator κ and IoU.
Figure 11. The relationship between similarity regulation indicator κ and IoU.
Remotesensing 14 05110 g011
Table 1. Quantitative evaluation results of Experiment 1. The entries in bold denote the best results in Experiment 1.
Table 1. Quantitative evaluation results of Experiment 1. The entries in bold denote the best results in Experiment 1.
MethodsPrecisionRecallF1 scoreIoU
Bisenetv2 [41]50.9%40.9%45.4%18.9%
Segformer [42]52.7%39.4%45.1%18.6%
U2PL [26]43.2%83.8%57%41.1%
DMT [33]44.8%85%58.7%42.4%
The proposed method58.7%60.4%59.5%64.5%
Table 2. Quantitative evaluation results of Experiment 2. The entries in bold denote the best results in Experiment 2.
Table 2. Quantitative evaluation results of Experiment 2. The entries in bold denote the best results in Experiment 2.
MethodsPrecisionRecallF1 scoreIoU
Bisenetv258.8%26.6%36.6%18.1%
Segformer57.4%25.9 %35.7%16.2 %
U2PL41.9%82.8%55.6%39.3%
DMT44.5%85.5%58.5%42.1%
The proposed method63.9%65.2%64.5%60.4%
Table 3. The results of the ablation study. √ and — separately represent used and not used; the entries in bold denote the best results in the corresponding experiment; and Δ(IoU) is the IoU change compared with the baseline.
Table 3. The results of the ablation study. √ and — separately represent used and not used; the entries in bold denote the best results in the corresponding experiment; and Δ(IoU) is the IoU change compared with the baseline.
ExperimentMethodBPGSAdaptive LossIoUΔ(IoU)
Experiment 1Baseline44.7%+0.0%
Baseline56.1%+11.4%
Baseline64.5%+19.8%
Experiment 2Baseline43.6%+0.0%
Baseline53.3%+9.7%
Baseline60.4%+16.8%
Table 4. Detailed κ IoU values in the two experiments. The entries in bold denote the best results in the corresponding experiment.
Table 4. Detailed κ IoU values in the two experiments. The entries in bold denote the best results in the corresponding experiment.
Experiment 1 κ 00.050.10.150.20.250.30.350.40.450.5
IoU (%)51.452.153.554.756.357.559.860.762.263.462.7
κ 0.550.60.650.70.750.80.850.90.951
IoU (%)63.364.562.960.358.756.855.454.854.353.6
Experiment 2 κ 00.050.10.150.20.250.30.350.40.450.5
IoU (%)49.849.650.751.253.955.357.159.558.359.460.1
κ 0.550.60.650.70.750.80.850.90.951
IoU (%)60.960.459.758.157.855.953.453.652.552.1
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Zhao, C.; Shen, Y.; Su, N.; Yan, Y.; Liu, Y. Gully Erosion Monitoring Based on Semi-Supervised Semantic Segmentation with Boundary-Guided Pseudo-Label Generation Strategy and Adaptive Loss Function. Remote Sens. 2022, 14, 5110. https://doi.org/10.3390/rs14205110

AMA Style

Zhao C, Shen Y, Su N, Yan Y, Liu Y. Gully Erosion Monitoring Based on Semi-Supervised Semantic Segmentation with Boundary-Guided Pseudo-Label Generation Strategy and Adaptive Loss Function. Remote Sensing. 2022; 14(20):5110. https://doi.org/10.3390/rs14205110

Chicago/Turabian Style

Zhao, Chunhui, Yi Shen, Nan Su, Yiming Yan, and Yong Liu. 2022. "Gully Erosion Monitoring Based on Semi-Supervised Semantic Segmentation with Boundary-Guided Pseudo-Label Generation Strategy and Adaptive Loss Function" Remote Sensing 14, no. 20: 5110. https://doi.org/10.3390/rs14205110

APA Style

Zhao, C., Shen, Y., Su, N., Yan, Y., & Liu, Y. (2022). Gully Erosion Monitoring Based on Semi-Supervised Semantic Segmentation with Boundary-Guided Pseudo-Label Generation Strategy and Adaptive Loss Function. Remote Sensing, 14(20), 5110. https://doi.org/10.3390/rs14205110

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop