Next Article in Journal
Improving Jujube Fruit Tree Yield Estimation at the Field Scale by Assimilating a Single Landsat Remotely-Sensed LAI into the WOFOST Model
Next Article in Special Issue
Double-Branch Multi-Attention Mechanism Network for Hyperspectral Image Classification
Previous Article in Journal
Pointing Accuracy of an Operational Polarimetric Weather Radar
Previous Article in Special Issue
Spectral-Spatial Attention Networks for Hyperspectral Image Classification

Remote Sens. 2019, 11(9), 1116; https://doi.org/10.3390/rs11091116

Article
Label Noise Cleansing with Sparse Graph for Hyperspectral Image Classification
1
School of Information Science and Technology, Jiujiang University, Jiujiang 332005, China
2
School of Tourism and Territorial Resources, Jiujiang University, Jiujiang 332005, China
3
School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
*
Correspondence: [email protected]
Haiou Yang is the co-first author.
Received: 20 April 2019 / Accepted: 6 May 2019 / Published: 10 May 2019

Abstract

:
In a real hyperspectral image classification task, label noise inevitably exists in training samples. To deal with label noise, current methods assume that noise obeys the Gaussian distribution, which is not the real case in practice, because in most cases, we are more likely to misclassify training samples at the boundaries between different classes. In this paper, we propose a spectral–spatial sparse graph-based adaptive label propagation (SALP) algorithm to address a more practical case, where the label information is contaminated by random noise and boundary noise. Specifically, the SALP mainly includes two steps: First, a spectral–spatial sparse graph is constructed to depict the contextual correlations between pixels within the same superpixel homogeneous region, which are generated by superpixel image segmentation, and then a transfer matrix is produced to describe the transition probability between pixels. Second, after randomly splitting training pixels into “clean” and “polluted,” we iteratively propagate the label information from “clean” to “polluted” based on the transfer matrix, and the relabeling strategy for each pixel is adaptively adjusted along with its spatial position in the corresponding homogeneous region. Experimental results on two standard hyperspectral image datasets show that the proposed SALP over four major classifiers can significantly decrease the influence of noisy labels, and our method achieves better performance compared with the baselines.
Keywords:
hyperspectral image classification; label noise cleansing; spectral–spatial sparse graph; adaptive label propagation

1. Introduction

Hyperspectral image (HSI), which includes numerous contiguous bands of spectral information of the observation, have been easily and cheaply acquired by various remote sensing sensors in recent years. HSI classification [1,2], which aims at the high-precision and pixel-level categorization of land use/land over types, has been significantly promoted for many practical applications [3,4], such as ecological monitoring, precision agriculture, and mineral exploration. As a result of the enduring and increasing demand of classification accuracy, the supervised paradigm of HSI classification benefiting from powerful knowledge of manual labels, is always one of the main topics of interest in the remote sensing community [5,6]. In the past decades, a number of effective HSI classification methods have been proposed based on multifarious supervised models, e.g., Bayesian model [7], neural networks [8], support vector machine (SVM) [9], sparse representation learning [10], collaborative representation learning [11,12], extreme learning machine (ELM) [13,14], and generative adversarial network [15,16]. They can generally be well-trained and provide satisfying results.
Commonly speaking, the availability of supervised models highly depends on the validity of manual labels. However, some unpredictable mislabeled samples (namely, label noise) naturally exist in real-world hyperspectral datasets when human labelers are involved, and the noise-polluted pixels would evidently decrease the performance of supervised classifiers [17]. To deal with the unavoidable label noise problem, researchers have proposed many useful methods [18,19,20,21,22,23,24] that mainly fall into three categories [25]: label noise-robust classification, label noise-tolerant classification, and label noise cleansing. Label noise-robust classification focuses on studying discriminative classifiers that are insensitive to label noise, label noise-tolerant classification is concerned with modeling the label noise simultaneously learning with the classifiers. Both approaches can be regarded as a style called “learning with the label noise,” which is always designed for specific classification models with sophisticated learning algorithms and thus has less generalization and scalability. By contrast, label noise cleansing is a preprocessing step, and it works on learning a universal noise filter to improve the label quality of training data for any classifiers. Therefore, this study focuses on the label noise cleansing style that is more general for HSI classification with noisy labels.
HSI classification in the presence of label noise is an exigent and challenging issue. However, few label cleansing methods are found when even considering the literature of generic image classification [26,27,28]. For example, Pelletier et al. [17] evaluated the influence of land cover label noise on classification performances. They demonstrated that the accuracies of two classic models, namely, SVM and random forest (RF), were sensitive to mislabeled training data. Unfortunately, they did not provide a solution to handle the label noise. Jiang et al. [29] proposed a random label propagation algorithm (RLPA), which can be the only relevant effort. They first constructed a probability transfer matrix to depict the affinity of hyperspectral pixels and then relabeled the training pixels by exploiting the label information from neighbors via random label propagation. The performance of several typical classifiers trained on relabeled samples had significant improvements over that of classifiers trained on initial samples with label noise. This approach assumed that noise obeys the Gaussian distribution, and noisy labels were generated by employing only random noise. However, the random noise cannot simulate the actual situation, because in most cases, the labelers probably misclassify hyperspectral pixels at the boundaries between different classes. Two kinds of label noise are more likely to take place during the practical labeling procedure [25]: random label noise and boundary label noise. On the one hand, hyperspectral pixels are so subtle that labelers may mislabel the samples unintentionally, which can increase the probability and randomness of noisy labels. On the other hand, diverse land use/land over types are often spatially staggered, and the labels of boundary pixels between adjacent surface types are harder to determine than those of other pixels. The neighborhood affinity is probably not robust when many noisy labels gather around the borders. Thus, a universal solution is needed to cope with the two types of label noise.
In this study, we propose a spectral–spatial sparse graph-based adaptive label propagation (SALP) algorithm to clean random label noise and boundary label noise for HSI classification. The framework shown in Figure 1 mainly includes two steps: First, we construct a spectral–spatial sparse graph to depict the affinity between pixels. Both spectral information and spatial constraint are important for calculating the similarity of different pixels [30,31,32,33], and sparse graph that explicitly considers the influence of data noise is insensitive to the gathered noise [34,35,36,37,38,39]. Therefore, a spectral–spatial sparse graph is adopted to describe the contextual correlations between pixels within same superpixel homogeneous region that are obtained by entropy rate superpixel segmentation (ESR) [40], and then a transfer matrix is produced to describe the transition probability between pixels. Second, we employ an adaptive label propagation algorithm to relabel the training samples. The spectral–spatial superpixel homogeneous region contains some useful prior knowledge that pixels with similar land use/land over types are often adjacent, and the central pixels in a superpixel homogeneous region are usually farther from the true border between different surface types more than other pixels (refer to the mapping results in Figure 1). We can adaptively relabel the pixels based on their spatial position in the corresponding homogeneous region. Hence, after randomly splitting training pixels into “clean” and “polluted,” an adaptive label propagation process is utilized to iteratively propagate the label information from “clean” to “polluted” based on the transfer matrix. The centered “polluted” pixels are relabeled by exploiting the label information from neighborhood “clean” samples, whereas the labels of other “polluted” pixels are adjusted by sparse “clean” pixels that locate in the same superpixel homogeneous region. Finally, the label propagation is carried out several times, and the final labels for each pixels are calculated based on a majority vote algorithm (MVA) [41]. The proposed SALP is tested on two real HSI databases, namely, Indian Pines and Salinas Scene, and our performance over four typical classifiers can significantly decrease the influence of noisy labels and exceed the baselines.
The main contributions of this study can be summarized as the following three aspects:
(1)
We carefully analyzed and examined the core issue and the influence of label noise for HSI classification, which is an useful guideline to present a label noise-polluted classification work.
(2)
A novel label noise cleansing method, namely, SALP algorithm, is proposed to deal with random label noise and boundary label noise.
(3)
Experimental results on two publicly practical datasets show that the proposed SALP over four major classifiers can obviously reduce the impact of noisy labels, and its performance can surpass the baselines in terms of overall accuracy (OA), average accuracy (AA), Kappa coefficient, and visual classification map.
This paper is organized as follows: Section 2 formulates and discusses the problem statement; Section 3 presents the methodology of the proposed SALP; Section 4 draws the experimental results and discussions; and Section 5 concludes this article.

2. Problem Statement

This section first presents the mathematical definitions of HSI classification in the presence of label noise, and then quantitatively discusses the impact of random label noise and boundary label noise.
Given a HSI composed of hundreds of contiguous spectral bands, some pixels with labels are regarded as training data, and the other pixels without labels are treated as testing data. When the HSI classification meets label noise, there are a certain level of noisy labels, including random label noise and boundary label noise, exist in the training pixels. To deal with the noisy labels, label noise cleansing is to relabel the initial labels to ensure that the performance of supervised classifiers using relabeled training data is better than that of using initial data.
Specifically, a set of training pixels in a D dimensional spectral feature space are denoted as X = x 1 , x 2 , , x N , x i R D , i is an integer in the interval [ 1 , N ] . γ = 1 , 2 , , C depicts a label set, the class labels of x 1 , x 2 , , x N are denoted as y 1 , y 2 , , y N , y j γ , j is an integer in the interval [ 1 , C ] , and a matrix Y R D × C is employed to denote the ground truth labels, where Y i j = 1 if j is the label of x i . To model the label noise process, another matrix Y ˜ R D × C is adopted to represent the labels with the noise of training pixels, and our goal is to forecast the label of x l X with the matrix Y ˜ in certain noise rate [42] ρ . And ρ j k that is the probability of one sample being marked by incorrect labels can be indicated as:
ρ j k = P Y ˜ i k = 1 | Y i j = 1 , j , k 1 , 2 , , C , j k , i 1 , 2 , , N .
The influence of random label noise on HSI classification was examined in Jiang’s study [29], and results showed that random label noise can degenerate the performance of supervised classifiers. To further analyze the integrated impact of random label noise and boundary label noise, we test the noisy label-based algorithm (NLA), in which training samples and their corresponding noisy labels are directly used to train the classifiers. Four typical classifiers are utilized, namely, neighbor nearest (NN), SVM, RF, and ELM. Two standard hyperspectral datasets, i.e., the Indian Pines and Salinas, are adopted, and ground truth samples are randomly splited into training set and testing set. The label noise uses “both” setting that is a fusion of random label noise and boundary label noise. The noise rate is set as ρ = 0 , 0.1 , 0.2 , , 0.9 with an interval of 0.1. The OA is employed to measure the classification performance. More details about the sample setting, the label noise setting, and other experimental setup can be found in Section 4.2.
As shown in Figure 2, three phenomena occur: First, all classifiers have good performance when learning the classifiers without label noise, whereas their performance declines quickly along with the increase of ρ . Second, some classifiers may have good results at low noise rate. For example, ELM that pertains to single-hidden layer feed-forward neural network, can randomly generates the input weights and compute the output weights with a least squares solution. It is more computationally efficient meanwhile usually obtain similar or better generalization performance than traditional neural networks and SVM [13,43]. However, all classifiers are still sensitive to label noise especially when ρ is high. Third, the performance of all classifiers is poor at a high noise rate and converges to a similarly low value. Above all, the random label noise and boundary label noise can seriously and negatively affect the availability of supervised classifiers. Thus, studying a general and effective label noise cleansing method for HSI classification is urgent.

3. Proposed Method

3.1. Overview of the Proposed Method

Label noise cleansing is performed to relabel the initial labeled training samples rather than the unlabeled samples adopted for unsupervised models [44,45]. Thus, the use of original labels is the key to the cleansing methodology. Although the process of pixel-level labeling is difficult and complicated, most labels are still credible because they are always generated by experts. Therefore, a intuitional idea is to relabel a pixel based on the majority of label information from its correlational pixels (e.g., neighbors [29]). The neighborhood affinity may be useful for cleansing the random label noise that is randomly distributed. However, different surface types are often so spatially staggered that the labels of boundary pixels are easily mislabeled, and the neighborhood affinity is probably less discriminative to eliminate boundary label noise.
To solve the problem, this study proposes SALP algorithm, and its core idea can be described as follows: First, the sparse graph that explicitly considers the influence of data noise is insensitive to the gathered label noise, and its sparsity is datum-adaptive instead of manually determining the size of neighborhood [46,47,48]. Second, spatial information is important for measuring the similarity of different pixels. A superpixel homogeneous region produced by superpixel image segmentation may provide a useful spatial constraint for graph construction [49,50]. Third, the centered pixels in a superpixel homogeneous region are often farther from the true border between different surface types more than other pixels. A good strategy is to adaptively adjust the label propagation for each pixel along with its spatial position in the corresponding homogeneous region. Therefore, we utilize a spectral–spatial sparse graph to depict the affinity between pixels within the same superpixel homogeneous region, and then iteratively relabel the training samples by exploiting the graph through an adaptive label propagation algorithm.

3.2. Spectral-Spatial Sparse Graph Construction

The core issue of graph construction is weight measurement between nodes, and a straightforward way to achieve this is to compute the spectral similarity between pixels. However, constructing a full graph for all pixels is complicated, and the commonly used similarities based on Euclidean distance and spectral angle mapper, etc., are not robust enough to handle the low inter-class and high intra-class spectral difference [51,52]. We notice that spatial information is helpful for differentiating the spectral features [43]. Thus, the initial HSI is segmented into many non-overlapping superpixel homogeneous regions, and then a sparse graph is generated based on the sparse spectral similarity with spatial constraint. The graphical illustration of spectral–spatial sparse graph is shown in Figure 3.
Firstly, following the superpixel segmentation procedure [29], I f is the first principal component of an HSI that contains hundreds of contiguous spectral bands, and it can be acquired through principal component analysis [53] so as to reduce the computational complexity. Then, a set of superpixel homogeneous regions S = s 1 , s 2 , , s T is produced via ESR,
I f = k T s k , s . t . s k s g = , k , g { 1 , 2 , , T } , k g ,
Secondly, a sparse graph is adopted to depict the contextual correlations between hyperspectral pixels. Instead of constructing a graph in the whole image, we employs a spatial constraint, in which the sparse graph is generated for the pixels in the same superpixel homogeneous region. Specifically, the training data X = x 1 , x 2 , , x N , x i R D are coded sparsely by 0 -norm optimization:
arg min α i α i 0 , s . t . x i = B i α i ,
where 0 is the 0 -norm of a vector that counts the number of nonzero elements. B i = x 1 , x 2 , , x N , I R D × ( D + N 1 ) , α i R D + N 1 . ALthough sparsity optimization is a nonconvex NP-hard problem, if the solution of α i is sparse enough [54], Equation (3) can be relaxed by optimizing the following convex function:
arg min α i x i B i α i 2 2 + λ α i 1 ,
where 1 is the 1 -norm, which sums the absolute value of the vector, and can be solved as a linear program. λ is used to weight the importance of minimizing α i .
Set sparse graph G = X , W with the X as the vertices set and the W as the weight matrix, and the edge weights W i j from x i to x j can be denoted by:
W i j = α i j , x i , x j s k , 0 , x i s k , x j s g ,
where α i j is the j-th coefficient of α i . Generally, a large W i j corresponds to the greater effect of x i on x j in the label propagation procedure. Notably, correlations may also exist between x i and other pixels, and the overall affinity between associated pixels should be examined to calculate the transfer matrix T i j , which indicates the probability of label information traveling from x i to x j as follows:
T i j = P i j = W i j k = 1 N W i k .

3.3. Adaptive Label Propagation

Label propagation is performed to deliver the label information from the correctly labeled pixels to the incorrect ones [55]. Although the ground truth is unknown, we can randomly divide the training data several times and then fuse all the results of label propagation based on the MVA. In each time, some training pixels that are randomly set as “clean” would preserve the labels, and the others are treated as “polluted” pixels that are unlabeled. To propagate the labels of “clean” samples, a common strategy that the “polluted” pixels absorb the label information from their neighbors [29] based on the transfer matrix T , is sometimes useful to cope with the random label noise. However, boundary label noise is intensively located around the borders, and neighborhood affinity may bring about wrong labels for boundary pixels. Therefore, we update the label propagation adaptively for each pixel along with its spatial position in the corresponding homogeneous region.
Compared with the centered pixels in a superpixel homogeneous region, the pixels close to the homogeneous region border are more likely to be near the true boundary between different surface types. Thus, the labels of centered “polluted” pixels are altered based on the information from their neighborhood “clean” samples, whereas other “polluted” pixels are relabeled by exploiting the label information from sparse “clean” pixels that locate in the same superpixel homogeneous region. Specifically, a set of training pixels X is stochastically split into two parts: the “clean” subset X L = x 1 , x 2 , , x l with its label matrix Y ˜ L = Y ˜ : , 1 : l R l × C , l is the number of labeled pixels that is decided by the noise rate, l = r o u n d N × ρ ; the “polluted” subset X U = x l + 1 , x 2 , , x N without labels. Then, the objective of adaptive label propagation is to iteratively forecast the label matrix Y ˜ U of X U by exploiting the transfer matrix T .
Assuming that the label prediction matrix of all the training pixels is F = f 1 , f 2 , , f N R N × C , the label of x i at time t + 1 is
f i t + 1 = θ x i s k x j s k i , n e i g h T j i f j t + 1 θ Y ˜ L U i , if x i s k m e a n < σ k , θ x i , x j s k T j i f j t + 1 θ Y ˜ L U i , o t h e r s ,
where Y ˜ L U i is the i-th column of Y ˜ L U = Y ˜ L ; Y ˜ U . θ is to balance the influence between the present label and other label information acquired from the referential pixels. We fix θ to 0.9 in all the experiments. f j t is the references label f j at time t, the references s k i , n e i g h are the neighbors of x i when x i is close to the mean of the superpixel homogeneous region, s k m e a n . Otherwise, f j t can be the labels of all other pixels within s k . Parameter σ k is the variance of s k . Following the optimization [56], Equation (7) can converge to:
F * = lim t F t = lim θ T F t 1 + 1 θ Y ˜ L U = 1 θ I T 1 Y ˜ L U ,
and the cleaned label of x i can be denoted as:
y i * = arg max F j i * j .
To reduce the error of random selection, the above process would be repeated for many times, and the final propagated label can be calculated by MVA. The pseudocodes of the proposed SALP is overall described as Algorithm 1.
Algorithm 1 The proposed SALP algorithm.
Input: A hyperspectral image I f ; The training pixels X = x 1 , x 2 , , x N with their labels y 1 , y 2 , , y N ; Parameters ρ and θ .
Output: The cleaned label y 1 * , y 2 * , , y N * .
  • Generate the superpixel homogeneous region set S of I f by Equation (2);
  • Construct the sparse graph by Equation (5);
  • Calculate the transfer matrix T by Equation (6);
  • Update the label of x i at time t + 1 based on T by Equation (7);
  • Acquire the cleaned label of x i by Equation (9);

4. Results and Discussions

4.1. Datasets

Two publicly standard HSI datasets are adopted to evaluate our method. The Indian Pines was captured by the AVIRIS sensor in northwestern Indiana, the scene of which includes two-thirds agriculture, and one-third forest or other natural perennial vegetation. The dataset consists of 145 × 145 pixels with 20 m per pixel and 220 bands in the wavelength range of 0.4–2.45 m. This study utilizes 200 bands for classification after removing 24 water absorption bands [15,29,57], and 10,249 labeled pixels with 16 different land cover types from the ground truth map. The gray image and the false-color composite image of the corresponding ground truth map are demonstrated in Figure 4.
The Salinas was collected by the 224-band AVIRIS sensor over Salinas Valley, California. Similar to the Indian Pines dataset, 20 water absorption bands were removed in this scene. This study employs the remaining 204 bands that are over 0.4–2.5 μ m of 512 × 217 pixels with a spatial resolution of 3.7 m, and 54,129 labeled pixels with 16 classes sampled from the ground truth map. The gray image and the false-color composite image of the corresponding ground truth map are demonstrated in Figure 5.

4.2. Experimental Setup

This section introduces the setup of our experiments, which mainly contains data setting, label noise generation, classification baselines, and evaluation metrics.
For the Indian Pines and Salinas datasets, pixels from the ground truth map are randomly divided into the training and testing samples. Specifically, on the Indian Pines dataset, 10 percent of pixels are randomly selected as training samples, and the others are regarded as testing samples. On the Salinas dataset, 50 training pixels from each class are randomly chosen for training, and the rest is for testing.
To comprehensively evaluate our method, we generate two kinds of label noise setting on training data through the procedure below. The first setting is random label noise (abbr. “random”). Although we focus on the more practical issue that some noisy labels may gather around the borders between different surface classes, and some of them distribute arbitrarily, existing methods always employ only random label noise. Therefore, hyperspectral pixels from the training data are randomly mislabeled at certain rate in the “random” setting. The second setting is the fusion of random label noise and boundary label noise (abbr. “both”). The boundary pixels are manually sought out in the ground truth label matrix, their k-nearest neighbors (KNN) are mislabeled to some extent, then the random noisy labels are acquired from the other pixels. The number of boundary noisy labels is same as that of random noisy labels in the “both” setting. The noise rate is set as ρ = 0.1 , 0.2 , 0.3 , 0.4 , 0.5 . We do not show the comparison results with ρ > 0.5, because the labels of HSIs are provided by experts, and most of labels are always correct although a large amount of label noise probably exists in the practical applications.
Label noise cleansing is a process of HSI classification. We use four typical classifiers, namely, NN, SVM, RF, and ELM, to verify our method. In other words, we can evaluate the effectiveness of SALP based on the result of comparing the performance of classifier learning with the initial training data and the relabeled one. Three commonly-used metrics, i.e., the OA, AA, and the Kappa coefficient are used to measure the performance [57,58].

4.3. Results Comparison and Analysis with the “Random” Setting

Existing methods usually adopt only the random label noise. Thus, this section tests our method with the “random” setting in which label noise distributes randomly in the whole HSI. Three baselines of label noise cleansing are compared, which include NLA, isolation Forest (iForest) [59], and RLPA [29]. NLA is a basic baseline in which pixels with the corresponding noisy labels are directly used to train the classifiers. iForest is an anomaly detection algorithm that can be used to detect the noisy labels through the steps below: Many isolation trees are produced based on the sub-samples of training samples. Afterwards, the anomaly score for each sample is computed by analyzing the isolation trees, and those samples with scores that surpass the predefined threshold would be removed. RLPA is the state-of-the-art of label noise cleansing, and it is the first label cleansing solution to deal with the random noise label for HSI classification. The OA, AA, and Kappa coefficient of classification results over four typical classifiers on the Indian Pines and the Salinas datasets are shown in Table 1 and Table 2.
On the basis of the results, three observations can be summarized as follows: (1) Compared with the NLA that directly learns with noisy data, SALP can significantly decrease the influence of random label noise. Our OA, AA, and Kappa coefficient outperform those of NLA at most noise rates, especially when high rates of label noise are present. For example, the improvements of OA at ρ = 0.5 are 34.62%, 22.26%, and 9.71%, those of AA are 25.63%, 30.06%, 6.51%, and 9.66%, and those of Kappa coefficient are 37.47%, 16.36%, 13.18%, 2.9% for NN, SVM, RF, and ELM, respectively on the Indian Pines dataset, and greater improvements exist on the Salinas dataset. (2) Compared with the baselines of label noise cleansing, i.e., iForest and RLPA, SALP achieves better performance in most cases. The OA and Kappa coefficient of RLPA is slightly higher than that of SALP at low noise rates on the Indian Pines dataset, because the neighborhood constraint of RLPA still works well when the arbitrary noisy labels are almost dispersively distributed. However, our results exceed those of RLPA on Indian Pines at high noise rates, thereby probably inducing more agminated noisy labels even though a “random” setting exists. Moreover, the AA of SALP is evidently better than that of RLPA on both Indian Pines and Salinas dataset, and almost all of our results surpass baselines on the Salinas dataset. (3) Compared with ELM and RF that pertain to the label-robust learning paradigm and can be less influenced by noisy labels, NN and SVM are more sensitive to label noise. However, the performance degradations of NN and SVM based on SALP are usually less than those based on other baselines along with the increase in noise rate, it further demonstrates the effectiveness of our method for cleansing the label noise. The OA trend of SALP and RLPA (the best baseline) is shown in Figure 6. As can be seen, our method has a smoother curve and stable performance than RLPA on two datasets.

4.4. Results, Comparison and Analysis with the “both” Settings

In this section, we evaluate the effectiveness of SALP with the ”both” setting, in which training samples are polluted by random label noise and boundary label noise, which is a more practical case than the “random” setting. However, few studies employ boundary label noise. Thus, we carefully choose RLPA, which is the only label cleansing method for HSI classification and the NLA as baselines. The OA, AA, and Kappa coefficient of classification results over four typical classifiers on two hyperspectral datasets are shown in Table 3 and Table 4.
Four conclusions can be drawn based on the results: (1) The proposed SALP achieves evidently better results than NLA in terms of OA, AA, and Kappa coefficient, which means that our method can effectively clean the fusion of random label noise and boundary label noise. (2) The performance of the proposed SALP algorithm surpasses that of RLPA, which is the state-of-the-art of label noise cleansing for HSI classification in most situations, especially when high noise rates exist, and the improvements of results with the “both” setting are more obvious compared with those with the “random” setting, it states that our SALP is useful and suitable to deal with both random label noise and boundary label noise. (3) For RF and ELM that are insensitive to label noise, SALP can perserve their robustness. For more sensitive NN and SVM, our method can slow down the degradation of classification performance along with the increase in ρ . The OA trend of SALP and RLPA is compared in Figure 7, and the OA curves of our method are steadier those of RLPA accompanied by the rise in the noise rate. (4) To further present the visualization results, the classification maps on two noise levels ( ρ = 0.1 and ρ = 0.5 ) are shown in Figure 8 and Figure 9 for the Indian Pines, and in Figure 10 for the Salinas dataset. The classification maps of SALP result in improved labels especially for the boundary pixels between different surface types and achieve a smoother effect for the hyperspectral pixels in some tiny surface classes, e.g., the northeast of Indian Pines.

4.5. Further Discussion

Compared with KNN graph. Figure 3 shows the graphical difference between full graph and spectral–spatial sparse graph. To present a quantitative comparison, we test a commonly-used full graph that is constructed based on KNN, and the results in the “both” setting are shown in Figure 11. We observe that the performace of SALP is significantly better than that of KNN graph at any noise rates. Moreover, the results of KNN graph regress rapidly as the noise rate grows, especially when the classifiers adopt NN and SVM. The robustness demonstrates the effectiveness of SALP, and KNN graph is sensitive to label noise for HSI classification.
Influence of boundary label noise on SALP. The proportion of random label noise and boundary label noise is equal in the previous “both” setting. In this section, we evaluate the OA trend of SALP when the proportion is altered. As shown in Figure 12, two phenomenon can be observed: First, the curves with different proportions of boundary label noise on two datasets are steady, which further proves the superiority of SALP. Second, there exists an increasing trend of OA until the proportion of boundary label noise is more than 0.8, especially on the Indina Pines dataset. On one hand, it demonstrates that SALP is usually suitable for coping with boundary label noise. On the other hand, Indian Pines has more small surface patches than Salinas, and noisy labels probably submerge these patches if there are too much boundary label noise, which may reduce the effectiveness of SALP.

5. Conclusions

Label noise is an unavoidable and urgent issue for HSI classification. Compared with the state-of-the-art that employs only the random label noise, this study proposes a SALP algorithm to handle a more practical noise condition in which the real HSI classification task is more likely to involve random label noise and boundary label noise. The SALP first constructs a robust sparse graph based on the spectral similarity between hyperspectral pixels with spatial constraint, and then adaptively propagates the label information from “clean” samples to “polluted” samples by exploiting the graph. Three major conclusions can be drawn based on extensive experimental results on two standard public datasets: First, our method always achieves better performance than baselines in terms of OA, AA and Kappa coefficient with either “random” setting or “both” setting, thereby indicating that SALP can effectively clean random label noise and boundary label noise. Second, SALP can slow down the degradation of classification performance along with the increase in label noise rate for those label noise-sensitive classifiers, e.g., NN and SVM. Third, the performance of our method is usually steady and sometimes improved when the proportion of boundary label noise is increased in the “both” setting, thus demonstrating that SALP can handle the gathered boundary label noise.

Author Contributions

Conceptualization, Q.L. and J.J.; Methodology, Q.L., H.Y. and J.J.; Software, Q.L. and H.Y.; Validation, Q.L. and H.Y.; Formal analysis, Q.L. and H.Y.; Investigation, Q.L. and H.Y.; Resources, Q.L.; Data curation, H.Y.; Writing—original draft preparation, Q.L.; Writing—review and editing, Q.L., H.Y. and J.J.; Visualization, H.Y.; Supervision, J.J.; Project administration, J.J.; Funding acquisition, Q.L. and J.J.

Funding

The research was supported by the National Nature Science Foundation of China (61562048).

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
SALPSpectral–spatial sparse graph based adaptive label propagation
HSIHyperspectral image
SVMSupport vector machines
ELMExtreme learning machine
RLPARandom label propagation algorithm
ESREntropy rate superpixel segmentation
MVAMajority vote algorithm
OAOverall accuracy
NLANoisy label based algorithm
NNNeighbor nearest
RFRandom forest
KNNk-nearest neighbors
iForestisolation Forest

References

  1. Camps-Valls, G.; Tuia, D.; Bruzzone, L.; Benediktsson, J.A. Advances in hyperspectral image classification: Earth monitoring with statistical learning methods. IEEE Signal Process. Mag. 2014, 31, 45–54. [Google Scholar] [CrossRef]
  2. Tuia, D.; Persello, C.; Bruzzone, L. Domain adaptation for the classification of remote sensing data: An overview of recent advances. IEEE Geosci. Remote Sens. Mag. 2016, 4, 41–57. [Google Scholar] [CrossRef]
  3. Schneider, S.; Murphy, R.J.; Melkumyan, A. Evaluating the performance of a new classifier—The GP-OAD: A comparison with existing methods for classifying rock type and mineralogy from hyperspectral imagery. ISPRS J. Photogramm. Remote Sens. 2014, 98, 145–156. [Google Scholar] [CrossRef]
  4. Tiwari, K.; Arora, M.; Singh, D. An assessment of independent component analysis for detection of military targets from hyperspectral images. Int. J. Appl. Earth Obs. Geoinf. 2011, 13, 730–740. [Google Scholar] [CrossRef]
  5. He, L.; Li, J.; Liu, C.; Li, S. Recent advances on spectral–spatial hyperspectral image classification: An overview and new guidelines. IEEE Trans. Geosci. Remote Sens. 2018, 56, 1579–1597. [Google Scholar] [CrossRef]
  6. Ghamisi, P.; Plaza, J.; Chen, Y.; Li, J.; Plaza, A.J. Advanced spectral classifiers for hyperspectral images: A review. IEEE Geosci. Remote Sens. Mag. 2017, 5, 8–32. [Google Scholar] [CrossRef]
  7. Landgrebe, D.A. Signal Theory Methods in Multispectral Remote Sensing; John Wiley & Sons: Hoboken, NJ, USA, 2005; Volume 29. [Google Scholar]
  8. Ratle, F.; Camps-Valls, G.; Weston, J. Semisupervised neural networks for efficient hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2010, 48, 2271–2282. [Google Scholar] [CrossRef]
  9. Melgani, F.; Bruzzone, L. Classification of hyperspectral remote sensing images with support vector machines. IEEE Trans. Geosci. Remote Sens. 2004, 42, 1778–1790. [Google Scholar] [CrossRef][Green Version]
  10. Chen, Y.; Nasrabadi, N.M.; Tran, T.D. Hyperspectral image classification using dictionary-based sparse representation. IEEE Trans. Geosci. Remote Sens. 2011, 49, 3973–3985. [Google Scholar] [CrossRef]
  11. Jiang, J.; Chen, C.; Yu, Y.; Jiang, X.; Ma, J. newblock Spatial-aware collaborative representation for hyperspectral remote sensing image classification. IEEE Geosci. Remote Sens. Lett. 2017, 14, 404–408. [Google Scholar] [CrossRef]
  12. Jiang, X.; Song, X.; Zhang, Y.; Jiang, J.; Gao, J.; Cai, Z. Laplacian regularized spatial-aware collaborative graph for discriminant analysis of hyperspectral imagery. Remote Sens. 2019, 11, 29. [Google Scholar] [CrossRef]
  13. Li, W.; Chen, C.; Su, H.; Du, Q. Local binary patterns and extreme learning machine for hyperspectral imagery classification. IEEE Trans. Geosci. Remote Sens. 2015, 53, 3681–3693. [Google Scholar] [CrossRef]
  14. Samat, A.; Du, P.; Liu, S.; Li, J.; Cheng, L. Ensemble Extreme Learning Machines for Hyperspectral Image Classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2014, 7, 1060–1069. [Google Scholar] [CrossRef]
  15. Zhu, L.; Chen, Y.; Ghamisi, P.; Benediktsson, J.A. Generative adversarial networks for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2018, 56, 5046–5063. [Google Scholar] [CrossRef]
  16. Ma, J.; Yu, W.; Liang, P.; Li, C.; Jiang, J. FusionGAN: A generative adversarial network for infrared and visible image fusion. Inf. Fusion 2019, 48, 11–26. [Google Scholar] [CrossRef]
  17. Pelletier, C.; Valero, S.; Inglada, J.; Champion, N.; Marais Sicre, C.; Dedieu, G. Effect of training class label noise on classification performances for land cover mapping with satellite image time series. Remote Sens. 2017, 9, 173. [Google Scholar] [CrossRef]
  18. Angluin, D.; Laird, P. Learning from noisy examples. Mach. Learn. 1988, 2, 343–370. [Google Scholar] [CrossRef][Green Version]
  19. Lawrence, N.D.; Schölkopf, B. Estimating a kernel Fisher discriminant in the presence of label noise. In Proceedings of the Eighteenth International Conference on Machine Learning, ICML ’01, Williamstown, MA, USA, 28 June–1 July 2001; Volume 1, pp. 306–313. [Google Scholar]
  20. Natarajan, N.; Dhillon, I.S.; Ravikumar, P.K.; Tewari, A. Learning with noisy labels. In Proceedings of the Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA, 5–10 December 2013; pp. 1196–1204. [Google Scholar]
  21. Liu, T.; Tao, D. Classification with noisy labels by importance reweighting. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 38, 447–461. [Google Scholar] [CrossRef]
  22. Tu, B.; Zhang, X.; Kang, X.; Zhang, G.; Li, S. Density Peak-Based Noisy Label Detection for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2019, 57, 1573–1584. [Google Scholar] [CrossRef]
  23. Kang, X.; Duan, P.; Xiang, X.; Li, S.; Benediktsson, J.A. Detection and correction of mislabeled training samples for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2018, 56, 5673–5686. [Google Scholar] [CrossRef]
  24. Gao, Y.; Ma, J.; Yuille, A.L. Semi-supervised sparse representation based classification for face recognition with insufficient labeled samples. IEEE Trans. Image Process. 2017, 26, 2545–2560. [Google Scholar] [CrossRef] [PubMed]
  25. Frénay, B.; Verleysen, M. Classification in the presence of label noise: A survey. IEEE Trans. Neural Netw. Learn. Syst. 2014, 25, 845–869. [Google Scholar] [CrossRef]
  26. You, Y.L.; Kaveh, M. Fourth-order partial differential equations for noise removal. IEEE Trans. Image Process. 2000, 9, 1723–1730. [Google Scholar] [CrossRef] [PubMed]
  27. Zhu, Z.; You, X.; Chen, C.P.; Tao, D.; Ou, W.; Jiang, X.; Zou, J. An adaptive hybrid pattern for noise-robust texture analysis. Pattern Recognit. 2015, 48, 2592–2608. [Google Scholar] [CrossRef]
  28. Condessa, F.; Bioucas-Dias, J.; Kovačević, J. Supervised hyperspectral image classification with rejection. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2016, 9, 2321–2332. [Google Scholar] [CrossRef]
  29. Jiang, J.; Ma, J.; Wang, Z.; Chen, C.; Liu, X. Hyperspectral image classification in the presence of noisy labels. IEEE Trans. Geosci. Remote Sens. 2019, 57, 851–865. [Google Scholar] [CrossRef]
  30. Ji, R.; Gao, Y.; Hong, R.; Liu, Q.; Tao, D.; Li, X. Spectral–spatial constraint hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2014, 52, 1811–1824. [Google Scholar]
  31. Pu, H.; Chen, Z.; Wang, B.; Jiang, G.M. A novel spatial–spectral similarity measure for dimensionality reduction and classification of hyperspectral imagery. IEEE Trans. Geosci. Remote Sens. 2014, 52, 7008–7022. [Google Scholar]
  32. Kang, X.; Li, S.; Benediktsson, J.A. Spectral–spatial hyperspectral image classification with edge-preserving filtering. IEEE Trans. Geosci. Remote Sens. 2014, 52, 2666–2677. [Google Scholar] [CrossRef]
  33. Zheng, X.; Yuan, Y.; Lu, X. Dimensionality reduction by spatial–spectral preservation in selected bands. IEEE Trans. Geosci. Remote Sens. 2017, 55, 5185–5197. [Google Scholar] [CrossRef]
  34. Bin, C.; Jianchao, Y.; Shuicheng, Y.; Yun, F.; Huang, T. Learning with l1-graph for image analysis. IEEE Trans. Image Process. 2010, 19, 858–866. [Google Scholar]
  35. Gu, Y.; Feng, K. L1-graph semisupervised learning for hyperspectral image classification. In Proceedings of the 2012 IEEE International Geoscience and Remote Sensing Symposium, Munich, Germany, 22–27 July 2012; pp. 1401–1404. [Google Scholar]
  36. Wang, X.; Zhang, X.; Zeng, Z.; Wu, Q.; Zhang, J. Unsupervised spectral feature selection with l1-norm graph. Neurocomputing 2016, 200, 47–54. [Google Scholar] [CrossRef]
  37. Liu, L.; Chen, L.; Chen, C.P.; Tang, Y.Y. Weighted joint sparse representation for removing mixed noise in image. IEEE Trans. Cybern. 2017, 47, 600–611. [Google Scholar] [CrossRef] [PubMed]
  38. Liu, L.; Chen, C.P.; You, X.; Tang, Y.Y.; Zhang, Y.; Li, S. Mixed noise removal via robust constrained sparse representation. IEEE Trans. Circuits Syst. Video Technol. 2018, 28, 2177–2189. [Google Scholar] [CrossRef]
  39. Fan, F.; Ma, Y.; Li, C.; Mei, X.; Huang, J.; Ma, J. Hyperspectral image denoising with superpixel segmentation and low-rank representation. Inf. Sci. 2017, 397, 48–68. [Google Scholar] [CrossRef]
  40. Liu, M.Y.; Tuzel, O.; Ramalingam, S.; Chellappa, R. Entropy rate superpixel segmentation. In Proceedings of the CVPR 2011, Colorado Springs, CO, USA, 20–25 June 2011; pp. 2097–2104. [Google Scholar]
  41. Freund, Y. Boosting a weak learning algorithm by majority. Inf. Comput. 1995, 121, 256–285. [Google Scholar] [CrossRef]
  42. Kalai, A.T.; Servedio, R.A. Boosting in the presence of noise. J. Comput. Syst. Sci. 2005, 71, 266–290. [Google Scholar] [CrossRef][Green Version]
  43. Chen, C.; Li, W.; Su, H.; Liu, K. Spectral–spatial classification of hyperspectral image based on kernel extreme learning machine. Remote Sens. 2014, 6, 5795–5814. [Google Scholar] [CrossRef]
  44. Ma, L.; Crawford, M.M.; Zhu, L.; Liu, Y. Centroid and Covariance Alignment-Based Domain Adaptation for Unsupervised Classification of Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2019, 57, 2305–2323. [Google Scholar] [CrossRef]
  45. Jiang, J.; Ma, J.; Chen, C.; Wang, Z.; Cai, Z.; Wang, L. SuperPCA: A superpixelwise PCA approach for unsupervised feature extraction of hyperspectral imagery. IEEE Trans. Geosci. Remote Sens. 2018, 56, 4581–4593. [Google Scholar] [CrossRef]
  46. Jia, S.; Zhang, X.; Li, Q. Spectral–Spatial Hyperspectral Image Classification Using 1/2 Regularized Low-Rank Representation and Sparse Representation-Based Graph Cuts. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2015, 8, 2473–2484. [Google Scholar] [CrossRef]
  47. Ma, D.; Yuan, Y.; Wang, Q. Hyperspectral anomaly detection via discriminative feature learning with multiple-dictionary sparse representation. Remote Sens. 2018, 10, 745. [Google Scholar] [CrossRef]
  48. Ma, J.; Zhao, J.; Jiang, J.; Zhou, H.; Guo, X. Locality preserving matching. Int. J. Comput. Vis. 2019, 127, 512–531. [Google Scholar] [CrossRef]
  49. Zhang, S.; Li, S.; Fu, W.; Fang, L. Multiscale superpixel-based sparse representation for hyperspectral image classification. Remote Sens. 2017, 9, 139. [Google Scholar] [CrossRef]
  50. Li, J.; Zhang, H.; Zhang, L. Efficient superpixel-level multitask joint sparse representation for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2015, 53, 5338–5351. [Google Scholar]
  51. Xue, Z.; Du, P.; Li, J.; Su, H. Simultaneous sparse graph embedding for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2015, 53, 6114–6133. [Google Scholar] [CrossRef]
  52. Chen, M.; Wang, Q.; Li, X. Discriminant analysis with graph learning for hyperspectral image classification. Remote Sens. 2018, 10, 836. [Google Scholar] [CrossRef]
  53. Wold, S.; Esbensen, K.; Geladi, P. Principal component analysis. Chemom. Intell. Lab. Syst. 1987, 2, 37–52. [Google Scholar] [CrossRef]
  54. Donoho, D.L. For most large underdetermined systems of linear equations the minimal 1-norm solution is also the sparsest solution. Commun. Pure Appl. Math. 2006, 59, 797–829. [Google Scholar] [CrossRef]
  55. Kothari, R.; Jain, V. Learning from labeled and unlabeled data. In Proceedings of the 2002 International Joint Conference on Neural Networks, IJCNN’02 (Cat. No. 02CH37290), Honolulu, HI, USA, 12–17 May 2002; Volume 3, pp. 2803–2808. [Google Scholar]
  56. Zhu, X.; Ghahramani, Z. Learning from Labeled and Unlabeled Data with Label Propagation; Technical Report CMU-CALD-02-107; Carnegie Mellon University: Pittsburgh, PA, USA, 2002. [Google Scholar]
  57. Cheng, G.; Li, Z.; Han, J.; Yao, X.; Guo, L. Exploring hierarchical convolutional features for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2018, 56, 6712–6722. [Google Scholar] [CrossRef]
  58. Qin, Y.; Bruzzone, L.; Li, B.; Ye, Y. Cross-Domain Collaborative Learning via Cluster Canonical Correlation Analysis and Random Walker for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2019. [Google Scholar] [CrossRef]
  59. Liu, F.T.; Ting, K.M.; Zhou, Z.H. Isolation-based anomaly detection. ACM Trans. Knowl. Discov. Data (TKDD) 2012, 6, 3. [Google Scholar] [CrossRef]
Figure 1. The framework of the proposed SALP algorithm, which mainly includes spectral–spatial sparse graph construction and adaptive label propagation.
Figure 1. The framework of the proposed SALP algorithm, which mainly includes spectral–spatial sparse graph construction and adaptive label propagation.
Remotesensing 11 01116 g001
Figure 2. The OA of NLA over four typical classifiers at different noise rate on two standard hyperspectral datasets. The results of NN, SVM, RF and ELM are labeled in red “o”, green “+”, blue “*”, and black “x”.
Figure 2. The OA of NLA over four typical classifiers at different noise rate on two standard hyperspectral datasets. The results of NN, SVM, RF and ELM are labeled in red “o”, green “+”, blue “*”, and black “x”.
Remotesensing 11 01116 g002
Figure 3. The graphical illustration of spectral–spatial sparse graph. In a full graph, vertexes are densely connected with others. In a spectral–spatial sparse graph, vertexes are sparsely linked with those pixels that are located in the same homogeneous regions. The links between different homogeneous regions and some weak links marked with red arrows are removed from the full graph.
Figure 3. The graphical illustration of spectral–spatial sparse graph. In a full graph, vertexes are densely connected with others. In a spectral–spatial sparse graph, vertexes are sparsely linked with those pixels that are located in the same homogeneous regions. The links between different homogeneous regions and some weak links marked with red arrows are removed from the full graph.
Remotesensing 11 01116 g003
Figure 4. (a) The gray image and (b) the corresponding ground truth map of Indian dataset.
Figure 4. (a) The gray image and (b) the corresponding ground truth map of Indian dataset.
Remotesensing 11 01116 g004
Figure 5. (a) The gray image and (b) the corresponding ground truth map of Salinas dataset.
Figure 5. (a) The gray image and (b) the corresponding ground truth map of Salinas dataset.
Remotesensing 11 01116 g005
Figure 6. OA trend of RLPA and SALP over NN and SVM with the “random” Setting. The proposed SALP is marked by imaginary line, and the RLPA is marked by solid line. The results of NN is labeled in red “+”, and that of SVM is labeled in blue “*”.
Figure 6. OA trend of RLPA and SALP over NN and SVM with the “random” Setting. The proposed SALP is marked by imaginary line, and the RLPA is marked by solid line. The results of NN is labeled in red “+”, and that of SVM is labeled in blue “*”.
Remotesensing 11 01116 g006
Figure 7. OA trend of RLPA and SALP over NN and SVM with the “both” setting. The proposed SALP is marked by imaginary line, and the RLPA is marked by solid line. The results of NN is labeled in red “+”, and that of SVM is labeled in blue ”*“.
Figure 7. OA trend of RLPA and SALP over NN and SVM with the “both” setting. The proposed SALP is marked by imaginary line, and the RLPA is marked by solid line. The results of NN is labeled in red “+”, and that of SVM is labeled in blue ”*“.
Remotesensing 11 01116 g007
Figure 8. The classification maps of baselines and SALP over four classifiers on the Indian Pines dataset when ρ = 0.1. From the top row down to the last row: NN, SVM, RF, and ELM, from the first column to the last column: NLA, RLPA, and SALP. Please zoom in to see the details.
Figure 8. The classification maps of baselines and SALP over four classifiers on the Indian Pines dataset when ρ = 0.1. From the top row down to the last row: NN, SVM, RF, and ELM, from the first column to the last column: NLA, RLPA, and SALP. Please zoom in to see the details.
Remotesensing 11 01116 g008
Figure 9. The classification maps of baselines and SALP over four classifiers on the Indian Pines dataset when ρ = 0.5. From the top row down to the last row: NN, SVM, RF, and ELM, from the first column to the last column: NLA, RLPA, and SALP. Please zoom in to see the details.
Figure 9. The classification maps of baselines and SALP over four classifiers on the Indian Pines dataset when ρ = 0.5. From the top row down to the last row: NN, SVM, RF, and ELM, from the first column to the last column: NLA, RLPA, and SALP. Please zoom in to see the details.
Remotesensing 11 01116 g009
Figure 10. The classification maps of baselines and SALP over four classifiers on the Salinas dataset when (a) ρ = 0.1 and (b) ρ = 0.5. From the top row down to the last row: NN, SVM, RF, and ELM, from the first column to the last column: NLA, RLPA, and SALP. Please zoom in to see the details.
Figure 10. The classification maps of baselines and SALP over four classifiers on the Salinas dataset when (a) ρ = 0.1 and (b) ρ = 0.5. From the top row down to the last row: NN, SVM, RF, and ELM, from the first column to the last column: NLA, RLPA, and SALP. Please zoom in to see the details.
Remotesensing 11 01116 g010
Figure 11. OA of KNN graph and SALP with the “both” Setting. The proposed SALP is marked by imaginary line, and KNN graph is marked by solid line. The results of NN, SVM, RF and ELM are labeled in red “o”, green “+”, blue “*”, and black “x”.
Figure 11. OA of KNN graph and SALP with the “both” Setting. The proposed SALP is marked by imaginary line, and KNN graph is marked by solid line. The results of NN, SVM, RF and ELM are labeled in red “o”, green “+”, blue “*”, and black “x”.
Remotesensing 11 01116 g011
Figure 12. OA trend of SALP with different proportion of boundary label noise in the “both” setting. The results of NN, SVM, RF and ELM are labeled in red “o”, green “+”, blue “*”, and black “x”.
Figure 12. OA trend of SALP with different proportion of boundary label noise in the “both” setting. The results of NN, SVM, RF and ELM are labeled in red “o”, green “+”, blue “*”, and black “x”.
Remotesensing 11 01116 g012
Table 1. Comparisons of baselines and SALP over four typical classifiers with the “random” Setting on the Indian Pines dataset. The best results are bolded.
Table 1. Comparisons of baselines and SALP over four typical classifiers with the “random” Setting on the Indian Pines dataset. The best results are bolded.
ρ ClassifierOA [%]AA [%]Kappa Coefficient
NLAiForestRLPASALPNLAiForestRLPASALPNLAiForestRLPASALP
0.1NN73.2474.3479.30 80.49 69.5569.5072.63 72.90 0.69660.70850.7635 0.777
SVM84.2183.73 88.60 86.7561.3764.2874.56 75.44 0.81760.8122 0.8695 0.8480
RF 80.29 79.1179.7278.69 68.37 66.6866.2665.85 0.7734 0.75970.76650.7538
ELM 91.35 90.1191.6690.07 84.92 82.5383.3181.880.90120.8869 0.9047 0.8844
0.2NN65.2473.0878.88 80.46 62.7366.1472.32 73.17 0.60860.69250.7587 0.7766
SVM77.1677.96 88.01 86.8554.9362.7874.67 75.63 0.73400.7444 0.8627 0.8491
RF78.4174.03 79.46 78.96 67.37 61.2666.1266.640.75180.7006 0.7635 0.7571
ELM88.6583.30 91.31 90.0382.2673.44 83.23 82.500.87040.8082 0.9006 0.8860
0.3NN57.4670.9578.02 79.55 54.2861.7570.31 71.74 0.52470.66880.7491 0.7667
SVM71.1076.82 87.03 86.6846.5858.2971.18 74.23 0.66030.73150.8514 0.8595
RF75.9572.5479.13 79.27 64.4058.5265.55 66.39 0.72430.68370.7598 0.7609
ELM86.4181.51 90.59 89.7077.2868.7280.94 81.67 0.84470.7878 0.8925 0.8822
0.4NN49.7668.7176.73 79.15 47.8260.1269.66 70.49 0.44210.64260.7348 0.7622
SVM65.0473.0185.86 86.52 39.0057.2571.21 72.62 0.58510.68670.8381 0.8533
RF72.4369.59 78.79 78.6461.0656.9165.61 64.78 0.68440.6494 0.7562 0.7538
ELM82.9376.9189.62 90.27 72.5364.6180.63 81.46 0.80500.73450.8815 0.8880
0.5NN40.5664.6472.91 75.18 40.0655.3965.31 65.69 0.34580.59670.6923 0.7205
SVM61.9168.9082.16 84.17 35.8053.3664.02 65.86 0.54690.63880.7952 0.8241
RF66.6866.08 76.75 76.3957.0453.3963.47 63.55 0.62120.60960.7329 0.7292
ELM76.9472.4387.07 87.11 67.7458.9476.69 77.40 0.73730.68260.8522 0.8540
Table 2. Comparisons of baselines and SALP over four typical classifiers with the “random” Setting on the Salinas dataset. The best results are bolded.
Table 2. Comparisons of baselines and SALP over four typical classifiers with the “random” Setting on the Salinas dataset. The best results are bolded.
ρ ClassifierOA [%]AA [%]Kappa Coefficient
NLAiForestRLPASALPNLAiForestRLPASALPNLAiForestRLPASALP
0.1NN78.0785.1086.45 87.08 83.9591.7293.01 93.51 0.75790.83500.8497 0.8566
SVM84.4488.2291.43 92.81 91.7493.1495.45 95.93 0.82720.86940.9047 0.9200
RF86.9786.7188.09 90.10 92.1892.3393.27 94.52 0.85530.85260.8677 0.8901
ELM92.6990.5492.96 93.85 96.3195.2096.58 96.84 0.91860.89490.9216 0.9315
0.2NN70.2284.9385.89 86.59 75.1291.3392.76 93.34 0.67210.83310.8436 0.8504
SVM85.8788.2491.13 91.70 91.3093.1695.20 95.70 0.84150.86940.9013 0.9079
RF85.5485.9887.82 89.15 90.5491.6693.12 94.22 0.83950.84450.8648 0.8775
ELM92.2789.6392.82 93.41 95.9294.5996.49 96.80 0.91390.88480.9201 0.9267
0.3NN60.8584.0884.79 86.5 65.4490.2692.14 93.36 0.57100.82370.8318 0.8512
SVM76.6285.9990.91 91.63 89.4791.4895.12 95.63 0.74370.84450.8989 0.9069
RF82.5984.6987.12 88.95 87.5290.3492.78 94.21 0.80700.83010.8571 0.8796
ELM91.3488.2292.56 93.38 95.0793.5296.31 96.65 0.90360.86920.9172 0.9263
0.4NN53.9983.7283.27 85.28 57.8389.9191.28 92.65 0.49580.81970.8150 0.8369
SVM77.5284.7990.08 90.91 85.9890.5294.37 95.34 0.75250.83130.8897 0.9066
RF79.0384.2786.47 88.88 83.6290.0192.54 94.31 0.76750.82560.8500 0.8767
ELM90.4486.8992.03 93.17 94.1792.6996.10 96.62 0.89360.85450.9114 0.9241
0.5NN44.8682.89 80.79 80.3247.1788.5389.22 89.49 0.3978 0.8104 0.78790.783
SVM75.0384.5289.28 89.56 75.9789.54 94.45 93.850.72060.82810.8808 0.8842
RF73.1983.25 85.20 85.1077.0688.6291.50 91.81 0.70340.8143 0.8360 0.8352
ELM89.3986.3991.55 91.99 93.2691.70 95.67 95.430.88190.84900.9061 0.9109
Table 3. Comparisons of baselines and SALP over four typical classifiers with the “both” Setting on the Indian Pines dataset. The best results are bolded.
Table 3. Comparisons of baselines and SALP over four typical classifiers with the “both” Setting on the Indian Pines dataset. The best results are bolded.
ρ ClassifierOA [%]AA [%]Kappa Coefficient
NLARLPASALPNLARLPASALPNLARLPASALP
0.1NN75.2379.48 79.99 67.8072.14 72.88 0.70710.7513 0.7537
SVM83.9385.12 85.74 62.5776.38 77.42 0.8150 0.8627 0.8362
RF78.2478.90 79.73 62.7166.06 66.99 0.74980.7568 0.7664
ELM89.9492.02 92.43 81.4282.63 83.91 0.88500.9089 0.9135
0.2NN65.8678.45 78.53 63.3171.81 71.91 0.61690.7566 0.7714
SVM78.5184.85 87.82 55.6973.41 76.45 0.75390.8481 0.8602
RF 78.72 77.6576.7362.06 65.37 64.70 0.7560 0.74220.7314
ELM89.06 91.93 91.3580.4881.22 82.46 0.8752 0.9077 0.9009
0.3NN59.6377.55 77.93 56.0471.27 71.56 0.53680.7446 0.7485
SVM71.4184.03 86.89 45.4872.47 73.45 0.66930.8395 0.8495
RF75.2077.98 78.69 63.9664.15 64.92 0.71490.7475 0.7557
ELM87.99 91.60 90.4579.40 82.41 82.130.8625 0.8939 0.8909
0.4NN49.5877.46 79.36 44.9771.97 72.10 0.44170.7439 0.7553
SVM64.2483.18 85.45 36.4064.37 67.35 0.57830.8129 0.8325
RF72.1178.62 78.45 58.67 65.93 64.980.6823 0.7544 0.7524
ELM82.57 91.06 90.1669.9481.73 82.04 0.8010 0.8922 0.8875
0.5NN42.4272.54 74.50 42.1866.20 68.17 0.36650.6892 0.7117
SVM54.4780.80 85.76 28.5561.92 64.45 0.46150.7898 0.8313
RF65.5378.05 78.79 53.6162.89 63.03 0.60610.7468 0.7490
ELM78.1089.98 90.74 67.5181.26 81.78 0.75050.8856 0.8881
Table 4. Comparisons of baselines and SALP over four typical classifiers with the “both” Setting on the Salinas dataset. The best results are bolded.
Table 4. Comparisons of baselines and SALP over four typical classifiers with the “both” Setting on the Salinas dataset. The best results are bolded.
ρ ClassifierOA [%]AA [%]Kappa Coefficient
NLARLPASALPNLARLPASALPNLARLPASALP
0.1NN78.20 86.98 86.6284.43 93.63 93.520.76020.8528 0.8556
SVM90.1992.18 92.26 94.0995.85 95.97 0.89060.9082 0.9087
RF86.7587.65 87.65 92.2393.25 93.25 0.86400.8632 0.8632
ELM92.9693.86 93.95 96.68 96.99 96.970.92810.9309 0.9310
0.2NN70.9685.75 86.15 76.7793.01 93.34 0.68030.8368 0.8416
SVM89.4991.78 92.10 91.79 95.87 95.780.88310.9014 0.9121
RF84.4486.48 87.00 90.37 92.93 92.860.82720.8503 0.8514
ELM92.0293.79 93.79 95.6796.12 96.38 0.91130.9191 0.9237
0.3NN62.7884.27 86.06 66.4392.16 92.88 0.59110.8287 0.844
SVM83.491.18 91.78 88.2994.98 95.32 0.81360.9021 0.9065
RF81.6086.11 86.59 88.2192.29 93.14 0.79670.8544 0.8605
ELM90.76 93.83 93.4493.9696.10 96.50 0.89700.9116 0.9267
0.4NN53.2383.97 84.29 58.8991.21 92.77 0.48910.8080 0.8365
SVM75.6091.25 91.60 83.7194.31 95.40 0.7301 0.9027 0.9004
RF77.4085.99 86.26 82.9091.95 92.99 0.75010.8450 0.8477
ELM90.70 93.68 93.2793.4796.08 96.26 0.89890.9144 0.9247
0.5NN42.8278.87 80.74 46.6487.47 88.93 0.37880.7521 0.7680
SVM72.8487.00 91.05 79.7792.66 94.03 0.69860.8563 0.8704
RF75.00 86.58 86.2477.2191.05 92.16 0.72220.8313 0.8569
ELM88.90 93.59 93.1693.16 95.71 95.510.87670.9073 0.9100

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Back to TopTop