1. Introduction
The development of satellite and unmanned aerial vehicle (UAV) technology has made it easy to capture high spatial resolution remote sensing images. However, how to make full use of these images is a challenge for a lot of applications. As an important processing step for the analysis of remote sensing images, image classification can provide valuable information for various practical applications, i.e., urban planning, change detection, crop yield estimation, and sustainable forest management. Although unsupervised classification methods were also proposed by some researchers, supervised classification methods show obvious priority in practical applications. However, obtaining enough training samples is an obstacle to using supervised classification methods. Collecting a large amount of training samples is time-consuming and cost-expensive. Without enough training samples, a lot of state-of-art supervised classification methods are not able to generate the expected classification result. Therefore, achieving expected classification performance with a limited number of training samples is a trend in recent years.
A lot of research has made efforts on reducing the requirement for training samples. Among them, active learning (AL) is a potential direction [
1,
2,
3]. The AL provides an interactive solution for commonly used classifiers to achieve excellent classification performance even with a limited number of training samples. The AL repeatedly selects valuable unlabeled samples based on the previous round of classification using a small training set, then selected samples will be labeled and added to the training set. After multiple iterations, as the updated training set contains more representative samples and avoids repeated samples, the workload of labeling training samples is reduced and the classifier can perform well. The AL has been applied successfully in the pixel-based classification of remote sensing images, especially hyperspectral images. In [
4], different batch-mode AL techniques for the classification of remote sensing images with support vector machine (SVM) were investigated. Different query functions based on uncertainty criteria and diversity criteria were investigated and the combination of these two criteria showed the ability to select the potentially most informative set of samples at each iteration of the AL process. In [
5], to integrate spectral-spatial information with the AL, the supervised classification and AL were based on the information extracted from a
patch. In [
6], to address the problem of fewer training samples problem in hyperspectral image classification, an algorithm that combined both semisupervised learning and AL was proposed. A supervised clustering method was utilized to find highly confidential clusters to enrich the training data, and the left clusters were the candidates for active learning.
Except for pixel-based AL, object-based AL was also exploited by researchers in recent years. Compared to pixel-based methods, the object-based method can make full use of the spatial information within the data. Specifically, in dealing with high spatial resolution images, the object-based classification has a lot of advantages over the pixel-based classification [
7,
8,
9,
10]. Superpixel is also a representation of an object. The superpixel segmentation methods are commonly used in remote sensing image analysis because superpixel can effectively combine spectral and spatial features and reduce computational efforts. It may be directly applied to remote sensing image classification [
11,
12,
13]. It can also be combined with deep learning [
14,
15,
16,
17,
18] or graph neural network [
19,
20,
21] for remote sensing image classification. Similarly, as a basic processing unit, superpixel can also be combined with active learning and applied to remote sensing image classification. In [
22], over-segmented superpixels were used as the basic unit for classification and AL. The results showed that superpixel-based AL was superior to pixel-based AL. In [
23], AL and the random forest (RF) [
24] classifier were adopted to classify segmented objects. As the object was used as the classification unit, the negative influence of the speckle noise was relieved. In [
25], information entropy is used to evaluate the classification uncertainty of segmented objects. According to information entropy, the training set is enriched by adding a certain proportion of zero-entropy objects acquired via random sampling, and non-zero-entropy objects were used as a candidate set for active learning. In [
26], AL was integrated with an object-based classification method, and the informativeness of samples can be estimated by using various object-based features.
Although some object-based AL methods have achieved promising performance [
22,
23,
25,
26], the contextual information between adjacent and spatially close objects was seldom considered. As current segmentation techniques are not able to generate accurate segments for real ground objects, most object-based methods use over-segmented objects or superpixels to avoid the negative influence of under-segmentation [
27,
28]. Therefore, a real ground object usually contains multiple segmented objects. Ignoring the contextual information of objects within a real ground object will negatively affect classification accuracy. Thus, the contextual information between adjacent objects is important. Moreover, normally used contextual information extraction methods only extend one layer of neighbors, which may not enough for some scenarios. When an expert is labeling training objects, not only the information of the target object will be used, but also the information of its neighboring objects. However, the label is only assigned to the target object in the traditional active learning method [
22,
23,
25,
26]. Actually, it is possible to spread expert labeling information to neighboring objects.
In this paper, we first identify superpixels with certain categories and uncertain superpixels by using supervised learning XGBoost or SVM. Then we use the active learning method to process those uncertain superpixels. We propose the Similar Neighboring Superpixels Search and Labeling (SNSSL) strategy in the AL process to efficiently spread the expert labeling information. The proposed method is innovative in making full use of the expert label information from active learning. The expert labeling information from active learning is only used to enrich the training set for updating the classifier in most papers on active learning. However, in this paper, the expert labeling information from active learning is not only used to enrich the training set but also used to spread the expert label information to similar neighboring superpixels, which is our main contribution. In each round of active query, the most uncertain superpixel (named target superpixel) will be selected for labeling. The neighboring superpixels which are highly similar to the target superpixel will be selected by the search method based on superpixel similarity. Then the label of the target superpixel will be assigned to the selected neighboring superpixels. The main idea behind the method which exploits superpixel-based contextual information is Tobler’s First Law of Geography (near things are more related than distant things) [
29]. If two superpixels are spatially close, there is a high probability that the two superpixels belong to the same ground class; therefore, the method propagates the expert label information to spatially adjacent superpixels by computing the similarity between the expert labeled superpixel and its neighbor superpixels. The final classification map is composed of the supervised learning classification map and the active learning with the SNSSL classification map. To demonstrate that the proposed AL method can exploit contextual information and expert labeling information more efficiently, the proposed method is compared with the classifications based on state-of-art AL strategies [
4,
30] in classification accuracy.
The rest of this article is organized as follows. In
Section 2, the details of the proposed method are presented. The experimental results are provided in
Section 3, and the discussions are provided in
Section 4. In
Section 5, the conclusion of our work is presented.