Next Article in Journal
Impacts of Climate Change on Oceans and Ocean-Based Solutions: A Comprehensive Review from the Deep Learning Perspective
Previous Article in Journal
Multi-Temporal Dual Polarimetric SAR Crop Classification Based on Spatial Information Comprehensive Utilization
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Few-Shot Unsupervised Domain Adaptation Based on Refined Bi-Directional Prototypical Contrastive Learning for Cross-Scene Hyperspectral Image Classification

1
National University of Defense Technology, Changsha 410073, China
2
Army Engineering University of PLA, Nanjing 210017, China
3
Laboratory for Big Data and Decision, Changsha 410073, China
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Remote Sens. 2025, 17(13), 2305; https://doi.org/10.3390/rs17132305
Submission received: 17 May 2025 / Revised: 26 June 2025 / Accepted: 1 July 2025 / Published: 4 July 2025

Abstract

Hyperspectral image cross-scene classification (HSICC) tasks are confronted with tremendous challenges due to spectral shift phenomena across scenes and the tough work of obtaining labels. Unsupervised domain adaptation has proven its effectiveness in tackling this issue, but it has a fatal limitation of intending to narrow the disparity between source and target domains by utilizing fully labeled source data and unlabeled target data. However, it is costly even to attain labels from source domains in many cases, rendering sufficient labeling as used in prior work impractical. In this work, we investigate an extreme and realistic scenario where unsupervised domain adaptation methods encounter sparsely labeled source data when handling HSICC tasks, namely, few-shot unsupervised domain adaptation. We propose an end-to-end refined bi-directional prototypical contrastive learning (RBPCL) framework for overcoming the HSICC problem with only a few labeled samples in the source domain. RBPCL captures category-level semantic features of hyperspectral data and performs feature alignment through in-domain refined prototypical self-supervised learning and bi-directional cross-domain prototypical contrastive learning, respectively. Furthermore, our framework introduces the class-balanced multicentric dynamic prototype strategy to generate more robust and representative prototypes. To facilitate prototype contrastive learning, we employ a Siamese-style distance metric loss function to aggregate intra-class features while increasing the discrepancy of inter-class features. Finally, extensive experiments and ablation analysis implemented on two public cross-scene data pairs and three pairs of self-collected ultralow-altitude hyperspectral datasets under different illumination conditions verify the effectiveness of our method, which will further enhance the practicality of hyperspectral intelligent sensing technology.

Graphical Abstract

1. Introduction

Hyperspectral image (HSI) performs prominence research value in view of its hundreds of continuous spectral channels with abundant geospatial and spectral information [1,2,3]. The fine HSI classification (HSIC) aims at assigning each pixel to a predefined land cover category based on its inherent spectral and spatial properties [4,5,6,7]. It is one of the most fundamental and commonly addressed contents of various remote sensing applications, such as environmental monitoring, resource investigation, and smart city [8,9,10]. Over the past decade, deep learning (DL) has shown enormous superiority over conventional machine learning (ML) in the fields of natural language processing and computer vision due to its mighty representation capability [11,12,13,14]. Inspired by this, researchers have tried to conquer the issue of HSIC with deep learning algorithms and scored tremendous achievements. An impressive number of HSIC methods for the single scene have been investigated [15,16,17,18,19], which benefit from the massive annotated data and typically follow a common hypothesis, that is, the training and testing sets originate from the independent and identical distributions.
However, in practical remote sensing applications, hyperspectral images (HSIs) collected at different locations or times often violate the assumption of distribution consistency. This is because of differences in climate conditions or changes in the physical properties of the target material over time, resulting in distinct interclass similarity and intraclass variability between HSIs, which is called the spectral shift phenomenon [20]. Therefore, HSIC tasks often encounter cross-scene challenges that will yield a significant performance drop when generalizing the well-trained model to another unlabeled scene [21]. Transfer learning (TL) tries to transfer knowledge from the source domain to the target domain by relaxing the assumption that the training data and the test data must have independent and identical distribution [22]. Consequently, specific task-oriented transfer learning (TL), also known as domain adaptation (DA), has emerged, and it is in line with the real hyperspectral image cross-scene classification (HSICC) task [23,24,25,26,27,28,29,30,31,32], especially the unsupervised domain adaptation (UDA) [33,34,35]. Recent literature is elaborated in Section 3. UDA methods intend to fit the distribution gap between source and target domains to transfer knowledge from a sufficiently labeled source domain to an unlabeled target domain, which predicts the target data without leveraging any labeling information from the target samples.
Although many UDA methods could achieve high accuracy in general HSICC tasks, they all rely on utilizing adequate supervision information in the source domain, together with the unlabeled target samples for domain matching. In real-world scenarios, due to the high annotation cost and enormous annotation difficulty, even acquiring a fully annotated source domain is an extravagant expectation. Usually, in the fields of battlefield situational awareness and resource investigation, we can only obtain annotations for a few samples from the source domain. We call it few-shot unsupervised domain adaptation (FUDA), which is also required to be capable of handling the circumstances of no available labels in the target domain. Unfortunately, there is currently no literature involved in this challenging topic among the hyperspectral community. Almost all the related work focuses on the cross-domain few-shot learning setting, which assumes that the target domain also provides a few labeled data [36,37,38,39,40,41]. For daily RGB images, several FUDA methods [42,43,44,45,46] are proposed to solve the cross-domain classification problem with the scarcity of labeled data in the source domain, which are extensively discussed in Section 2. Since only a very limited number of labeled source samples could be adopted, it is arduous to learn distinguishable features in the source domain. These FUDA methods mine the information of unlabeled samples with the assistance of the self-supervised contrastive learning strategy [42,43] that possesses excellent representational ability or the few-shot learning scheme [44,45,46] so as to achieve domain alignment.
In this paper, inspired by the above works, we propose a refined bi-directional prototypical contrastive learning (RBPCL) architecture for the FUDA setting in the HSICC task that integrates unsupervised clustering learning, prototypical representation learning, and domain alignment with few-shot labeled source data. RBPCL comprises four major components to learn both domain-invariant and category-discriminative features. RBPCL firstly encodes the semantic structure of source and target data into the embedding space and performs K-means clustering according to the class-balanced multicentric dynamic (BMD) prototype strategy [47] to generate more robust and representative prototypes (i.e., representative embedding for a group of semantically similar instances), which tries to avoid the negative transfer trouble induced by class imbalance and the inability to aggregate each class of potential typical samples when employing the routine monocentric prototype strategy [48]. Second, RBPCL leverages the preliminarily encoded semantic information and clustered prototypes to conduct in-domain prototypical contrastive learning to learn better semantic structure. In the process, we design a Siamese-style distance metric loss function to aggregate intra-class features while increasing the discrepancy between inter-class features, ultimately mitigating the impact of false negatives. Third, RBPCL executes cross-domain bi-directional prototypical self-supervision following the robust instance-to-prototype matching pattern for transferring knowledge from the source to the target domain. It is more robust to cross-domain abnormal samples compared with the instance-to-instance matching manner, thus promoting faster and smoother optimization convergence. Fourth, RBPCL integrates prototypical contrastive learning with a cosine classifier, which is adaptively updated by feature vectors from source to target embedding space step by step, contributing to better classification performance on the target domain. The cosine classifier calculates the dot product between feature vectors and each weight vector of the cosine classifier and then inputs them into the softmax layer for probabilistic classification. To further alleviate the effect of cross-domain mismatching, similar to [43], we perform entropy maximization of expected network prediction and mutual-information (MI) maximization between input and network output to obtain more diversified outputs upon source and target datasets and get high-confidence predictions for each sample, respectively.
The main contributions of our proposed RBPCL can be summarized as follows:
1.
We propose a refined bi-directional prototypical contrastive learning (RBPCL) framework for the few-shot unsupervised domain adaptation (FUDA) setting in hyperspectral cross-scene classification (HSICC) tasks. So far as we know, this work is the first attempt to tackle the FUDA setting in the field of hyperspectral cross-scene classification.
2.
We leverage refined in-domain and bi-directional cross-domain prototypical contrastive learning to simultaneously realize efficient category-discriminative feature representation and cross-domain alignment in an end-to-end, unsupervised, and adaptive pattern.
3.
We employ the class-balanced multicentric dynamic (BMD) prototype strategy to facilitate the generation of more representative and robust clustering prototypes. Furthermore, we design a Siamese-style distance metric loss function to gather intra-class features while segregating inter-class features, ultimately promoting refined prototypical self-supervised learning.
4.
We carry out exhaustive crossover experiments on five data pairs with different degrees of domain shifts, including hyperspectral images of spatially disjoint geographical regions and multitemporal images, to demonstrate the effectiveness of the proposed method. It is noteworthy that the practical value of our RBPCL is demonstrated using three ultralow-altitude hyperspectral images, independently collected by an unmanned aerial vehicle (UAV) under varying geographic locations and illumination conditions.
The subsequent sections of this paper are arranged as follows. Section 2 introduces the latest development of related work. Section 3 describes the specific implementation of the proposed RBPCL model. Section 4 provides the details of experiments with corresponding results and analysis. The conclusions are given in Section 6.

2. Related Work

2.1. Unsupervised Domain Adaptation

The conventional unsupervised domain adaptation (UDA) setting aims to reduce the domain gap and utilize sufficient labeled source domain data to accomplish classification in the unlabeled target domain. There are plenty of UDA methods that show efficient performance, which are based on a Maximum Mean Discrepancy loss (MMD) to minimize the feature divergence across domains [34,49,50,51]. Besides, with the development of generative adversarial networks, adversarial training is widely used to learn a representation that is simultaneously discriminative in the source domain and domain-invariant for tackling domain shift [52,53,54,55,56]. Moreover, several methods train the classifier in both the source domain and target domain by leveraging pseudo-labels from the target domain to calculate classification loss [57,58,59,60]. In addition, self-supervised learning has been adopted for unsupervised domain adaptation recently [61,62]. For HSICC tasks, Wang et al. designed a domain adversarial broad adaptation network (DABAN), which integrates a bottleneck adaptation module and a conditional adaptation broad network [25]. Cheng et al. propose an unsupervised HSI classification method called soft instance-level domain adaptation with the virtual classifier [26]. Moreover, UDA with dense-based compaction (UDAD) and UDA with content-wise alignment (UDACA) were proposed for the HSICC task successively [28,29]. Zhao et al. recently proposed a novel UDA framework toward multilevel features and decision boundaries (ToMF-B) for the HSICC task [30]. Overall, these UDA methods expect full supervision of the source domain for domain alignment and cross-domain classification but would not work when encountering the dilemma of a scarce labeled source domain existing in the FUDA setting for HSICC tasks.

2.2. Prototypical Contrastive Learning

Self-supervised learning (SSL) [63] is a subset of unsupervised learning methods, which refers to learning strong semantic feature representations without the demand for class label supervision. Instead, it automatically builds supervision information from unlabeled data. Instance-wise contrastive learning (ICL) [63,64,65] is an extensively used self-supervised learning strategy to improve the discriminative ability with the idea of pulling different views from the same sample together and pushing negative samples far apart using instance-level contrastive loss. Some works demonstrate their superiority in classification downstream tasks compared with the massive ICL frameworks, focusing on the prototypical contrastive learning (PCL) task [66,67,68]. PCL introduces prototypes as latent variables to help search for the maximum-likelihood estimation of the network parameters in an Expectation-Maximization framework. PCL [66] combines a cluster algorithm with contrastive learning, and it replaces the InfoNCE loss with the ProtoNCE loss for contrastive learning to promote embeddings closer to their assigned prototypes and far from negative prototypes. SPCL [67] employs a Siamese-style metric loss to mitigate the impact of hard false-negative samples, which match intra-prototype features while expanding the distance between inter-prototype representations. In the hyperspectral community, Cao et al. [69] apply unsupervised prototypical contrastive learning to hyperspectral classification by designing two autoencoder-based modules. Wang et al. [70] propose an adversarial prototype learning (APL) method for training an accurate HSI classification model in a uniform manner when the training set contains few, high-dimensional, and biased samples.

2.3. Few-Shot Unsupervised Domain Adaptation

Compared with UDA, few-shot unsupervised domain adaptation (FUDA) handles UDA tasks by leveraging a few labeled source domain samples. Besides, cross-domain FSL [36,37,38,39,40,41,70,71,72,73,74,75,76,77,78] is not capable of solving the issues of no available labels in the target domain and the large domain gap between the support and query sets in every task, which will be settled in the FUDA. In the field of computer vision, there have been a few attempts concerned with the FUDA setting recently. Kim et al. proposed a cross-domain self-supervised (CDS) [42] learning approach for domain adaptation, which captures apparent visual similarity with in-domain self-supervision in a domain-adaptive manner and performs cross-domain feature alignment with cross-domain self-supervised learning. However, cross-domain instance-to-instance matching in CDS is very sensitive to abnormal samples. Yue et al. proposed the prototypical cross-domain self-supervised learning (PCS) [43] framework, which performs prototype self-supervised learning in-domain and cross-domain concurrently, but the coarse-grained clustering manner could easily lead to negative transfer. In addition, Huang et al. [44] put forward the image-to-class sparse similarity encoding (IMSE) method, which utilizes local features to learn similarity patterns for cross-domain similarity measurement. Meta-FUDA [45] performs task-level transfer and domain-level transfer jointly through leveraging meta-learning-based optimization. Yu et al. [46] proposed a task-specific semantic feature learning method (TSECS) for FUDA. TSECS learns high-level semantic features for image-to-class similarity measurement and designs a cross-domain self-training strategy to fully utilize the few labeled samples from the source domain to establish the classifier in the target domain.

3. Methodology

3.1. Problem Definition

The few-shot unsupervised domain adaptation setting (FUDA) involves two domains: a source domain X s + = X s X s u (including a very limited number of labeled samples for each category X s = x i s ,   y i s i = 1 n s , as well as unlabeled samples X s u = x i s u i = 1 n s u , where n s represents the number of labeled samples from source domain, n s u denotes the number of unlabeled samples from source domain) and a target domain X t u (including the unlabeled samples X t u = x i t u i = 1 n t u with different data characteristics and homogeneous label space from the source domain, where n t u denotes the number of unlabeled samples in the target domain.), where x i s ,   x i s u ,   x i t u R H × W × B , x i s and x i s u are collectively referred to as x i s + , and y i s 0 ,   1 ,   2 , ,   K H × W (H and W denote the spatial dimensions (height and width) of the hyperspectral image, B represents the total number of spectral bands, and k is the number of ground-truth categories). Moreover, in the FUDA setting, the number of classes is known, and we represent it as n c . Our goal is to train a cross-domain classification model by utilizing these three data sources ( X s , X s u , and X t u ) for evaluating on the target domain X t u .

3.2. Overall Framework

The overview of our proposed RBPCL is presented in Figure 1. The base model includes a feature generator with the Double-Branch Dual-Attention Mechanism Network (DBDA) [79], a l 2 normalization layer, which outputs a normalized feature vector, and a cosine similarity-based classifier. Specifically, hyperspectral image patches are utilized as input samples, which are extracted around the pixels being tackled. To this end, DBDA is employed as the shared feature generator for both domains, consisting of two distinct branches—namely, the spectral and spatial branches—that are designed to capture spectral and spatial features independently. Moreover, the channel-wise attention mechanism and spatial-wise attention mechanism are adopted to refine the feature maps. Thus, the spectral-spatial joint features are extracted for the HSICC task, benefiting from the synergy between spatial and spectral information in the course of training. Based on this, our approach achieves discriminative feature learning and domain alignment through unifying k-means clustering, intra-domain refined prototypical contrastive learning, bi-directional inter-domain prototypical contrastive learning, and adaptive prototypical classifier learning adaptively.

3.3. In-Domain Refined Prototypical Contrastive Learning

For domain adaptation, if prototypical contrastive self-supervised learning is naively applied on X s + = X s X t u would lead to potential problems. Primarily due to the domain shift, samples of different categories from different domains could be mistakenly gathered into the same cluster, and samples of the same category from different domains could be mapped into clusters that are far away from each other. To alleviate these problems, we perform prototypical contrastive learning separately in X s + and in X t u to prevent the negative transfer caused by incorrect cluster generation across domains and indistinguishable feature learning.
Concretely, we maintain two memory banks, V S and V t , for source and target domains, respectively:
V s = [ v 1 s ,   v 2 s , v ( n s + n s u ) s ] ,   V t = [ v 1 t ,   v 2 t , v n t u t ] ,
where v i s , v i t are respectively the stored feature vectors of x i s + and x i t u , updated with a momentum m = 0.99 after each batch and initialized with f i s , (the output of feature extractor F, f i s = F ( x i s + ) ,   f i t = F ( x i t u ) ):
v i s m v i s + ( 1 m ) f i s , v i t m v i t + ( 1 m ) f i t .
For in-domain prototypical contrastive learning, k-means clustering is performed on V s and V t to obtain source and target clusters. However, the domain gaps between source and target are typically different between categories, resulting in relatively higher prediction confidence scores for those easy-transfer classes in both domains. Moreover, due to the domain gap, we consider that a rough monocentric feature prototype for each class could not effectively represent the source and target data and would also induce negative transfer, particularly for those hard-transfer samples. Therefore, we introduce a class-balanced multicentric dynamic prototype (BMD) strategy to generate more representative and robust prototypes, thereby promoting prototypical self-supervised training.
The BMD strategy consists of two main components, namely inter-class balanced prototype and intra-class multicentric prototype, which can jointly prevent the gradual dominance of easy-transfer categories on prototype generation and reduce the noisy labels for those hard-transfer samples. We compare the BMD strategy with existing methods in Figure 2. The inter-class balanced prototype and the intra-class multicentric prototype schemes are presented in Figure 3 and Figure 4, respectively.
In the inter-class balanced prototype scheme, we select M samples with the top-M scores for each category of ground objects as the potential represented samples along both domains X s + and X t u for the k-th class and use spherical k-means for clustering, where k [ 1 , 2 , , k ] , K = n c . Since the top instances are most likely to be positive with k-th class, we can use the class-balanced feature prototypes μ k s and μ k t .
In the intra-class multicentric prototype scheme, we assume the index of sampled top-M instances for k-th class in the source domain as M k s . The predefined number of multiple feature prototype is R. We represent each instance x s + as the extracted feature F ( x s + ) and finally employ the classical k-means algorithm to attain intra-class clustering. Therefore, we can iterate this process to obtain more stable and representative source prototypes as follows, which is similar to the target domain.
M k s = arg max x s X S + M k s = M max 1 r R exp F x s + × μ k , r s k = 1 K max 1 r R exp F x s + × μ k , r s
μ k , r s r = 1 R = K m e a n s i M k s F x i s +
where M = max 1 , n s + n s u γ × K , γ is a hyperparameter denoting the top-M selection ratio.
It can be inferred from above that k-means clustering is implemented on V s and V t to obtain source clusters C s = C k , r ( s ) k [ 1 , K ] , r [ 1 , R ] and similarly target clusters C t with normalized source prototypes μ k , r s k [ 1 , K ] , r [ 1 , R ] and normalized target prototypes μ k , r t k [ 1 , K ] , r [ 1 , R ] . Concretely, μ k , r s = u k , r s u k , r s , where u k , r s = 1 C k , r ( s ) v i s C k , r ( s ) v i s . We elaborate only on the source domain for succinct notation; all manipulations are conducted on the target domain equally.
We compute a feature vector f i s = F ( x i s ) with the feature generator F during the training process. To execute in-domain prototypical contrastive learning, the similarity distribution vectors are calculated between f i s and μ k , r s k [ 1 , K ] , r [ 1 , R ] as P i s = [ P i , 1 s , P i , 2 s , , P i , K s ] , with
P i , k s = r = 1 R exp μ k , r s × f i s / ϕ k = 1 K r = 1 R exp μ k , r s × f i s / ϕ .
wherein ϕ is a temperature value determining the degree of concentration. Then the in-domain prototypical contrastive loss can be calculated as:
L P C = i = 1 n s + n s u L C E P i s , c s ( i ) + i = 1 n t u L C E P i t , c t ( i ) ,
where c s ( · ) and c t ( · ) indicate the cluster index of the sample.
In addition, we design a Siamese-style distance metric (SDM) loss function L M e t r i c to aggregate intra-class features more effectively while increasing the discrepancy between inter-class features, ultimately mitigating the influence of false negatives. Concretely, for source domain, we randomly draw two feature embeddings, denoted as ( f k s , f ^ k s and ( f k s , f ^ k s , that belong to prototype as positive and negative sample pairs, respectively.
Then the overall L M e t r i c can be computed as
L Metric s = 1 k , k K k k L C E dis f k s , f ^ k s , 1 + L C E dis f k s , f ^ k s , 0 ,
L Metric t = 1 k , k K k k L C E dis f k t , f ^ k t , 1 + L C E dis f k t , f ^ k t , 0 ,
L Metric = L Metric s + L Metric t ,
where dis ( · ) = 0.5 + 0.5 c o s f , f ^ , and c o s · represents the cosine distance.
Finally, we implement k-means on the instances A times with the number of clustered categories K and multiple feature prototype R. Thus, the ultimate loss for in-domain prototypical contrastive self-supervised learning is written as
L I n D o m a i n = 1 A a = 1 A L P C a + L M e t r i c a .

3.4. Bi-Directional Cross-Domain Prototypical Contrastive Learning

We obtain the category-discriminative semantic features in each domain by the above operations. To explicitly enforce domain matching between source and target domains, we conduct bi-directional cross-domain prototypical self-supervised learning in an instance-prototype manner. For the purpose of discovering a match for an instance x i , we perform entropy minimization on the similarity distribution vector between its representation and the centroids of the other domain.
Concretely, given clustering centroids μ k , r t k [ 1 , K ] , r [ 1 , R ] in the target domain and feature vector f i s in the source domain, we initially calculate the similarity distribution vector P i s t = [ P i , 1 s t , P i , 2 s t , , P i , K s t ] , in which
P i , k s t = r = 1 R e x p ( μ k , r t × f i s / τ ) k = 1 K r = 1 R e x p ( μ k , r t × f i s / τ ) .
Subsequently, we minimize the entropy of P i s t as:
H ( P i s t ) = k = 1 K P i , k s t l o g P i , k s t .
Correspondingly, we can calculate H ( P i t s ) , and the ultimate loss for cross-domain prototypical contrastive learning is:
L C r o s s D o m a i n = i = 1 n s + n s u H ( P i s t ) + i = 1 n t u H ( P i t s ) .

3.5. Adaptive Prototypical Classifier Learning

The objective of this section is to learn an even better domain-aligned, discriminative feature encoder F and, more significantly, a cosine classifier C that could achieve high classification accuracy on the target data.
The cosine classifier C consists of weight vectors W = [ w 1 , w 2 , , w K ] , where K = n c denotes the total number of categories. The output of C, 1 T W T f is input to a softmax layer σ to acquire the final probabilistic output p ( x ) = σ ( 1 T W T f ) , where T is a temperature parameter. With the availability of the labeled samples in X s , we first train F and C with a standard cross-entropy loss for classification straightforwardly:
L c l s = E ( x , y ) X s L C E ( p ( x ) , y ) ,
However, in the FUDA setting, due to the few-shot labeled samples in X s , it is almost impossible to obtain C with excellent classification performance on the target data only by leveraging L c l s for training.
Thus, we employ adaptive prototype classifier update (APCU) to obtain C for classifying target samples accurately. The direction of updating the weight vector w k needs to be representative of features belonging to the corresponding class k. This indicates that the semantics of w k coincide with the ideal cluster prototype of class k. We employ an estimate of the ideal cluster prototypes to update W. Yet the computed μ k , r s and μ k , r t cannot be naively adopted for this purpose, not only because the correspondence between w k and μ k , r is unclear, but also because the k-means result may contain very impure clusters, resulting in non-representative prototypes.
Therefore, we utilize the few-shot labeled data as well as samples with high-confidence predictions to estimate the prototype for each category. Formally, we define X s ( k ) = x | ( x , y ) X s , y = k , k 1 , 2 , , K and denote the set of unlabeled samples with high-confidence label k as X s u k and X t u k . With the assistance of p ( x ) = [ p ( x ) 1 , p ( x ) 2 , , p ( x ) K ] , X s u k = x | x X s u , p ( x ) k > ξ , which is similar for X t u k , where ξ is a confidence threshold. Thus, the estimate of w k from the source and target domains can be calculated as
w ^ k s = 1 X s + ( k ) x X s + ( k ) V s ( x ) ; w ^ k t = 1 X t u ( k ) x X t u ( k ) V t ( x ) ,
where X s + k = X s ( k ) X s u ( k ) , and V ( x ) is the embedding in memory bank corresponding to x.
Subsequently, we employ a domain adaptive scheme to update w k , with w ^ k s in the early training stage and with w ^ k t during the later stage. Since w ^ k s is more robust in early training stage with the support of few labeled source samples, while w ^ k t would be more representative later for target domain to achieve better cross-domain adaptation performance. Specifically, we utilize X t u ( k ) to decide whether w ^ k t is robust to adopt:
w k = unit w ^ k s if X tu ( k ) < t w unit w ^ k t otherwise ,
where u n i t ( · ) represents the normalization of the input vector, and t w is a threshold hyperparameter.
In addition, in order for promoting the above joint prototype-classifier learning paradigm to work well, the output predictions of the network are desired to be adequately confident. In other words, it means obtaining robust w ^ k s and w ^ k t for k = 1 , 2 , , K . First, we maximize the entropy of expected network prediction H ( E x X [ p ( y | x ; θ ) ] ) to facilitate the network to have diversified outputs over the dataset, where θ indicates learnable parameters in both F and C, and X = X s X s u X t u . Second, we adopt entropy minimization on the network output to obtain a high-confidence prediction for each instance, which has shown efficacy in label-scarce scenarios [80]. The prior distribution p 0 is given by E x X [ p ( y | x ; θ ) ] .
Thus, the above two operations are equivalent to maximizing the mutual information between input and output:
I ( y ; x ) = H ( p 0 ) E x [ H ( p ( y | x ; θ ) ) ]
The detailed derivation is refered to in [43]. Then the loss of mutual information maximization L M I M can be written as
L M I M = I ( y ; x ) .
To sum up, our proposed RBPCL framework performs class-balanced multicentric dynamic prototype generation, in-domain refined prototypical contrastive learning, bi-directional cross-domain prototypical contrastive learning, and unified adaptive prototype-classifier learning. Together with APCU in Equations (15) and (16), the overall objective of model training is
L R B P C L = L c l s + λ i n L I n D o m a i n + λ c r o s s L C r o s s D o m a i n + λ m i m L M I M .
In order to provide a concise yet comprehensive description, the training procedure of RBPCL, derived from the preceding elaboration, is formally outlined in Algorithm 1. Subsequently, the classification performance of the proposed framework is evaluated on target datasets using the trained feature generator F and the cosine similarity-based classifier C.
Algorithm 1: Training Procedure for RBPCL
Remotesensing 17 02305 i001

4. Experiments

This section presents an empirical evaluation of RBPCL on five representative hyperspectral dataset pairs, each reflecting domain shifts induced by differences in geographic locations and acquisition dates. Specifically, two pairs are widely adopted benchmark datasets, whereas the others were acquired over two Chinese cities using a UAV-based hyperspectral sensor system under diverse lighting environments.

4.1. Datasets

4.1.1. Pavia Data Pair

The Pavia dataset pair comprises two distinct scenes: Pavia University (PU) and Pavia Center (PC). Both datasets were acquired by the ROSIS sensor during an airborne mission over the city of Pavia in northern Italy. After excluding several noisy spectral bands, the remaining data offer high-resolution spectral measurements spanning the wavelength interval from 0.43 to 0.86. The imagery has a spatial resolution of 1.3 m. PU initially contains 103 spectral bands, and PC contains 102 bands; to ensure consistency across the pair, the final band of PU is discarded. It is important to note that the two scenes do not share an identical set of land cover categories. Therefore, we select seven common classes for our experimental evaluations. As discussed in Yang et al. [81], classifying these images is highly challenging due to substantial intra-class variation and limited inter-class distinction. The number of labeled samples in each selected class is provided in Table 1. The false-color composites for PU and PC are shown in Figure 5a and Figure 5b, respectively. Their corresponding ground-truth maps, containing the seven shared categories, are illustrated in Figure 5c,d, where the background regions are marked in black.

4.1.2. Houston Data Pair

The Houston dataset pair was provided by the NSF-funded National Center for Airborne Laser Mapping (NCALM), covering both the University of Houston campus and adjacent urban areas. It comprises two temporally distinct hyperspectral images, acquired in 2012 and 2017, respectively, using the ITRES-CASI 1500 hyperspectral sensor, DATA SURPASS TECHNOLOGY CO., LTD., Taichung City, Taiwan. These images were featured in the IEEE GRSS Data Fusion Contests of 2013 and 2018. The spatial resolutions differ between the two datasets: 2.5 m for Houston 2013 (HO2013) and 1 m for Houston 2018 (HO2018). Specifically, HO2013 consists of 349 × 1905 pixels with 144 spectral bands, while HO2018 includes 601 × 2384 pixels and 48 bands. Both datasets span a spectral range of 380–1050 nm. To align the spectral dimensionality between the two datasets, every three adjacent bands in HO2013 were averaged into one, resulting in a 48-band structure consistent with HO2018 [82]. For the cross-scene classification task, seven shared classes were selected. Prior studies by Liu et al. [21,24] have discussed spectral amplitude shifts that challenge domain adaptation on this dataset. The number of labeled samples per class is summarized in Table 2. Figure 6a,b displays the false-color representations of HO2013 and HO2018, while the corresponding ground-truth maps with the selected seven categories are shown in Figure 6c and Figure 6d, respectively.

4.1.3. Self-Collected Data Pairs

To evaluate the real-world applicability of the proposed RBPCL method under ultra-low-altitude flight conditions, we conducted aerial hyperspectral imaging using a Hexacopter UAV equipped with a SPECIM_FX17E hyperspectral camera, SPECIM Company, Oulu, Finland. Atmospheric correction was applied to the captured data. A total of three hyperspectral images were collected and organized into three distinct data pairs, each demonstrating varying levels of spectral shift. One of these images, referred to as CSSunny, was acquired in Changsha at 4:00 PM on 27 September 2021, under sunny conditions with a flight altitude of 30 m. This scene includes six categories of ground objects. The remaining two images were captured in Tianjin, each containing twelve object classes, and flown at an altitude of 60 m. The first, named TJSunny, was obtained at 3:00 PM on 21 March 2023, under sunny weather, while the second, designated TJCloudy, was recorded at 3:30 PM on 22 March 2023, in cloudy conditions. All three images share the same spatial resolution of 482 × 640 pixels and cover the spectral range from 935 nm to 1720 nm with 224 spectral bands. As shown in Figure 7a–c, the false-color visualizations of these self-collected datasets are provided, while the corresponding ground-truth maps with eight common classes are illustrated in Figure 7d–f. Then we combine three self-collected images into three data pairs, namely TJSunny-TJCloudy, TJSunny-CSSunny, and TJCloudy-CSSunny. In particular, the categories of the HSI images in the TJSunny-CSSunny and TJCloudy-CSSunny scenes are not exactly the same, and we select six classes shared for experiments. The sample counts for each class in the three self-collected dataset pairs are detailed in Table 3 and Table 4. As illustrated in Figure 8, despite variations in illumination caused by differences in date and location, the spectral profiles of individual materials remain largely consistent in shape, while their amplitudes exhibit notable fluctuations. This suggests that different materials respond to changes in light intensity with varying degrees of sensitivity. It is worth noting that the light intensity of CSSunny is significantly stronger than that of TJSunny, both of which are collected in sunny weather. Specifically, subtle observation of Figure 7 can reveal differences in the spectral curve shapes between hyperspectral images collected on sunny and cloudy days within certain wavelength ranges.

4.2. Implementation Details

In our experiments, the proposed method was evaluated alongside seven baseline techniques. For each data pair, both images were alternately treated as the source domain to ensure comprehensive cross-domain comparison. The training phase of all experiments is conducted on three-shot source labels per class in all data pairs with the assistance of unlabeled source and target samples. Subsequently, the classification accuracy of all methods was assessed on the corresponding target domains. For input construction, we extracted HSI patches of size 9 × 9 , with the label assigned according to the class of the pixel located at the geometric center of each patch.
To facilitate a comprehensive comparison among intra-domain classification, unsupervised domain adaptation (UDA), and few-shot unsupervised domain adaptation (FUDA) approaches, we implemented a baseline using the DBDA model [79], which is trained solely on the source domain and directly applied to the target data without any adaptation. This baseline, referred to as “S-only”, serves as a non-transfer setting. The performance difference between “S-only” and other domain adaptation strategies reflects the relative effectiveness of each method in transferring knowledge across domains. In addition, four representative UDA techniques (DANN [52], MDDIA [60], DABAN [25], and SILDA-VC [26]) and four FUDA methods (CDS [42], PCS [43], PCS+BMD, and PCS+SDM) are included to assess the capability of our proposed RBPCL framework in addressing cross-domain hyperspectral image classification with limited labeled source data. These comparisons also highlight the advantages of integrating both the BMD clustering strategy and SDM loss within the in-domain and inter-domain prototypical contrastive learning process. It is worth mentioning that, aside from SILDA-VC—which maintains a graph convolutional backbone—the feature generator architecture F is kept consistent across all evaluated methods.
All experiments were conducted using the PyTorch 2.2.0 framework and Python 3.10 on a system equipped with an NVIDIA TITAN RTX GPU (24 GB RAM), NVIDIA, Santa Clara, CA, USA. To accelerate clustering operations, we employed the GPU-based implementation of k-means provided by FAISS [83]. The network was trained for 200 epochs. During self-supervised learning, both source and target domains used a mini-batch size of 32, while a batch size of 16 was applied for the classification loss. The optimization process was carried out using stochastic gradient descent (SGD) with a learning rate of 0.01 and a momentum parameter of 0.9. The learning rate ratio between the linear layer and the convolution layer is set to 1: 0.1. We use SGD with a weight decay rate of 5 e 4 . We adaptively set the temperature according to [66]. The temperature τ is set as 0.1 in all experiments. We choose hyper-parameters λ i n = 1 and λ c r o s s = 0.5 , and the weight λ m i m = 0.05 . As for parameters m (momentum for memory bank update) and A (number of k-means in L I n D o m a i n ), we set m = 0.5 and A = 20 . We calculate the weights for the cosine classifier only utilizing source data for the first five epochs and set t w to be around half of the average number of instances per class. Besides, for the k-means cluster, we fix the hyperparameters γ and T to 3 and 0.05, respectively, for all data pairs. And we set R = 4 for Pavia data pair and the Houston data pair and R = 2 for self-collected data pairs. It is worth noting that centroids of clusters and weights of cosine classifiers are updated per epoch for both self-supervised learning and classification.
To comprehensively assess the performance of the compared methods, we employ three objective evaluation metrics: overall accuracy (OA), average accuracy (AA), and the statistical Kappa coefficient. The OA refers to the number of correctly classified samples divided by the total number of samples. AA refers to the average of the classification accuracy for each category. Kappa coefficients are adopted in consistency tests to measure whether the model predictions are consistent with the actual classification results. All classification results reported in this article are averaged over ten random conductances.

4.3. Results and Ablation Analysis

For the Pavia data pair, we randomly select three labeled samples from the PU dataset to form the training set together with unlabeled samples from the PU and PC datasets, then perform unsupervised transfer and classification tests on the PC scenario. The classification results are given in Table 5. Correspondingly, the second experiment swaps the domain identities of PU and PC images, and the unsupervised classification results of the PU scenario are shown in Table 6.
Moreover, the comparison of experimental results on the Houston data pair and self-collected data pairs is displayed in Table 7, Table 8, Table 9, Table 10, Table 11, Table 12, Table 13 and Table 14. It can be seen that our RBPCL method almost provides optimal results on three objective indicators in all experimental scenarios. This indicates the superiority of the proposed method in dealing with the FUDA problem for HSICC tasks. Furthermore, the classification performance of the three FUDA methods (e.g., RBPCL, PCS, and CDS) is significantly better than the UDA methods. This suggests that it is essential to research FUDA methods for HSICC tasks. Furthermore, it is worth noting that the PCS+SDM method, PCS+BMD method, and our RBPCL method achieve better classification results than PCS, which shows that the SDM loss and BMD strategy play a positive role in the process of prototype self-supervised learning and adaptive prototypical classifier learning. The classification performance of PCS+BMD overall outperforms PCS+SDM, which validates that the BMD strategy is more advantageous since it generates more robust and representative prototypes during the clustering process and then promotes intra-domain and cross-domain prototype contrastive learning.
To present visualized classification results and enhance the effectiveness of our proposed method, we presented labeled pixel classification maps displaying the Pavia University → Pavia Center in Figure 9. Moreover, we presented all-pixel classification maps of the TJSunny → CSSunny scenes in Figure 10. Since the distribution of ground object targets in PC hyperspectral images is relatively scattered and lacks strong spatial continuity, it is more distinguishable to display the classification results, excluding background pixels. Observing these classification maps in Figure 9 and Figure 10, we can conclude, consistent with the experimental results displayed in Table 10, Table 11, Table 12, Table 13 and Table 14. Except for individual categories, our proposed method has the best overall classification performance.
Besides, in order to demonstrate the advantages of the proposed RBPCL method more visually, the t-SNE visualizations of three FUDA-based methods (CDS, PCS, and RBPCL) are presented. Among them, different colors represent different categories of target objects. The t-SNE scatter of cross-scene learned features plotted on the Pavia data pair is presented in Figure 11, which contains our RBPCL method and baselines CDS and PCD. And the t-SNE visualizations of cross-scene learned features in Houston data pairs and self-collected data pairs are displayed in Figure 12, Figure 13, Figure 14 and Figure 15. It can be observed from Figure 11, Figure 12, Figure 13, Figure 14 and Figure 15 that the feature distribution of our RBPCL method is significantly better than that of the other two algorithms, CDS and PCS, which means the proposed RBPCL method possesses the ability to learn better discriminative semantic features compared with the CDS and PCS methods. Especially, t-SNE visualization figures qualitatively reveal that RBPCL clusters instances with the same category in the feature space well, and the features learned from RBPCL are more closely aggregated than CDS and PCS. Therefore, it demonstrates that the refined prototypical contrastive self-supervised training process enhances the learning capability of our framework to some extent.
In particular, we constructed three data pairs based on self-collected datasets, representing different degrees of spectral shift. The TJCloudy-CSSunny data pair exhibits the most obvious spectral shift phenomenon due to the large span of time and space, as well as the large difference in illumination conditions, which can be confirmed by Figure 8. Moreover, our proposed method can stand out in all data pairs with varying degrees of spectral shift, which once again verifies its superiority and stability.

4.4. Impact of Different Parameter Settings

Our experiments involve some important hyperparameter settings, which directly affect the experimental results. Based on previous works [43,47], we preliminarily determined parameter schemes and adjusted them according to the characteristics of different hyperspectral datasets in our experiments and ultimately obtained the optimal experimental parameters through parameter analysis experiments. In this section, we conducted parameter analysis experiments on the number of prototypes R in the multi-prototype strategy and the weight parameters ( λ i n , λ c r o s s , and λ m i m ) of the loss function. Firstly, we set R from 1 to 6 for comparative experiments, and classification results on Pavia and TJSunny-CSSunny data pairs are shown in Figure 16, Figure 17, Figure 18 and Figure 19. We can observe that the cross-domain classification performance of the RBPCL method shows a trend of first increasing and then decreasing with the increase in the multi-feature prototype R. The optimal R value on the Pavia data pair is 4, and the optimal R value on the TJSunny-CSSunny data pair is 2. This indicates that the more prototypes, the worse the classification performance will be. Reasonable R values should be set according to the characteristics of different datasets.
Then, we conducted analysis experiments on different combinations of loss weights ( λ i n , λ c r o s s , and λ m i m ). Experimental results of Pavia and TJSunny-CSSunny data pairs are presented in Table 15, Table 16, Table 17 and Table 18. We can conclude that the optimal combination of loss weights is λ i n = 1 , λ c r o s s = 0.5 , and λ m i m = 0.05 . Moreover, we also obtained a consistent conclusion in parameter analysis experiments for the other three data pairs.

4.5. Effects of Losses

In this section, we analyze three loss modules described in Section 3. Specifically, we compare the performance of average accuracy (AA), overall accuracy (OA), and Kappa by fixing the weights of the loss modules ( L I n D o m a i n , L C r o s s D o m a i n , and L M I M ) on experiments for all data pairs. Experimental results are shown in Table 19 and Table 20, and the setting of only L c l s is equivalent to the S-only method in Table 5, Table 6, Table 7, Table 8, Table 9, Table 10, Table 11, Table 12, Table 13 and Table 14. As we can observe in Table 19 and Table 20, adding each component contributes to the final adaptation accuracy without any performance degradation, which demonstrates the effectiveness of all loss modules in our RBPCL framework.

4.6. Impact of the Labeled Samples in Source Domain

In this section, we conduct comparison experiments to demonstrate the effect of different numbers of labeled source samples on three FUDA algorithms. Concretely, 3, 5, 7, and 9 labeled samples per class in the source domain dataset are selected for the training; the cross-domain classification results are shown in Figure 20. It is obvious that the classification performance of the proposed RBPCL method shows an upward trend in all three indicators for each data pair as the labeled sample size increases. Overall, our method consistently exceeds the other two methods by a large margin in OA. Finally, the above experimental results demonstrate that the advantage of our proposed algorithm is its capability of adapting to circumstances with variations in the number of labeled source samples.
Table 15. Cross-scene classification results (%) OF PU → PC under different lambda ( λ i n , λ c r o s s , and λ m i m )-value settings.
Table 15. Cross-scene classification results (%) OF PU → PC under different lambda ( λ i n , λ c r o s s , and λ m i m )-value settings.
λ in λ cross λ mim AAOAKappa
110.0572.3972.0163.74
0.510.0571.368.6562.59
10.50.0574.5771.2465.96
0.50.50.0569.9369.2261.81
110.0172.6170.2564.15
0.510.0170.8468.3862.07
10.50.0173.4570.6964.28
0.50.50.0168.7369.4761.46
Table 16. Cross-scene classification results (%) OF PC → PU under different lambda ( λ i n , λ c r o s s and λ m i m )-value settings.
Table 16. Cross-scene classification results (%) OF PC → PU under different lambda ( λ i n , λ c r o s s and λ m i m )-value settings.
λ in λ cross λ mim AAOAKappa
110.0572.4770.6960.24
0.510.0571.2269.3859.77
10.50.0573.671.4861.35
0.50.50.0570.8668.5659.14
110.0171.6770.8459.58
0.510.0170.7567.9258.76
10.50.0172.5168.8960.43
0.50.50.0170.8367.1559.06
Table 17. Cross-scene classification results (%) OF TJSunny → CSSunny under different lambda ( λ i n , λ c r o s s and λ m i m )-value settings.
Table 17. Cross-scene classification results (%) OF TJSunny → CSSunny under different lambda ( λ i n , λ c r o s s and λ m i m )-value settings.
λ in λ cross λ mim AAOAKappa
110.0581.7780.0469.81
0.510.0580.1378.6670.28
10.50.0585.6281.1571.64
0.50.50.0579.7877.5970.94
110.0180.3779.4169.36
0.510.0180.2177.6469.13
10.50.0182.3181.8670.02
0.50.50.0179.5576.9768.87
Table 18. Cross-scene classification results (%) OF CSSunny → TJSunny under different lambda ( λ i n , λ c r o s s and λ m i m )-value settings.
Table 18. Cross-scene classification results (%) OF CSSunny → TJSunny under different lambda ( λ i n , λ c r o s s and λ m i m )-value settings.
λ in λ cross λ mim AAOAKappa
110.0572.6176.3365.24
0.510.0571.0676.8766.42
10.50.0576.4780.5570.62
0.50.50.0572.1877.3567.53
110.0171.3675.4764.95
0.510.0170.1176.0465.88
10.50.0174.3978.7668.13
0.50.50.0170.9974.8266.75
Table 19. Perfomance contribution (%) of each part in RBPCL framework (AA/OA/KAPPA).
Table 19. Perfomance contribution (%) of each part in RBPCL framework (AA/OA/KAPPA).
MethodPU → PCHO2013 → HO2018TJSunny → TJCloudyTJSunny → CSSunnyTJCloudy → CSSunny
L c l s 49.79/48.97/39.1547.40/37.10/18.7544.38/39.08/32.2967.81/58.06/42.2649.07/47.05/35.53
+ L I n D o m a i n 56.74/52.11/43.9856.22/41.53/24.8751.63/49.16/43.2172.10/63.88/51.4262.52/53.31/44.48
+ L C r o s s D o m a i n 64.37/60.09/52.3465.56/57.93/42.7560.74/57.37/52.4677.34/75.73/62.1970.49/62.16/51.83
+ L MIM + APCU 74.57/71.24/65.9677.01/64.09/49.2767.45/68.05/63.2385.62/81.15/71.6478.26/69.69/60.65
Table 20. Perfomance contribution (%) of each part in RBPCL framework (AA/OA/KAPPA).
Table 20. Perfomance contribution (%) of each part in RBPCL framework (AA/OA/KAPPA).
MethodPC → PUHO2018 → HO2013TJCloudy → TJSunnyCSSunny → TJSunnyCSSunny → TJCloudy
L c l s 55.51/40.68/30.9639.05/42.36/32.2250.46/50.25/44.3262.59/41.55/28.2852.67/28.18/15.94
+ L I n D o m a i n 60.50/49.37/39.6152.08/50.62/41.9762.72/54.52/48.4468.97/60.65/41.0360.46/40.73/30.51
+ L C r o s s D o m a i n 65.93/58.82/50.0663.63/61.79/54.4571.83/66.39/61.8172.36/70.24/57.8967.28/55.19/38.72
+ L MIM + APCU 73.60/71.48/61.3574.00/74.48/69.7982.18/70.88/66.7876.47/80.55/70.6275.25/64.38/49.04

5. Discussion

In the future, considering that our method requires sufficient unlabeled source samples and unlabeled target samples to form the training set, due to restrictions on data privacy protection, we will further explore the applicability of few-shot learning to reduce dependence on unlabeled training samples in our subsequent studies.

6. Conclusions

In this paper, we explore breaking through the dilemma of hyperspectral cross-scene classification (HSICC), that is, the spectral shift phenomenon between different scenes, unlabeled target domains, and even label-scarce source domains. In response to this dilemma, we propose an end-to-end refined bi-directional prototypical contrastive learning (RBPCL) framework to learn more discriminative feature representation and achieve better domain alignment for the FUDA setting in HSICC tasks and finally enable efficient unsupervised classification in target hyperspectral images. The proposed RBPCL method comprises refined in-domain prototypical self-supervised learning, bi-directional cross-domain prototypical instance-to-prototype contrastive learning, and adaptive prototypical classifier learning, which are jointly optimized within an end-to-end framework and function in a complementary manner. Besides, in order to promote prototypical self-supervised learning effectively, we leverage the class-balanced multicentric dynamic (BMD) prototype strategy to facilitate the generation of more representative and robust clustering prototypes. Furthermore, we design a Siamese-style distance metric (SDM) loss function to aggregate intra-class features while separating inter-class features. Moreover, our RBPCL method leverages the Double-Branch Dual-Attention Mechanism Network (DBDA) as the feature generator, enabling joint modeling of spatial and spectral characteristics in hyperspectral data, thereby enhancing the discriminative power of the learned semantic representations. Finally, we demonstrate the effectiveness of the proposed method over several typical advanced UDA methods and excellent FUDA methods by carrying out experiments on public data pairs and self-collected ultra-low-altitude hyperspectral data pairs captured on different dates and locations. It is noteworthy that our RBPCL method shows potential regarding tackling the spectral shift problem of real-world hyperspectral images across different illumination conditions, despite encountering the scarcity of labeled source samples.

Author Contributions

Conceptualization, X.T. and H.S.; methodology, X.T.; validation, H.S.; investigation, L.Z.; data curation, X.Z. (Xiaoxiong Zhang) and X.Z. (Xiaolei Zhou); writing—original draft preparation, X.T. and H.S.; writing—review and editing, C.L. and C.J.; supervision, X.Z. (Xiaolei Zhou); project administration, X.Z. (Xiaoxiong Zhang); funding acquisition, X.Z. (Xiaoxiong Zhang) and L.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Research Program of the Army Engineering University of PLA under Grant No. KYGYJKQTZQ24003, and the National Natural Science Foundation of China under Grant No. 62402510 and No. 72471236.

Data Availability Statement

Publicly available datasets used in this paper are open-sourced and can be accessed through the following URLs: https://www.ehu.eus/ccwintco/index.php?title=Hyperspectral_Remote_Sensing_Scenes (Pavia University, accessed on 11 April 2025); https://aistudio.baidu.com/datasetdetail/244699 (Houston, accessed on 11 April 2025); https://ieee-dataport.org/documents/tjcshyperspectral (TJ_CS_HYPERSPECTRAL, accessed on 11 April 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Camps-Valls, G.; Tuia, D.; Bruzzone, L.; Benediktsson, J.A. Advances in Hyperspectral Image Classification: Earth Monitoring with Statistical Learning Methods. IEEE Signal Process. Mag. 2014, 31, 45–54. [Google Scholar] [CrossRef]
  2. Ghamisi, P.; Yokoya, N.; Li, J.; Liao, W.; Liu, S.; Plaza, J.; Rasti, B.; Plaza, A. Advances in Hyperspectral Image and Signal Processing: A Comprehensive Overview of the State of the Art. IEEE Geosci. Remote. Sens. Mag. 2017, 5, 37–78. [Google Scholar] [CrossRef]
  3. He, L.; Li, J.; Liu, C.; Li, S. Recent Advances on Spectral–Spatial Hyperspectral Image Classification: An Overview and New Guidelines. IEEE Trans. Geosci. Remote Sens. 2018, 56, 1579–1597. [Google Scholar] [CrossRef]
  4. Kumar, B.; Dikshit, O.; Gupta, A.; Singh, M.K. Feature extraction for hyperspectral image classification: A review. Int. J. Remote Sens. 2020, 41, 6248–6287. [Google Scholar] [CrossRef]
  5. Samaniego, L.; Bárdossy, A.; Schulz, K. Supervised Classification of Remotely Sensed Imagery Using a Modified k-NN Technique. IEEE Trans. Geosci. Remote Sens. 2008, 46, 2112–2125. [Google Scholar] [CrossRef]
  6. Chang, C.I. Statistical Detection Theory Approach to Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2019, 57, 2057–2074. [Google Scholar] [CrossRef]
  7. Imani, M.; Ghassemian, H. An overview on spectral and spatial information fusion for hyperspectral image classification: Current trends and challenges. Inf. Fusion 2020, 59, 59–83. [Google Scholar] [CrossRef]
  8. Wan, Y.; Fan, Y.; Jin, M. Application of hyperspectral remote sensing for supplementary investigation of polymetallic deposits in Huaniushan ore region, northwestern China. Sci. Rep. 2021, 11, 440. [Google Scholar] [CrossRef]
  9. Meng, S.; Wang, X.; Hu, X.; Luo, C.; Zhong, Y. Deep learning-based crop mapping in the cloudy season using one-shot hyperspectral satellite imagery. Comput. Electron. Agric. 2021, 186, 106188. [Google Scholar] [CrossRef]
  10. Zhang, F.; Li, X.; Qiu, S.; Feng, J.; Wang, D.; Wu, X.; Cheng, Q. Hyperspectral imaging combined with convolutional neural network for outdoor detection of potato diseases. In Proceedings of the 2021 6th International Symposium on Computer and Information Processing Technology (ISCIPT), Changsha, China, 11–13 June 2021; pp. 846–850. [Google Scholar]
  11. Li, S.; Song, W.; Fang, L.; Chen, Y.; Ghamisi, P.; Benediktsson, J.A. Deep Learning for Hyperspectral Image Classification: An Overview. IEEE Trans. Geosci. Remote Sens. 2019, 57, 6690–6709. [Google Scholar] [CrossRef]
  12. Paoletti, M.E.; Haut, J.M.; Plaza, J.; Plaza, A.J. Deep learning classifiers for hyperspectral imaging: A review. ISPRS J. Photogramm. Remote Sens. 2019, 158, 279–317. [Google Scholar] [CrossRef]
  13. Liu, B.; Yu, A.; Yu, X.; Wang, R.; Gao, K.; Guo, W. Deep Multiview Learning for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2021, 59, 7758–7772. [Google Scholar] [CrossRef]
  14. Liu, Q.; Xiao, L.; Yang, J.; Chan, J.C.W. Content-Guided Convolutional Neural Network for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2020, 58, 6124–6137. [Google Scholar] [CrossRef]
  15. Wang, H.; Cheng, Y.; Chen, C.L.P.; Wang, X. Broad Graph Convolutional Neural Network and Its Application in Hyperspectral Image Classification. IEEE Trans. Emerg. Top. Com. Intell. 2023, 7, 610–616. [Google Scholar] [CrossRef]
  16. Yu, C.; Gong, B.; Song, M.; Zhao, E.; Chang, C.I. Multiview Calibrated Prototype Learning for Few-Shot Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5544713. [Google Scholar] [CrossRef]
  17. Deng, C.; Xue, Y.; Liu, X.; Li, C.; Tao, D. Active Transfer Learning Network: A Unified Deep Joint Spectral–Spatial Feature Learning Model for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2019, 57, 1741–1754. [Google Scholar] [CrossRef]
  18. Su, Y.; Gao, L.; Jiang, M.; Plaza, A.J.; Sun, X.; Zhang, B. NSCKL: Normalized Spectral Clustering With Kernel-Based Learning for Semisupervised Hyperspectral Image Classification. IEEE Trans. Cybern. 2022, 53, 6649–6662. [Google Scholar] [CrossRef]
  19. Su, Y.; Chen, J.; Gao, L.; Plaza, A.J.; Jiang, M.; Xu, X.; Sun, X.; Li, P. ACGT-Net: Adaptive Cuckoo Refinement-Based Graph Transfer Network for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5521314. [Google Scholar] [CrossRef]
  20. Liu, H.; Li, W.; Xia, X.G.; Zhang, M.; Gao, C.; Tao, R. Spectral Shift Mitigation for Cross-Scene Hyperspectral Imagery Classification. IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. 2021, 14, 6624–6638. [Google Scholar] [CrossRef]
  21. Peng, J.; Huang, Y.; Sun, W.; Chen, N.; Ning, Y.; Du, Q. Domain Adaptation in Remote Sensing Image Classification: A Survey. IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. 2022, 15, 9842–9859. [Google Scholar] [CrossRef]
  22. Pan, S.J.; Yang, Q. A Survey on Transfer Learning. IEEE Trans. Knowl. Data Eng. 2010, 22, 1345–1359. [Google Scholar] [CrossRef]
  23. Zhang, Y.; Li, W.; Tao, R.; Peng, J.; Du, Q.; Cai, Z. Cross-Scene Hyperspectral Image Classification With Discriminative Cooperative Alignment. IEEE Trans. Geosci. Remote Sens. 2021, 59, 9646–9660. [Google Scholar] [CrossRef]
  24. Liu, T.; Zhang, X.; Gu, Y. Unsupervised Cross-Temporal Classification of Hyperspectral Images With Multiple Geodesic Flow Kernel Learning. IEEE Trans. Geosci. Remote Sens. 2019, 57, 9688–9701. [Google Scholar] [CrossRef]
  25. Wang, H.; Cheng, Y.; Chen, C.L.P.; Wang, X. Hyperspectral Image Classification Based on Domain Adversarial Broad Adaptation Network. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5517813. [Google Scholar] [CrossRef]
  26. Cheng, Y.; Chen, Y.; Kong, Y.; Wang, X. Soft Instance-Level Domain Adaptation With Virtual Classifier for Unsupervised Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5509013. [Google Scholar] [CrossRef]
  27. Persello, C.; Bruzzone, L. Kernel-Based Domain-Invariant Feature Selection in Hyperspectral Images for Transfer Learning. IEEE Trans. Geosci. Remote Sens. 2016, 54, 2615–2626. [Google Scholar] [CrossRef]
  28. Yu, C.; Liu, C.; Yu, H.; Song, M.; Chang, C.I. Unsupervised Domain Adaptation With Dense-Based Compaction for Hyperspectral Imagery. IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. 2021, 14, 12287–12299. [Google Scholar] [CrossRef]
  29. Yu, C.; Liu, C.; Song, M.; Chang, C.I. Unsupervised Domain Adaptation With Content-Wise Alignment for Hyperspectral Imagery Classification. IEEE Geosci. Remote Sens. Lett. 2022, 19, 5511705. [Google Scholar] [CrossRef]
  30. Zhao, C.; Qin, B.; Feng, S.; Zhu, W.; Zhang, L.; Ren, J. An Unsupervised Domain Adaptation Method Towards Multi-Level Features and Decision Boundaries for Cross-Scene Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5546216. [Google Scholar] [CrossRef]
  31. Zhang, Y.; Li, W.; Zhang, M.; Qu, Y.; Tao, R.; Qi, H. Topological Structure and Semantic Information Transfer Network for Cross-Scene Hyperspectral Image Classification. IEEE Trans. Neural Netw. Learn. Syst. 2021, 34, 2817–2830. [Google Scholar] [CrossRef]
  32. Huang, Y.; Peng, J.; Sun, W.; Chen, N.; Du, Q.; Ning, Y.; Su, H. Two-Branch Attention Adversarial Domain Adaptation Network for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5540813. [Google Scholar] [CrossRef]
  33. Li, S.; Liu, C.H.; Lin, Q.; Xie, B.; Ding, Z.; Huang, G.; Tang, J. Domain Conditioned Adaptation Network. Proc. AAAI Conf. Artif. Intell. 2020, 34, 11386–11393. [Google Scholar] [CrossRef]
  34. Long, M.; Cao, Y.; Wang, J.; Jordan, M.I. Learning Transferable Features with Deep Adaptation Networks. In Proceedings of the 32nd International Conference on Machine Learning, Lille, France, 6–11 July 2015; pp. 97–105. [Google Scholar]
  35. Long, M.; Zhu, H.; Wang, J.; Jordan, M.I. Deep Transfer Learning with Joint Adaptation Networks. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017; pp. 2208–2217. [Google Scholar]
  36. Tseng, H.Y.; Lee, H.Y.; Huang, J.B.; Yang, M.H. Cross-Domain Few-Shot Classification via Learned Feature-Wise Transformation. arXiv 2020, arXiv:2001.08735. [Google Scholar]
  37. Zhao, A.; Ding, M.; Lu, Z.; Xiang, T.; Niu, Y.; Guan, J.; Wen, J.; Luo, P. Domain-Adaptive Few-Shot Learning. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Online, 5–9 January 2021; pp. 1389–1398. [Google Scholar]
  38. Zhang, Y.; Li, W.; Zhang, M.; Tao, R. Dual Graph Cross-Domain Few-Shot Learning for Hyperspectral Image Classification. In Proceedings of the 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 23–27 May 2022; pp. 3573–3577. [Google Scholar]
  39. Zhang, Y.; Li, W.; Zhang, M.; Wang, S.; Tao, R.; Du, Q. Graph Information Aggregation Cross-Domain Few-Shot Learning for Hyperspectral Image Classification. IEEE Trans. Neural Netw. Learn. Syst. 2022, 35, 1912–1925. [Google Scholar] [CrossRef]
  40. Wang, B.; Xu, Y.; Wu, Z.; Zhan, T.; Wei, Z. Spatial–Spectral Local Domain Adaption for Cross Domain Few Shot Hyperspectral Images Classification. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5539515. [Google Scholar] [CrossRef]
  41. Li, Z.; Liu, M.; Chen, Y.; Xu, Y.; Li, W.; Du, Q. Deep Cross-Domain Few-Shot Learning for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2021, 60, 5501618. [Google Scholar] [CrossRef]
  42. Kim, D.; Saito, K.; Oh, T.H.; Plummer, B.A.; Sclaroff, S.; Saenko, K. Cross-domain Self-supervised Learning for Domain Adaptation with Few Source Labels. arXiv 2020, arXiv:2003.08264. [Google Scholar]
  43. Yue, X.; Zheng, Z.; Zhang, S.; Gao, Y.; Darrell, T.; Keutzer, K.; Vincentelli, A.S. Prototypical Cross-domain Self-supervised Learning for Few-shot Unsupervised Domain Adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 13829–13839. [Google Scholar]
  44. Yang, W.; Yang, C.; Huang, S.; Wang, L.; Yang, M. Few-Shot Unsupervised Domain Adaptation via Meta Learning. In Proceedings of the 2022 IEEE International Conference on Multimedia and Expo (ICME), Taipei, Taiwan, 18–22 July 2022; pp. 1–6. [Google Scholar]
  45. Huang, S.; Yang, W.; Wang, L.; Zhou, L.; Yang, M. Few-shot Unsupervised Domain Adaptation with Image-to-Class Sparse Similarity Encoding. In Proceedings of the 29th ACM International Conference on Multimedia, Online, 20–24 October 2021; pp. 677–685. [Google Scholar]
  46. Yu, L.; Yang, W.; Huang, S.; Wang, L.; Yang, M. High-level semantic feature matters few-shot unsupervised domain adaptation. Proc. Aaai Conf. Artif. Intell. 2023, 37, 11025–11033. [Google Scholar] [CrossRef]
  47. Qu, S.; Chen, G.; Zhang, J.; Li, Z.; He, W.; Tao, D. BMD: A General Class-balanced Multicentric Dynamic Prototype Strategy for Source-free Domain Adaptation. In Proceedings of the 17th European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; pp. 165–182. [Google Scholar]
  48. Zhang, P.; Zhang, B.; Zhang, T.; Chen, D.; Wang, Y.; Wen, F. Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 12409–12419. [Google Scholar]
  49. Tzeng, E.; Hoffman, J.; Zhang, N.; Saenko, K.; Darrell, T. Deep Domain Confusion: Maximizing for Domain Invariance. arXiv 2014, arXiv:1412.3474. [Google Scholar]
  50. Yan, H.; Ding, Y.; Li, P.; Wang, Q.; Xu, Y.; Zuo, W. Mind the Class Weight Bias: Weighted Maximum Mean Discrepancy for Unsupervised Domain Adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 945–954. [Google Scholar]
  51. Long, M.; Zhu, H.; Wang, J.; Jordan, M.I. Unsupervised Domain Adaptation with Residual Transfer Networks. In Proceedings of the Annual Conference on Neural Information Processing Systems 2016, Barcelona, Spain, 5–10 December 2016; pp. 136–144. [Google Scholar]
  52. Ganin, Y.; Ustinova, E.; Ajakan, H.; Germain, P.; Larochelle, H.; Laviolette, F.; Marchand, M.; Lempitsky, V.S. Domain-Adversarial Training of Neural Networks. J. Mach. Learn. Res. 2015, 17, 1–35. [Google Scholar]
  53. Long, M.; Cao, Z.; Wang, J.; Jordan, M.I. Conditional Adversarial Domain Adaptation. In Proceedings of the Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, Montréal, QC, Canada, 3–8 December 2018; pp. 1647–1657. [Google Scholar]
  54. Tzeng, E.; Hoffman, J.; Saenko, K.; Darrell, T. Adversarial Discriminative Domain Adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2962–2971. [Google Scholar]
  55. Saito, K.; Watanabe, K.; Ushiku, Y.; Harada, T. Maximum Classifier Discrepancy for Unsupervised Domain Adaptation. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 3723–3732. [Google Scholar]
  56. Hoffman, J.; Tzeng, E.; Park, T.; Zhu, J.Y.; Isola, P.; Saenko, K.; Efros, A.A.; Darrell, T. CyCADA: Cycle-Consistent Adversarial Domain Adaptation. In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; pp. 1989–1998. [Google Scholar]
  57. Cao, Y.; Long, M.; Wang, J. Unsupervised Domain Adaptation With Distribution Matching Machines. Proc. AAAI Conf. Artif. Intell. 2018, 32, 2795–2802. [Google Scholar] [CrossRef]
  58. Hu, L.; Kan, M.; Shan, S.; Chen, X. Unsupervised Domain Adaptation With Hierarchical Gradient Synchronization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 4042–4051. [Google Scholar]
  59. Wang, Q.; Breckon, T. Unsupervised Domain Adaptation via Structured Prediction Based Selective Pseudo-Labeling. Proc. AAAI Conf. Artif. Intell. 2020, 34, 6243–6250. [Google Scholar] [CrossRef]
  60. Jiang, X.; Lao, Q.; Matwin, S.; Havaei, M. Implicit Class-Conditioned Domain Alignment for Unsupervised Domain Adaptation. In Proceedings of the 37th International Conference on Machine Learning, Online, 13–18 July 2020; pp. 4816–4827. [Google Scholar]
  61. Ghifary, M.; Kleijn, W.; Zhang, M.; Balduzzi, D.; Li, W. Deep Reconstruction-Classification Networks for Unsupervised Domain Adaptation. In Proceedings of the 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; pp. 597–613. [Google Scholar]
  62. Sun, Y.; Tzeng, E.; Darrell, T.; Efros, A.A. Unsupervised Domain Adaptation through Self-Supervision. arXiv 2019, arXiv:1909.11825. [Google Scholar]
  63. Chen, T.; Kornblith, S.; Norouzi, M.; Hinton, G.E. A Simple Framework for Contrastive Learning of Visual Representations. In Proceedings of the 37th International Conference on Machine Learning, Online, 13–18 July 2020; pp. 1597–1607. [Google Scholar]
  64. Tian, Y.; Chen, X.; Ganguli, S. Understanding self-supervised Learning Dynamics without Contrastive Pairs. In Proceedings of the 38th International Conference on Machine Learning, Online, 18–24 July 2021; pp. 10268–10278. [Google Scholar]
  65. Chen, X.; He, K. Exploring Simple Siamese Representation Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 15745–15753. [Google Scholar]
  66. Li, J.; Zhou, P.; Xiong, C.; Socher, R.; Hoi, S.C.H. Prototypical Contrastive Learning of Unsupervised Representations. arXiv 2020, arXiv:2005.04966. [Google Scholar]
  67. Mo, S.; Sun, Z.; Li, C. Siamese Prototypical Contrastive Learning. arXiv 2022, arXiv:2208.08819. [Google Scholar]
  68. Wang, X.; Liu, Z.; Yu, S.X. Unsupervised Feature Learning by Cross-Level Instance-Group Discrimination. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 12581–12590. [Google Scholar]
  69. Cao, Z.; Li, X.; Zhao, L. Unsupervised Feature Learning by Autoencoder and Prototypical Contrastive Learning for Hyperspectral Classification. arXiv 2020, arXiv:2009.00953. [Google Scholar] [CrossRef]
  70. Wang, S.; Du, B.; Zhang, D.; Wan, F. Adversarial Prototype Learning for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2021, 60, 5511918. [Google Scholar] [CrossRef]
  71. Cheng, Y.; Zhang, W.; Wang, H.; Wang, X. Causal Meta-Transfer Learning for Cross-Domain Few-Shot Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5521014. [Google Scholar] [CrossRef]
  72. Zhang, S.; Chen, Z.; Wang, D.; Wang, Z.J. Cross-Domain Few-Shot Contrastive Learning for Hyperspectral Images Classification. IEEE Geosci. Remote Sens. Lett. 2022, 19, 5514505. [Google Scholar] [CrossRef]
  73. Zhang, M.; Liu, H.; Gong, M.; Li, H.; Wu, Y.; Jiang, X. Cross-Domain Self-Taught Network for Few-Shot Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2023, 61, 4501719. [Google Scholar] [CrossRef]
  74. Hu, L.; He, W.; Zhang, L.; Zhang, H. Cross-Domain Meta-Learning Under Dual-Adjustment Mode for Few-Shot Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5526416. [Google Scholar] [CrossRef]
  75. Wang, H.; Wang, X.; Cheng, Y. Graph Meta Transfer Network for Heterogeneous Few-Shot Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5501112. [Google Scholar] [CrossRef]
  76. Liu, Q.; Peng, J.; Ning, Y.; Chen, N.; Sun, W.; Du, Q.; Zhou, Y. Refined Prototypical Contrastive Learning for Few-Shot Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5506214. [Google Scholar] [CrossRef]
  77. Qin, B.; Feng, S.; Zhao, C.; Li, W.; Tao, R.; Xiang, W. Cross-Domain Few-Shot Learning Based on Feature Disentanglement for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote. Sens. 2024, 62, 5514215. [Google Scholar] [CrossRef]
  78. Hou, W.; Peng, J.; Yang, B.; Wu, L.; Sun, W. A Cross-Scene Few-Shot Learning Based on Intra–Inter Domain Contrastive Alignment for Hyperspectral Image Change Detection. IEEE Trans. Geosci. Remote. Sens. 2025, 63, 5513014. [Google Scholar] [CrossRef]
  79. Li, R.; Zheng, S.; Duan, C.; Yang, Y.; Wang, X. Classification of Hyperspectral Image Based on Double-Branch Dual-Attention Mechanism Network. Remote. Sens. 2020, 12, 582. [Google Scholar] [CrossRef]
  80. Berthelot, D.; Carlini, N.; Goodfellow, I.J.; Papernot, N.; Oliver, A.; Raffel, C. MixMatch: A Holistic Approach to Semi-Supervised Learning. In Proceedings of the Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, Vancouver, BC, Canada, 8–14 December 2019; pp. 5049–5059. [Google Scholar]
  81. Yang, J.; Zhao, Y.; Chan, J.C.W. Learning and Transferring Deep Joint Spectral–Spatial Features for Hyperspectral Classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 4729–4742. [Google Scholar] [CrossRef]
  82. Ge, C.; Du, Q.; Li, Y.; Li, J. Multitemporal Hyperspectral Image Classification using Collaborative Representation-based Classification with Tikhonov Regularization. In Proceedings of the 10th International Workshop on the Analysis of Multitemporal Remote Sensing Images (MultiTemp), Shanghai, China, 5–7 August 2019; pp. 1–4. [Google Scholar]
  83. Johnson, J.; Douze, M.; Jégou, H. Billion-Scale Similarity Search with GPUs. IEEE Trans. Big Data 2021, 7, 535–547. [Google Scholar] [CrossRef]
Figure 1. The framework of proposed RBPCL for FUDA setting in HSICC task. Refined in-domain and bi-directional cross-domain prototypical contrastive learning is conducted between normalized feature vectors f and prototypes μ aggregated by clustering vectors v in memory banks with BMD strategy. Features with confident predictions ( p ) are adopted to update cosine classifier vectors w adaptively. MI maximization L M I M and supervised classification loss L c l s are further leveraged to obtain discriminative embeddings.
Figure 1. The framework of proposed RBPCL for FUDA setting in HSICC task. Refined in-domain and bi-directional cross-domain prototypical contrastive learning is conducted between normalized feature vectors f and prototypes μ aggregated by clustering vectors v in memory banks with BMD strategy. Features with confident predictions ( p ) are adopted to update cosine classifier vectors w adaptively. MI maximization L M I M and supervised classification loss L c l s are further leveraged to obtain discriminative embeddings.
Remotesensing 17 02305 g001
Figure 2. Comparison between existing prototype strategy (up) and our multicentric prototype strategy (down). Existing prototype strategies are monocentric and class-biased, which easily leads to negative transfers for those hard-transfer instances, while the class-balanced multicentric dynamic prototype (BMD) strategy can effectively address the issue.
Figure 2. Comparison between existing prototype strategy (up) and our multicentric prototype strategy (down). Existing prototype strategies are monocentric and class-biased, which easily leads to negative transfers for those hard-transfer instances, while the class-balanced multicentric dynamic prototype (BMD) strategy can effectively address the issue.
Remotesensing 17 02305 g002
Figure 3. A ground-object example compares the existing class-biased strategy (left) with the BMD class-balanced strategy (right). To better understand the difference, we annotate the decision direction of these two sampling schemes. In the presence of class imbalance, the existing strategy is prone to gathering class-biased instances.
Figure 3. A ground-object example compares the existing class-biased strategy (left) with the BMD class-balanced strategy (right). To better understand the difference, we annotate the decision direction of these two sampling schemes. In the presence of class imbalance, the existing strategy is prone to gathering class-biased instances.
Remotesensing 17 02305 g003
Figure 4. Comparison between existing monocentric prototype strategy (left) and our multicentric prototype strategy (right).We can speculate that we could get more robust prototypes and accurate decision boundaries.
Figure 4. Comparison between existing monocentric prototype strategy (left) and our multicentric prototype strategy (right).We can speculate that we could get more robust prototypes and accurate decision boundaries.
Remotesensing 17 02305 g004
Figure 5. Hyperspectral images of Pavia data pair.
Figure 5. Hyperspectral images of Pavia data pair.
Remotesensing 17 02305 g005
Figure 6. Hyperspectral images of Houston data pair.
Figure 6. Hyperspectral images of Houston data pair.
Remotesensing 17 02305 g006
Figure 7. Hyperspectral images of self-collected data.
Figure 7. Hyperspectral images of self-collected data.
Remotesensing 17 02305 g007
Figure 8. Spectral curve charts of self-collected datasets.
Figure 8. Spectral curve charts of self-collected datasets.
Remotesensing 17 02305 g008
Figure 9. Cross-scene classification maps with all the compared methods in Pavia University → Pavia Center.
Figure 9. Cross-scene classification maps with all the compared methods in Pavia University → Pavia Center.
Remotesensing 17 02305 g009
Figure 10. Cross-scene all-pixel classification maps with all the compared methods in TJSunny → CSSunny.
Figure 10. Cross-scene all-pixel classification maps with all the compared methods in TJSunny → CSSunny.
Remotesensing 17 02305 g010
Figure 11. The t-SNE scatter plots on Pavia data pair.
Figure 11. The t-SNE scatter plots on Pavia data pair.
Remotesensing 17 02305 g011
Figure 12. The t-SNE scatter plots on Houston data pair.
Figure 12. The t-SNE scatter plots on Houston data pair.
Remotesensing 17 02305 g012
Figure 13. The t-SNE scatter plots on TJSunny-TJCloudy data pair.
Figure 13. The t-SNE scatter plots on TJSunny-TJCloudy data pair.
Remotesensing 17 02305 g013
Figure 14. The t-SNE scatter plots on TJSunny-CSSunny data pair.
Figure 14. The t-SNE scatter plots on TJSunny-CSSunny data pair.
Remotesensing 17 02305 g014
Figure 15. The t-SNE scatter plots on TJCloudy-CSSunny data pair.
Figure 15. The t-SNE scatter plots on TJCloudy-CSSunny data pair.
Remotesensing 17 02305 g015
Figure 16. Cross-scene classification results (%) of PU → PC under different R-value settings.
Figure 16. Cross-scene classification results (%) of PU → PC under different R-value settings.
Remotesensing 17 02305 g016
Figure 17. Cross-scene classification results (%) of PC → PU under different R-value settings.
Figure 17. Cross-scene classification results (%) of PC → PU under different R-value settings.
Remotesensing 17 02305 g017
Figure 18. Cross-scene classification results (%) of TJSunny → CSSunny under different R-value settings.
Figure 18. Cross-scene classification results (%) of TJSunny → CSSunny under different R-value settings.
Remotesensing 17 02305 g018
Figure 19. Cross-scene classification results (%) of CSSunny → TJSunny under different R-value settings.
Figure 19. Cross-scene classification results (%) of CSSunny → TJSunny under different R-value settings.
Remotesensing 17 02305 g019
Figure 20. Classification performance (%) achieved by the RBPCL, PCS and CD with increasing training samples from the source domain.
Figure 20. Classification performance (%) achieved by the RBPCL, PCS and CD with increasing training samples from the source domain.
Remotesensing 17 02305 g020
Table 1. Number of samples in each land cover category within Pavia data pair.
Table 1. Number of samples in each land cover category within Pavia data pair.
ClassNumber of Samples
IDNamePUPC
1Trees30647598
2Asphalt66319248
3Bricks36822685
4Bitumen13307287
5Shadows9472863
6Meadows18,6493090
7Bare soil50296584
Total39,33239,355
Table 2. Number of samples in each land cover category within Houston data pair.
Table 2. Number of samples in each land cover category within Houston data pair.
ClassNumber of Samples
IDNameHO2013HO2018
1Healthy grass12519799
2Stressed grass125432,502
3Trees124413,588
4Water325266
5Residential buildings126839,762
6Commercial buildings1244223,684
7Road125245,810
Total7838365,411
Table 3. Number of samples in each land cover category within TJSunny-TJCloudy data pair.
Table 3. Number of samples in each land cover category within TJSunny-TJCloudy data pair.
ClassNumber of Samples
IDNameTJSunnyTJCloudy
1Sand58857052
2Red plastic track13351042
3Green fake turf47113687
4White cloth285237
5Gray cloth274230
6Green bushes16052340
7Red bushes18282019
8Asphalt pavement62523950
9Silver-colored metal box4235
10Grey floor tiles26223683
11Red floor tiles65755392
12Metal manhole cover18590
Total31,59929,757
Table 4. Number of samples in each land cover category within TJSunny-CSSunny and TJCloudy-CSSunny data pairs.
Table 4. Number of samples in each land cover category within TJSunny-CSSunny and TJCloudy-CSSunny data pairs.
ClassNumber of Samples
IDNameTJSunnyTJCloudyCSSunny
1Sand588570526758
2Red plastic track133510423478
3White cloth3002711202
4Gray cloth1842391369
5Asphalt pavement621631822940
6Silver-colored metal box5141101
Total13,97111,82715,848
Table 5. Cross-scene classification results with different methods on three shots per class of PU → PC.
Table 5. Cross-scene classification results with different methods on three shots per class of PU → PC.
ClassS-OnlyDANNMDDIADABANSILDA-VCCDSPCSPCS+SDMPCS+BMDRBPCL
149.936.2669.9642.3640.1441.4842.9258.7263.8460.76
260.9953.9263.0440.9425.6831.1846.0257.1332.0960.51
370.2265.6710099.8598.8499.2579.6973.7681.1487.07
427.1972.1813.3136.4786.2781.4783.9779.2280.375.03
532.1239.9233.6982.8463.6851.8776.6579.2497.5984.73
65966.5562.7651.3956.2282.1660.5254.0870.8972.7
749.0943.0650.958.9869.9968.8576.8971.381.2281.22
AA49.7953.8456.2458.9862.9865.1866.6767.6372.4474.57
OA48.9752.853.551.2957.2658.9363.2766.466.5271.24
Kappa39.1544.2444.9843.0850.0852.0257.0160.1861.365.96
Table 6. Cross-scene classification results with different methods on three shots per class OF PC → PU.
Table 6. Cross-scene classification results with different methods on three shots per class OF PC → PU.
ClassS-OnlyDANNMDDIADABANSILDA-VCCDSPCSPCS+SDMPCS+BMDRBPCL
141.0561.4376.5219.262.4154.9362.4164.480.3762.8
233.3145.5924.7544.3834.2639.4324.3920.2449.7262.54
388.2328.5382.3686.1111.4977.2384.2490.5771.8872.77
491.8771.1699.788.790.4498.4299.8599.1795.4194.5
519.2662.4314.3935.9862.7543.1780.9598.296.487.09
615.4735.1735.3644.7356.6547.5360.0667.3967.2476.85
799.3688.7898.1599.8297.5197.3784.2384.757.6358.62
AA55.5156.1661.659.8559.3665.4470.8874.9574.0973.6
OA40.6847.0850.8754.8755.6157.5161.4365.4166.1771.48
Kappa30.9637.1840.6443.7745.0547.3550.8755.815661.35
Table 7. Cross-scene classification results with different methods on three shots per class of HO2013 → HO2018.
Table 7. Cross-scene classification results with different methods on three shots per class of HO2013 → HO2018.
ClassS-OnlyDANNMDDIADABANSILDA-VCCDSPCSPCS+SDMPCS+BMDRBPCL
133.8596.6658.8295.5243.383.0852.3981.8894.6968.42
257.675.3852.9453.6982.340.9548.9548.2754.7485.53
351.6758.9468.6796.3689.7352.9796.7384.6584.7895.1
468.5699.6268.1885.6168.5689.3968.5693.9467.05100
564.791.5388.8686.3312.5689.2676.4640.0479.2760.54
631.0640.8138.7928.4769.0367.1854.4570.965.8657.99
724.274.7725.5357.8928.0410.5958.0343.4129.2871.46
AA47.456.8257.471.9856.2261.9265.0866.1567.9577.01
OA37.140.8745.5145.065960.0758.3362.963.2264.09
Kappa18.7522.9329.832.2634.3940.4743.0542.6945.4749.27
Table 8. Cross-scene classification results with different methods on three shots per class of HO2018 → HO2013.
Table 8. Cross-scene classification results with different methods on three shots per class of HO2018 → HO2013.
ClassS-OnlyDANNMDDIADABANSILDA-VCCDSPCSPCS+SDMPCS+BMDRBPCL
134.035327.731.9573.1877.0270.1484.6379.1878.38
258.7184.982.6780.1153.8379.8759.6647.6867.4972.52
332.5331.7267.7174.2491.2233.0972.0682.2975.5277.05
411.7680.1987.3168.4282.3587.3185.1477.7193.1970.28
563.1953.6363.1159.445.124.2572.3582.7856.5682.78
631.5625.5221.7414.9812.9651.2141.319.8142.3548.39
741.664.1656.864.6452.6491.4472.7286.1687.3688.56
AA39.0556.1658.1556.2558.7663.4667.6368.7271.6674
OA42.3653.3754.7454.8455.9460.665.5867.7169.1174.48
Kappa32.2244.9346.7346.5548.2353.3659.5161.7863.5769.79
Table 9. Cross-scene classification results with different methods on three shots per class of TJSunny → TJCloudy.
Table 9. Cross-scene classification results with different methods on three shots per class of TJSunny → TJCloudy.
ClassS-OnlyDANNMDDIADABANSILDA-VCCDSPCSPCS+SDMPCS+BMDRBPCL
145.098341.5531.8364.2324.3124.3860.5273.9278.94
236.2271.6620.6535.4581.2743.6172.5394.8142.6556.77
38.4614.6831.7138.4456.5439.9138.4742.332.0912.75
433.4780.5140.6859.3219.0763.5658.4710044.4980.51
56957.646992.5814.4172.0579.0499.5698.2596.94
617.2742.9216.9755.5881.8344.2576.1456.3128.7791.58
743.461.2983.4549.04666.9543.6111.8465.5148.12
814.2320.1356.6256.1741.242.7299.5788.3388.2596.33
997.0610010010088.2410010082.3510079.41
1012.6315.8314.4853.3912.7681.8836.21.9628.9542.07
1195.0559.6294.0579.9384.9299.0595.290.7310086.68
1260.6740.4540.4540.4540.4540.4540.4558.4340.4539.33
AA44.3848.9850.857.6849.2459.963.6765.5961.9567.45
OA39.0844.2248.9451.2254.7855.3158.2958.3864.4768.05
Kappa32.2938.5842.7947.350.4149.7453.3953.0559.3163.23
Table 10. Cross-scene classification results with different methods on three shots per class of TJCloudy → TJSunny.
Table 10. Cross-scene classification results with different methods on three shots per class of TJCloudy → TJSunny.
ClassS-OnlyDANNMDDIADABANSILDA-VCCDSPCSPCS+SDMPCS+BMDRBPCL
136.2263.980.1310010094.1296.8268.5999.3599.63
260.1922.3676.2485.6932.1334.9354.7253.5264.3289.06
337.267.0329.5844.628.415.0537.4554.1213.9997.35
432.3986.5791.945.1757.2852.1197.1870.42100100
551.2897.4349.4512.7775.5499.6398.5310099.27100
648.3253.0385.8549.7242.8797.0768.4592.5210098
720.0922.2960.7639.592.4547.4565.9659.7774.4498.41
810076.8677.2493.0292.0594.984.7397.3610015.17
910010010092.3487.4910010010010058.54
1039.0366.1136.0569.1845.9694.0122.2457.3895.99100
1136.7162.5922.369.7518.3125.3745.553.7328.8446.88
1244.0246.4572.2836.2261.6221.7434.2459.7844.0283.15
AA50.4658.7265.1556.5961.1864.767.1572.2776.6982.18
OA50.2553.6255.1760.7359.2362.3763.3668.4568.5970.88
Kappa44.3247.4949.7755.3653.9357.1758.363.8864.166.78
Table 11. Cross-scene classification results (%) with different methods on three shots per class of TJSunny → CSSunny.
Table 11. Cross-scene classification results (%) with different methods on three shots per class of TJSunny → CSSunny.
ClassS-OnlyDANNMDDIADABANSILDA-VCCDSPCSPCS+SDMPCS+BMDRBPCL
138.1968.5234.7346.754.8865.5864.1363.8377.5864.72
215.2252.8599.9430.8824.1416.5729.9331.1116.3462.29
371.5747.4924.8146.1534.4566.2277.1892.3150.8486.96
496.7296.7279.910092.996.7296.791.2699.45100
585.1560.7610097.0191.8799.3799.499.2499.299.77
6100100100100100100100100100100
AA67.8171.0673.2370.1266.3774.0877.8979.6373.985.62
OA58.663.664.7168.4668.6376.4877.477.5781.1581.15
Kappa42.2648.4855.5955.9456.0864.966.5466.3470.771.64
Table 12. Cross-scene classification results (%) with different methods on three shots per class of CSSunny → TJSunny.
Table 12. Cross-scene classification results (%) with different methods on three shots per class of CSSunny → TJSunny.
ClassS-OnlyDANNMDDIADABANSILDA-VCCDSPCSPCS+SDMPCS+BMDRBPCL
134.0255.5953.8855.5965.4864.0760.3348.9867.6470.73
253.6779.4647.7179.467.1238.2661.1457.0535.6143.4
398.6653.8595.9753.8564.8871.4852.3586.2956.8650.17
446.4596.7291.7696.7293.9910095.0576.580.8793.44
542.747.2966.1747.2863.5774.7279.1193.3989.5998.1
6100100100100100100100100100100
AA62.5972.1575.9272.1565.8474.7574.6677.0371.7676.47
OA41.5554.8460.3254.8459.5467.169.1970.8674.4180.55
Kappa28.2843.2943.3643.2943.7252.9156.6758.6262.670.62
Table 13. Cross-scene classification results (%) with different methods on three shots per class of TJCloudy → CSSunny.
Table 13. Cross-scene classification results (%) with different methods on three shots per class of TJCloudy → CSSunny.
ClassS-OnlyDANNMDDIADABANSILDA-VCCDSPCSPCS+SDMPCS+BMDRBPCL
129.7530.6111.8532.5951.6126.9628.958.3468.462.82
240.5599.9710040.1210010010010010047.74
334.972.1634.897.7521.5737.348.2122.442.869.28
449.7112.7937.0673.689.4333.8548.6133.1162.7292.69
598.4379.7210098.7467.9510010092.5847.12100
6419686978510097829497
AA49.0753.5461.6273.3155.9266.3570.4564.7469.1778.26
OA47.0551.6651.9455.4159.5558.3861.2969.0869.1269.69
Kappa35.5339.8643.0646.0747.649.6352.9958.9259.4760.65
Table 14. Cross-scene classification results (%) with different methods on three shots per class of CSSunny → TJCloudy.
Table 14. Cross-scene classification results (%) with different methods on three shots per class of CSSunny → TJCloudy.
ClassS-OnlyDANNMDDIADABANSILDA-VCCDSPCSPCS+SDMPCS+BMDRBPCL
116.344.9418.3533.3532.7654.9162.7673.2250.2871.85
224.0231.0313.263022.8328.2752.8358.8970.9687.7
325.9331.8537.0469.334.4476.5844.4499.2665.4377.78
410058.8294.1210010092.8310099.5832.0776.47
549.835.3391.2958.1276.1839.2540.1825.590.4737.69
610010010067.510010010080.49100100
AA52.6750.3359.0159.7161.0365.366.772.8268.275.25
OA28.1841.339.7641.8545.1949.7556.2660.2763.0664.38
Kappa15.9420.3827.9930.4633.6432.0640.1543.0848.2349.04
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Tang, X.; Shi, H.; Li, C.; Jiang, C.; Zhang, X.; Zeng, L.; Zhou, X. Few-Shot Unsupervised Domain Adaptation Based on Refined Bi-Directional Prototypical Contrastive Learning for Cross-Scene Hyperspectral Image Classification. Remote Sens. 2025, 17, 2305. https://doi.org/10.3390/rs17132305

AMA Style

Tang X, Shi H, Li C, Jiang C, Zhang X, Zeng L, Zhou X. Few-Shot Unsupervised Domain Adaptation Based on Refined Bi-Directional Prototypical Contrastive Learning for Cross-Scene Hyperspectral Image Classification. Remote Sensing. 2025; 17(13):2305. https://doi.org/10.3390/rs17132305

Chicago/Turabian Style

Tang, Xuebin, Hanyi Shi, Chunchao Li, Cheng Jiang, Xiaoxiong Zhang, Lingbin Zeng, and Xiaolei Zhou. 2025. "Few-Shot Unsupervised Domain Adaptation Based on Refined Bi-Directional Prototypical Contrastive Learning for Cross-Scene Hyperspectral Image Classification" Remote Sensing 17, no. 13: 2305. https://doi.org/10.3390/rs17132305

APA Style

Tang, X., Shi, H., Li, C., Jiang, C., Zhang, X., Zeng, L., & Zhou, X. (2025). Few-Shot Unsupervised Domain Adaptation Based on Refined Bi-Directional Prototypical Contrastive Learning for Cross-Scene Hyperspectral Image Classification. Remote Sensing, 17(13), 2305. https://doi.org/10.3390/rs17132305

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop