Next Article in Journal
Improved Binary Classification of Underwater Images Using a Modified ResNet-18 Model
Previous Article in Journal
Toward Intelligent Underwater Acoustic Systems: Systematic Insights into Channel Estimation and Modulation Methods
Previous Article in Special Issue
Fast Inference End-to-End Speech Synthesis with Style Diffusion
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Meta-Learning-Integrated Neural Architecture Search for Few-Shot Hyperspectral Image Classification

1
Heilongjiang Province Key Laboratory of Laser Spectroscopy Technology and Application, Harbin University of Science and Technology, Harbin 150080, China
2
School of Integrated Circuit, Shenzhen Polytechnic University, Shenzhen 518115, China
*
Author to whom correspondence should be addressed.
Electronics 2025, 14(15), 2952; https://doi.org/10.3390/electronics14152952
Submission received: 11 June 2025 / Revised: 15 July 2025 / Accepted: 22 July 2025 / Published: 24 July 2025

Abstract

In order to address the limitations of the number of label samples in practical accurate classification scenarios and the problems of overfitting and an insufficient generalization ability caused by Few-Shot Learning (FSL) in hyperspectral image classification (HSIC), this paper designs and implements a neural architecture search (NAS) for a few-shot HSI classification method that combines meta learning. Firstly, a multi-source domain learning framework was constructed to integrate heterogeneous natural images and homogeneous remote sensing images to improve the information breadth of few-sample learning, enabling the final network to enhance its generalization ability under limited labeled samples by learning the similarity between different data sources. Secondly, by constructing precise and robust search spaces and deploying different units at different locations, the classification accuracy and model transfer robustness of the final network can be improved. This method fully utilizes spatial texture information and rich category information of multi-source data and transfers the learned meta knowledge to the optimal architecture for HSIC execution through precise and robust search space design, achieving HSIC tasks with limited samples. Experimental results have shown that our proposed method achieved an overall accuracy (OA) of 98.57%, 78.39%, and 98.74% for classification on the Pavia Center, Indian Pine, and WHU-Hi-LongKou datasets, respectively. It is fully demonstrated that utilizing spatial texture information and rich category information of multi-source data, and through precise and robust search space design, the learned meta knowledge is fully transmitted to the optimal architecture for HSIC, perfectly achieving classification tasks with few-shot samples.

1. Introduction

Among various remote sensing technologies, the hyperspectral image (HSI) originates from the ground object imaging technology developed in aerospace, achieving the goal of long-range ground object detail imaging [1]. This imaging feature enables HSI to simultaneously contain the spatial distribution characteristics [2,3]. In order to address the limitations of the number of label samples in practical accurate classification scenarios and the problems of overfitting and insufficient generalization ability caused by Few-Shot Learning (FSL) in hyperspectral image classification (HSIC), this paper designs and implements a neural architecture search (NAS) for a few-shot HSI classification method that combines meta learning. Firstly, a multi-source domain learning framework was constructed to integrate heterogeneous natural images and homogeneous remote sensing images to improve the information breadth of few-sample learning, enabling the final network to enhance its generalization ability under limited labeled samples by learning the similarities between different data sources. Here, “homogeneous” refers to the relative consistency of remote sensing images in terms of data generation logic, feature distribution patterns, or acquisition conditions, which stands in contrast to the “heterogeneity” of natural images. In terms of the uniformity of imaging mechanisms, remote sensing images are usually captured by specific sensors at fixed altitudes and within preset spectral ranges, with relatively stable imaging parameters and little interference from manual shooting. In contrast, natural images mostly come from devices such as ordinary cameras and mobile phones, with significant differences in shooting angles, lighting, distances, and device models, resulting in a more chaotic imaging mechanism (i.e., “heterogeneous”). Regarding the regularity of feature distribution, the underlying features of remote sensing images (such as spectral reflectance and texture structure) often conform to the physical properties of the corresponding ground objects (for example, vegetation has strong near-infrared reflectance, and water bodies have smooth textures), leading to more concentrated feature distributions and more stable noise patterns. Due to the randomness of shooting scenes, objects, and conditions, natural images have more scattered feature distributions with more significant differences. This explains the distinction between homogeneous and heterogeneous images. Secondly, by constructing precise and robust search spaces and deploying different units at different locations, the classification accuracy and model transfer robustness of the final network can be improved. This method fully utilizes spatial texture information and rich category information of multi-source data and transfers the learned meta knowledge to the optimal architecture for HSIC execution through precise and robust search space design, achieving HSIC tasks with limited samples. Experimental results have shown that our proposed method achieved an overall accuracy (OA) of 98.57%, 78.39%, and 98.74% for classification on the Pavia Center, Indian Pine, and WHU-Hi-LongKou datasets, respectively. It is fully demonstrated that by utilizing spatial texture information and rich category information of multi-source data, through precise and robust search space design, the learned meta knowledge is fully transmitted to the optimal architecture for HSIC, perfectly achieving classification tasks with few-shot samples.
Intensity characteristics and material fingerprint spectral characteristics of the target form a unique triple information fusion advantage [4]. Therefore, HSI, which contains rich spatial and spectral information, has deep research value in precision agriculture [5], land cover analysis [6], marine hydrological detection [7], geological exploration [7], and other fields.
Hyperspectral image classification (HSIC) is a key direction in HSI research, with the core goal of accurately extracting discriminative features from high-dimensional data to address the classification challenges of surface cover in complex scenes. In the field of deep learning (DL) images dominated by convolutional neural networks (CNN) [8,9,10], the performance of HSIC tasks has been effectively improved through end-to-end feature extraction mechanisms. Artificially designed models such as the Spectral Spatial Residual Network (SSRN) [11] and Attention Based Adaptive Spectral Spatial Kernel ResNet (A2S2K-ResNet) [12] have achieved satisfactory classification performances. However, these models require a large number of labeled samples. In practical application scenarios, the cost of HSI data collection is relatively high, with tens of thousands of pixels, making it expensive and difficult to obtain a large number of labeled samples [13]. Therefore, under resource constraints, the scarcity of labeled samples poses a challenge to HSIC. Although unsupervised learning and semi-supervised learning have made progress, these methods require the use of data for complete model retraining, which incurs significant resource overhead.
To address the challenge of resource constraints, meta-learning methods based on few-sample learning (FSL) have been proposed and applied to HSIC. This method learns transferable knowledge from a source domain with rich labeled samples, allowing the model to learn how to learn. Then, the model trained on the source domain is transferred to the target domain and combined with the limited samples in the target domain to achieve the classification of unlabeled samples.
In 2019, Chen et al. [14] proposed a one-dimensional automatic neural network, which achieved the first application of Neural Architecture Search (NAS) based on the DARTS search strategy in the field of HSI classification, automating the process of designing classification models. Subsequently, Zhang et al. [15] designed a pixel-to-pixel level HSI classification architecture 3D asymmetric NAS, where all operations are performed within the three-dimensional architecture of a hierarchical search space and the network width is adaptively adjusted based on the characteristics of different HSIs. Xue et al. [16] proposed a hyperspectral image classification method that combines the automatic design of convolutional neural networks with Transformers. This method is the first to combine NAS and Transformers to process HSI classification tasks. Cao et al. [17] proposed a lightweight multi-scale neural structure search-based hyperspectral classification method and designed a new lightweight and efficient search space to reduce the number of model parameters, with high classification performance and low computational cost. In 2024, Xiao et al. [18] first explored the application of few-sample learning in HSI classification tasks using NAS, achieving the automatic design of HSIC embedding extractors with limited labeled samples.
In this context, NAS can effectively reduce manual intervention and save the time cost of parameter tuning by automatically designing network structures, providing a more efficient and accurate approach to network design and optimization for HSIC research with limited samples. Therefore, in order to continue exploring the application of NAS in FSL, this paper proposes a few-shot hyperspectral image classification method combined with meta learning. The contributions of this article are as follows:
  • Explored the FSL application of NAS in hyperspectral classification tasks, constructed a multi-source domain learning framework combined with meta learning, and improved the richness of learnable meta knowledge.
  • By designing a precise robust search space for attention convolution, the automatic design of the HSIC feature extractor architecture under limited samples was achieved. The optimal precise and robust units were deployed at different positions of the architecture, ensuring that the architecture maintains both classification accuracy and transfer robustness on HSIC.
  • Within the search space, an attention convolution operator is proposed, which combines efficient attention mechanisms with depthwise separable convolutions to enhance the discriminative feature extraction capability of the optimal architecture while maintaining the effectiveness of the convolution.
  • By combining focus loss and label-distribution-aware margin loss, optimal architecture can effectively improve the classification performance of the model for imbalanced samples.

2. Materials and Methods

2.1. Overall Framework of MLFS-NAS

This paper proposes Meta-Learning-Integrated NAS for Few-Shot HSI classification (MLFS-NAS) for HSIC tasks with limited labeled samples. The workflow of the MLFS-NAS is shown in Figure 1, which consists of three main parts: NAS for searching for the optimal embedded feature extractor architecture, the training process for multi-source domains and labeled target domains, and the testing process for unlabeled target domains. Among them, a support set S and a query set Q with disjoint samples are divided from the source domain dataset and the labeled target domain dataset.
Firstly, the natural image dataset Mini ImageNet (MI) and hyperspectral dataset Chikusei (CK) are used to jointly construct a multi-source domain dataset to build a NAS-based supernet (i.e., containing all possible candidate architectures within a predefined search space), and then the optimal unit architecture search for the feature extractor is completed through a differentiable search strategy to construct the final network. Then, a large number of labeled samples from multiple domains and a limited number of labeled samples from the target domain HSI dataset (including the Pavia Center, Indian Pines, and WHU Hi LongKou datasets) are used for meta learning, alternating optimization training to fine tune the parameters of the final network, resulting in an optimal architecture for a target domain HSIC suitable for unlabeled samples. Finally, the optimal architecture is migrated to complete the HSIC for unlabeled samples. Among them, a precise and robust search space was constructed to improve the accuracy and robustness of the architecture, ensuring the stability and classification accuracy of the final architecture during the migration process. A deep separable convolution operator for the internal search space is proposed by combining an adaptive fine-grained channel attention (FCA) and Mixed Local Channel Attention (MICA) to improve the final network’s ability to learn features.

2.2. Few-Shot Sample Learning of Multi-Source Domain and Target Domain

In the method of this paper, an FSL framework is established for the multi-source domain and the target domain. Each dataset in the multi-source domain is denoted by D u . Here, u represents the index of the source domain, and its value ranges from 1 to 2. Each dataset contains C u categories. In each episode, c categories are randomly sampled from the source domain datasets, and for each category, k labeled samples are extracted as the support set S . Similarly, n labeled samples are randomly sampled from the same c categories as the query set Q , where the samples in the support set are excluded. Therefore, the support set of the multi-source domain is denoted as S u = { ( x i , y i ) } i = 1 c × k , and the query set is defined as Q u = { ( x j , y j ) } j = 1 c × n . The dataset D of the target domain is divided into a few-shot dataset with labeled samples and a test dataset with unlabeled samples. Similarly, the support set is defined as S t = { ( x i , j i ) } i = 1 c × k ( k labeled samples, with c sampled classes in each episode), and the query set is defined as Q t = { ( x j , y j ) } j = 1 c × n ( n t samples with the same classes).
After the network search, the training process is carried out through episode-based meta-training between the source datasets and the target dataset, which helps in the collaborative learning of the discriminative embedding space shared by the source domain and the target domain. Firstly, three mapping layers, M 1 , M 2 , and M 3 , are used to make the input dimensions of the multi-source domain datasets and the target domain dataset equal. Then, the data is input into Finalnet. The feature output of the mapping layer is F = F × M , where the input data F R H × W × c h , M R c h × 100 , and c h is determined by the number of input spectral bands [18].
f u s = Finalnet ( M u ( S u ) ) f u q = Finalnet ( M u ( Q u ) )
where u { 1 , 2 , t } is determined by the dataset of each domain.
In order to address the problem of class imbalance, this chapter employs the focal loss (FL) [19] and the label-distribution-aware margin loss (LDAM) [20] functions jointly for few-shot learning in the HSIC task. The total loss function is
L F S L u = L L D A M u + L F L u , u { 1 , 2 , t }
where L L D A M represents the label-distribution-aware margin loss and L F L represents the focal loss.
In the test phase, classification is carried out by minimizing the distances between the samples x q in the query set Q and all the samples in the support set S :
p ( x q ) = arg min x i S   d ( f θ ( x q ) , f θ ( x i ) )
where x q represents the unlabeled samples in the query set; p ( x q ) represents the predicted class labels of x q .
L F L = x Q c = 1 C ( 1 y c ) γ log ( p ^ ( x q ) )
where C represents the number of classes within the domain; y c represents the true label distribution of x q ; p ^ ( x q ) represents the predicted probability distribution; and γ represents the hyperparameter and its value ranges within [0, 5].
L L D A M ( ( x , y ) , f ) = log u u + y j e z j , u = e z y p y
p j = C n j 1 / 4 , j { 1 , 2 , , k }
where z y represents the output value of the sample x with label y passing through the model; z j represents the output value of this sample for the jth class; and n j represents the number of samples in each class.

2.3. Accurate and Robust Search Space

In the ARNAS [21] method, guiding conclusions were proposed for NAS, demonstrating that the depth and width of unit structures at different positions in the entire neural architecture have different effects on accuracy and robustness. By placing different unit structures in different positions, the accuracy and robustness of the neural network architecture can be simultaneously improved. However, most existing search space designs are composed of normal units and reduced units. By stacking multiple normal units to improve classification accuracy as much as possible and reducing units to avoid invalid data, the overall architecture is constructed. Therefore, the final architecture is mainly composed of the same normal units, which limits the accuracy and robustness of the architecture.

2.3.1. Design of Accurate and Robust Search Space

This paper constructs an accurate and robust search space for few samples of HSIC. Among them, reduced units are retained to avoid interference from invalid data on the architecture, while precise and robust units are used instead of single-type normal units to improve the accuracy and robustness of the neural network architecture by placing different unit structures in different positions. Among them, the precise unit and the robust unit return feature maps of the same dimension but can be placed in different positions of the architecture to obtain gains in classification accuracy and transfer robustness. The reason for naming precise units and robust units is that precise units tend to have more attention separable convolution operators internally, which makes it easier for the model to learn more parameters to improve classification performance. Robust units tend to use dilated convolutions, fixing certain weights to zero and making it difficult for input perturbations to alter the output, thus possessing stronger robustness. Its complete structure is shown in Figure 2.
Among them, the reduction unit is placed at one-third and two-thirds of the entire architecture, the precise unit is placed before the second reduction unit, and the robust unit is placed after the second reduction unit. This design mainly relies on the strong correlation between unit performance and its position in the architecture, with a focus on accuracy in the front and robustness in the back. Subsequently, the NAS automatically searches for the optimal operation combination and connection method within the precise and robust units. The operations between nodes within a precise unit tend to have more separable convolutions, fewer dilated convolutions, and skip connections, while the operations between nodes within a robust unit tend to have more dilated convolutions, fewer separable convolutions, and no skip connections.

2.3.2. Internal Design of Search Space

This paper constructs a unit structure for precise and robust search spaces, and the design of internal operators in the search space is a key factor in determining unit performance. Previous CNN methods have overly relied on local spatial feature extraction, while existing channel attention mechanisms (such as SE modules) use fully connected layer static weighting to sever the dynamic correlation between spatial details and spectral global response, making it difficult to fully exploit ground features and complex spectral spatial information in hyperspectral data [22].
To this end, as shown in Figure 3, this paper designs three convolution operators to construct the search space: Dilated Convolution (Dilated_Conv), which expands the receptive field through multi-level dilation rates and captures global contextual information without increasing parameters. The Adaptive Fine-Grained Channel Attention Depthwise Separable Convolution (FCA_SepConv) combines depthwise separable convolution with adaptive fine-grained channel attention to capture the relationship between global and local information at different granularities, achieving a dynamic correlation between local spatial details and global spectral response, thereby improving feature selection efficiency. The Mixed Local Channel Attention Depthwise Separable Convolution (MICA_SepConv) achieves the fine fusion of spectral spatial features through multi-scale local feature interaction and channel-adaptive weighting. The search space achieves spectral spatial joint modeling through the global perception of dilated convolution, local optimization of FCA_SepConv, and multi-scale interactive collaboration of MICA_SepConv. By designing these three operators in conjunction with a precise and robust search space, it is possible to effectively search for the final network that balances classification performance and transfer robustness.
(1) Fine-Grained Channel Attention Depthwise Separable Convolution
In order to effectively integrate the global and local information of hyperspectral data, this paper constructs an FCA_SepConv operator by combining the adaptive fine-grained channel attention (FCA) [23] mechanism. Among them, the FCA creates correlation matrices through cross-correlation operations to capture the relationships between global and local information at different granularities. This feature enhances the interaction between global and local information, enabling a more precise division of their correlations at different granularity levels. Finally, by constructing a trainable parameter to dynamically merge global and local information, the FCA achieved adaptive allocation of channel weights, thereby improving feature extraction capabilities. The architecture of the FCA is shown in Figure 4.
The core idea of FCA is to achieve the interaction of multi-scale features and the adaptive allocation of channel attention through global and local contrastive modeling, thereby enhancing the feature representation ability. Firstly, in order to obtain global information from the feature map, the feature map F containing global spatial information is transformed into a channel descriptor U through global average pooling. Given the feature map F R C × H × W , C , H , and W represent the number of channels, length, and width, respectively. They are merged through global averaging to generate the channel descriptor U R C . The n th channel element of the channel descriptor U is expressed as
U n = G A P ( F n ) = 1 H × W i = 1 H j = 1 W F n ( i , j )
where F n ( i , j ) is the nth channel feature map in this local area; G A P ( x ) is the global average pooling function. Through this function, the shape of the feature map F can be transformed from C × H × W to C × 1 × 1 .
In order to obtain local channel information while ensuring a small number of model parameters, a band matrix B = [ b 1 , b 2 , , b k ] is used for local channel interaction:
U l c = i = 1 k U · b i
where U is the channel descriptor; U l c represents the local information; and k represents the number of adjacent channels. In order to obtain the global channel information and enhance the ability to represent global information, the diagonal matrix D is utilized to capture the dependencies among all channels as global information.
U g c = i = 1 c U · d i
where U g c represents the global information; C is the number of channels; and the diagonal matrix is D = d 1 , d 2 , d 3 , , d c .
In order to promote an effective interaction between global information and local information, the global information obtained through the diagonal matrix is combined with the local information obtained through the band matrix. The correlation between the two is captured at different granularities through the cross-correlation operation:
M = U g c · U l c T
where M represents the correlation matrix.
In order to accurately assign feature weights and reduce the computational complexity, an adaptive fusion strategy is adopted. The row and column information are extracted from the correlation matrix and transposed into weight vectors of global and local information. Dynamic fusion is achieved through a learnable factor, and the process is expressed as follows:
U g c w = j c M i , j , i 1 , 2 , , c
U l c w = j c ( U l c · U g c T ) = j c M i , j T
W = σ ( σ ( θ ) × σ ( U g c w ) + ( 1 σ ( θ ) ) × σ ( U l c w ) )
where U g c w and U l c w are the fused global channel weight and local channel weight, respectively; θ represents the Sigmoid activation function.
Through this method, the redundant cross-correlation operation between local and global information is effectively avoided, and at the same time, the interaction between local and global information is further promoted. Eventually, the FCA selectively emphasizes the information features, suppresses the useless features, and achieves a more precise weight allocation for the denoising of relevant features. Finally, the obtained weights are multiplied by the input feature map, and the formula is as follows:
F * = W F
where F is the input feature map, and F * is the feature map output though FCA.
(2) Mixed Local Channel Attention Depthwise Separable Convolution
Due to the complexity and high computational cost of spatial attention modules, it is difficult to directly integrate them into lightweight convolutions. Even though some simple attention methods can successfully reduce model parameters and include spatial and channel information, they also exclude local information and only provide long-range information for the entire range. This article uses the Mixed Local Channel Attention (MLCA) [24] module to improve the performance of depthwise separable convolution by combining channel attention and spatial attention according to HSIC requirements. This module aims to address the issue of existing attention mechanisms ignoring spatial feature information and improve the model’s expressive power while maintaining a lightweight. The principle of MLCA_SepConv is shown in Figure 5.
The MLCA mechanism first performs block processing, converting the input feature vector into a vector of 1 × C × k s × k s , and extracts local spatial information through the first local pooling. Among them, k s represents the number of blocks, which is used to determine the size of the blocks. In the initial stage, two branches are utilized to convert the input into one-dimensional vectors. The first branch contains global information, and the second branch contains local spatial information. After one-dimensional convolution, the original resolution of the two vectors is restored through the un-average pooling operation (UNAP), and then the information is fused to achieve the goal of mixed attention. Conv1d in the figure is a one-dimensional convolution, which processes the vectors of the two branches. The size of the convolution kernel k is proportional to the channel dimension C , which indicates that when capturing local cross-channel interaction information, only the relationship between each channel and its k adjacent channels is considered. The selection of k is determined by Formula (15):
k = ϕ ( C ) = log 2 ( C ) γ + b γ o d d
where C is the number of channels; k is the size of the convolution kernel; both γ and b are hyperparameters with default values of 2; and “odd” means that k is an odd number. If k is an even number, then 1 is added to it.
Figure 6 illustrates the main processes of global average pooling (GAP), Local Average Pooling (LAP), and un-average pooling. GAP is the global average pooling, while adaptive concrete pooling with an output size of 1 will change the feature map to 1 × 1 . When it is necessary to perform multiplication or addition on the source input, Expand or UNAP should be used for expansion. Among them, UNAP mainly focuses on the attributes of graphics and extends them to the required size. UNAP can achieve adaptive pooling by outputting a size equal to the size of the source feature map. The entire feature map of LAP and GAP channels is different. LAP divides the entire feature map into patches of k × k and then performs average pooling on each patch, which can be executed using the output of k × k adaptive average pooling. When expanding LAP, the size is not 1 × 1, so the Expand operation cannot be directly used. On the contrary, the UNAP process must be used to restore the feature map to its original size.
The overall algorithm flow of the proposed MLSF-NAS is written as Algorithm 1.
Algorithm 1 MLSF-NAS
Initialization and Data Preparation:
1. source_domains = [Domain_D1, Domain_D2] //Source domain data
2. target_domain_labeled = Domain_Dt_labeled //Labeled data of the target domain
3. target_domain_unlabeled = Domain_Dt_unlabeled //Unlabeled data of the target domain
Stage 1: Supernet Architecture Search:
1. SuperNet = InitializeSuperNet() //Initialize the supernet
2. for epoch = 1 to SUPERNET_EPOCHS do
  for each domain in source_domains + [target_domain_labeled] do
    S_samples, Q_samples = SplitIntoSupportQuery(domain) //Split into support set and query set
    features = SuperNet(S_samples, Q_samples) //Supernet feature extraction
    loss = Loss(features) //Calculate loss
    UpdateSuperNet(SuperNet, loss) //Update supernet parameters
  end for
end for
Stage 2: Optimal Architecture Extraction and Final Network Construction:
1. EpisodeOptimizer = InitializeOptimizer(FinalNet) //Initialize the final network optimizer
2. for episode = 1 to EPISODES do
  episode_data = SampleEpisodeData(source_domains, target_domain_labeled) //Sample episode data
  for each batch in episode_data do
    M_samples, other_samples = SplitBatch(batch) //Split into M set and other sets
    features = FinalNet(M_samples, other_samples) //Final network feature extraction
    loss = Loss(features) //Calculate loss
    UpdateFinalNet(FinalNet, loss, EpisodeOptimizer) //Update final network parameters
  end for
end for
Stage 4: Transfer Application of Unlabeled Data in the Target Domain:
1. NNClassifier = InitializeClassifier() //Initialize the classifier
2. for each sample in target_domain_unlabeled do
  features = FinalNet(sample) //Feature extraction
  prediction = NNClassifier(features) //Classification prediction
end for

3. Results

3.1. Dataset Description

The experimental datasets used in this method include a multi-source dataset with sufficient labeled samples and a target domain dataset with fewer labeled samples. For multi-source datasets, this paper chooses Mini ImageNet (MI) and Chikusei (CK) to jointly form a multi-source domain dataset. For the target domain dataset, this paper uses three representative HSI datasets: Pavia Center (PC) dataset, Indian Pines (IN) dataset, and WHU Hi LongKou (LK) dataset.
Mini ImageNet dataset: MI is a subset of the ImageNet dataset consisting of 100 categories, each with 600 images, for a total of 60,000 images. The Mini ImageNet dataset, due to its rich natural images, is often used as a benchmark dataset for meta learning and few-sample domains, for research in few-sample learning.
Chikusei dataset: CK is a hyperspectral image of Chikushi, Ibaraki, Japan, obtained through the Hyperspec-VNIR-CIRIS spectrometer. The ground sampling distance is 2.5 m, the scene size is 2517 × 2335 pixels, with 19 categories and a total of 512 bands.
The PC dataset was captured by ROSIS sensors during a flight over Pavia, northern Italy, in 2001. The number of spectral bands in the Pavia Center is 102. The size of the dataset is 1096 × 1096 pixels, with a spatial resolution of 1.3 m. There are nine categories available for experimental research, as shown in Table 1.
The IN dataset was obtained by the AVIRIS sensor at the Indian Pine testing site in northwest Indiana in 1992. It consists of 16 crop categories, 145 × 145 pixels, and 224 spectral reflectance bands, with 200 effective bands and a spatial resolution of 20 m, as shown in Table 2.
The WHU-Hi-LongKou dataset was obtained on 7 July 2018 in Longkou Town, Hubei Province, China, using an 8 mm focal length Headwall Nano Hyperspec imaging sensor installed on the DJI Matrice 600 Pro (Shenzhen Dajiang lnnovation Technology Co., Ltd., Shenzhen, China) drone platform. The research area is a simple agricultural scenario with a total of nine categories. The image size is 550 × 400 pixels, with 270 bands between 0.4~1 μm. The spatial resolution of the hyperspectral image carried by the drone is approximately 0.463 m, as shown in Table 3.

3.2. Experimental Environment Configuration and Implementation Details

All experiments were conducted under the following computer configurations: Intel (R) Xeon (R) CPU E5-2620 v4@2.10GHz, 128 GB RAM and NVIDIA GeForce 2080 Ti graphics processing unit (GPU) (Shanghai Fenghu Information Technology Co., Ltd., Shanghai, China). The software environment is a 64-bit Windows 10 system, using the open-source framework Pytorch 1.12.1. The Adam optimizer is used to optimize architecture parameters with a learning rate of 0.004.
In this experiment, the single-domain method SSRN [11], the adaptive spectral space kernel improved residual network A2S2K-ResNet [12], the lightweight multi-scale neural architecture search (LMSS-NAS) [17], and the dual view spectral and global spatial feature fusion network (DSGSF) [25] randomly selected five label samples for each class in the target domain. For multi-domain methods such as the Deep Cross Domain Few-Shot Learning (DCFSL) [26] and Heterogeneous Few-Shot Learning (HFSL) [27], an additional 200 labeled source domain samples were randomly selected. The remaining samples in the target domain are retained as the test set. For each episode, the input size of HSI in the CK and target datasets is set to 9 × 9 × 9 (the number of bands) to maintain the same size as the spatial features of hyperspectral data in cross-domain scenes. In addition, in MI, the input size of each image is set to 33 × 33 × 33 to utilize its rich spatial and texture information.
Specifically, the FSL task in each episode is c-ways k-shot n-query, where c, k, and n represent the number of selected classes, the number of labeled samples for each class in the support set, and the number of labeled samples for each class in the query set, respectively. Usually, c is set as the number of classes in the target dataset to ensure the total number of tasks in the source domain dataset based on the FSL method previously used for each task for both the single-source dataset and the target dataset. For the target dataset, k and nt are the same as the previous FSL method. To ensure the stability of the comparison, each experiment was repeated ten times, and the average was taken as the final result. The search iteration count was set to 500 and the training iteration count to 20,000. The Adam optimizer was used to optimize the network, with a learning rate set to 0.004.

3.3. Comparison of the Proposed Method with the State-of-the-Art Methods

In order to comprehensively and objectively evaluate the performance of different methods in HSI classification tasks, this chapter combines a quantitative and qualitative analysis to conduct in-depth and detailed evaluations of various methods. For the quantitative evaluation, this study used three indicators, including the overall accuracy (OA), average accuracy (AA), and Kappa coefficient (K). This chapter validates the proposed method through several comparative experiments, including the methods SSRN and A2S2K-ResNet based on manually designed convolutional residual networks, DSGSF based on the dual channel attention mechanism, LMSS-NAS based on NAS, DFSL based on FSL, DCFSL, and HFSL based on FSL, as shown in Table 4, Table 5 and Table 6.
It can be seen in Table 4, in non-FSL methods, SSRN (OA: PC 91.64%, IN 53.70%, and LK 87.84%) and A2S2K-ResNet (OA: PC 93.39%, IN 57.28%, and LK 88.42%) have a lower classification performance. Therefore, when training with limited labeled samples, manually designed convolutional neural networks have insufficient advantages in achieving an accurate classification. Based on NAS, LMSS-NAS (OA: PC of 96.85%, IN of 60.96%, and LK of 94.60%) has improved compared to SSRN and A2S2K-ResNet and is close to the overall classification performance of DSGSF. However, due to the overfitting problem caused by limited samples, it lags behind the FSL method.
Compared with DCFSL and HFSL in the manually designed FSL method, MLFS-NAS achieved the most competitive and accurate classification performance through multi-source learning and the automatic search of the final network composed of optimal units, under the limitation of limited labeled samples in three datasets. On the PC, IN, and LK datasets, the OA reached 98.57%, 78.39%, and 98.74%, respectively. Meanwhile, compared to the suboptimal method HFSL, the OA improved by 2.33%, 10.17%, and 1.9%, respectively. In addition, on the IN dataset, the AA performance of MLFS-NAS (AA: 84.35%) was higher than that of HFSL (AA: 80.43%), and MLFS-NAS correctly classified more samples in categories, resulting in a higher Kappa coefficient (76.91%). Overall, ARFSL-NAS abstracts the natural image texture information and spectral spatial commonalities of hyperspectral data into transferable meta knowledge through multi-source learning and utilizes precise and robust units at different locations to enhance classification accuracy and transfer robustness. Therefore, MLFS-NAS achieved better classification results on all three datasets.
In order to present the classification results more clearly, a visual analysis was conducted on the classification results of seven methods on two hyperspectral datasets, as shown in Figure 7, Figure 8 and Figure 9. Obviously, compared to other methods, the MLFS-NAS method has more accurate classification results. Compared with other classification methods, SSRN and A2S2K-ResNet have more noise and scattered points in their classification graphs, while LMSS-NAS, DFSL, DCFSL, and HFSL classification methods still have some misclassification. Compared with the true value graph, it can be seen that this method can obtain more accurate classification results, further proving the effectiveness of NAS and multi-source learning in few-sample hyperspectral classification.
Figure 10, Figure 11 and Figure 12 show the different classification accuracies of different classification methods on these three datasets. As shown in the figure, MLFS-NAS has a high accuracy in the vast majority of categories in the three datasets. For example, in the Bitumen and Tile categories of the PC dataset, MLFS-NAS is significantly higher than the suboptimal method HFSL. In the Sesame and Mixed weed classes of the LK dataset, MLFS-NAS also has certain advantages in classification accuracy compared to other comparative experiments. Specifically, the IN dataset is prone to overfitting due to the imbalance in the number of samples between classes (e.g., Oats have only 20 samples), making classification challenging under limited samples. This is also the reason why non-FSL methods perform poorly in this class (e.g., SSRN has only a 12.28% classification accuracy). However, the FSL method utilizes meta learning and implicit knowledge transfer to enable the model to learn the ability to quickly generalize a small number of samples during the training phase, achieving a 100% classification accuracy in this class. At the same time, MLFS-NAS effectively transfers the meta knowledge abstracted by MI and CK to HSI through multi-source learning and enhances the discriminative feature extraction of various types by precise and robust attention convolution operators in the search space, avoiding the problem of feature flooding caused by sparse samples and achieving accurate classification on the IN dataset.

4. Discussion

4.1. Analysis of Optimal Cell Structure

This paper proposes a multi-source FSL framework based on the fusion of heterogeneous natural images (ImageNet) and homogeneous remote sensing images (Chikusei). After each search iteration, the optimal unit structure of the source dataset is obtained as shown in Figure 13. Among them, operations between nodes within precise units tend to have more separable convolutions, fewer dilated convolutions, and skip connections, while operations between nodes within robust units tend to have more dilated convolutions, fewer separable convolutions, and no skip connections. This unit structure design is such that more learnable parameters are beneficial for improving accuracy, so precise units will choose more separable convolution types. During the search process, the robust units at the output of this method are more inclined towards dilated convolution. In robust units, dilated convolutions fix certain weights to zero, resulting in fewer learning parameters. Therefore, input disturbances are difficult to alter the output, thus possessing stronger robustness. The NSA architecture automatically searches for the optimal precise and robust unit structure in the search space and fully extracts the global and local information of HSI through efficient convolution operators in the search space, further improving the classification accuracy of NAS in the field of few-sample HSIC and ensuring the robustness of the transfer process.

4.2. Analysis of Related Parameter

The results of the ablation experiment are shown in Table 7. The results indicate that when MI and CK are treated as independent, single-domain structures, the MI single domain exhibits better experimental performance than the CM single domain on the PC and LK datasets, while the classification performance on the IN dataset is slightly lower. The experiments have shown that in single-source domain learning, different source domains have their own advantages. However, when MI and CK are both in multi-source learning, the learned multi-source information significantly improves the accuracy and generalization ability of HSI classification under limited labeled samples, achieving the best results in all three target domain datasets, which further proves the effectiveness of a multi-source dataset’s learning.
This section conducted experiments on the MLFS-NAS method to verify its robustness under limited labeled samples. In the experiment, one to five labeled samples were selected from each class of the target domain dataset to validate the performance of the model. As shown in Figure 14, as the number of labeled samples increases, the classification performance of the method proposed in this chapter steadily improves. The experimental results indicate that MLSF-NAS effectively alleviates the risk of overfitting in low-sample scenarios through the final network and multi-source feature representation, demonstrating excellent information extraction efficiency and model robustness.

5. Conclusions

This paper proposes an MLFS-NAS, which aims to improve the classification performance and generalization ability of the model with only a small number of labeled samples by combining meta-learning strategies. This method utilizes a multi-source learning framework that integrates heterogeneous natural images (ImageNet) and homogeneous remote sensing images (Chikusei) to enhance the model’s generalization performance by allowing the model to learn to distinguish similarities between homogeneous and heterogeneous data. Meanwhile, by designing a precise and robust efficient search space, different optimal unit structures can be deployed at different locations to improve the classification accuracy and transfer robustness of the final network. Among them, in order to effectively utilize the global and local information of hyperspectral data, an efficient internal search space composed of separable convolutions with combined FCA and MICA mechanisms is designed. The experimental results on PC, IN, and LK datasets show that the proposed method effectively improves the accuracy of land cover classification under limited labeled samples.

Author Contributions

Conceptualization, A.W., K.Z., H.C., H.W. and M.W.; methodology, K.Z., H.C. and M.W.; software K.Z. and M.W.; validation K.Z. and H.C.; writing—review and editing A.W., K.Z., H.C., H.W. and M.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the Key Research and Development Plan Project of Heilongjiang (JD2023SJ19), the Natural Science Foundation of Heilongjiang Province (LH2023F034), the Science and Technology Project of Heilongjiang Provincial Department of Transportation (HJK2024B002), and Shenzhen Polytechnic University Research Fund (6025310007K).

Data Availability Statement

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Cantalloube, H.M.J.; Nahum, C.E. Airborne SAR-Efficient Signal Processing for Very High Resolution. Proc. IEEE 2013, 101, 784–797. [Google Scholar] [CrossRef]
  2. Gao, Z.D.; Hao, Q.; Liu, Y.; Zhu, Y.Y.; Cao, J.; Meng, H.M.; Liu, J.; Chen, H.L. Development of Hyperspectral Imaging and Application Technology. Metrol. Meas. Technol. 2019, 24–34. [Google Scholar] [CrossRef]
  3. Rasti, B.; Hong, D.; Hang, R.; Ghamisi, P.; Kang, X.; Chanussot, J.; Benediktsson, J.A. Feature Extraction for Hyperspectral Imagery: The Evolution From Shallow to Deep: Overview and Toolbox. IEEE Geosci. Remote Sens. Mag. 2020, 8, 60–88. [Google Scholar] [CrossRef]
  4. Landgrebe, D. Hyperspectral image data analysis. IEEE Signal Process. Mag. 2002, 19, 17–28. [Google Scholar] [CrossRef]
  5. Bioucas-Dias, J.M.; Plaza, A.; Camps-Valls, G.; Scheunders, P.; Nasrabadi, N.; Chanussot, J. Hyperspectral Remote Sensing Data Analysis and Future Challenges IEEE Geosci. Remote Sens. Mag. 2013, 1, 6–36. [Google Scholar]
  6. Kavita, B.; Vijaya, M. Evaluation of Deep Learning CNN Model for Land Use Land Cover Classification and Crop Identification Using Hyperspectral Remote Rensing Images. J. Indian Soc. Remote Sens. 2019, 47, 1949–1958. [Google Scholar]
  7. Gao, A.F.; Rasmussen, B.; Kulits, P.; Scheller, E.L.; Greenberger, R.; Ehlmann, B.L. Generalized Unsupervised Clustering of Hyperspectral Images of Geological Targets in the Near Infrared. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Nashville, TN, USA, 19–25 June 2021; pp. 4289–4298. [Google Scholar]
  8. Yue, J.; Zhao, W.; Mao, S.; Liu, H. Spectral-Spatial Classification of Hyperspectral Images using Deep Convolutional Neural Networks. Remote Sens. Lett. 2015, 6, 468–477. [Google Scholar] [CrossRef]
  9. Zhang, H.; Li, Y.; Zhang, Y.; Shen, Q. Spectral-Spatial Classification of Hyperspectral Imagery using a Dual-channel Convolutional Neural Network. Remote Sens. Lett. 2017, 8, 438–447. [Google Scholar] [CrossRef]
  10. Chen, Y.; Jiang, H.; Li, C.; Jia, X.; Ghamisi, P. Deep Feature Extraction and Classification of Hyperspectral Images based on Convolutional Neural Networks. IEEE Trans. Geosci. Remote Sens. 2016, 54, 6232–6251. [Google Scholar] [CrossRef]
  11. Zhong, Z.; Li, J.; Luo, Z.; Chapman, M. Chapman. Spectral-Spatial Residual Network for Hyperspectral Image Classification: A 3-D Deep Learning Framework. IEEE Trans. Geosci. Remote Sens. 2018, 56, 847–858. [Google Scholar] [CrossRef]
  12. Roy, S.K.; Manna, S.; Song, T.; Bruzzone, L. Attention-Based Adaptive Spectral-Spatial Kernel ResNet for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2021, 59, 7831–7843. [Google Scholar] [CrossRef]
  13. Liu, B.; Yu, X.; Yu, A.; Zhang, P.; Wan, G.; Wang, R. Deep few-shot learning for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2019, 57, 2290–2304. [Google Scholar] [CrossRef]
  14. Chen, Y.; Zhu, K.; Zhu, L.; He, X.; Ghamisi, P.; Benediktsson, J.A. Automatic Design of Convolutional Neural Network for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2019, 57, 7048–7066. [Google Scholar] [CrossRef]
  15. Zhang, H.; Gong, C.; Bai, Y.; Bai, Z.; Li, Y. 3-D-ANAS: 3-D Asymmetric Neural Architecture Search for Fast Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5508519. [Google Scholar] [CrossRef]
  16. Xue, X.; Zhang, H.; Fang, B.; Bai, Z.; Li, Y. Grafting Transformer on Automatically Designed Convolutional Neural Network for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5531116. [Google Scholar] [CrossRef]
  17. Cao, C.; Xiang, H.; Song, W.; Yi, H.; Xiao, F.; Gao, X. Lightweight Multiscale Neural Architecture Search With Spectra-Spatial Attention for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5505315. [Google Scholar] [CrossRef]
  18. Xiao, F.; Xiang, H.; Cao, C.; Gao, X. Neural Architecture Search-Based Few-Shot Learning for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5513715. [Google Scholar] [CrossRef]
  19. Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; Dollar, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
  20. Cao, K.; Wei, C.; Gaidon, A.; Arechiga, N.; Ma, T. Learning imbalanced datasets with label-distribution-aware margin loss. Adv. Neural Inf. Process. Syst. 2019, 32. [Google Scholar] [CrossRef]
  21. Ou, Y.; Feng, Y.; Sun, Y. Towards Accurate and Robust Architectures via Neural Architecture Search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–22 June 2024; pp. 5967–5976. [Google Scholar]
  22. Feng, J.; Fu, C. An Enhanced YOLOv8 Model for Flame and Smoke Detection. In Proceedings of the 2024 4th International Conference on Computer Science and Blockchain (CCSB), Shenzhen, China, 6–8 September 2024; pp. 109–113. [Google Scholar]
  23. Sun, H.; Wen, Y.; Feng, H.; Zheng, Y.; Mei, Q.; Ren, D.; Yu, M. Unsupervised Bidirectional Contrastive Reconstruction and Adaptive Fine-Grained Channel Attention Networks for image dehazing. Neural Netw. 2024, 176, 106314. [Google Scholar] [CrossRef] [PubMed]
  24. Wan, D.; Lu, R.; Shen, S.; Xu, T.; Lang, X.; Ren, Z. Mixed Local Channel Attention for Object Detection. Eng. Appl. Artif. Intell. 2023, 123, 106442. [Google Scholar] [CrossRef]
  25. Guo, T.; Wang, R.; Luo, F.; Gong, X.; Zhang, L.; Gao, X. Dual-View Spectral and Global Spatial Feature Fusion Network for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5512913. [Google Scholar] [CrossRef]
  26. Li, Z.; Liu, M.; Chen, Y.; Xu, Y.; Li, W.; Du, Q. Deep Cross-Domain Few-Shot Learning for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5501618. [Google Scholar] [CrossRef]
  27. Wang, Y.; Liu, M.; Yang, Y.; Li, Z.; Du, Q.; Chen, Y.; Li, F.; Yang, H. Heterogeneous Few-Shot Learning for Hyperspectral Image Classification. IEEE Geosci. Remote Sens. Lett. 2022, 19, 5510405. [Google Scholar] [CrossRef]
Figure 1. The overall architecture of MLFS-NAS.
Figure 1. The overall architecture of MLFS-NAS.
Electronics 14 02952 g001
Figure 2. The accurate and robust search space.
Figure 2. The accurate and robust search space.
Electronics 14 02952 g002
Figure 3. The convolution operator of internal search space.
Figure 3. The convolution operator of internal search space.
Electronics 14 02952 g003
Figure 4. Adaptive Fine-Grained Channel Attention Depthwise Separable Convolution structure.
Figure 4. Adaptive Fine-Grained Channel Attention Depthwise Separable Convolution structure.
Electronics 14 02952 g004
Figure 5. Mixed Local Channel Attention Depthwise Separable Convolution architecture.
Figure 5. Mixed Local Channel Attention Depthwise Separable Convolution architecture.
Electronics 14 02952 g005
Figure 6. Schematic diagram of GAP, LAP, and UNAP.
Figure 6. Schematic diagram of GAP, LAP, and UNAP.
Electronics 14 02952 g006
Figure 7. The classification results of PC dataset. (a) Ground truth; (b) SSRN; (c) A2S2K-ResNet; (d) DSGSF; (e) LMSS-NAS; (f) DCFSL; (g) HFSL; and (h) MLFS-NAS.
Figure 7. The classification results of PC dataset. (a) Ground truth; (b) SSRN; (c) A2S2K-ResNet; (d) DSGSF; (e) LMSS-NAS; (f) DCFSL; (g) HFSL; and (h) MLFS-NAS.
Electronics 14 02952 g007
Figure 8. The classification results of IN dataset. (a) Ground truth; (b) SSRN; (c) A2S2K-ResNet; (d) DSGSF; (e) LMSS-NAS; (f)DCFSL; (g) HFSL; and (h) MLFS-NAS.
Figure 8. The classification results of IN dataset. (a) Ground truth; (b) SSRN; (c) A2S2K-ResNet; (d) DSGSF; (e) LMSS-NAS; (f)DCFSL; (g) HFSL; and (h) MLFS-NAS.
Electronics 14 02952 g008
Figure 9. The classification results of LK dataset. (a) Ground truth; (b) SSRN; (c) A2S2K-ResNet; (d) DSGSF; (e) LMSS-NAS; (f) DCFSL; (g) HFSL; and (h) MLFS-NAS.
Figure 9. The classification results of LK dataset. (a) Ground truth; (b) SSRN; (c) A2S2K-ResNet; (d) DSGSF; (e) LMSS-NAS; (f) DCFSL; (g) HFSL; and (h) MLFS-NAS.
Electronics 14 02952 g009
Figure 10. Classification results of different methods for PC dataset.
Figure 10. Classification results of different methods for PC dataset.
Electronics 14 02952 g010
Figure 11. Classification results of different methods for IN dataset.
Figure 11. Classification results of different methods for IN dataset.
Electronics 14 02952 g011
Figure 12. Classification results of different methods for LK dataset.
Figure 12. Classification results of different methods for LK dataset.
Electronics 14 02952 g012
Figure 13. Optimal cell structure on the source datasets. (a) Accurate cell; (b) robust cell.
Figure 13. Optimal cell structure on the source datasets. (a) Accurate cell; (b) robust cell.
Electronics 14 02952 g013
Figure 14. Classification accuracy of HCFSL-NAS with different numbers of labeled samples on the target domain datasets. (a) PC; (b) IN; and (c) LK.
Figure 14. Classification accuracy of HCFSL-NAS with different numbers of labeled samples on the target domain datasets. (a) PC; (b) IN; and (c) LK.
Electronics 14 02952 g014
Table 1. Description of PC dataset information.
Table 1. Description of PC dataset information.
No.ClassColorSample NumbersFalse-Color MapGround-Truth Map
1WaterElectronics 14 02952 i00165,971Electronics 14 02952 i017Electronics 14 02952 i018
2TreesElectronics 14 02952 i0027598
3AsphaltElectronics 14 02952 i003300
4Self-Blocking BricksElectronics 14 02952 i0042685
5BitumenElectronics 14 02952 i0056584
6TilesElectronics 14 02952 i0069248
7ShadowsElectronics 14 02952 i0077287
8MeadowsElectronics 14 02952 i00842,826
9Bare SoilElectronics 14 02952 i0092863
Total 42,776
Table 2. Description of IN dataset information.
Table 2. Description of IN dataset information.
No.Class NameColorSample False-Color MapGround-Truth Map
1AlfalfaElectronics 14 02952 i00146Electronics 14 02952 i019Electronics 14 02952 i020
2Corn-notillElectronics 14 02952 i0021428
3Corn-mintillElectronics 14 02952 i003830
4CornElectronics 14 02952 i004237
5Grass-pastureElectronics 14 02952 i005483
6Grass-treesElectronics 14 02952 i006730
7Grass-pasture-mowedElectronics 14 02952 i00728
8Hay-windrowedElectronics 14 02952 i008478
9OatsElectronics 14 02952 i00920
10Soybean-notillElectronics 14 02952 i010972
11Soybean-mintillElectronics 14 02952 i0112455
12Soybean-cleanElectronics 14 02952 i012593
13WheatElectronics 14 02952 i013205
WoodsElectronics 14 02952 i0141265
Buildings-grass-trees-drivesElectronics 14 02952 i015386
Stone-steel-towersElectronics 14 02952 i01693
Total 10,249
Table 3. Description of LK dataset information.
Table 3. Description of LK dataset information.
No.ClassColorSample NumbersFalse-Color MapGround-Truth Map
1CornElectronics 14 02952 i00134,511Electronics 14 02952 i021Electronics 14 02952 i022
2CottonElectronics 14 02952 i0028374
3SesameElectronics 14 02952 i0033031
4Broad-leaf soybeanElectronics 14 02952 i00463,212
5Narrow-leaf soybeanElectronics 14 02952 i0054151
6RiceElectronics 14 02952 i00611,854
7WaterElectronics 14 02952 i00767,056
8Roads and housesElectronics 14 02952 i0087124
9Mixed weedElectronics 14 02952 i0095229
Total 204,542
Table 4. Classification results of PC dataset.
Table 4. Classification results of PC dataset.
MethodSSRNA2S2K-ResNetDSGSFLMSS-NASDCFSLHFSLMLFS-NAS
Class
199.14 ± 0.2099.99 ± 0.0199.98 ± 0.0299.95 ± 0.0499.50 ± 0.1299.78 ± 0.0299.65 ± 0.11
284.13 ± 4.6788.48 ± 11.2075.05 ± 6.3996.87 ± 1.8692.70 ± 4.3195.56 ± 5.1093.46 ± 1.58
366.49 ± 7.8557.80 ± 18.1997.56 ± 2.0183.84 ± 5.9484.73 ± 3.5690.26 ± 2.7193.67 ± 2.63
461.27 ± 12.2662.32 ± 6.6851.49 ± 14.7852.91 ± 15.2399.51 ± 0.2188.20 ± 0.8493.66 ± 3.56
581.32 ± 8.7390.04 ± 7.3499.97 ± 4.8591.65 ± 5.3586.23 ± 3.8389.85 ± 4.1094.86 ± 2.68
684.24 ± 4.8273.76 ± 6.7293.29 ± 3.4191.55 ± 0.7794.07 ± 1.2571.71 ± 0.4597.98 ± 1.52
791.27 ± 1.7096.68 ± 2.6699.96 ± 1.4998.71 ± 0.9184.45 ± 3.6499.86 ± 4.5590.37 ± 2.89
894.26 ± 5.3197.63 ± 2.6298.67 ± 0.9999.93 ± 0.0598.75 ± 0.2299.80 ± 0.9199.52 ± 0.14
993.71 ± 0.1899.98 ± 0.01100.00 ± 0.0099.25 ± 0.7795.98 ± 4.7392.51 ± 3.0496.58 ± 1.46
OA (%)91.64 ± 1.7599.98 ± 0.0195.77 ± 0.5696.85 ± 1.2596.89 ± 0.3096.24 ± 0.9596.58 ± 1.46
AA (%)83.98 ± 0.9385.21 ± 0.0490.67 ± 2.0890.52 ± 2.4492.88 ± 2.5691.95 ± 1.1198.57 ± 0.53
K × 10089.94 ± 2.5590.70 ± 0.0394.00 ± 0.8095.55 ± 1.7695.60 ± 0.0994.68 ± 0.7695.85 ± 0.47
Table 5. Classification results of IN dataset.
Table 5. Classification results of IN dataset.
MethodSSRNA2S2K-ResNetDSGSFLMSS-NASDCFSLHFSLMLFS-NAS
Class
129.93 ± 23.7427.28 ± 18.9491.66 ± 2.0926.39 ± 9.8198.8 ± 1.6998.78 ± 1.2296.67 ± 2.57
252.13 ± 14.6451.99 ± 4.8947.63 ± 3.8059.97 ± 18.2538.15 ± 6.1656.71 ± 4.6461.73 ± 1.33
324.07 ± 3.2842.27 ± 1.0241.86 ± 2.7646.78 ± 24.8451.63 ± 3.9450.61 ± 4.3075.78 ± 1.81
429.30 ± 12.9130.40 ± 10.2216.01 ± 16.7740.86 ± 3.8772.70 ± 13.0179.74 ± 18.9782.93 ± 9.21
576.71 ± 12.4977.49 ± 13.4854.72 ± 14.1979.60 ± 12.7961.93 ± 13.0163.39 ± 10.2591.10 ± 0.92
689.19 ± 7.2689.66 ± 1.8582.06 ± 3.6185.75 ± 10.7394.14 ± 0.6978.41 ± 11.1087.66 ± 1.71
735.31 ± 26.5125.44 ± 4.1240.00 ± 22.7432.23 ± 19.72100.00 ± 0.0097.83 ± 2.1798.92 ± 1.25
895.02 ± 6.90100.00 ± 0.0099.32 ± 0.4599.11 ± 0.8697.14 ± 3.7399.68 ± 0.1196.57 ± 2.01
912.28 ± 5.6333.60 ± 2.4211.59 ± 1.4918.50 ± 15.11100.00 ± 0.00100.00 ± 0.00100.00 ± 0.00
1063.05 ± 7.6151.89 ± 11.3945.04 ± 11.4570.90 ± 18.4059.82 ± 0.7967.99 ± 3.3675.32 ± 2.20
1161.22 ± 5.1467.19 ± 2.7873.21 ± 3.8856.15 ± 8.8533.31 ± 3.8664.65 ± 1.7171.52 ± 1.07
1228.32 ± 4.9841.61 ± 2.0642.85 ± 5.7446.20 ± 25.1941.52 ± 6.7055.78 ± 23.9871.44 ± 4.27
1394.99 ± 4.1584.42 ± 0.5680.61 ± 1.2291.56 ± 7.6999.00 ± 0.7198.75 ± 0.2598.77 ± 1.77
1491.43 ± 4.6096.97 ± 4.7793.06 ± 1.9596.30 ± 2.3192.10 ± 5.3380.99 ± 9.4891.48 ± 0.87
1558.55 ± 25.5159.61 ± 6.7855.50 ± 15.8964.14 ± 12.8379.27 ± 8.5899.21 ± 0.5295.67 ± 3.01
1673.47 ± 29.2672.21 ± 4.9189.77 ± 8.0183.60 ± 8.1398.30 ± 2.4194.32 ± 5.6896.55 ± 2.77
OA (%)53.70 ± 4.8457.28 ± 0.9159.17 ± 2.2360.96 ± 4.2766.55 ± 1.8169.61 ± 1.7178.39 ± 0.78
AA (%)57.18 ± 5.3559.49 ± 3.3960.31 ± 2.6962.38 ± 2.6378.53 ± 1.0980.43 ± 2.2684.35 ± 3.19
K × 10048.66 ± 5.2252.70 ± 0.9454.30 ± 2.1855.23 ± 4.9761.99 ± 1.4965.83 ± 1.3076.91 ± 0.90
Table 6. Classification results of LK dataset.
Table 6. Classification results of LK dataset.
MethodSSRNA2S2K-ResNetDSGSFLMSS-NASDCFSLHFSLMLFS-NAS
Class
193.97 ± 4.5892.07 ± 0.7399.12 ± 0.0198.17 ± 0.5399.35 ± 0.3794.39 ± 4.0398.45 ± 0.53
254.57 ± 15.7576.51 ± 19.5394.93 ± 0.4295.23 ± 8.0893.71 ± 4.2299.00 ± 0.3997.58 ± 0.30
359.98 ± 11.3289.33 ± 3.3165.59 ± 0.6974.45 ± 22.6293.75 ± 6.2081.18 ± 6.0398.32 ± 0.08
491.85 ± 6.7074.08 ± 2.1394.86 ± 0.2198.40 ± 0.9182.06 ± 5.3197.04 ± 1.2598.53 ± 0.23
543.85 ± 6.0184.16 ± 0.8676.99 ± 3.4377.33 ± 12.0198.29 ± 1.1791.41 ± 3.7198.42 ± 0.85
685.98 ± 12.9097.38 ± 2.3098.49 ± 1.2197.39 ± 4.1987.85 ± 3.6296.40 ± 1.9698.31 ± 0.88
798.65 ± 0.8899.48 ± 0.3999.34 ± 0.3099.31 ± 0.6599.94 ± 0.3199.58 ± 0.3798.31 ± 1.22
887.20 ± 15.5793.64 ± 2.5394.73 ± 1.1768.24 ± 13.7488.96 ± 3.8696.38 ± 0.8697.64 ± 0.36
946.62 ± 15.2190.63 ± 1.4674.13 ± 2.4259.23 ± 4.6592.97 ± 1.4987.08 ± 0.7598.55 ± 0.48
OA (%)87.48 ± 1.3188.42 ± 0.4994.60 ± 0.1494.99 ± 1.3692.67 ± 1.1496.84 ± 0.9798.74 ± 0.46
AA (%)73.63 ± 1.2473.25 ± 1.2388.69 ± 0.4786.84 ± 2.1992.99 ± 0.5593.61 ± 0.4598.93 ± 0.11
K × 10083.64 ± 1.7685.22 ± 0.6192.88 ± 0.2393.34 ± 1.8390.54 ± 1.3595.86 ± 1.2798.61 ± 0.35
Table 7. The ablation experiment of the source domain.
Table 7. The ablation experiment of the source domain.
Source DatasetsTarget DatasetsOA (%)AA (%)K × 100
MIPC97.49 ± 0.9894.62 ± 0.4696.41 ± 0.76
IN73.77 ± 0.8976.94 ± 1.3771.35 ± 1.22
LK96.50 ± 1.6996.63 ± 0.9196.39 ± 1.13
CKPC97.36 ± 0.5194.56 ± 1.0796.38 ± 0.84
IN77.42 ± 1.0279.81 ± 1.1475.29 ± 1.27
LK96.47 ± 0.7096.53 ± 0.3996.32 ± 0.51
MI&CKPC98.57 ± 0.5395.85 ± 0.4797.89 ± 0.66
IN79.85 ± 0.7880.06 ± 3.1976.91 ± 0.90
LK98.74 ± 0.4698.93 ± 0.1198.61 ± 0.35
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, A.; Zhang, K.; Wu, H.; Chen, H.; Wang, M. Meta-Learning-Integrated Neural Architecture Search for Few-Shot Hyperspectral Image Classification. Electronics 2025, 14, 2952. https://doi.org/10.3390/electronics14152952

AMA Style

Wang A, Zhang K, Wu H, Chen H, Wang M. Meta-Learning-Integrated Neural Architecture Search for Few-Shot Hyperspectral Image Classification. Electronics. 2025; 14(15):2952. https://doi.org/10.3390/electronics14152952

Chicago/Turabian Style

Wang, Aili, Kang Zhang, Haibin Wu, Haisong Chen, and Minhui Wang. 2025. "Meta-Learning-Integrated Neural Architecture Search for Few-Shot Hyperspectral Image Classification" Electronics 14, no. 15: 2952. https://doi.org/10.3390/electronics14152952

APA Style

Wang, A., Zhang, K., Wu, H., Chen, H., & Wang, M. (2025). Meta-Learning-Integrated Neural Architecture Search for Few-Shot Hyperspectral Image Classification. Electronics, 14(15), 2952. https://doi.org/10.3390/electronics14152952

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop