Next Article in Journal
HiSatFL: A Hierarchical Federated Learning Framework for Satellite Networks with Cross-Domain Privacy Adaptation
Previous Article in Journal
SOUTY: A Voice Identity-Preserving Mobile Application for Arabic-Speaking Amyotrophic Lateral Sclerosis Patients Using Eye-Tracking and Speech Synthesis
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Small Sample Palmprint Recognition Based on Image Augmentation and Dynamic Model-Agnostic Meta-Learning

1
Xiangjiang Laboratory, Changsha 410205, China
2
School of Intelligent Engineering and Intelligent Manufacturing, Hunan University of Technology and Business, Changsha 410205, China
*
Authors to whom correspondence should be addressed.
Electronics 2025, 14(16), 3236; https://doi.org/10.3390/electronics14163236
Submission received: 24 June 2025 / Revised: 8 August 2025 / Accepted: 13 August 2025 / Published: 14 August 2025
(This article belongs to the Section Artificial Intelligence)

Abstract

Palmprint recognition is becoming more and more common in the fields of security authentication, mobile payment, and crime detection. Aiming at the problem of small sample size and low recognition rate of palmprint, a small-sample palmprint recognition method based on image expansion and Dynamic Model-Agnostic Meta-Learning (DMAML) is proposed. In terms of data augmentation, a multi-connected conditional generative network is designed for generating palmprints; the network is trained using a gradient-penalized hybrid loss function and a dual time-scale update rule to help the model converge stably, and the trained network is used to generate an expanded dataset of palmprints. On this basis, the palmprint feature extraction network is designed considering the frequency domain and residual inspiration to extract the palmprint feature information. The DMAML training method of the network is investigated, which establishes a multistep loss list for query ensemble loss in the inner loop. It dynamically adjusts the learning rate of the outer loop by using a combination of gradient preheating and a cosine annealing strategy in the outer loop. The experimental results show that the palmprint dataset expansion method in this paper can effectively improve the training efficiency of the palmprint recognition model, evaluated on the Tongji dataset in an N-way K-shot setting, our proposed method achieves an accuracy of 94.62% ± 0.06% in the 5-way 1-shot task and 87.52% ± 0.29% in the 10-way 1-shot task, significantly outperforming ProtoNets (90.57% ± 0.65% and 81.15% ± 0.50%, respectively). Under the 5-way 1-shot condition, there was a 4.05% improvement, and under the 10-way 1-shot condition, there was a 6.37% improvement, demonstrating the effectiveness of our method.

1. Introduction

In recent years, palmprint recognition has been an important biometric technique, and with the development of the technology, deep learning methods [1] that require a large number of training samples are also widely used for palmprint recognition [2]. However, existing publicly available palmprint datasets are usually small sample datasets with 5–20 samples [3], and collecting large-scale labeled data is very challenging due to confidentiality issues [4]. Training deep learning models using a small amount of labeled data can easily lead to overfitting, thus limiting their performance in practical applications [5]. How to achieve high-accuracy palmprint recognition under limited labeled data has become a hot issue in current research. At present, there are two main means to improve the accuracy of palmprint recognition: expanding the size of the dataset and applying small-sample learning methods.
Data augmentation techniques are commonly used to solve the data scarcity problem and alleviate the overfitting phenomenon of the model. Existing data augmentation techniques mainly include traditional data augmentation methods and deep learning-based image data augmentation methods. Traditional data augmentation methods increase the diversity of training data by performing operations such as rotating, scaling, translating, cropping, and flipping existing images [6,7]. However, these methods can only perform limited transformations based on the original data and cannot generate completely new image data. With the research and wide application of deep learning, some scholars have applied deep learning methods to the field of data expansion, which can be classified into adversarial training-based and generative model-based image data expansion methods according to the way of generating new samples. Adversarial training [8] generates new training samples by making small perturbations to the training data, while the generative adversarial network (GAN [9]) learns the distributional characteristics of the input data based on the training generative model and generates new images that are similar to the original images, thus increasing the number and diversity of training samples.
The goal of small sample learning is that the model obtains a certain generalization ability through a small amount of labeled data and acquires recognition ability through a small amount of fast fine-tuning when facing a new task. Existing small sample learning techniques mainly include transfer learning and meta-learning. Transfer learning refers to the learning task of acquiring the first few layers of a pre-trained network in other classification tasks and then fine-tuning the whole network on the target dataset to adapt to the small-sample dataset after obtaining enough a priori knowledge for the model. However, there are quantitative and distributional differences between the data and the model training [10] samples in real scenarios, and the model parameters may not be adequately tuned to the most appropriate state. Model-Agnostic Meta-Learning (MAML) [11] is a representative meta-learning algorithm that optimizes the initial parameters of the model during multi-task training, enabling the model to adapt quickly after receiving a small amount of new task data. The core idea of MAML is to learn through meta-training the generalized initial parameters of the model so that it can quickly adapt to new tasks after a small number of gradient updates, thus solving the data scarcity problem in few-sample learning. In the field of palmprint recognition, MAML can effectively utilize a small amount of labeled data to improve the generalization ability of the recognition model. However, palmprint images often contain complex background information, which may affect the recognition effect of the model. In order to effectively improve the accuracy of palm-print recognition under the condition of a small sample dataset, this paper conducts research from the perspective of data expansion and small sample learning at the same time. In terms of data expansion, a palmprint expansion method based on Multi-connection Conditions Generate Adversarial Networks (MCCGAN) is proposed. In terms of small-sample learning, introducing Model-Agnostic Meta-Learning into the field of palmprint recognition, designing Frequency-Domain Inception-RSE Palmprint Network (FDIR Net) considering frequency domain and residual inspiration, and a Dynamic Model-Agnostic Meta-learning method training method. The contributions of this paper are as follows:
  • Aiming at the problem of the small sample size of palmprint data, MCCGAN is designed to expand the palmprint dataset.
  • Aiming at the problem of low recognition rate of palmprint images with small samples, the MAML method is introduced and improved to enhance the fast adaptive learning capability of the feature extraction network on the small sample task.

2. Methods

2.1. MCCGAN-Based Palmprint Image Generation Model

Aiming at the difficulties in small sample palmprint recognition, this paper proposes a palmprint image generation model based on MCCGAN, which consists of a multi-connected palmprint generator (MC-G) and a multi-connected palmprint discriminator (MC-D). The palmprint training dataset images and random noise information are encoded and decoded using the generator to generate new palmprint images of the same class. The combination of real and generated data is judged using the discriminator, and the generator is utilized to expand the data for the small sample of palmprint images after adversarial training.

2.1.1. The Structure of MCCGAN

Palmprint has complex multilevel texture information; for this reason, this paper proposes a palmprint expansion method based on MCCGAN. The structure of MCCGAN is shown in Figure 1.
MCCGAN consists of two parts: a multi-connected palmprint generator (MC-G) and a multi-connected palmprint discriminator (MC-D). For the same category of palmprints x i   ( i   =   1 ,   2 , ,   n ) , MC-G receives the original palmprint image   x 1 and random noise z and outputs the generated palmprint image x 1 g ; MC-D receives the generated sample pair x g consisting of one true and one false palmprint and the true sample pair x r consisting of two real palmprints. MC-D receives the generated sample pair, consisting of one true and one false palmprint, and the real sample pair x r consisting of two real palmprints, and outputs the probability of the sample pairs, which guides MC-G training.
The multi-connection palmprint generator is composed of U-Net [12] improved by Dense Connection (DC) [13] and Skip Connection (SC) [14], and its structure is shown in Figure 2.
MC-G mainly consists of a feature encoder, a noise encoder, a decoder, and an output module, and each component is described as follows:
(1)
The feature encoder is used to extract the palmprint ( x 1 ) features and obtain different-scale feature maps. Among them, feature encoder 0 includes one convolutional layer with a 3 × 3 step size of 2, which performs preliminary feature encoding of the input palmprint image; feature encoders 1–3 firstly contain one convolutional layer with a 3 × 3 step size of 2 and one convolutional layer with a 3 × 3 step size of 1 internally, which achieves feature reuse and enhances the flow of gradients across the hierarchical levels through the dense concatenation operation of the input features of the upper level and the output features of the upper level; secondly, the feature encoding is performed by three 3 × 3 convolutions with a step size of 1 to accomplish the current layer feature extraction task, respectively.
(2)
The noise encoder is used to edit the noise (z) information of the input network by one linear layer to modify the random noise to match the size of the decoder input.
(3)
To complement the feature encoder, the decoder integrates multiple input sources and employs a series of upsampling operations to progressively restore spatial details from the encoded features. In addition to receiving features from the previous layer and noise, the decoder incorporates outputs from the U-Net encoder via full-size and regular skip connections, enabling effective gradient flow between the encoder and decoder to enhance training stability. Each decoder layer consists of an upsampling block, which includes a 3 × 3 transpose convolution with a stride of 2 to double the feature map resolution, followed by a 3 × 3 convolution with a stride of 1 to refine features, a ReLU activation for non-linearity, and batch normalization to stabilize training. This upsampling block is repeated three times, with each iteration processing the feature maps to produce higher-resolution outputs. The resulting decoder output aligns with the spatial dimensions of the corresponding encoder layer, facilitating tasks such as image reconstruction or semantic segmentation.
(4)
The output module is used to process the information from the network-generated data to produce a target image ( x 1 g ), including three 3 × 3 convolutions + Relu activation + batch normalization operations with a step size of 1 and a hyperbolic tangent function (Tanh).
The multi-connected palmprint discriminator consists of DenseNet, which introduces the Pyramid Split Attention (PSA) [15] mechanism, and its structure is schematically shown in Figure 3, which mainly consists of the Pyramid Split Attention DenseBlock (PSADenseBlock), the transition block, and the output processing module.
(1)
The structure of PSADenseBlock is schematically shown in Figure 4, where Figure 4a shows the overall structure of the Pyramid Split Attention DenseBlock, which corresponds to the feature level extracted by MC-G, constructs a 4-layered dense network connection, and splices to obtain the inputs of the lth layer, x l = concat [ x 0 , x 1 ,…,   x l 1 ], the nonlinear transformation [16] function h(·) in DenseNet extracts features through a composite operation, consisting of batch normalization, ReLU activation, and a 3 × 3 convolution. To enhance the neural network’s ability to distinguish real and fake palmprint features, we integrate a Pyramid Split Attention (PSA) mechanism into this transformation, as illustrated in Figure 4b. The PSA module enhances channel-wise feature weighting by capturing multi-scale information through three steps: First, the input feature map is divided into S channel groups. Each group undergoes convolution with a kernel of size   k i = 3 + 2 × ( i − 1), where i is the group index, allowing the network to extract features at progressively larger scales. Second, the resulting feature maps from all groups are concatenated to form a unified feature representation, which is processed by a Squeeze-and-Excitation (SE) block to compute channel attention weights, emphasizing the most informative channels across scales. Finally, the attention weights for the S groups are normalized using a SoftMax function, as described in Equation (1), to produce a weighted combination of multi-scale features.
a t t l i = e x p z l i i = 0 S 1 exp z l i
The weight a t t l i and the corresponding feature map points are multiplied by f l i a t t l i , to get the weighted feature map.
(2)
The input of PSADenseBlock contains the feature maps of all previous layers stacked in the channel dimension, and the feature size is consistent within the module. In order to connect the PSADenseBlocks and control the complexity of the model, the Transition module is set up in the network to reduce the size of the feature maps. The Transition module connects two neighboring PSADenseBlocks, in which the features undergo the operation of “batch normalization + Relu activation + 1 × 1 convolution + 2 × 2 average pooling”, and the h   ×   w feature map obtained from the previous layer is compressed into h 2   ×   w 2 .
(3)
The output processing includes the average pooling layer, linear layer, LeakyRelu activation layer, and linear layer, and after the output processing, the result of judging the authenticity of the image is output.

2.1.2. MCCGAN Network Training Method

Since the loss function of CGAN is mainly based on the Jensen-Shannon divergence to measure the difference between the real data distribution and the generated data distribution, the logarithmic operation may lead to the disappearance of the gradient when the performance difference between the generator and the discriminator is large, making the model training difficult. In order to improve the stability of MCCGAN training, this paper introduces the gradient penalization method described in WGAN-GP [17] into MCCGAN training, which constrains the gradient value of the network, and the improved loss function is shown in Equation (2).
L MCCGAN = E x r ~ p d a t a x r D x r E x g ~ p z x g D x g + λ E x ^ ~ p x ^ x ^ D x ^ 2 1 2
where x r denotes the real palmprint data pair containing two palmprint images under the same label, p d a t a   ( x r ) denotes its data distribution, x g denotes the generated palmprint data pair consisting of real palmprints and MC-G output palmprints, containing both real palmprint images and generated images of the same category, p z   ( x g ) denotes its data distribution, and construct a mixed sample x ^ by performing linear interpolation on x r and x g , as shown in Equation (3), p   ( x ^ ) denotes its data distribution, D(∙) denotes the MC-D output result.
x ^ = μ x r + ( 1 μ ) x g , μ U ( 0 , 1 )
μ is a random weight uniformly sampled from the interval [0, 1], ensuring that it lies on the line connecting the real sample and the generated sample. The gradient penalty term λ E x ^ ~ p ( x ^ ) [ x ^ D x ^ 2 1 2 ] uses the L2 paradigm to measure the MC-D network output to ensure that it is uniformly smooth throughout the entire space, and λ denotes the gradient penalty coefficient, which is usually taken as λ = 10.
Traditional GANs simultaneously update the generator and discriminator during network training. However, in the initial stages of iteration, the quality of the palmprint images generated by MC-G is poor, and MC-D can easily learn the differences between them and real samples, resulting in very small or nearly zero gradient information obtained by MC-G, leading to the “vanishing gradient” phenomenon. To avoid this issue, the Two Time-Scale Update Rule (TTUR) is applied in MCCGAN to balance the training process of the generator and discriminator: Separate Adam optimizers are established for the MC-G and MC-D models. The learning rate for MC-G is set to 0.0001. Forward propagation is performed on 95% of the data, loss is calculated, and the gradients of the model parameters are computed. Based on the gradients and the optimizer’s rules, the parameters of the generator network are updated; The learning rate for MC-D is set to 0.0004. For each data set, forward propagation is performed, loss is calculated, and the gradients of the model parameters are computed. Based on the gradients and the optimizer’s rules, the parameters of MC-D are updated. As shown in Table 1.

2.2. A Small Sample Palmprint Feature Learning Method Based on FDIR Network

2.2.1. Multi-Branch Parallel Feature Extraction Module

In neural networks, the learning goal of each layer is to learn the mapping of the input data [18], and when the number of network layers increases, the network becomes more difficult to train. He [19] proposed a Residual Network (ResNet), which guides the network to learn the difference between the input data and the desired output and mitigates problems such as gradient vanishing or gradient exploding. In this paper, we apply the residual connection to the Inception module [20], which can allow the network to pass the features of the original input directly, thus retaining more low-level feature information, which helps the transfer and fusion of multilevel features, and enhances the network’s ability to perceive features at different scales and different semantics of the image.
In this paper, the design of a parallel multi-branch residual feature extraction module is shown in Figure 5. For the feature map of the upper input layer, the following four sets of convolutional branches are used for feature extraction: a convolution operation with a kernel size of 1 × 1 and a step size of 1 outputs the feature data F c 1 ( X : , b 1 , : , : w , W 1 ) of the b1 channel; the output of the previous layer is average pooled and then passed through the convolutional kernel of 1 × 1, which outputs the feature data F c 2 ( X : , c , : , : w , W 2 ) ; the output of the previous layer is feature-extracted by passing it through a convolution operation with a kernel size of 1 × 1 and a step size of 1 versus a kernel size of 5 × 5, a step size of 1, and a padding of 2. The output of the previous layer is feature-extracted, and the feature data of the b3 channel is outputted as F c 3 ( X : , b 3 , : , : w , W 3 ) ; the output of the previous layer is feature-extracted by passing it through a convolution operation with a kernel size of 1 × 1, a step size of 1 and a padding of 1, and a convolution operation with kernel size of 3 × 3, step size of 1, and padding of 1. The output of the previous layer is feature extracted, and the output of the b4 channel is feature data F c 4 ( X : , b 4 , : , : w , W 4 ) . The output of the four sets of branches is spliced by channel as in Equation (4) to output the feature data F c .
F c = F c 1 X : , b 1 , : , : w , W 1 F c 2 X : , b 2 , : , : w , W 2 F c 3 X : , b 3 , : , : w , W 3 F c 4 X : , b 4 , : , : w , W 4
  represents concatenation by channel.
The residual connection branch aligns X : , c , h , w w to the feature data F c channel dimensions through the projection function W c s , as in Equation (5).
F r = X : , c , h , w w W c s
Finally, splice F c , and F r to get the output of the current module.

2.2.2. Considering the Channel Attention Mechanism in the Frequency Domain

Appearance, texture, and color similarity between palmprint data samples is high, and small samples have local subtle differences with frequency domain information variations. The attention mechanism can enhance the model’s performance [21] to focus on important features and suppress irrelevant information. Given that the attention mechanism is widely used in the field of computer vision, in order to capture the subtle feature changes and enhance the feature expression ability, the channel attention mechanism, which considers the frequency domain information, is introduced in the model construction, and the operation process is as follows:
Step 1. For the feature map X : , C , H , W output from the upper layer network is divided X 0 , X 1 , , X C 1 along the channel dimension, and for   X i there is X i R C × H × W ,   i { 0,1 , , C 1 } , and for each component, a Fast Fourier Transform (FFT) is applied to calculate the frequency components F r e q i , as in Equation (6).
F r e q i u , v = 2 D D C T u , v X i = h = 0 H 1 w = 0 W 1 X : , h , w i e 2 π i h u H + w v W s . t .   i { 0,1 , , C 1 }   s . t .   h 0,1 , , H 1 , w 0,1 , , W 1
where u ,   v are the corresponding 2D coordinates of X i and F h , w u , v denotes the FFT basis function performed on the image.
Step 2. Combine the computed frequency domain F r e q i ( u , v ) , as in Equation (7).
F r e q = c a t ( F r e q i u , v )   s . t .   i { 0,1 , , c 1 }
cat(.) indicates splicing along the channel.
Step 3. Construct the correlation of feature channels by fully connected layers and activation functions, and output 1 × 1 × c feature weights, as in Equation (8).
m a t t = s i g m o i d   f c F r e q
Step 4. Normalize the weights, weight (1 × 1 × c) × (H × W × c) by channel-by-channel product and output the weighted feature   X : , c , h , w w , The process is shown in Figure 6.

2.2.3. DMAML-Based Training Method for Small-Sample Learning Networks

(1)
Weighted multi-step loss parameter update method
In the backpropagation of MAML, only the last step of the specific feature extraction network is considered for the weights [22], while the previous multi-step updates are chosen to be optimized implicitly, leading to unstable training [23]. In order to preserve the influence of the inner loop query set on the meta-learner parameters, an annealed weighted summation of the query set loss after the support set parameter update is performed. The parameter updating process is described in detail below in conjunction with the formula.
Step 1. Randomly obtain the task T i from a set of tasks in the inner loop.
Step 2. Use the support set T i s = x 1 i , y 1 i , , x k i , y k i of the current task as the input to the small-sample palmprint feature extraction network to obtain the predicted probability p i c that the observation sample i belongs to the category c. Compute the cross-entropy loss L T i s according to Equation (9).
L T i s = 1 N i L i = 1 N i c = 1 n y i c l o g ( p i c )
  y i c is the sign function (take 1 if the true category of sample i is equal to c, and 0 otherwise); and n denotes the number of recognized categories.
Step 3. The inner loop parameter update using the cross-entropy loss L T i s according to Equation (10) is performed to obtain the new task-specific model parameters θ i , m   under the current task.
θ i , m = θ i , m 1 α θ L T i s f ( θ i , m 1 )
where   θ L T i s f ( θ i , m 1 ) denotes the loss gradient, and α is the inner loop learning rate, the initial network weight parameter   θ i , 0   =   φ .
Step 4. Input the query set T i Q the current task into the palmprint image recognition model f θ i , and calculate the query set cross-entropy loss   L T i Q .
Step 5. Calculate the weight w for the current query set loss L T i Q according to Equation (11) and save w · L T i Q   to the list.
w = max 1 m + 1 r m , 0.03 m + 1
where in m in the formula represents the current number of update steps and r m represents the weight decay rate, select 1 m + 1 as the basic annealing function, as it decreases smoothly and is commonly used in optimization algorithms; Introduce the decay rate   r m = 1 m + 1 · 10 , adjusting the weight to 0.9 m + 1 to accelerate decay adaptation for small-sample tasks.
Step 6. The process of steps 2-step to 5 above is repeated, and the task is utilized to update the inner loop parameters M times. Wherein, the process of step 5 above is a data preparation action for the inner loop phase for subsequent outer loop parameter updates, i.e., generating a multi-step loss list.
Step 7. In the MAML outer loop, step 5 of the inner loop phase is utilized to generate the multi-step loss list for updating the outer loop meta-learner model weight parameters φ by means of the optimizer RMSprop, as in Equation (12).
φ = φ β θ i , m T i ~ p T L T i f θ i , m , i < N 3 φ β φ T i ~ p T L T i f θ i , m , i > N 3
where φ denotes the meta-learner model weight parameter, p T denotes the distribution of the set of tasks, L T i f θ i , m   is the query set test loss for the parameters in a particular sub-task, θ i T i ~ p T L T i f θ i , m   is the derivative of the loss function with respect to the parameters of the inner loop (of second order), and φ T i ~ p T L T i f θ i , m   is the derivative of the loss function with respect to the parameters of the outer loop model (first order), and β is the learning rate of the outer loop. In order to balance the model performance and computational efficiency, for the first N 3 tasks, the second-order derivatives are used to solve; for the other tasks, the first-order approximate derivatives are used.
(2)
Learning Rate Dynamic Adjustment Strategy
The outer loop learning rate of MAML determines the update speed and stability of the meta-learner parameters. When a smaller learning rate is used, the model is difficult to adapt to the changes in the training data, resulting in slower convergence of the training process and more iterative steps are required to achieve a reasonable level of performance; whereas, when a larger learning rate is used, the model is prone to overfitting on small-sample palmprint data, resulting in an unstable training process and divergence of the final model. In order to make the model converge faster and find suitable parameters stably, this paper combines progressive preheating and cosine annealing [24,25] to design a dynamic adjustment method for the learning rate of the outer loop. Specifically, the following steps are performed in a learning round.
Step 1. Set upper and lower bounds on the outer loop learning rate [ β m i n , β m a x ].
Step 2. Apply Equation (13) to gradually increase the outer loop learning rate β through progressive warm-up rounds within the first learning round.
β = β m i n + ( β m a x β m i n ) ( s c / s w )
where   s c is the number of iterative steps of the current training, s w is the total progressive warmup round iteration steps.
Step 3. Loop step 1 s c times to complete the learning rate warmup.
Step 4. Starting the cosine annealing phase from the current learning round, the outer loop learning rate β is dynamically updated according to Equation (14).
β = 1 2 β m a x ( 1 + c o s ( s c s w s t s w π ) )
where s t is the total number of iterative steps trained in the current learning round, and the learning rate is restarted using cosine annealing hot restart whenever all the tasks in that learning round have been performed.
The pseudocode for training DMAML is as follows Algorithm 1:
Algorithm 1 MAML with Weighted Multi-step Loss and Dynamic Learning Rate
Require:
  • Task distribution p ( T )
  • Inner loop steps M
  • Learning rates α   ,   β min   ,   β max
  • Warmup steps s w , total steps s t   , iteration steps s c
Ensure: Updated meta-parameters φ
  1: Initialize φ randomly.
  2: For each task T i ρ ( T )  do
  3:   θ i , 0 φ
  4:  for  m = 1 to M  do
  5:   Compute support loss: L T i S 1 N i c = 1 n y ic log ( p ic )
  6:   Update parameters: θ i , m θ i , m 1 α θ L T i S f ( θ i , m 1 )
  7:   Compute query loss: L T i Q CrossEntropy ( f ( θ i , m 1 ) , T i Q )
  8:   Compute weight: w max 1 m + 1 r m , 0.03 m + 1
  9:   Save w · L T i Q to loss list
10:    θ i , m θ i , m
11:  end for
12:  Dynamic Learning Rate Adjustment:
13:  if  s c s w  then
14:    β β min + ( β max β min ) × ( s c / s w )
15:  else
16:    β 1 2 β max 1 + cos s c s w s t s w π
17:  end if
18:    s c s c + 1
19:  if  i < N / 3  then
20:   Update φ φ β θ i Σ L T i f ( θ i , m )
21:  else
22:   Update φ φ β φ Σ L T i f ( θ i , m )
23:  end if
24: end for
25: return  φ

3. Experiments and Analysis

Experimentation of this paper’s model with classical as well as state-of-the-art small-sample learning methods on three classical palmprint datasets.

3.1. Datasets and Data Preprocessing

The palmprint dataset data used in the experiments were obtained from the Hong Kong Polytechnic University (PolyU Palmprint Image Database), the Indian Institute of Technology (IIT Delhi Touchless Palmprint Database), and Tongji University (Tongji Palmprint Image Database).
To meet the requirements of the network architecture and palmprint recognition, we perform the following preprocessing on palmprint images: using the Python image processing library PIL to read the raw image dataset and generate data arrays, to achieve robust region localization across diverse palm print images while maintaining computational efficiency suitable for network input requirements, we select a keypoint detection-based method to crop a 128 × 128-pixel region of interest (ROI), specifically detecting key points in the palmprint images (the center of the palm and the positions of the finger creases), and cropping a 128 × 128 region centered on the palm that contains the primary texture features. We use the NumPy 2.2.3 2.7.1 to save the processed image data as an npy format file to accommodate subsequent network training.

3.2. Environment and Evaluation Metrics

The experiments are programmed in Python 3.7, the deep learning framework is Pytorch 2.7.1, the hardware GPU is NVIDIA GeForce RTX 2060, CUDA 11.6, the operating system is Windows 10 21H2, and the memory is 16 GB.
To verify the effectiveness of the data expansion algorithm proposed in this paper, a combination of qualitative and quantitative methods is used for evaluation. The quantitative metrics are Frechet Inception Distance (FID) [26], Structural Similarity Index Method (SSIM) [27], Peak Signal to Noise Ratio (PSNR) [28], and Inception Score (IS) [29]. Below is a brief introduction to these four indicators.
FID: The quality of the generated images is measured by comparing the distribution differences between real images and generated images in a high-dimensional feature space.
SSIM: Comparing the similarity of two images based on three aspects—brightness, contrast, and structural information—is a structure-based image quality evaluation method.
RSNR: Measures the reconstruction quality of an image, expressed as the ratio between the maximum signal value and noise. The unit is decibels (dB).
IS: Inception Score is a widely used metric for evaluating the performance of generative models, primarily measuring the quality and diversity of generated images. This metric evaluates the conditional probability distribution of generated images in category prediction and the marginal probability distribution by calculating the KL divergence between them.

3.3. Palmprint Expansion Comparison Experiment

To illustrate the state-of-the-art of the proposed method, subjective and objective comparisons are made using some open-source generative adversarial network data augmentation methods, all of which are supervised learning models that can constrain the classes of generated palmprints to meet the needs of the recognition experiments, including conditional generative networks (CGANs) [30], deep convolutional generative adversarial networks (DC-GAN) [31], Wasserstein generative adversarial network (WGAN) [32], conditional diffusion model (CDM) [33] and Few-shot image generation (FewshotGAN) [34].
Table 2 shows the comparison between the original dataset and the effect of each algorithm after data expansion. From Table 1, it can be seen that due to the small class of training samples in the PolyU palmprint dataset, other generative adversarial network models can hardly obtain the palmprint texture features, and the output image contains a large amount of irregular noise on the IIT-D dataset with more palm shadows; the DCGAN model mistakenly treats the unevenly illuminated portion as the palmprint feature information, which reduces the reasonableness of the generated image; on the more numerous but insufficiently illuminated On the Tongji dataset, most of the methods can obtain the image information, but the clarity of the generated image is limited. In contrast, the MCCGAN proposed in this paper can extract the palmprint feature information from the limited PolyU dataset. The generated image retains rich palmprint texture details, which can reasonably simulate the palm shadows generated during the palmprint acquisition process in the IIT-D dataset, and the generated image has clearer texture details.
Among table, represents that the larger the indicator value, the better the effect, while ↓ represents that the smaller the indicator value, the better the effect. The quantitative performance of each comparison algorithm in different datasets is shown in Table 3.
On the PolyU dataset, MCCGAN improves the FID metric by 10%, the SSIM value by 25%, and the PSNR metric by 0.7% compared to other generative algorithms. On the IITD dataset, MCCGAN improves the FID metric by 29%, the SSIM value reaches 0.748, and the PSNR metric improves by 13.2%. In the Tongji dataset, the FID value of MCCGAN is slightly inferior to that of WGAN, indicating that the difference between the distribution of the generated data and the real data distribution is larger than that of WGAN. Combined with Table 3, it can be seen that the visual effect of the image generated by MCCGAN is closer to the real situation of the palm prints than that of the WGAN-generated image, and the SSIM value reaches 0.580. The PSNR value is improved by 14.1%. It can be seen that MCCGAN has better structural similarity and less distortion than other expansion methods while maintaining better similarity in the distribution of the generated image data, and the quality of generation is more stable among different datasets. In terms of running time, the generator network in MCCGAN connects the bottom features to the top features through jump connections, which provides a faster information transfer and reconstruction path, helps to reduce information loss and blurring, and accelerates the speed of generating images.

3.4. Palmprint Expansion Ablation Experiment

To discuss the impact of generator structure design on image generation quality, ablation experiments were designed. Group 1 used a U-Net to construct the generator, Group 2 used a U-Net network with dense connections (U-Net + DC) to construct the generator, and Group 3 used a U-Net network with skip connections (U-Net + SC) to construct the generator, and Group 4 used a U-Net network with dense connections + skip connections (U-Net + DC + SC) to construct the generator. Each group of generators was set with 0–4 noise encoders for comparison, while other structural settings remained consistent with MCCGAN. The ablation experiment results are shown in Table 4.
As shown in Table 4, when four noise encoders are set, the FID value of the (2) U-Net + DC group is lower than that of the (1) group, while the IS, SSIM, and PSNR values improved. This indicates that the encoders effectively capture and utilize both low-level and high-level feature information from the image encoding process by receiving and leveraging the outputs of the first two layers through dense connections. This results in improvements in the distributional consistency, structural similarity, and reconstruction quality between the generated images and the real images.
The (3) U-Net + SC group showed a significant decrease in FID values compared to the (1) group when using 2 noise encoders, while SSIM and PSNR values improved significantly. This indicates that the decoder receives and utilizes the outputs from the previous layer and multiple feature encoders through skip connections, fully leveraging low-level and high-level semantic information during decoding and preserving more feature details during upsampling. The (4) U-Net + DC + SC group simultaneously improved the feature encoder and decoder, and when combined with three noise encoders, all four metrics achieved good results. This is because the model under this structure can more fully utilize low-level and high-level feature information and accurately reconstruct images through the combination of multiple encoders, thereby generating fingerprint images with small distribution differences from the original images, diverse categories, similar structures, and low distortion. In terms of runtime, increasing model complexity leads to longer runtime, but the increase is small and remains within an acceptable range.
To discuss the impact of the discriminator framework design on image generation quality, 0 to 3 sets of pyramid segmentation attention mechanisms were inserted into the nonlinear transformation of PSADenseBlock. Four experiments were designed, with other structural settings consistent with MCCGAN. The image generation experimental results are shown in Table 5.
From the comparison results, it can be seen that the pyramid-split attention module introduces a multi-scale attention mechanism for image features in the discriminator. As the number of pyramid-split attention modules increases, the distribution difference between the generated images and the real images decreases, and the FID score gradually decreases; Meanwhile, the IS score continues to rise, indicating that the pyramid segmentation attention module provides a more global feature representation and multi-scale perception capability, thereby improving the generated images in multiple aspects (such as category diversity and realism); The fluctuations in SSIM, PSNR, and runtime suggest that designing two PSA modules in the DensNetBlock is appropriate, as it can enhance the diversity and realism of the generated images while maintaining reasonable computational efficiency.

3.5. Small Sample Palmprint Recognition Experiment

In this section, the proposed FDIR Net and DMAML are compared with the current mainstream small-sample learning methods, including MatchingNets [35], ProtoNets [36], MAML, Reptile [37], and FETA [38], using the MCCGAN expanded dataset. The experiments use the image recognition accuracy in the N-way K-shot case as the evaluation index. The comparison of recognition accuracy is shown in Table 6 and Table 7.
From the comparison of the results in Table 6 and Table 7, it can be seen that FDIR Net + DMAML can obtain a better recognition rate of palmprints with small samples under a variety of tasks, with the best recognition rate of 97.09% on the IIT-D dataset and 99.15% on the Tongji dataset and the range of fluctuation of the recognition rate is further reduced, and the model is more stable.
In order to test the palmprint verification performance of each model, each test sample and template are matched, and different matching similarity thresholds are taken to obtain the DET curves as shown in Figure 7 and Figure 8, and the equal error rates are obtained as shown in Figure 9 and Figure 10.
From the comparison of the experimental results, it can be seen that the DET curve of FDIR Net + DMAML is completely located below the other compared models, which indicates that the proposed model in this paper takes into account the security and ease of use of the recognition system.

4. Conclusions

In this paper, for the problem of the low recognition rate of a small sample palmprint, a small sample palmprint recognition method based on image expansion and DMAML is proposed. In terms of small-sample palmprint image expansion, the MCCGAN network is designed, which contains two parts of the network structure: the multi-connected palmprint generator encodes-decodes the palmprint training dataset images and random noise information to generate new palmprint images of the same category, the multi-connected palmprint discriminator judges the similarity between the real data and the generated data and guides the generator to improve the quality of the generated images. In the process of adversarial training, the loss function mixed with a gradient penalty is used to regularize the gradient to keep the training process stable and avoid gradient explosion or gradient disappearance. In terms of small-sample palmprint feature learning, the Inception multi-branch structure is used as the basis to design the feature extraction module combined with the ResNet residual idea, and at the same time, the channel attention mechanism considering the frequency domain is designed to enhance the network’s attention to the specific region of the palmprint image, so as to improve the model’s feature learning and representation ability. In order to meet the parameter update demand of MAML for this feature extraction network, the multi-step loss list of the inner loop is constructed, and the progressive preheating and cosine annealing are combined to design the dynamic adjustment method of the outer loop learning rate. Experiments show that the method proposed in this study can better generate high-quality palmprint images, expand the palmprint dataset, and perform well in terms of small-sample palmprint recognition accuracy, recognition efficiency, and model robustness. In future research work, the application of MCCGAN in the case of data imbalance will be considered to expand the application range of the expanded method and to combine many different modal information to enhance the model’s ability to understand and recognize palm texture and to realize multimodal palmprint recognition.

Author Contributions

Conceptualization, H.B. and Z.D.; methodology, X.Z. and H.B.; software, Y.L.; validation, H.B. and Y.L.; investigation, X.Z.; writing—original draft preparation, H.B. and Z.D.; writing—review and editing, X.Z. and K.Z.; visualization, Y.L.; supervision, X.Z.; project administration, K.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Major Program of Xiangjiang Laboratory (Grant No. 23XJ01001, Grant No. 22XJ01002).

Data Availability Statement

This study utilized publicly available palmprint datasets. The PolyU Palmprint Image Database can be accessed via: https://www4.comp.polyu.edu.hk/~csajaykr/IITD/Database_Palm.htm (accessed on 3 April 2025). The IIT Delhi Touchless Palmprint Database is available at: http://www4.comp.polyu.edu.hk/~csajaykr/IITD/Database_Palm.htm (accessed on 3 April 2025). The Tongji Palmprint Image Database can be obtained from: http://sse.tongji.edu.cn/linzhang/cr3dpalm/cr3dpalm.htm (accessed on 3 April 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

DMAMLDynamic Model-Agnostic Meta-Learning
MAMLModel-Agnostic Meta-Learning
MCCGANMulti-connection Conditions Generate Adversarial Networks
FDIR NetFrequency-Domain Inception-Rse Palmprint Network
MC-GThe generator of Multi-connection Conditions Generate Adversarial Networks
MC-DThe discriminator of Multi-connection Conditions Generate Adversarial Networks
PSAPyramid Split Attention
PSADenseBlockPyramid Split Attention DenseBlock

References

  1. Zhou, X.; Liang, W.; Kevin, I. Deep-learning-enhanced human activity recognition for internet of healthcare things. IEEE Internet Things J. 2020, 7, 6429–6438. [Google Scholar] [CrossRef]
  2. Zhou, K.; Zhou, X.; Yu, L.; Shen, L.; Yu, S. Double bio-logically inspired transform network for robust palmprint recognition. Neurocomputing 2019, 337, 24–45. [Google Scholar] [CrossRef]
  3. Xue, Y.; Xue, M.; Liu, Y.; Bai, X. Palmprint Recognition Based on 2DGabor Wavelet and BDPCA. Comput. Eng. 2014, 40, 196–199. [Google Scholar]
  4. Fei, F.; Li, S.; Dai, H.; Hu, C.; Dou, W.; Ni, Q. A k-anony mity based schema for location privacy preservation. IEEE Trans. Sustain. Comput. 2017, 4, 156–167. [Google Scholar] [CrossRef]
  5. Zhenan, S.; Ran, H.; Liang, W. Overview of biometrics research. J. Image Graph. 2021, 26, 1254–1329. [Google Scholar] [CrossRef]
  6. Wang, C.; Xu, Z.; Ma, X.; Hong, Z.; Fang, Q.; Guo, Y. Mask R-CNN and Data Augmentation and Transfer Learning. Chin. J. Biomed. Med. Eng. 2021, 40, 410–418. [Google Scholar]
  7. Ren, K.; Chang, L.; Wan, M.; Gu, G.; Chen, Q. An improved retinal vascular segmentation study based U-Net enhanced pretreatment. Laser J. 2022, 43, 192–196. [Google Scholar]
  8. Mao, X.; Shan, Y.; Li, F.; Chen, X.; Zhang, S. CLSpell: Contrastive learning with phonological and visual knowledge for Chinese spelling check. Neurocomputing 2023, 554, 126468. [Google Scholar] [CrossRef]
  9. Zhang, Y.; Xie, H.; Zhuang, S.; Zhan, X. Image Processing and Optimization Using Deep Learning-Based Generative Adversarial Networks (GANs). J. Artif. Intell. Gen. Sci. 2024, 5, 50–62. [Google Scholar] [CrossRef]
  10. Zhou, X.; Li, Y.; Liang, W. CNN-RNN based intelligent recommendation for online medical pre-diagnosis support. IEEE/ACM Trans. Comput. Biol. Bioinform. 2020, 18, 912–921. [Google Scholar] [CrossRef]
  11. Finn, C.; Abbeel, P.; Levine, S. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. arXiv 2017, arXiv:1703.03400. [Google Scholar]
  12. Lv, P.; Wang, J.; Zhang, X.; Shi, C. Deep supervision and atrous inception-based U-Net combining CRF for automatic liver segmentation from CT. Sci. Rep. 2022, 12, 16995. [Google Scholar] [CrossRef] [PubMed]
  13. Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21 July 2017; pp. 4700–4708. [Google Scholar]
  14. Shi, C.; Cheng, Y.; Wang, J.; Wang, Y.; Mori, K.; Tamura, S. Low-rank and sparse decomposition based shape model and probabilistic atlas for automatic pathological organ segmentation. Med. Image Anal. 2017, 38, 30–49. [Google Scholar] [CrossRef] [PubMed]
  15. Wang, J.; Lv, P.; Wang, H. SAR-U-Net: Squeeze and ex-citation block and atrous spatial pyramid pooling based residual U-Net for automatic liver segmentation in Computed Tomography. Comput. Methods Programs Biomed. 2021, 208, 106268. [Google Scholar] [CrossRef]
  16. Zhou, X.; Tian, J.; Wang, Z.; Yang, C.; Huang, T.; Xu, X. Nonlinear bilevel programming approach for decentralized supply chain using a hybrid state transition algorithm. Knowl. Based Syst. 2022, 240, 108119. [Google Scholar] [CrossRef]
  17. Gulrajan, I.; Ahmed, F.; Arjovsky, M. Improved Training of Wasserstein GANs. arXiv 2017, arXiv:1704.00028. [Google Scholar] [CrossRef]
  18. Liang, W.; Chen, X.; Huang, S.; Huang, G.; Yan, K.; Zhou, X. Federal learning edge network based sentiment analysis combating global COVID-19. Comput. Commun. 2023, 204, 33–42. [Google Scholar] [CrossRef]
  19. He, K.; Zhang, X.; Ren, S. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NE, USA, 26 June–1 July 2016; pp. 770–778. [Google Scholar]
  20. Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
  21. Liu, C.; Wang, Z. Efficient complex ISAR object recognition using adaptive deep relation learning. IET Comput. Vis. 2020, 14, 185–191. [Google Scholar] [CrossRef]
  22. Yang, Y.; Yang, F.; Chen, J. Pythagorean fuzzy Bonferroni mean with weighted interaction operator and its application in fusion of online multidimensional ratings. Int. J. Comput. Intell. Syst. 2022, 15, 94. [Google Scholar] [CrossRef]
  23. Li, F.; Shan, Y.; Mao, X. Multi-task joint training model for machine reading comprehension. Neurocomputing 2022, 488, 66–77. [Google Scholar] [CrossRef]
  24. Liu, D.; Liu, Y.; Chen, X. The new similarity measure and distance measure of a hesitant fuzzy linguistic term set based on a linguistic scale function. Symmetry 2018, 10, 367. [Google Scholar] [CrossRef]
  25. Jiang, F.; Wang, K.; Dong, L. Stacked autoencoder-based deep reinforcement learning for online resource scheduling in large-scale MEC networks. IEEE Internet Things J. 2020, 7, 9278–9290. [Google Scholar] [CrossRef]
  26. Fréchet, M. Sur quelques points du calcul fonctionnel. Rend. Del Circ. Mat. Di Palermo 1906, 22, 1–72. [Google Scholar] [CrossRef]
  27. Wang, Z.; Bovik, A.; Sheikh, H.R. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2014, 13, 600–612. [Google Scholar] [CrossRef] [PubMed]
  28. Korhonen, J.; You, J. Peak signal-to-noise ratio revisited: Is simple beautiful? In Proceedings of the Fourth International Workshop on Quality of Multimedia Experience, Yarra Valley, Australia, 5–7 July 2012; pp. 37–38. [Google Scholar]
  29. Salimans, T.; Goodfellow, I.; Zaremba, W. Improved techniques for training gans. In Proceedings of the Advances in neural information processing systems, NIPS 2016, Barcelona, Spain, 5–10 December 2016; pp. 2234–2242. [Google Scholar]
  30. Mirza, M.; Simon, O. Conditional generative adversarial nets. arXiv 2014, arXiv:1411.1784. [Google Scholar] [CrossRef]
  31. Radford, A.; Metz, L.; Chintala, S. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv 2015, arXiv:1511.06434. [Google Scholar]
  32. Arjovsky, M.; Chintala, S.; Bottou, L. Wasserstein Generative Adversarial Networks. In Proceedings of the International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017; pp. 214–223. [Google Scholar]
  33. Dhariwal, P.; Nichol, A. Diffusion models beat gans on image synthesis. Adv. Neural Inf. Process. Syst. 2021, 34, 8780–8794. [Google Scholar]
  34. Gu, B.; Zhai, J. Few-shot image generation based on meta-learning and generative adversarial network. Signal Process. Image Commun. 2025, 137, 117307. [Google Scholar] [CrossRef]
  35. Vinyals, O.; Blundell, C.; Lillicrap, T. Matching Networks for One Shot Learning. Adv. Neural Inf. Process. Syst. 2016, 29, 3637–3645. [Google Scholar]
  36. Snell, J.; Swersky, K.; Zemel, R. Prototypical Networks for Few-shot Learning. Adv. Neural Inf. Process. Syst. 2017, 30, 4080–4090. [Google Scholar]
  37. Nichol, A.; Achiam, J.; Schulman, J. On First-Order Meta-Learning Algorithms. arXiv 2018, arXiv:1803.02999. [Google Scholar]
  38. Maniparambil, M.; Mcguinness, K.; Connor, N. BaseTransformers: Attention over base data-points for One Shot Learning. In Proceedings of the 33rd British Machine Vision Conference, London, UK, 21–24 November 2022; pp. 1–14. [Google Scholar]
Figure 1. The Structure of MCCGAN.
Figure 1. The Structure of MCCGAN.
Electronics 14 03236 g001
Figure 2. The overall architecture of the Multi-connected palmprint generator.
Figure 2. The overall architecture of the Multi-connected palmprint generator.
Electronics 14 03236 g002
Figure 3. The architecture of a multi-connected palmprint discriminator.
Figure 3. The architecture of a multi-connected palmprint discriminator.
Electronics 14 03236 g003
Figure 4. The architecture of Pyramid Split Attention DenseBlock.
Figure 4. The architecture of Pyramid Split Attention DenseBlock.
Electronics 14 03236 g004
Figure 5. Parallel Multi-Branch Residual Feature Extraction Module.
Figure 5. Parallel Multi-Branch Residual Feature Extraction Module.
Electronics 14 03236 g005
Figure 6. Channel attention mechanism considering frequency domain information.
Figure 6. Channel attention mechanism considering frequency domain information.
Electronics 14 03236 g006
Figure 7. DET curves for the IIT-D dataset.
Figure 7. DET curves for the IIT-D dataset.
Electronics 14 03236 g007
Figure 8. DET curves for the Tongji dataset.
Figure 8. DET curves for the Tongji dataset.
Electronics 14 03236 g008
Figure 9. Equal error rate for the IIT-D dataset.
Figure 9. Equal error rate for the IIT-D dataset.
Electronics 14 03236 g009
Figure 10. Equal error rate for the Tongji dataset.
Figure 10. Equal error rate for the Tongji dataset.
Electronics 14 03236 g010
Table 1. Hyperparameter settings of MC-G and MC-D.
Table 1. Hyperparameter settings of MC-G and MC-D.
ComponentOptimizerLearning RateForward PropagationNotes
MC-GAdam0.000195% of dataUpdates based on gradients
MC-DAdam0.0004100% of dataUpdates based on gradients
Table 2. Comparison of the results of the expansion experiments with different models.
Table 2. Comparison of the results of the expansion experiments with different models.
DatesetOriginalCGANDCGANWGANCDMFewshotGANMCCGAN
PolyUElectronics 14 03236 i001Electronics 14 03236 i002Electronics 14 03236 i003Electronics 14 03236 i004Electronics 14 03236 i005Electronics 14 03236 i006Electronics 14 03236 i007
IIT-DElectronics 14 03236 i008Electronics 14 03236 i009Electronics 14 03236 i010Electronics 14 03236 i011Electronics 14 03236 i012Electronics 14 03236 i013Electronics 14 03236 i014
TongjiElectronics 14 03236 i015Electronics 14 03236 i016Electronics 14 03236 i017Electronics 14 03236 i018Electronics 14 03236 i019Electronics 14 03236 i020Electronics 14 03236 i021
Table 3. Quantitative comparison of expansion effects.
Table 3. Quantitative comparison of expansion effects.
DatasetMethod D F I D D S S I M D P S N R Generation Time/s
PolyUCGAN55.9990.40163.0010.898
DCGAN26.3130.12758.7000.043
WGAN11.8360.47365.6070.118
FewshotGAN10.9220.52365.9020.015
MCCGAN10.6440.59266.0830.010
IIT-DCGAN88.1280.10655.7325.301
DCGAN74.5470.08756.1460.045
WGAN46.8980.28759.2050.188
FewshotGAN37.8910.63563.6140.027
MCCGAN33.2560.74867.0670.019
TongjiCGAN41.4560.06760.5321.696
DCGAN53.5840.09860.3390.057
WGAN17.6950.08261.4260.454
FewshotGAN20.3470.21566.9260.253
MCCGAN21.7200.58069.1200.164
Table 4. Comparison of generator structure ablation experiment results.
Table 4. Comparison of generator structure ablation experiment results.
Generator StructureNumber of Noise Encoders D F I D D I S D S S I M D P S N R Single Generation Time/s
01234
(1) U-Net 124.674641.5120.25959.5930.004
120.214626.8580.26760.0870.005
95.583494.9340.31161.0570.005
1068.315259.1140.07255.8630.005
278.241767.8210.18257.8780.005
(2) U-Net + DC 489.5011045.3540.22856.8810.007
193.432604.0920.26160.6440.007
464.470483.6840.27058.8560.007
414.555998.1620.45462.3260.007
74.688785.0080.51163.8000.007
(3) U-Net + SC 328.9431358.1100.19158.7940.007
99.7801108.5900.63264.8000.007
48.138532.7650.64266.5640.007
240.056 524.6700.37860.0170.006
50.8221149.4810.55263.3650.006
(4) U-Net + DC + SC 54.202796.6250.31161.6980.007
82.920959.4960.53865.1690.007
132.010553.1920.05658.9200.007
33.951878.9830.74767.0300.007
178.7311543.4690.32459.0660.007
Table 5. Comparison of discriminator structure ablation experiment results.
Table 5. Comparison of discriminator structure ablation experiment results.
Discriminator Structure D F I D D I S D S S I M D P S N R Time/s
DenseBlock34.067679.9800.62865.9900.014
DenseBlock + 1 PSA48.974752.8770.59664.9320.015
DenseBlock + 2 PSA33.256876.2540.74967.0680.014
DenseBlock + 3 PSA32.8701305.0520.67266.5950.019
Table 6. IIT-D Small sample palmprint image recognition experiment.
Table 6. IIT-D Small sample palmprint image recognition experiment.
Method5-Way 1-Shot5-Way 3-Shot10-Way 1-Shot10-Way 3-Shot
MatchingNets84.95% ± 0.4392.97% ± 0.1678.74% ± 0.4492.12% ± 0.49
ProtoNets90.57% ± 0.6595.56% ± 0.4081.15% ± 0.5091.93% ± 0.43
MAML79.24% ± 0.1486.69% ± 1.3551.95% ± 1.4658.03% ± 3.35
Reptile52.17% ± 3.9069.33% ± 6.5054.41% ± 4.5371.83% ± 4.46
FETA58.49% ± 0.1062.59% ± 0.2159.42% ± 0.3563.49% ± 0.11
FDIR Net +
DMAML
94.62% ± 0.0697.09% ± 0.0387.52% ± 0.2995.35% ± 0.12
Table 7. Tongji Small sample palmprint image recognition experiment.
Table 7. Tongji Small sample palmprint image recognition experiment.
Method5-Way 1-Shot5-Way 3-Shot10-Way 1-Shot10-Way 3-Shot
MatchingNets90.44% ± 1.7696.07% ± 1.0087.60% ± 1.0391.61% ± 2.79
ProtoNets95.86% ± 0.3896.59% ± 0.3789.35% ± 0.4998.24% ± 0.15
MAML88.84% ± 1.6587.63% ± 0.7967.87% ± 6.9465.44% ± 6.66
Reptile63.17% ± 5.6172.24% ± 6.1457.40% ± 4.3277.73% ± 2.47
FETA61.19% ± 0.6462.83% ± 0.4160.47% ± 0.2964.91% ± 0.81
FDIR Net +
DMAML
97.71% ± 0.1099.15% ± 0.0196.25% ± 1.0398.42% ± 0.07
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhou, X.; Bai, H.; Dong, Z.; Zhou, K.; Liu, Y. Small Sample Palmprint Recognition Based on Image Augmentation and Dynamic Model-Agnostic Meta-Learning. Electronics 2025, 14, 3236. https://doi.org/10.3390/electronics14163236

AMA Style

Zhou X, Bai H, Dong Z, Zhou K, Liu Y. Small Sample Palmprint Recognition Based on Image Augmentation and Dynamic Model-Agnostic Meta-Learning. Electronics. 2025; 14(16):3236. https://doi.org/10.3390/electronics14163236

Chicago/Turabian Style

Zhou, Xiancheng, Huihui Bai, Zhixu Dong, Kaijun Zhou, and Yehui Liu. 2025. "Small Sample Palmprint Recognition Based on Image Augmentation and Dynamic Model-Agnostic Meta-Learning" Electronics 14, no. 16: 3236. https://doi.org/10.3390/electronics14163236

APA Style

Zhou, X., Bai, H., Dong, Z., Zhou, K., & Liu, Y. (2025). Small Sample Palmprint Recognition Based on Image Augmentation and Dynamic Model-Agnostic Meta-Learning. Electronics, 14(16), 3236. https://doi.org/10.3390/electronics14163236

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop