Few-Shot Leukocyte Classification Algorithm Based on Feature Reconstruction Network with Improved EfficientNetV2

Wang, Xinzheng; Ou, Cuisi; Pan, Guangjian; Hu, Zhigang; Cao, Kaiwen

doi:10.3390/app15179377

Open AccessArticle

Few-Shot Leukocyte Classification Algorithm Based on Feature Reconstruction Network with Improved EfficientNetV2

by

Xinzheng Wang

^*,

Cuisi Ou

,

Guangjian Pan

,

Zhigang Hu

and

Kaiwen Cao

School of Medical Technology and Engineering, Henan University of Science and Technology, Luoyang 471000, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(17), 9377; https://doi.org/10.3390/app15179377

Submission received: 14 July 2025 / Revised: 15 August 2025 / Accepted: 18 August 2025 / Published: 26 August 2025

Download

Browse Figures

Versions Notes

Abstract

Deep learning has excelled in image classification largely due to large, professionally labeled datasets. However, in the field of medical images data annotation often relies on experienced experts, especially in tasks such as white blood cell classification where the staining methods for different cells vary greatly and the number of samples in certain categories is relatively small. To evaluate leukocyte classification performance with limited labeled samples, a few-shot learning method based on Feature Reconstruction Network with Improved EfficientNetV2 (FRNE) is proposed. Firstly, this paper presents a feature extractor based on the improved EfficientNetv2 architecture. To enhance the receptive field and extract multi-scale features effectively, the network incorporates an ASPP module with dilated convolutions at different dilation rates. This enhancement improves the model’s spatial reconstruction capability during feature extraction. Subsequently, the support set and query set are processed by the feature extractor to obtain the respective feature maps. A feature reconstruction-based classification method is then applied. Specifically, ridge regression reconstructs the query feature map using features from the support set. By analyzing the reconstruction error, the model determines the likelihood of the query sample belonging to a particular class, without requiring additional modules or extensive parameter tuning. Evaluated on the LDWBC and Raabin datasets, the proposed method achieves accuracy improvements of 3.67% and 1.27%, respectively, compared to the method that demonstrated strong OA performance on both datasets among all compared approaches.

Keywords:

feature reconstruction; few-shot learning; leukocyte classification; EfficientNetV2

1. Introduction

Leukocytes represent an indispensable component of the human immune system, with the majority of their production occurring in bone marrow and lymphoid tissues. Employing a series of intricate physiological processes, they are capable of effectively resisting the invasion of pathogens, including bacteria, viruses, and fungi, thereby maintaining the healthy state of the organism [1,2,3]. Leukocyte classification traditionally divides them into two major groups, granulocytes and non-granulocytes, based on their morphological characteristics [4,5]. The granulocyte population comprised a very low percentage of basophils (approximately 0–1%), a moderate number of eosinophils (approximately 1–5%), and a dominant number of neutrophils (approximately 50–70%). The non-granulocyte population is primarily composed of monocytes (approximately 2–10%) and lymphocytes (approximately 20–45%), which also play a pivotal role in the immune response process [4,5,6]. When the leukocyte count is outside the normal range, it may indicate that the organism is in a pathological state [7,8]. Leukemia, a hematological malignancy with a significant adverse impact on human health, demonstrates a progressive global incidence trend on an annual basis. A review of the literature revealed a significant increase in the global incidence of newly diagnosed leukemia cases, from 354.5 thousand to 518.5 thousand cases between 1990 and 2017 [9].

In light of the significant risk to human health posed by leukemia and the rising prevalence of the disease, the imperative for strengthening early screening, diagnosis, and treatment protocols becomes increasingly evident. In this context, it is of particular importance to develop efficient and accurate leukocyte classification techniques to assist physicians in the rapid and accurate identification and diagnosis of the disease during its initial phases. Conventional deep learning classification techniques typically necessitate a substantial quantity of data to achieve optimal performance. However, in clinical practice the collection and annotation of medical data present considerable challenges for algorithm development. The identification and labeling of medical images require substantial time and effort from experienced healthcare professionals. These processes demand a high degree of expertise and extensive time commitment, which collectively hinder the acquisition and annotation of large-scale, high-quality medical datasets. Consequently, the availability of well-curated data essential for deep learning applications remains limited. In addition, in practical applications medical data samples, such as blood cell images, often exhibit distributional biases. General classification models may be unable to learn sufficiently generalized features, which in turn affects their performance. Few-shot learning offers an effective solution to this problem. This is a machine learning method that trains and predicts models with a minimal amount of data.

With the development of deep learning, few-shot learning methods based on meta-learning, metric learning, and graph neural networks receive increasing attention. These methods utilize the expressive power and parameter-sharing properties of neural networks and make significant progress in few-shot learning tasks. For example, Ravi et al. [10] considered the optimization process as a model for few-shot learning and proposed an end-to-end meta-learning method. They put forth a meta-learner model based on LSTM networks to learn an optimal algorithm for training another learner model in scenarios with limited data. This approach explored techniques for model parameter updating using optimization algorithms in a few-shot learning scenario. Jiang et al. [11] proposed a meta-learning method based on conditional class dependencies. Conditional class-aware meta-learning (CAML) is a method that conditionally transforms feature representations by leveraging metric spaces specifically designed to encode relationships among different classes. Through this mechanism, CAML enables the conditional modulation of feature representations utilized by the base learner, incorporating regularization that is guided by the structure of the label space. By explicitly considering and integrating inter-class dependencies, this approach significantly improves meta-learning performance.

Based on meta-learning research, Kozerawski et al. [12] proposed a multi-stage training framework aimed at predicting the parameters of SVM classifiers. The second stage focused on learning a category-specific feature space for each classification task, and the third stage underwent an end-to-end training mechanism to dynamically produce a set of neural network classifiers to deal with the problem of few-shot image recognition by combining metric learning and the classifiers. The method was able to update the metric space each time a new sample was added, thus improving the recognition performance. In addition, Gidaris et al. [13] proposed incorporating the attention mechanism in metric learning to improve the performance of small sample image recognition by combining metric learning and visual attention mechanisms to extract features. Later, researchers found that combining metric learning with graph neural networks could lead to better generalization ability of the model. For example, Mandal et al. [14] proposed the combination of meta-learning with graph neural networks to optimize the parameters of GNN through meta-learning for better generalization ability with a small number of samples. The results showed that the integration of meta-learning with graph neural networks could lead to better prediction by the model. Moreover, Hamilton et al. [15] put forth the GraphSAGE approach to achieve inductive learning on large-scale graphs by sampling neighboring nodes. Node embeddings were generated with efficiency through the use of node feature information, allowing for the learning of a function that generates embeddings through the sampling and aggregation of features from the local neighborhood of a node.

In the field of leukocyte classification, Sudhakar et al. [16] proposed a Siamese twin network (STN)-based method for few-shot learning to enable automated classification of healthy peripheral blood cells. This method incorporated STN and contrastive learning techniques and used EfficientNet-B3 as a base model, which was trained with fewer images to achieve the classification of peripheral blood cells. In the context of the small sample challenge of brain imaging modality recognition, Santi et al. [17] put forth a deep triple network structure based on a CNN. The structure learned an optimized distance metric function by constructing a triple consisting of an anchor, a positive sample, and a negative sample, and extracting a deep feature representation using a CNN. This approach successfully maximized the similarity between the anchor and the positive sample while minimizing the similarity with the negative sample, thus improving the classification performance in a resource-constrained few-shot environment. Chen et al. [18] made significant progress in the field of few-shot chest CT image analysis by using the momentum contrastive learning (MCL) strategy. The approach effectively learned feature representations and enabled accurate COVID-19 diagnosis by increasing the similarity among positive sample pairs while reducing the similarity between negative sample pairs. In addition, Alfonso et al. [19] used a Siamese neural networks-based few-sample learning approach to migrate feature knowledge learned from a complete and well-labeled source dataset (e.g., colon tissue images) to a target domain containing a wider range of tissue types (e.g., colon, lung, breast tissues) to significantly improve the classification performance. The above method not only demonstrated the substantial potential of few-shot learning in medical image analysis but also provided innovative insights into addressing the challenge posed by the limited availability of medical data.

In light of the aforementioned research in few-shot medical image classification, this study proposes the utilization of FRNE, a few-shot learning method based on the integration of pre-trained feature extraction and spatial reconstruction, to enhance the predictive accuracy of simulations utilizing limited samples of leukocyte data. The proposed method reformulates the white blood cell classification task as a feature-space-reconstruction problem. Specifically, an optimized feature extractor is employed to extract features from the support set samples, which are then used to reconstruct the query samples. Feature re-construction is achieved through a closed-form solution regression, and the resulting reconstruction error serves as the basis for category prediction. Unlike optimization-based meta-learning approaches [10,20]—which require the training of an external optimizer (e.g., an LSTM) and involve a large number of meta-training tasks (e.g., N-way K-shot episodes)—and metric learning methods [16] that rely on the careful design of positive and negative sample pairs, the feature reconstruction-based algorithm offers several advantages. It directly reflects the similarity between query and support samples, provides an interpretable classification mechanism in the medical domain, leverages the geometric relationships within the feature space, avoids complex optimization procedures, and demonstrates high computational efficiency. Therefore, it is particularly well-suited for applications with limited data and computational resources, where high classification accuracy is essential. The main contributions of this work are summarized as follows:

(1): A classification method for white blood cell images is designed based on the FRNE few-shot learning framework. A spatial reconstruction network is introduced to align the feature maps of the support set with those of the query set. By utilizing features extracted from a limited number of reference cell samples, the spatial characteristics of the target cell categories can be effectively reconstructed, thereby enabling an investigation into the applicability and performance of few-shot learning approaches in the field of blood cell analysis.
(2): The improved EfficientNetv2 model is proposed as a feature extractor and integrated with the ASPP module. By employing dilated convolution structures with different dilation rates, the model is capable of effectively capturing and adaptively extracting multi-scale features from blood cell images.

2. Related Work

Deep learning has achieved remarkable breakthroughs across various domains, owing to its powerful computational capabilities and effective learning mechanisms. However, its performance heavily relies on the availability of large-scale, accurately labeled datasets, which restricts its applicability in many real-world scenarios. For instance, in the field of medical diagnosis, ethical concerns, privacy issues, and the high cost of expert annotation make it difficult to obtain sufficient labeled data for research. In response to these challenges, researchers have increasingly focused on few-shot learning, which has emerged as a crucial branch of machine learning. Introduced by Norbert et al. [21], few-shot learning aims to enable models to achieve strong performance with only a limited number of training samples. To enhance its effectiveness, researchers incorporate methods such as meta-learning, incremental learning, metric learning, and semi-supervised approaches, often combining multiple strategies and learners. Currently, few-shot learning is being progressively applied in areas such as industrial vision inspection, robotics, and healthcare.

In the medical field, few-shot learning also faces significant developmental challenges and remains relatively underexplored due to various constraints, including data security concerns, ethical regulations, limited inter-class variability, and insufficient sample representativeness. Similarly to its application in other domains, few-shot learning in medicine is predominantly employed for tasks such as classification, segmentation, and detection. The frequently cited literature includes the following: Singh et al. [22] proposed a few-shot learning method termed “MetaMed,” which incorporates data augmentation techniques such as MixUP and CutMix into a meta-learning framework. MetaMed achieved classification accuracies above 70% on the Pap smear, ISIC 2018, and BreakHis datasets. Furthermore, it demonstrated superior average performance in both two-way and three-way classification tasks compared to transfer learning approaches. Sun et al. [23] introduced an incremental learning framework termed meta self-attention prototype incrementer, which incorporates a feature extraction embedding encoder, a prototype enhancement component, and a distance metric-based classifier. Their approach effectively addressed the challenge of incorporating new classes in medical time-series classification while preserving knowledge of previously learned classes, thereby achieving excellent results. Liu et al. [24] developed a meta Siamese network based on metric learning that utilized similarity measurements for arrhythmia diagnosis, which demonstrated strong robustness. Roy et al. [25] proposed a two-arm architectural model incorporating the ‘Squeeze & Excite’ module to enable few-shot learning and segmentation of CT images across various organs throughout the body. This architecture demonstrates effective volumetric image segmentation without requiring pre-training. Wang et al. [26] introduced a semi-supervised few-shot framework that employs loss re-weighting to balance the distribution of different lesion categories, thereby achieving accurate classification of pulmonary infection regions using only a limited number of labeled samples by leveraging high-confidence predictions. Cheng [27], Zhu [28], and their colleagues adopted a prototype-based few-shot learning approach, which enhances model performance by capturing more representative class distributions via multiple descriptors or refined sub-regions. Jiang et al. [29] combined meta-learning, transfer learning, and metric learning to develop a multi-learner few-shot learning framework. Their method incorporates real-time data augmentation and dynamic Gaussian soft labels, resulting in strong generalization capabilities for classification tasks and demonstrating superior performance on the medical benchmark datasets CHEST, BLOOD, and PATH.

Although few-shot learning has been applied in the medical domain with promising outcomes, its application in blood cell segmentation and classification remains relatively limited. A literature search conducted in the SCI-EXPANDED database of the Web of Science using the query (TS = (few shot) OR TS = (few-shot)) AND (TS = (blood cell) OR TS = (bone marrow cell)), while excluding studies not directly focused on the segmentation or classification of blood or bone marrow cells, yielded only two relevant articles—Reference [16] and Reference [30]. The approach proposed in Reference [16] employs a Siamese network architecture to quantitatively compare the embeddings of generated image pairs by computing their absolute differences, which are subsequently utilized for training and class prediction. This demonstrates the efficacy of the few-shot contrastive learning method based on the EfficientNet-B3 backbone model in the domain of blood cell classification. Reference [30] provides a diverse cell segmentation dataset comprising red blood cells, white blood cells, and infected cells, which is valuable for blood cell research and particularly beneficial for few-shot learning involving imbalanced classes. In addition, Chossegros et al. [31] implemented a model fine-tuning workflow for few-shot learning of white blood cell classification. This methodology consisted of pre-training the EfficientNet model on images from two distinct datasets, followed by fine-tuning on the target dataset, which demonstrated promising performance. The objective of this research is to evaluate the effectiveness of few-shot learning based on feature reconstruction techniques in the classification of white blood cells with similar morphological structures and features. To accomplish this, the multi-dimensional, balanced extended EfficientNet model [32] is utilized for feature extraction and further enhanced through the integration of ASPP [33]. Convolutions with varying dilation rates are incorporated to generate diverse feature representations, thereby improving the model’s ability to capture multi-scale contextual information and enhancing its generalization capability. Finally, the features of query samples are reconstructed using a spatial feature reconstruction network, and the corresponding sample categories are determined based on the associated reconstruction error.

3. Methods

To evaluate the effectiveness of few-shot classification methods in the field of white blood cell classification, this study proposes a feature reconstruction-based few-shot learning framework that incorporates an improved EfficientNetV2 architecture. The proposed method seeks to improve classification accuracy by effectively utilizing limited training samples. The specific structure of the model is presented in Figure 1, which consists of three primary components: the feature extraction module, the feature reconstruction network, and the few-shot classification stage. The feature extraction module leverages the improved EfficientNetv2, incorporating fused-MBConv structures to accelerate training speed and an ASPP module to expand receptive fields, thereby enhancing cellular feature capture capability. Before engaging in downstream classification tasks, the feature extraction network undergoes pre-training on the mini-ImageNet dataset to enhance its generalization capability and feature extraction precision. Subsequently, both the support sample set and the query sample set are fed into the improved EfficientNetV2 feature extractor to generate their corresponding feature maps. These feature maps are then used as inputs to the spatial reconstruction network, which is subsequently trained to produce the final, well-trained spatial reconstruction model, thereby enabling the reconstruction of the feature maps and the determination of class membership relationships. Based on these procedures, the complete FRNE few-shot learning model is established. Finally, each query sample is evaluated using the trained model, and its class label is assigned based on the corresponding reconstruction error.

3.1. Feature Extractor Based on the Improved EfficientNetv2 Network

To build a powerful feature extractor that can adapt to the special needs of few-shot learning environments, this study has chosen EfficientNetv2 as the base model and made targeted improvements to it. Specifically, as shown in Figure 2, its efficient block-convolution design is retained in the network architecture of EfficientNetv2, which contains the fused-Conv module (located in the upper left region of Figure 2) and MBConv modules (situated in the upper middle region of Figure 2), which ensure an optimal balance between parametric efficiency and the performance of the model. To further enhance the effectiveness of the model’s feature extraction ability for white blood cell images, this study introduced an ASPP module between the final MBConv block and the last convolutional block. The ASPP module is a spatial pyramid pooling technique for multi-scale feature extraction. It typically comprises multiple dilated convolutional layers with different dilation rates. The dilation rate refers to the spacing at which the convolutional kernel samples the input feature map. By integrating dilated convolutions with varying dilation rates, the ASPP module can effectively capture multi-scale information across local to global regions while maintaining spatial resolution. The ability of ASPP to capture contextual information through multi-scale dilated convolutions with varying receptive fields enables the model to better understand and distinguish different types of white blood cells. The employment of dilated convolutions with different multiplicity expansion factors allows the model to prioritize distinct features at disparate scales, thereby enhancing its capacity to accommodate the heterogeneous sizes and shapes of blood cells. This approach contributes to the robustness of the leukocyte classifier in processing cell images of varying dimensions and shapes, while also augmenting the precision of the model in categorizing the leukocyte dataset.

The study of white blood cell classification based on few-shot learning has some limitations due to the relatively small sample size in the dataset. Therefore, before training the spatial reconstruction network this research uses a large-scale dataset (mini-ImageNet), comprising categories different from the target classes, to pre-train an effective feature extraction model. Through this pre-training process and subsequent weight transfer the feature extraction model can learn diverse and generalized features from extensive data sources, thereby enhancing performance across various image classification tasks. In cases where training data is scarce, these learned features enable the classification model to better adapt to the specific task of white blood cell image classification. Furthermore, the feature extraction network integrates regularization techniques such as dropout and mixup to dynamically regulate training intensity. These strategies not only enhance classification accuracy but also expedite both training and feature extraction processes. During the feature reconstruction phase, detailed in the following section, the pre-trained feature extractor is utilized to generate feature maps for both the support set and query set inputs, upon which the feature reconstruction network is subsequently trained.

3.2. Feature Map Reconstruction

Based on the aforementioned optimized improved EfficientNetv2 feature extractor, this study further constructs the FRNE model for application in white blood cell classification. The model combines the advantages of the spatial feature reconstruction network and the feature extractor pre-trained in this paper, achieving efficient classification under the condition of limited samples.

To address the issue that traditional classification methods are easily constrained by the scale of training data in few-shot learning scenarios, the FRNE model proposed in this study achieves network architecture fusion based on the feature reconstruction network (FRN) [34]. It leverages the feature space reconstruction mechanism of the FRN to impose reconstruction constraints on high-level information and enhances the feature extractor’s capability to capture multi-scale features. In the context of blood or bone marrow cell classification, sample acquisition is frequently constrained by high costs and ethical considerations, resulting in limited training data. The proposed classification approach based on feature space reconstruction error is capable of learning how to map query samples into a feature space comparable to that of the support set samples. It performs cross-category reconstruction of the query sample feature maps using the support features of given categories. By minimizing the error between the original and reconstructed features of the query samples, the model establishes a classification decision boundary, thereby estimating the likelihood of a query sample belonging to a specific category under data-scarce conditions. This approach is implemented in the form of a closed-form solution by performing regression directly to the query sample features through the support sample features, without the need to introduce new modules or large-scale training parameters. Additionally, in FRN the support features for each category are organized into a matrix, and the query samples are similarly mapped to the feature space. Subsequently, the query samples are classified by identifying a matrix that satisfies a specific condition, thereby minimizing the reconstruction error between the support feature matrix and the query features. The detailed architecture of the spatial feature reconstruction is illustrated in Figure 3.

In Figure 3, Xs denotes the support image dataset of labeled white blood cell samples. The task involves predicting the class label

y_{q}

for each query instance

x_{q}

(from the query set Xq). To achieve this, both Xs and Xq undergo feature extraction via the improved EfficientNetv2 model, yielding the feature map of the support sample set and the feature map Q of the query sample set, respectively. The feature maps from all support samples are subsequently pooled into a unified support sample pool to form a single matrix,

S_{c}

. Based on the principle of ridge regression derived from the least squares method, the query sample feature map Q is reconstructed as a weighted sum of the rows of the support sample feature matrix

S_{c}

, expressed as Equation (1):

\bar{W} = {a r g m i n}_{W} | | Q - W S_{c} | |^{2} + λ | | W | |^{2}

(1)

where λ denotes the penalty term coefficient, and W represents the weight matrix used for reconstructing Q. The ability of λ to penalize large weights can be further improved through the learned recalibration term

ρ

. A closed-form solution

\bar{W}

for W is derived and defined as Equation (2):

\bar{W} = ρ Q S_{c}^{T} (S_{c} S_{c}^{T} + λ I)^{- 1} S_{c}

(2)

Based on the optimal solution

\bar{W}

derived from Equation (2) and the support set features

S_{c}

, the reconstructed query feature map

\bar{Q}

can be effectively reconstructed, expressed as Equation (3). In other words, this implies that the reconstruction process leverages both the optimized parameters and the representative features from the support set to generate the target query feature map.

\bar{Q} = ρ \bar{W} S_{c}

(3)

Then, the Euclidean distances between the reconstructed feature map

\bar{Q}

and the actual query sample features of each class are calculated. These distances are subsequently converted into a probability distribution through the application of the softmax function, resulting in predicted probabilities and the associated class label. Specifically, the complete algorithmic procedure is summarized in the pseudocode provided in Algorithm 1.

Algorithm 1: FRNE model

As illustrated in Figure 3 and the preceding process shown in the pseudocode, following the spatial reconstruction of the features of the positive samples using the pre-trained classifier, the spatial feature information for a specific category is obtained. Subsequently, the query set is provided to the classifier, which generates a feature map of the query set. Thereafter, mapping is executed between the query set and the support set. The entire mapping process is the primary function of FRN, which aims to maximize the fit between the two sets to achieve spatial reconstruction. This enables FRN to effectively classify query samples through spatial feature reconstruction in few-shot learning scenarios. It not only makes efficient use of limited sample data but also improves the model’s generalization capability by reconstructing and mapping features.

4. Experimental Analysis

4.1. Dataset Construction

For the few-shot leukocyte classification task, this study uses two publicly available datasets, LDWBC [35] and Raabin [36], as the experimental datasets. LDWBC is a freely accessible white blood cell image classification dataset released by Wuhan University. It was captured using Wright–Giemsa staining with an OLYMPUS BX41 microscope and a Plan N 100×/1.25 objective lens. This dataset comprises a total of 22,645 images, including 224 basophil images, 968 monocyte images, 539 eosinophil images, 10,469 neutrophil images, and 10,445 lymphocyte images. Each image has a resolution of 1280 × 1280 pixels. The Raabin dataset is a publicly available dataset of normal peripheral blood leukocytes developed by Zahra et al. for tasks such as classification and segmentation. All samples in the dataset were Giemsa-stained and captured using two microscopes: Olympus CX18 and Zeiss. It comprises a total of 14,514 white blood cell images, categorized into five distinct classes, 301 basophils, 795 monocytes, 1066 eosinophils, 8891 neutrophils, and 3461 lymphocytes, with each image having a resolution of 575 × 575 pixels. This dataset is primarily designed for white blood cell classification. The LDWBC and Raabin datasets each contain five types of white blood cells, and the morphological differences among these cell types are relatively subtle, presenting a challenge for classification in small-sample scenarios. The sample distribution across classes is relatively balanced, and each class exhibits a moderate level of intra-class variation. All cell images are well-cropped and complete. The two datasets differ in staining methods, which allows for a more comprehensive evaluation of the robustness and generalization capability of few-shot learning approaches. Therefore, this study employs both datasets to assess the performance of few-shot learning methods in this context.

Both of these two datasets are divided into three distinct subsets: a training set, a validation set, and a test set. The training set is subsequently partitioned into a support set and a query set. Among them, the support set is subdivided into two kinds due to group experiments: five classes of cells with 10 samples each, i.e., 5 way–10 shot; and five classes of cells with 20 samples each, i.e., 5 way–20 shot. The query set has a total of five classes of cells with 10, 15, or 20 samples each. The validation set and test set consist of the remaining data samples, with a proportional split of 5:5. These constitute the dataset used in this study for white blood cell classification based on few-shot learning. Furthermore, during dataset loading, preprocessing and data augmentation are performed using techniques such as scaling, shearing, rotation, pixel normalization, and color jittering. These methods are intended to enhance the model’s ability to detect variations in object orientation and color.

It is worth stating that most of the current studies based on few-shot learning use 1 shot and 5 shot, but based on the sub-classification of leukocytes, the experimental effect of using 1 shot and 5 shot is not satisfactory due to the high similarity between various leukocytes. Therefore, within the scope of few-shot learning, this study increases the number of samples in the support set to 10 shot and 20 shot, thereby enabling the model to achieve better classification performance.

4.2. Experimental Configuration

All experiments presented in this study were conducted on a GPU computing platform for Windows with an Intel Core i9 12900KF CPU, 32 GB of RAM, an NVIDIA GTX3060Ti graphics card, and an Intel^® C612 chipset. The software environment was built using the open-source PyTorch 1.13.1 framework along with its Python 3.7.12 interface. Hyperparameter settings for the experiments are detailed in Table 1. Notably, the improved EfficientNetv2 employed SELU as the activation function and utilized a balanced cross-entropy loss function during the training process.

4.3. Analysis of Experimental Results

Table 2 shows the comparative results of a different few-shot learning methods on the test sets of both the LDWBC and Raabin datasets. As can be seen from Table 2, for each type of method the results obtained in the 5 way–20 shot scenario are significantly better than those in the 5 way–10 shot scenario. This suggests that in the classification of white blood cells with complex and variable morphologies, appropriately increasing the number of training samples can help the model better learn distinguishing features that may otherwise be obscured by morphological diversity, thereby enhancing overall classification performance. The few-shot learning method based on spatial reconstruction obtained a prediction accuracy of 62.34% for 5 way–10 shot and 77.15% for 5 way–20 shot on the LDWBC dataset, and obtained an accuracy of 64.63% for 5 way–10 shot and 72.02% 5 way–20 shot on the Raabin dataset for prediction accuracy. Under the same dataset, the prediction accuracy of the few-shot learning based on spatial reconstruction in this study is better than that of the comparison methods under both 5 way–10 shot and 5 way–20 shot schemes. It can be observed that, compared to the comparison methods, the FRNE model effectively learns and leverages richer spatial features through feature extraction and spatial reconstruction, thereby achieving more desirable results in cell categorization under few-shot scenarios.

Table 3 presents the comparative performance of various few-shot learning methods on the LDWBC and Raabin test sets. Table 4 provides a category-wise comparison of accuracy results for each type of leukocyte under small-shot learning settings on the same datasets. As evidenced by Table 3 and Table 4, the FRNE few-shot learning leukocyte categorization network demonstrates a range of improvements in each evaluation metric when compared with other small-shot learning methods. On the two datasets, the FRNE model achieved maximum OA improvements of 8.64% and 4.52% compared to other methods, respectively. When compared to the Mandal [14] method—which demonstrated strong OA performance on both datasets—the FRNE model still attained accuracy enhancements of 3.67% and 1.27% on the respective datasets. Taking LDWBC as an example, basophil, monocyte, eosinophil, neutrophil, and lymphocyte obtained 73.18%, 71.69%, 73.23%, 73.83%, and 74.22% accuracies in the 5 way–20 shot scenario, respectively. Although the prediction accuracy for basophils in this study (73.18%) is slightly lower than the accuracy reported by Kozerawski [12], which was 74.08%, the FRNE network utilized in this research still managed to attain an average accuracy of 73.23% and an average recall of 74.66%, showcasing a commendable overall performance of the model.

To further demonstrate the performance of the FRNE classification method in comparison with other few-shot learning approaches in the blood cell classification task, this study conducts a significance analysis of the OA and AR performance of each method on the LDWBC and Raabin test sets using the t-test, as presented in Table 5. The FRNE model demonstrates stable and statistically significant advantages in both OA and AR metrics on the LDWBC dataset when compared to several mainstream methods (e.g., Jiang and Mandal), as determined by t-tests. The differences with the Jiang and Mandal methods are particularly pronounced, with OA reaching a highly significant level (p ≤ 0.001), highlighting the superior cell-type discrimination ability of the proposed model. In the Raabin dataset, FRNE also shows statistically significant improvements in the OA metric (p ≤ 0.05) compared to methods such as Ravi and Kozerawski. In terms of AR, significant differences (p ≤ 0.05) are also observed when compared toJiang, Kozerawski, and Hamilton, further supporting the robustness of the model’s recall performance. In summary, for the task of small-sample white blood cell classification, the FRNE model consistently demonstrates advantages in both classification accuracy and recall, as evidenced by statistically significant t-test results. These findings suggest that the model maintains strong cell recognition capabilities even under limited data conditions.

Table 6 shows the comparison results for the LDWBC and Raabin datasets with different numbers of query sets, and Table 7 shows the accuracy of the classification results for each type of leukocyte for the LDWBC and Raabin datasets with different numbers of query sets. As evidenced in Table 6 and Table 7, the accuracy of the model varies when the spatial reconstruction network is trained with disparate numbers of query sets. At the same time, the performance of the model does not improve as the number of samples in the query set increases. On the contrary, the integrated classification performance of the model is better when the query set contains 15 samples. Taking the LDWBC dataset as an example, when the number of query sets is set to 15, the AF1 scores of the reconstructed network improve by 2.56% and 6.16% compared with those obtained using 10 and 20 query sets, respectively. Furthermore, when the number of query sets reaches 20 the model exhibits signs of overfitting, which negatively impacts its overall performance. As shown in Table 7, the prediction accuracies of diverse cell types are enhanced when the number of query sets is set to 15, compared to the results obtained when the number of query sets is 10 or 20. Consequently, in this study, the number of query sets was fixed at 15 during the training of the spatial reconstruction network to ensure that the model attained better prediction accuracy.

Table 8 presents the comparison results of different feature extractors using the LDWBC and Raabin datasets under the 5 way–20 shot setting. Table 9 shows the comparison results of the accuracy of different feature extractors using LDWBC and Raabin datasets for each type of leukocyte classification results (5 way–20 shot). As shown in Table 8 and Table 9, the classification results using the improved EfficientNetv2 proposed in this study as a feature extractor are better than the Conv-4 and ResNet-12 networks used by Davis et al. [23]. Taking LDWBC as an example, the few-shot classification network using improved EfficientNetv2 as a feature extractor improved the AF1 score by 5.62% and 5.97% compared to Conv-4 and ResNet-12, respectively, which shows that the leukocyte classification model proposed in this study has a more desirable effect in terms of classification accuracy.

5. Discussion

The improved EfficientNetv2 feature extractor is capable of capturing multi-scale features, thereby adapting to objects of varying sizes and shapes and effectively extracting key characteristics such as cell morphology, edge information, and texture features. To further validate its effectiveness as a feature extraction method within the few-shot learning framework proposed in this paper, ablation experiments, as described in Table 10, were conducted on both the LDWBC and Raabin datasets. The comparative results demonstrate that, across different datasets, the performance metrics OA, AP, AR, and AF1 have been improved by using the ASPP module. This indicates that the ASPP module enhances the feature extractor’s ability to better comprehend and differentiate various types of white blood cells, particularly in few-shot learning scenarios.

Since the FRNE model depends on morphological features extracted from cell datasets for its training, and given that white blood cells exhibit substantially higher morphological complexity and diversity compared to those in other application scenarios, this pronounced heterogeneity in characteristics renders the model more sensitive to data distribution variations in few-shot learning contexts. Specifically, variations such as the granule distribution and nuclear morphology in neutrophils, along with the irregular nuclear shapes of monocytes, can lead to blurred inter-class boundaries in the feature space, which compromises the classification performance of the FRNE model under few-shot conditions. Taking the Raabin dataset as an example, examples of misclassified images caused by the aforementioned factors are illustrated in Figure 4. Figure 4a presents representative images from the dataset, including neutrophils, eosinophils, monocytes, and lymphocytes in sequence. Figure 4b illustrates some cases in which a neutrophil is misclassified as an eosinophil. Figure 4c provides several examples of a monocyte being misclassified as a lymphocyte. Figure 4d displays some instances where a neutrophil is misclassified as a lymphocyte. These misclassification scenarios shown in Figure 4b–d occur relatively frequently when applying the FRNE model, which aligns with the observations reported in Reference [37]. The nuclear lobe morphology of neutrophils and eosinophils shows a certain degree of overlap, and their cytoplasmic characteristics also exhibit similarities. However, neutrophilic granules are typically coarser and their cytoplasm generally stains more lightly than that of eosinophils, as demonstrated in Figure 4a. During morphological classification, neutrophils may be misclassified due to their thick granule texture characteristics. Examples of such misclassification are presented in the first two images of Figure 4b. Additionally, when neutrophils are tilted or oriented at an angle, their nuclear lobes may resemble those of eosinophils, leading to potential misidentification, as demonstrated in the last two images of Figure 4b. Monocytes typically possess nuclei with kidney-shaped or horseshoe-shaped morphologies, characterized by loosely arranged chromatin. In contrast, lymphocytes have relatively large, round or oval nuclei with densely packed chromatin. However, when a monocyte nucleus appears nearly round and the chromatin structure is indistinct, or when the chromatin appears relatively dense, it may be misclassified as a lymphocyte, as shown in Figure 4c. Similarly, when neutrophil nuclear lobes are folded or distorted, or when staining artifacts cause certain regions to appear nearly round, these cells may also be misclassified as lymphocytes, as depicted in Figure 4d. To address these challenges, the training dataset should encompass a wide variety of cellular morphologies. In future studies, it is recommended to implement standardized staining protocols, enhance image resolution, and incorporate textual prior knowledge into the model to improve the classification accuracy of small-sample learning approaches.

The model in this paper has demonstrated commendable performance in the classification of blood cells. However, it still presents certain limitations and opportunities for further research in the following areas. Firstly, the model reconstructs the features of the query sample by reconstructing the support set samples, and it heavily relies on the pre-trained feature extractor to generate high-quality feature representations. If the feature extraction of the dataset is inadequate, or if the feature space distributions across datasets are inconsistent, then the reconstruction performance will deteriorate. Therefore, in cross-dataset generalization scenarios greater attention should be given to feature extraction and the distribution of the feature space. Furthermore, pre-training offers extensive prior knowledge and establishes a strong foundational starting point for few-shot learning. Logically, a more powerful pre-training process generally leads to better performance. However, it may also introduce confounding factors during the learning process, potentially hindering the classification model’s ability to effectively eliminate irrelevant interference. Further research is needed to better understand the causal relationships among upstream and downstream datasets, features derived from pre-training, classification models, and overall classification performance. Secondly, the final classification of FRNE is determined by feature reconstruction and the comparison of reconstruction errors. If the feature space is perturbed, misclassification may occur due to inaccurate reconstruction error estimation. Variations in staining methods can alter the color and texture information in images, thereby affecting the stability of feature reconstruction and introducing interference. Therefore, staining normalization can be performed at an early stage of feature extraction using the feature extraction network to avoid potential interference. Moreover, the addition of noise can be implemented to enhance the model’s ability to learn from diverse data variations, ultimately improving its generalization performance on unseen or perturbed data. Thirdly, the inference time of FRNE for this task is at the millisecond level. However, its clinical deployment remains constrained by limited computational resources and the need to protect patient data privacy, which hinders the practical implementation of the technology. Federated learning [38] enables local model fine-tuning while restricting interactions to model parameters only, thereby preventing the direct exchange and potential leakage of raw patient data and enhancing data privacy. Integrating federated learning with few-shot learning presents a promising approach for practical clinical deployment. Furthermore, the current method does not support the processing of heterogeneous input data, such as text and images, indicating the need for further research in this area.

It should also be noted that in practical applications within the medical field the incidence rates of diseases vary significantly. Collecting samples for low-incidence or rare diseases often presents considerable challenges. Consequently, the support set may lack sufficient representation of certain categories, leading to class imbalance. Applying few-shot learning methods that assume a balanced support set in such scenarios may introduce potential biases. Although the structural design of the proposed model can partially mitigate the issue of class imbalance—for instance, by not relying directly on the number of samples, but instead mapping query samples to the feature space of the support set samples—it remains effective in distinguishing categories through reconstruction error, provided the feature distributions remain consistent. Furthermore, the model’s ability to capture subtle local differences in fine-grained classification helps reduce bias toward the majority class. However, when the number of samples is extremely limited or the feature distribution is skewed then the model may fail to fully learn distinctive features, which could result in biased predictions. Currently, research on the imbalance problem primarily focuses on the data and algorithm levels. Common strategies include data augmentation and generation [39], oversampling techniques [40], regularization methods [41], and modifications to the loss function [42]. For specific tasks, adjustments to model architecture can also be considered. For example, Li et al. [43] developed a feature interaction module tailored for skin white patches, aiming to better understand data distribution and mitigate the effects of class imbalance. Song et al. [44] employed a pyramid fusion mechanism to achieve multi-granularity perception, thereby enhancing low-contrast features. Chen et al. [45] proposed a two-stage framework for cross-modal causal representation learning, with specialized modules designed for each stage to address spurious correlations between vision and language, as well as inherent limitations in radiological imaging. These approaches offer valuable insights that can be adapted and applied to few-shot learning scenarios [46], enabling the simulation of performance under imbalanced conditions. In addition, the quality of cell images significantly influences the practical effectiveness of few-shot learning. High-quality images that clearly depict fine details of cell morphology and structure contribute to more accurate feature extraction and representation, particularly for datasets with limited samples. In future research, advancements in imaging systems, such as those demonstrated in studies on phase-manipulating Fresnel lenses and video-activated cell sorting [47,48], can be leveraged to achieve breakthroughs at the hardware level.

6. Conclusions

The present study proposes FRNE as an efficient and robust few-shot learning method for the task of leukocyte classification in a few-shot context. The proposed method is designed to address the limitations of traditional machine learning models in sample-scarce scenarios. It significantly enhances the model’s characterization ability and generalization performance for limited sample data through the integration of a pre-training mechanism and a feature reconstruction network. The results demonstrate that FRNE exhibits high prediction accuracy and recall on both the LDWBC and Raabin datasets. The experimental configuration of 5 way–20 shot yielded a prediction accuracy of 77.15% on the LDWBC dataset and 72.02% on the Raabin dataset. In comparison to alternative few-shot learning methodologies, the FRNE approach exhibits superior prediction accuracy and enhanced model generalization capability, thereby further advancing the potential applications of few-shot learning in leukocyte classification.

Although the FRNE model demonstrates remarkable performance on specific datasets, there is still a significant challenge in achieving superior generalization across medical image datasets with disparate sources. In the future, further research may be conducted into more robust feature representation learning methods, or domain adaptation techniques may be introduced with a view to enhancing the model’s adaptability across different datasets.

Author Contributions

Conceptualization, X.W., G.P., C.O. and Z.H.; methodology, X.W., G.P. and C.O.; software, G.P., C.O. and X.W.; validation, X.W., G.P. and C.O.; formal analysis, X.W., G.P., Z.H. and C.O.; investigation, X.W.; resources, X.W. and Z.H.; data curation, G.P., C.O. and K.C.; writing—original draft preparation, G.P., X.W. and C.O.; writing—review and editing, X.W., C.O. and Z.H.; visualization, C.O. and K.C.; supervision, Z.H. and X.W.; project administration, X.W.; funding acquisition, X.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the key specialized research and development breakthrough of Henan province (Grant No. 232102211016) and the key scientific research projects of Henan colleges and universities (Grant No. 23A416004).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The publicly available dataset LDWBC was analyzed in this study. Data can be found here: https://biod.whu.edu.cn/sjj.htm (accessed on 3 February 2025). The publicly available dataset Raabin was analyzed in this study. Data can be found here: https://www.kaggle.com/datasets/raabindata/raabin-wbc (accessed on 3 February 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

FRNE	Feature Reconstruction Network with Improved EfficientNetV2
ASPP	Atrous Spatial Pyramid Pooling
LSTM	Long Short-Term Memory
CAML	Conditional Class-Aware Meta-Learning
SVM	Support Vector Machine
GNN	Graph Neural Networks
GraphSAGE	Graph SAmple and aggreGatE
STN	Siamese Twin Network
CT	Computed tomography
MCL	Momentum Contrastive Learning
ISIC 2018	International Skin Imaging Collaboration 2018
FRN	Feature Map Reconstruction Networks
GPU	Graphics Processing Unit
CPU	Central Processing Unit
SELU	Scaled Exponential Linear Unit
AF1	Aggregated F1-score

References

Khan, M.A.; Qasim, M.; Lodhi, H.M.J.; Nazir, M.; Javed, K.; Rubab, S.; Din, A.; Habib, U. Automated design for recognition of blood cells diseases from hematopathology using classical features selection and ELM. Microsc. Res. Tech. 2021, 84, 202–216. [Google Scholar] [CrossRef]
Almezhghwi, A.; Serte, S. Improved classification of white blood cells with the generative adversarial network and deep convolutional neural network. Comput. Intell. Neurosci. 2020, 2020, 6490479. [Google Scholar] [CrossRef]
Siddique, M.; Aziz, A.; Matin, A. An improved deep learning based classification of human white blood cell images. In Proceedings of the International Conference on Electrical and Computer Engineering (ICECE), Dhaka, Bangladesh, 17–19 December 2020; pp. 149–152. [Google Scholar] [CrossRef]
Ghosh, S.; Majumder, M.; Kudeshia, A. Leukox: Leukocyte classification using least entropy combiner (LEC) for ensemble learning. IEEE Trans. Circuits Syst. II Express Briefs 2021, 68, 2977–2981. [Google Scholar] [CrossRef]
Saade, P.; El, J.R.; El Hayek, S.; Abi Zeid, J.; Falou, O.; Azar, D. Computer-aided detection of white blood cells using geometric features and color. In Proceedings of the Cairo International Biomedical Engineering Conference (CIBEC), Cairo, Egypt, 20–22 December 2018; pp. 142–145. [Google Scholar] [CrossRef]
Zeng, L.; Fu, Y.; Guo, J.; Hu, H.; Li, H.; Wang, N. AI-based portable white blood cells classification and counting system in POCT. IEEE Sens. J. 2024, 24, 11057–11068. [Google Scholar] [CrossRef]
Özyurt, F. A fused CNN model for WBC detection with mRMR feature selection and extreme learning machine. Soft Comput. 2020, 24, 8163–8172. [Google Scholar] [CrossRef]
Baby, D.; Devaraj, S.J.; Hemanth, J.; Raj, A. Leukocyte classification based on feature selection using extra trees classifier: A transfer learning approach. Turk. J. Electr. Eng. Comput. Sci. 2021, 29, 2742–2757. [Google Scholar] [CrossRef]
Dong, Y.; Shi, O.; Zeng, Q.; Lu, X.; Wang, W.; Li, Y.; Wang, Q. Leukemia incidence trends at the global, regional, and national level between 1990 and 2017. Exp. Hematol. Oncol. 2020, 9, 14. [Google Scholar] [CrossRef]
Ravi, S.; Larochelle, H. Optimization as a model for few-shot learning. In Proceedings of the International Conference on Learning Representations (ICLR), Toulon, France, 24–26 April 2017. [Google Scholar]
Jiang, X.; Havaei, M.; Varno, F.; Chartrand, G.; Chapados, N.; Matwin, S. Learning to learn with conditional class dependencies. In Proceedings of the International Conference on Learning Representations (ICLR), New Orleans, LA, USA, 6–9 May 2019. [Google Scholar]
Kozerawski, J.J. Meta-Learning for Few-Shot Image Classification. Ph.D. Thesis, University of California, Santa Barbara, CA, USA, 3–7 May 2021. Available online: https://api.semanticscholar.org/CorpusID:249680233 (accessed on 12 July 2025).
Gidaris, S.; Komodakis, N. Dynamic few-shot visual learning without forgetting. CVPR 2018, 6, 4367–4375. [Google Scholar] [CrossRef]
Mandal, D.; Medyav, S.; Uzzi, B.; Aggarwal, C. Meta-learning with graph neural networks: Methods and applications. ACM SIGKDD Explor. Newsl. 2022, 23, 13–22. [Google Scholar] [CrossRef]
Hamilton, W.; Ying, R.; Leskovec, J. Inductive representation learning on large graphs. In Proceedings of the Advances in Neural Information Processing Systems 30 (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017; Volume 30, pp. 1025–1035. Available online: https://dl.acm.org/doi/abs/10.5555/3294771.3294869 (accessed on 12 July 2025).
Tummala, S.; Suresh, A.K. Few-shot learning using explainable Siamese twin network for the automated classification of blood cells. Med. Biol. Eng. Comput. 2023, 61, 1549–1563. [Google Scholar] [CrossRef]
Puch, S.; Sánchez, I.; Rowe, M. Few-shot learning with deep triplet networks for brain imaging modality recognition. In Proceedings of the MICCAI Workshop on Domain Adaptation and Representation Transfer, Shenzhen, China, 13 October 2019; pp. 181–189. [Google Scholar] [CrossRef]
Chen, X.; Yang, L.; Zhang, T.; Dong, J.; Zhang, Y. Momentum contrastive learning for few-shot COVID-19 diagnosis from chest CT images. Pattern Recognit. 2021, 113, 107826. [Google Scholar] [CrossRef]
Medela, A.; Picón, A.; Saratxaga, C.L.; Belar, O.; Cabezón, V.; Cicchi, R.; Bilbao, R.; Glover, B. Few-shot learning in histopathological images: Reducing the need of labeled data on biological datasets. In Proceedings of the IEEE 16th International Symposium on Biomedical Imaging (ISBI), Venice, Italy, 8–11 April 2019; pp. 1860–1864. [Google Scholar] [CrossRef]
Wang, Y.; Zhu, Y.; Wang, H.; Wang, R.; Duan, H.; Guo, L.; Liu, P. Meta-DDA: Meta-Learning with Diffusion and Dual Augmentation for Few-Shot Text Classification. Knowledge-Based Syst. 2025, 327, 114179. [Google Scholar] [CrossRef]
Jankowski, N.; Duch, W.; Grąbczewski, K. Meta-Learning in Computational Intelligence; Springer: Berlin, Germany, 2011; pp. 97–115. [Google Scholar] [CrossRef]
Singh, R.; Bharti, V.; Purohit, V.; Kumar, A.; Singh, A.K.; Singh, S.K. MetaMed: Few-shot medical image classification using gradient-based meta-learning. Pattern Recognit. 2021, 120, 108111. [Google Scholar] [CrossRef]
Sun, L.; Zhang, M.; Wang, B.; Tiwar, P. Few-shot class-incremental learning for medical time series classification. IEEE J. Biomed. Health Inform. 2023, 28, 1872–1882. [Google Scholar] [CrossRef]
Liu, Z.; Chen, Y.; Zhang, Y.; Ran, S.; Cheng, C.; Yang, G. Diagnosis of arrhythmias with few abnormal ECG samples using metric-based meta learning. Comput. Biol. Med. 2023, 153, 106465. [Google Scholar] [CrossRef]
Roy, A.G.; Siddiqui, S.; Pölsterl, S.; Navab, N.; Wachinger, C. ‘Squeeze & excite’-guided few-shot segmentation of volumetric images. Med. Image Anal. 2020, 59, 101587. [Google Scholar] [CrossRef]
Wang, X.; Yuan, Y.; Guo, D.; Huang, X.; Cui, Y.; Xia, M.; Wang, Z.; Bai, C.; Chen, S. SSA-Net: Spatial self-attention network for COVID-19 pneumonia infection segmentation with semi-supervised few-shot learning. Med. Image Anal. 2022, 79, 102459. [Google Scholar] [CrossRef] [PubMed]
Cheng, Z.; Wang, S.; Xin, T.; Zhou, T.; Zhang, H.; Shao, L. Few-shot medical image segmentation via generating multiple representative descriptors. IEEE Trans. Med. Imaging 2024, 43, 2202–2214. [Google Scholar] [CrossRef]
Zhu, Y.; Wang, S.; Xin, T.; Zhang, Z.; Zhang, H. Partition-a-medical-image: Extracting multiple representative sub-regions for few-shot medical image segmentation. IEEE Trans. Instrum. Meas. 2024, 73, 5016312. [Google Scholar] [CrossRef]
Jiang, H.; Gao, M.; Li, H.; Jin, R.; Miao, H.; Liu, J. Multi-learner based deep meta-learning for few-shot medical image classification. IEEE J. Biomed. Health Inform. 2022, 27, 17–28. [Google Scholar] [CrossRef]
Depto, D.S.; Rahman, S.; Hosen, M.M.; Akter, M.S.; Reme, T.R.; Rahman, A.; Zunair, H.; Rahman, S.; Mahdy, M.R.C. Automatic segmentation of blood cells from microscopic slides: A comparative analysis. Tissue Cell 2021, 73, 101653. [Google Scholar] [CrossRef]
Chossegros, M.; Delhommeau, F.; Stockholm, D.; Tannier, X. Improving the generalizability of white blood cell classification with few-shot domain adaptation. J. Pathol. Inform. 2024, 15, 100405. [Google Scholar] [CrossRef]
Tan, M.; Le, Q. EfficientNetV2: Smaller models and faster training. In Proceedings of the 38th International Conference on Machine Learning (ICML), Virtual Event, 18–24 July 2021; pp. 10096–10106. [Google Scholar] [CrossRef]
Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 834–848. [Google Scholar] [CrossRef]
Wertheimer, D.; Tang, L.M.; Hariharan, B. Few-shot classification with feature map reconstruction networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 8012–8021. [Google Scholar] [CrossRef]
Chen, H.; Liu, J.; Hua, C.; Zuo, Z.; Feng, J.; Pang, B.; Xiao, D. TransMixNet: An attention-based double-branch model for white blood cell classification and its training with the fuzzified training data. In Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Houston, TX, USA, 9–12 December 2021; pp. 842–847. [Google Scholar] [CrossRef]
Kouzehkanan, S.Z.M.; Saghari, S.; Tavakoli, E.; Rostami, P.; Abaszadeh, M.; Mirzadeh, F.; Satlsar, E.S.; Gheidishahran, M.; Gorgi, F.; Mohammadi, S.; et al. Raabin-WBC: A large free access dataset of white blood cells from normal peripheral blood. bioRxiv 2021. [Google Scholar] [CrossRef]
Üzen, H.; Fırat, H. A hybrid approach based on multipath Swin transformer and ConvMixer for white blood cells classification. Health Inf. Sci. Syst. 2024, 12, 33. [Google Scholar] [CrossRef]
Huang, W.; Ye, M.; Du, B.; Gao, X. Few-shot model agnostic federated learning. In Proceedings of the 30th ACM International Conference on Multimedia, Lisbon, Portugal, 10–14 October 2022; pp. 7309–7316. [Google Scholar] [CrossRef]
Luan, S.; Yu, X.; Lei, S.; Ma, C.; Wang, X.; Xue, X.; Ding, Y.; Ma, T.; Zhu, B. Deep learning for fast super-resolution ultrasound microvessel imaging. Phys. Med. Biol. 2023, 68, 245023. [Google Scholar] [CrossRef]
Gao, X.; Ren, B.; Zhang, H.; Sun, B.; Li, J.; Xu, J.; He, Y.; Li, K. An ensemble imbalanced classification method based on model dynamic selection driven by data partition hybrid sampling. Expert Syst. Appl. 2020, 160, 113660. [Google Scholar] [CrossRef]
Triantafillou, E.; Zhu, T.; Dumoulin, V.; Lamblin, P.; Evci, U.; Xu, K.; Goroshin, R.; Gelada, C.; Swersky, K.; Manzagol, P.; et al. Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples. In Proceedings of the International Conference on Learning Representations (ICLR), Addis Ababa, Ethiopia, 26–30 April 2020. [Google Scholar] [CrossRef]
Ansari, M.Y.; Mangalote, I.A.C.; Masri, D.; Dakua, S.P. Neural network-based fast liver ultrasound image segmentation. In Proceedings of the 2023 International Joint Conference on Neural Networks (IJCNN), Queensland, Australia, 18–23 June 2023; IEEE: Washington, DC, USA, 2023; pp. 1–8. [Google Scholar] [CrossRef]
Li, Z.; Jiang, S.; Xiang, F.; Li, C.; Li, S.; Gao, T.; He, K.; Chen, J.; Zhang, J.; Zhang, J. White patchy skin lesion classification using feature enhancement and interaction transformer module. Biomed. Signal Process. Control. 2025, 107, 107819. [Google Scholar] [CrossRef]
Song, W.; Wang, X.; Guo, Y.; Li, S.; Xia, B.; Hao, A. CenterFormer: A Novel Cluster Center Enhanced Transformer for Unconstrained Dental Plaque Segmentation. IEEE Trans. Multimed. 2024, 26, 10965–10978. [Google Scholar] [CrossRef]
Chen, W.; Liu, Y.; Wang, C.; Zhu, J.; Li, G.; Liu, C.L.; Lin, L. Cross-Modal Causal Representation Learning for Radiology Report Generation. IEEE Trans. Image Process. 2025, 34, 2970–2985. [Google Scholar] [CrossRef]
Ochal, M.; Patacchiola, M.; Vazquez, J.; Storkey, A.; Wang, S. Few-shot learning with class imbalance. IEEE Trans. Artif. Intell. 2023, 4, 1348–1358. [Google Scholar] [CrossRef]
Wang, S.; Dong, B.; Xiong, J.; Liu, L.; Shan, M.; Koch, A.W.; Gao, S. Phase manipulating Fresnel lenses for wide-field quantitative phase imaging. Opt. Lett. 2025, 50, 2683–2686. [Google Scholar] [CrossRef]
He, W.; Zhu, J.; Feng, Y.; Liang, F.; You, K.; Chai, H.; Sui, Z.; Hao, H.; Li, G.; Zhao, J.; et al. Neuromorphic-enabled video-activated cell sorting. Nat. Commun. 2024, 15, 10792. [Google Scholar] [CrossRef]

Figure 1. Few-shot leukocyte classification model based on FRNE.

Figure 2. The improved model of EfficientNetv2.

Figure 3. The principal diagram of spatial feature reconstruction.

Figure 4. Examples of incorrect white blood cell classification. (a) Four representative cell types in the Raabin dataset. From top to bottom: neutrophil, eosinophil, monocyte, and lymphocyte (N: neutrophil, E: eosinophil, M: monocyte, L: lymphocyte); (b) four examples of neutrophils misclassified as eosinophils (N→E); (c) four examples of monocytes misclassified as lymphocytes (M→L); (d) four examples of neutrophils misclassified as lymphocytes (N→L).

Table 1. Hyperparameter setting of FRNE.

Hyperparameter	Value
Activation Function	SELU
Cost Function	BCE Loss
Learning Rate	1 × 10⁻²
Weight Decay	5 × 10⁻⁴
Optimizer	Adam
Epochs	100
Batch Size	32
Learning rate cut scalar	0.1
Training Callbacks	Model Checkpoint, Reduce LR On Plateau Cosine Annealing Early Stopping

Table 2. Comparison results of different few-shot methods on LDWBC and Rabbin datasets (%) (10 shot and 20 shot).

Dataset	Compare Method	Type	5 Way–10 Shot Accuracy	5 Way–20 Shot Accuracy
LDWBC	Ravi [10]	Meta-Learning	55.36	73.68
	Jiang [11]	Meta-Learning	53.44	70.71
	Kozerawski [12]	Metric Learning	56.47	72.63
	Gidaris [13]	Metric Learning	52.11	76.43
	Mandal [14]	GNN	58.24	68.51
	Hamilton [15]	GNN	60.81	73.48
	FRNE	FRN	62.34	77.15
Raabin	Ravi [10]	Meta-Learning	54.98	68.28
	Jiang [11]	Meta-Learning	53.75	71.88
	Kozerawski [12]	Metric Learning	58.06	67.72
	Gidaris [13]	Metric Learning	50.6	67.5
	Mandal [14]	GNN	52.8	68.54
	Hamilton [15]	GNN	64.8	70.75
	FRNE	FRN	64.63	72.02

Table 3. Comparison results of different few-shot learning methods on LDWBC and Raabin test sets (%) (5 way–20 shot).

Dataset	Comparison Method	OA	AR	AP	AF1
LDWBC	Ravi [10]	73.68	68.31	65.85	67.06
	Jiang [11]	70.71	68.07	69.88	68.96
	Kozerawski [12]	72.63	69.51	71.23	70.36
	Gidaris [13]	76.43	68.81	74.61	71.59
	Mandal [14]	68.51	72.32	72.03	71.97
	Hamilton [15]	73.48	69.88	74.15	71.95
	FRNE	77.15	73.23	74.66	73.94
Raabin	Ravi [10]	68.28	71.63	67.11	69.3
	Jiang [11]	71.88	68.99	68.45	68.72
	Kozerawski [12]	67.72	69.32	66.71	67.99
	Gidaris [13]	67.5	69.43	70.07	69.75
	Mandal [14]	68.54	69.52	71.12	70.31
	Hamilton [15]	70.75	68.76	68.38	68.57
	FRNE	72.02	73.09	69.17	71.08

OA: overall accuracy, AR: average recall, AP: average precision, AF1: average F1-score.

Table 4. The proportion of correct predictions (%) for each leukocyte category on LDWBC and Raabin test sets based on different few-shot learning methods (5 way–20 shot).

Dataset	Comparison Method	B	M	E	N	L
LDWBC	Ravi [10]	69.17	70.84	65.61	68.47	67.45
	Jiang [11]	65.21	74.33	65.53	69.9	65.38
	Kozerawski [12]	74.08	73.48	65.93	66.19	67.85
	Gidaris [13]	70.93	66.74	67.6	72.86	65.9
	Mandal [14]	73.43	73.09	66.87	73.57	72.64
	Hamilton [15]	71.87	68.57	69.75	69.41	69.78
	FRNE	73.18	71.69	73.23	73.83	74.22
Raabin	Ravi [10]	70.3	70.19	74.47	73.49	69.69
	Jiang [11]	68.8	71.49	65.59	73.95	65.14
	Kozerawski [12]	67.13	70.42	73.29	68.72	67.06
	Gidaris [13]	69.46	67.24	72.19	65.58	72.68
	Mandal [14]	65.51	69.32	73.17	65.87	73.73
	Hamilton [15]	69.62	66.64	67.5	72.56	67.5
	FRNE	73.13	70.78	72.25	74.33	74.98

B: basophil, M: monocyte, E: eosinophil, N: neutrophil, L: lymphocyte.

Table 5. t-test results for LDWBC and Raabin datasets under OA and AP metrics for the FRNE model and comparison methods (* p ≤ 0.05, ** p ≤ 0.01, *** p ≤ 0.001).

Dataset	Metrics	Comparison Method	Mean ± Standard Deviation		t-Statistic	p-Value
Dataset	Metrics	Comparison Method	Group 1	Group 2 (FRNE)	t-Statistic	p-Value
LDWBC	OA	Ravi [10]	73.68 ± 1.57	77.15 ± 2.07	−2.987	0.017 *
		Jiang [11]	70.71 ± 1.94		−5.076	0.001 ***
		Kozerawski [12]	72.63 ± 2.31		−3.258	0.012 *
		Gidaris [13]	76.43 ± 1.90		−0.573	0.582
		Mandal [14]	68.51 ± 2.94		−5.373	0.001 ***
		Hamilton [15]	73.48 ± 1.62		−3.122	0.014 *
	AR	Ravi [10]	68.31 ± 1.70	73.23 ± 2.89	−3.281	0.011 *
		Jiang [11]	68.07 ± 1.56		−3.513	0.008 **
		Kozerawski [12]	69.51 ± 1.31		−2.622	0.031 *
		Gidaris [13]	68.81 ± 2.68		−2.508	0.037 *
		Mandal [14]	72.32 ± 3.03		−0.486	0.640
		Hamilton [15]	69.88 ± 2.46		−1.974	0.084
Raabin	OA	Ravi [10]	68.28 ± 2.00	72.02 ± 2.54	−2.587	0.032 *
		Jiang [11]	71.88 ± 2.03		−0.096	0.926
		Kozerawski [12]	67.72 ± 1.97		−2.991	0.017 *
		Gidaris [13]	67.50 ± 1.92		−3.174	0.013 *
		Mandal [14]	68.54 ± 2.03		−2.393	0.044 *
		Hamilton [15]	70.75 ± 2.00		−0.878	0.405
	AR	Ravi [10]	71.63 ± 2.89	73.09 ± 2.87	−0.802	0.446
		Jiang [11]	68.99 ± 2.20		−2.535	0.035 *
		Kozerawski [12]	69.32 ± 2.01		−2.406	0.043 *
		Gidaris [13]	69.43 ± 2.23		−2.252	0.054
		Mandal [14]	69.52 ± 2.32		−2.163	0.062
		Hamilton [15]	68.76 ± 2.33		−2.619	0.031 *

OA: overall accuracy, AR: average recall.

Table 6. Comparative results of the different query set numbers for LDWBC and Raabin datasets (5 way–20 shot).

Dataset	Query Set Size	OA	AR	AP	AF1
LDWBC	10	62.72	70.57	72.21	71.38
	15	77.15	73.23	74.66	73.94
	20	64.59	67.34	68.23	67.78
Raabin	10	73.76	68.16	68.27	68.22
	15	72.02	73.09	69.17	71.08
	20	64.86	68.31	70.50	70.70

OA: overall accuracy, AR: average recall, AP: average precision, AF1: average F1-score.

Table 7. The proportion of correct predictions (%) for each leukocyte category across different query set sizes (5 way–20 shot).

Dataset	Query Set Size	B	M	E	N	L
LDWBC	10	68.25	70.43	69.33	71.96	72.88
	15	73.18	71.69	73.23	73.83	74.22
	20	71.51	65.14	65.05	68.95	66.05
Raabin	10	67.23	69.29	71.01	68.79	64.5
	15	73.13	70.78	72.25	74.33	74.98
	20	66.95	69.04	72.46	67.82	65.3

B: basophil, M: monocyte, E: eosinophil, N: neutrophil, L: lymphocyte.

Table 8. Comparative results of different feature extractors using LDWBC and Raabin datasets (5 way–20 shot).

Dataset	Feature Extractor Models	OA	AR	AP	AF1
LDWBC	Conv-4	73.56	70.22	66.53	68.32
	ResNet-12	76.53	69.81	66.23	67.97
	Improved EfficientNetv2	77.15	73.23	74.66	73.94
Raabin	Conv-4	69.31	68.48	68.25	68.36
	ResNet-12	70.26	68.77	65.41	67.05
	Improved EfficientNetv2	72.02	73.09	69.17	71.08

OA: overall accuracy, AR: average recall, AP: average precision, AF1: average F1-score.

Table 9. The proportion of correct predictions (%) for each leukocyte category by different feature extractors using LDWBC and Raabin datasets (5 way–20 shot).

Dataset	Feature Extractor Models	B	M	E	N	L
LDWBC	Conv-4	67.09	68.84	74.68	68.5	71.98
	ResNet-12	68.57	70.69	71.68	69.49	68.64
	Improved EfficientNetv2	73.18	71.69	73.23	73.83	74.22
Raabin	Conv-4	69.46	68.29	67.54	70.78	66.31
	ResNet-12	67.72	70.94	71.27	69.92	63.98
	Improved EfficientNetv2	73.13	70.78	72.25	74.33	74.98

B: basophil, M: monocyte, E: eosinophil, N: neutrophil, L: lymphocyte.

Table 10. Impact of ASPP ablation on few-shot classification performance (LDWBC and Raabin, 5 way–20 shot).

Dataset	With ASPP	No ASPP	OA (%)	AR (%)	AP (%)	AF1 (%)
LDWBC		√	75.96	72.80	72.93	72.52
	√		77.15	73.23	74.66	73.94
Raabin		√	71.64	71.48	69.74	70.06
	√		72.02	73.09	69.17	71.08

OA: overall accuracy, AR: average recall, AP: average precision, AF1: average F1-score.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, X.; Ou, C.; Pan, G.; Hu, Z.; Cao, K. Few-Shot Leukocyte Classification Algorithm Based on Feature Reconstruction Network with Improved EfficientNetV2. Appl. Sci. 2025, 15, 9377. https://doi.org/10.3390/app15179377

AMA Style

Wang X, Ou C, Pan G, Hu Z, Cao K. Few-Shot Leukocyte Classification Algorithm Based on Feature Reconstruction Network with Improved EfficientNetV2. Applied Sciences. 2025; 15(17):9377. https://doi.org/10.3390/app15179377

Chicago/Turabian Style

Wang, Xinzheng, Cuisi Ou, Guangjian Pan, Zhigang Hu, and Kaiwen Cao. 2025. "Few-Shot Leukocyte Classification Algorithm Based on Feature Reconstruction Network with Improved EfficientNetV2" Applied Sciences 15, no. 17: 9377. https://doi.org/10.3390/app15179377

APA Style

Wang, X., Ou, C., Pan, G., Hu, Z., & Cao, K. (2025). Few-Shot Leukocyte Classification Algorithm Based on Feature Reconstruction Network with Improved EfficientNetV2. Applied Sciences, 15(17), 9377. https://doi.org/10.3390/app15179377

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Few-Shot Leukocyte Classification Algorithm Based on Feature Reconstruction Network with Improved EfficientNetV2

Abstract

1. Introduction

2. Related Work

3. Methods

3.1. Feature Extractor Based on the Improved EfficientNetv2 Network

3.2. Feature Map Reconstruction

4. Experimental Analysis

4.1. Dataset Construction

4.2. Experimental Configuration

4.3. Analysis of Experimental Results

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI