Test-Time Augmentation for Cross-Domain Leukocyte Classification via OOD Filtering and Self-Ensembling
Abstract
1. Introduction
- We propose a TTA procedure, in order to exploit the knowledge already learnt by the DL models during training, even during inference;
- We propose an OOD filtering procedure that exploits multi-layer deep features and the Euclidean distance to filter out the generated samples that are too far from the ID samples;
- We propose a fusion method that improves the classification performance by fusing the original ID data with generated TTA samples without the need for external data;
- We create a Self-Ensemble classifier that leverages a single DL model to provide a prediction based on the information of both ID and TTA through weighted soft voting;
- We provide a model-agnostic solution that can be used with any DL architecture for WBC classification and potentially can be used for every image classification task.
2. Related Work
2.1. WBC Analysis
2.2. Overcoming Domain Shift
2.3. Test-Time Augmentation
3. Proposed Method
3.1. Reference Scenario
3.2. Methodology
Algorithm 1: TTA |
Require: a set of augmentations A and an input image I |
Ensure: M = |A| transformed images |
1: M0 = I |
2: for i = 1, …, |A| do |
3: compute Mi = Ai(I) |
4: end for |
- Multi-Layer Feature Extraction: For all training images, features are extracted from three layers of a neural network, namely “low”, “mid”, and “high” levels (the precise layer depends on the used architecture).
- kNN Model Training: For each selected feature layer (“low”, “mid”, “high”), a separate kNN model is trained. This creates a non-parametric model of the feature distribution.
- Distance Threshold Estimation: For each training image, the average distance to its k nearest neighbours (e.g., k = 5) is computed. The 95th percentile of these distances is used as a threshold to distinguish ID from OOD samples. This choice is in line with standard practices in outlier detection, where even training samples lying beyond a high percentile (such as the 95th) are regarded as outliers. Such a conservative filtering ensures that only TTA images with features sufficiently close to the training data distribution are retained.
- TTA Feature Extraction: Each TTA image is processed to extract features from the three layers.
- TTA Distance Calculation: For each TTA feature vector, its average kNN distance is computed using the corresponding trained kNN model.
- TTA Image Filtering: A TTA image is retained only if its average distance is below the above-mentioned threshold for all considered layers. Filtering an image in practice corresponds to assigning it a weight of 0 in the subsequent SEC fusion step.
- TTA Weighting: For all retained TTA images, a weight is assigned that is inversely proportional to their average distance, giving more importance to augmentations that are closer to the training distribution.
Algorithm 2: SEC |
Require: a set M of different versions of the original image I, the corresponding weights w, and a DL model C |
Ensure: a prediction p for the image I |
1: initialise p = 0 |
2: for i = 0, …, |M| do |
3: compute p = p + C(Mi) ∗ wi |
4: end for |
5: p = p/|M| |
6: p = argmax(p) |
4. Experimental Evaluation
4.1. Datasets
4.2. Deep Learning Architectures
4.3. Experimental Setup
4.4. Experimental Results
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
Abbreviations
AI | Artificial Intelligence |
CNN | Convolutional Neural Network |
DA | domain adaptation |
DG | domain generalisation |
DL | Deep Learning |
DSEB | depthwise squeeze-and-excitation block |
FOODS | filtering of OOD samples |
ID | In-Distribution |
k-NN | k-Nearest Neighbour |
ML | Machine Learning |
OOD | Out-Of-Distribution |
PPM | pyramid pooling module |
SE | Self-Ensemble |
SEC | Self-Ensemble with Confidence |
SGD | Stochastic Gradient Descent |
TTA | test-time augmentation |
UDA | Unsupervised Domain Adaptation |
ViT | vision transformer |
WBC | white blood cell |
WBCC | Whole Blood Cell Count |
References
- Burton, A.G.; Jandrey, K.E. Leukocytosis and Leukopenia. In Textbook of Small Animal Emergency Medicine; John Wiley & Sons: Hoboken, NJ, USA, 2018; pp. 405–412. [Google Scholar]
- Kutlu, H.; Avci, E.; Özyurt, F. White blood cells detection and classification based on regional convolutional neural networks. Med. Hypotheses 2020, 135, 109472. [Google Scholar] [CrossRef]
- Pandey, P.; P, P.A.; Kyatham, V.; Mishra, D.; Dastidar, T.R. Target-Independent Domain Adaptation for WBC Classification Using Generative Latent Search. IEEE Trans. Med. Imaging 2020, 39, 3979–3991. [Google Scholar] [CrossRef] [PubMed]
- Salehi, R.; Sadafi, A.; Gruber, A.; Lienemann, P.; Navab, N.; Albarqouni, S.; Marr, C. Unsupervised Cross-Domain Feature Extraction for Single Blood Cell Image Classification. In Proceedings of the Medical Image Computing and Computer Assisted Intervention—MICCAI, Singapore, 18–22 September 2022; Springer: Cham, Switzerland, 2022; Volume 13433, pp. 739–748. [Google Scholar]
- Wang, J.; Lan, C.; Liu, C.; Ouyang, Y.; Qin, T.; Lu, W.; Chen, Y.; Zeng, W.; Yu, P.S. Generalizing to Unseen Domains: A Survey on Domain Generalization. IEEE Trans. Knowl. Data Eng. 2023, 35, 8052–8072. [Google Scholar]
- Lim, S.; Kim, I.; Kim, T.; Kim, C.; Kim, S. Fast AutoAugment. In Neural Information Processing Systems—NeurIPS. 2019. Available online: https://proceedings.neurips.cc/paper_files/paper/2019/file/6add07cf50424b14fdf649da87843d01-Paper.pdf (accessed on 24 July 2025).
- Muller, S.G.; Hutter, F. TrivialAugment: Tuning-free Yet State-of-the-Art Data Augmentation. In Proceedings of the International Conference on Computer Vision—ICCV, Montreal, QC, Canada, 10–17 October 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 754–762. [Google Scholar]
- Kimura, M. Understanding Test-Time Augmentation. In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); LNCS; Springer International Publishing: Cham, Switzerland, 2021; Volume 13108, pp. 558–569. [Google Scholar] [CrossRef]
- Ibrahim, A.T.; Abdullahi, M.; Kana, A.F.D.; Mohammed, M.T.; Hassan, I.H. Categorical classification of skin cancer using a weighted ensemble of transfer learning with test time augmentation. Data Sci. Manag. 2025, 8, 174–184. [Google Scholar] [CrossRef]
- Cino, L.; Distante, C.; Martella, A.; Mazzeo, P.L. Skin Lesion Classification Through Test Time Augmentation and Explainable Artificial Intelligence. J. Imaging 2025, 11, 15. [Google Scholar] [CrossRef]
- Garta, I.Y.; Tai, S.K.; Chen, R.C. Improved Detection of Multi-Class Bad Traffic Signs Using Ensemble and Test Time Augmentation Based on Yolov5 Models. Appl. Sci. 2024, 14, 8200. [Google Scholar] [CrossRef]
- Fu, R.; Han, J.; Sun, Y.; Wang, S.; Al-Absi, M.A.; Wang, X.; Sun, H. Robust crop disease detection using multi-domain data augmentation and isolated test-time adaptation. Expert Syst. Appl. 2025, 281, 127324. [Google Scholar] [CrossRef]
- Cho, Y.; Kim, Y.; Yoon, J.; Hong, S.; Lee, D. Feature Augmentation Based Test-Time Adaptation. In Proceedings of the 2025 IEEE Winter Conference on Applications of Computer Vision, WACV, Tucson, AZ, USA, 26 February–4 March 2025; pp. 6838–6847. [Google Scholar] [CrossRef]
- Boulahia, S.; Amamra, A.; Madi, M.; Daikh, S. Early, intermediate and late fusion strategies for robust deep learning-based multimodal action recognition. Mach. Vis. Appl. 2021, 32, 121. [Google Scholar] [CrossRef]
- Matek, C.; Schwarz, S.; Spiekermann, K.; Marr, C. Human-level recognition of blast cells in acute myeloid leukaemia with convolutional neural networks. Nat. Mach. Intell. 2019, 1, 538–544. [Google Scholar] [CrossRef]
- Acevedo, A.; Merino, A.; Boldú, L.; Molina, A.; Alférez, S.; Rodellar, J. A new convolutional neural network predictive model for the automatic recognition of hypogranulated neutrophils in myelodysplastic syndromes. Comput. Biol. Med. 2021, 134, 104479. [Google Scholar] [CrossRef]
- Vogado, L.H.; Veras, R.M.; Araujo, F.H.; Silva, R.R.; Aires, K.R. Leukemia diagnosis in blood slides using transfer learning in CNNs and SVM for classification. Eng. Appl. Artif. Intell. 2018, 72, 415–422. [Google Scholar] [CrossRef]
- Huang, Q.; Li, W.; Zhang, B.; Li, Q.; Tao, R.; Lovell, N.H. Blood Cell Classification Based on Hyperspectral Imaging with Modulated Gabor and CNN. J. Biomed. Health Inform. 2020, 24, 160–170. [Google Scholar] [CrossRef] [PubMed]
- Loddo, A.; Putzu, L. On the Effectiveness of Leukocytes Classification Methods in a Real Application Scenario. AI 2021, 2, 394–412. [Google Scholar] [CrossRef]
- Acevedo, A.; Alférez, S.; Merino, A.; Puigví, L.; Rodellar, J. Recognition of peripheral blood cell images using convolutional neural networks. Comput. Methods Programs Biomed. 2019, 180, 105020. [Google Scholar] [CrossRef]
- Rastogi, P.; Khanna, K.; Singh, V. LeuFeatx: Deep learning-based feature extractor for the diagnosis of acute leukemia from microscopic images of peripheral blood smear. Comput. Biol. Med. 2022, 142, 105236. [Google Scholar] [CrossRef]
- Fırat, H. Classification of microscopic peripheral blood cell images using multibranch lightweight CNN-based model. Neural Comput. Appl. 2024, 36, 1599–1620. [Google Scholar] [CrossRef]
- Tavakoli, S.; Ghaffari, A.; Kouzehkanan, Z.M.; Hosseini, R. New segmentation and feature extraction algorithm for classification of white blood cells in peripheral smear images. Sci. Rep. 2021, 11, 19428. [Google Scholar] [CrossRef]
- Rubin, R.; Anzar, S.M.; Panthakkan, A.; Mansoor, W. Transforming Healthcare: Raabin White Blood Cell Classification with Deep Vision Transformer. In Proceedings of the International Conference on Signal Processing and Information Security—ICSPIS, Dubai, United Arab Emirates, 8–9 November 2023; pp. 212–217. [Google Scholar]
- Saleem, S.; Amin, J.; Sharif, M.; Mallah, G.A.; Kadry, S.; Gandomi, A.H. Leukemia segmentation and classification: A comprehensive survey. Comput. Biol. Med. 2022, 150, 106028. [Google Scholar] [CrossRef]
- Das, P.K.; Meher, S. An efficient deep Convolutional Neural Network based detection and classification of Acute Lymphoblastic Leukemia. Expert Syst. Appl. 2021, 183, 115311. [Google Scholar] [CrossRef]
- Long, F.; Peng, J.; Song, W.; Xia, X.; Sang, J. BloodCaps: A capsule network based model for the multiclassification of human peripheral blood cells. Comput. Methods Programs Biomed. 2021, 202, 105972. [Google Scholar] [CrossRef]
- Tummala, S.; Suresh, A.K. Few-shot learning using explainable Siamese twin network for the automated classification of blood cells. Med. Biol. Eng. Comput. 2023, 61, 1549–1563. [Google Scholar] [CrossRef]
- Zhang, R.; Han, X.; Lei, Z.; Jiang, C.; Gul, I.; Hu, Q.; Zhai, S.; Liu, H.; Lian, L.; Liu, Y.; et al. RCMNet: A deep learning model assists CAR-T therapy for leukemia. Comput. Biol. Med. 2022, 150, 106084. [Google Scholar] [CrossRef]
- Jiang, L.; Tang, C.; Zhou, H. White blood cell classification via a discriminative region detection assisted feature aggregation network. Biomed. Opt. Express 2022, 13, 5246–5260. [Google Scholar] [CrossRef] [PubMed]
- Manzari, O.N.; Ahmadabadi, H.; Kashiani, H.; Shokouhi, S.B.; Ayatollahi, A. MedViT: A robust vision transformer for generalized medical image classification. Comput. Biol. Med. 2023, 157, 106791. [Google Scholar] [CrossRef]
- Almalik, F.; Alkhunaizi, N.; Almakky, I.; Nandakumar, K. FeSViBS: Federated Split Learning of Vision Transformer with Block Sampling. In Proceedings of the Medical Image Computing and Computer Assisted Intervention—MICCAI, Vancouver, BC, Canada, 8–12 October 2023; Springer: Cham, Switzerland, 2023; Volume 14221, pp. 350–360. [Google Scholar]
- Bravin, R.; Nanni, L.; Loreggia, A.; Brahnam, S.; Paci, M. Varied Image Data Augmentation Methods for Building Ensemble. IEEE Access 2023, 11, 8810–8823. [Google Scholar] [CrossRef]
- Li, C.; Liu, Y. Improved Generalization of White Blood Cell Classification by Learnable Illumination Intensity Invariant Layer. IEEE Signal Process. Lett. 2024, 31, 176–180. [Google Scholar] [CrossRef]
- Bairaboina, S.S.R.; Battula, S.R. Ghost-ResNeXt: An Effective Deep Learning Based on Mature and Immature WBC Classification. Appl. Sci. 2023, 13, 4054. [Google Scholar] [CrossRef]
- Togaçar, M.; Ergen, B.; Cömert, Z. Classification of white blood cells using deep features obtained from Convolutional Neural Network models based on the combination of feature selection methods. Appl. Soft Comput. 2020, 97, 106810. [Google Scholar] [CrossRef]
- Şengür, A.; Akbulut, Y.; Budak, Ü.; Cömert, Z. White blood cell classification based on shape and deep features. In Proceedings of the Artificial Intelligence and Data Processing Symposium—IDAP, Malatya, Turkey, 21–22 September 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–4. [Google Scholar]
- Li, C.; Lin, X.; Mao, Y.; Lin, W.; Qi, Q.; Ding, X.; Huang, Y.; Liang, D.; Yu, Y. Domain generalization on medical imaging classification using episodic training with task augmentation. Comput. Biol. Med. 2022, 141, 105144. [Google Scholar] [CrossRef]
- Roels, J.; Hennies, J.; Saeys, Y.; Philips, W.; Kreshuk, A. Domain Adaptive Segmentation In Volume Electron Microscopy Imaging. In Proceedings of the International Symposium on Biomedical Imaging—ISBI, Venice, Italy, 8–11 April 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1519–1522. [Google Scholar]
- Mahmood, F.; Chen, R.J.; Durr, N.J. Unsupervised Reverse Domain Adaptation for Synthetic Medical Images via Adversarial Training. IEEE Trans. Med. Imaging 2018, 37, 2572–2581. [Google Scholar] [CrossRef]
- Miller, T.; Cheng, J.; Fu, H.; Gu, Z.; Xiao, Y.; Zhou, K.; Gao, S.; Zheng, R.; Liu, J. Noise Adaptation Generative Adversarial Network for Medical Image Analysis. IEEE Trans. Med. Imaging 2020, 39, 1149–1159. [Google Scholar] [CrossRef]
- Li, W.; Yang, D.; Ma, C.; Liu, L. Identifying novel disease categories through divergence optimization: An approach to prevent misdiagnosis in medical imaging. Comput. Biol. Med. 2023, 165, 107403. [Google Scholar] [CrossRef]
- Chen, Z.; Pan, Y.; Ye, Y.; Cui, H.; Xia, Y. Treasure in Distribution: A Domain Randomization Based Multi-source Domain Generalization for 2D Medical Image Segmentation. In Proceedings of the Medical Image Computing and Computer Assisted Intervention—MICCAI, Vancouver, BC, Canada, 8–12 October 2023; Springer: Cham, Switzerland, 2023; Volume 14223, pp. 89–99. [Google Scholar]
- Sadafi, A.; Salehi, R.; Gruber, A.; Boushehri, S.S.; Giehr, P.; Navab, N.; Marr, C. A Continual Learning Approach for Cross-Domain White Blood Cell Classification. In Proceedings of the Domain Adaptation and Representation Transfer—MICCAI Workshop, Vancouver, BC, Canada, 12 October 2023; Springer: Cham, Switzerland, 2023; Volume 14293, pp. 136–146. [Google Scholar]
- Umer, R.M.; Gruber, A.; Boushehri, S.S.; Metak, C.; Marr, C. Imbalanced Domain Generalization for Robust Single Cell Classification in Hematological Cytomorphology. arXiv 2023, arXiv:2303.07771. [Google Scholar] [CrossRef]
- Berson, S.W. HIPAA. Oncol. Issues 2003, 18, 20. Available online: https://www.hhs.gov/hipaa/index.html (accessed on 24 July 2025). [CrossRef]
- Parera, A.V.; Costa, X. General Data Protection Regulation. Data Protection Law in the EU: Roles, Responsibilities and Liability. 2018. Available online: https://gdpr-info.eu/ (accessed on 24 July 2025).
- Li, Q.; Tan, K.; Yuan, D.; Liu, Q. Progressive Domain Adaptation for Thermal Infrared Tracking. Electronics 2025, 14, 162. [Google Scholar] [CrossRef]
- Shu, X.; Huang, F.; Qiu, Z.; Zhang, X.; Yuan, D. Learning Unsupervised Cross-Domain Model for TIR Target Tracking. Mathematics 2024, 12, 2882. [Google Scholar] [CrossRef]
- Khosla, A.; Zhou, T.; Malisiewicz, T.; Efros, A.A.; Torralba, A. Undoing the Damage of Dataset Bias. In Proceedings of the European Conference on Computer Vision—ECCV, Florence, Italy, 7–13 October 2012; pp. 158–171. [Google Scholar]
- Putzu, L.; Loddo, A.; Delussu, R.; Fumera, G. Specialise to Generalise: The Person Re-identification Case. In Proceedings of the Image Analysis and Processing—ICIAP, Udine, Italy, 11–15 September 2023; Springer Nature: Chem, Switzerland, 2023; pp. 381–392. [Google Scholar]
- Mitsuzum, Y.; Irie, G.; Kimura, A.; Nakazawa, A. A Generative Self-Ensemble Approach To Simulated+Unsupervised Learning. In Proceedings of the International Conference on Image Processing—ICIP, Abu Dhabi, United Arab Emirates, 25–28 October 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 2151–2155. [Google Scholar]
- Ding, H.; Dai, C.; Wu, Y.; Ma, W.; Zhou, H. SETEM: Self-ensemble training with Pre-trained Language Models for Entity Matching. Knowl.-Based Syst. 2024, 293, 111708. [Google Scholar] [CrossRef]
- Liu, X.; Cheng, M.; Zhang, H.; Hsieh, C.J. Towards Robust Neural Networks via Random Self-ensemble. In Proceedings of the European Conference on Computer Vision—ECCV, Munich, Germany, 8–14 September 2018; pp. 381–397. [Google Scholar]
- Rezatofighi, S.H.; Soltanian-Zadeh, H. Automatic recognition of five types of white blood cells in peripheral blood. Comput. Med. Imaging Graph. 2011, 35, 333–343. [Google Scholar] [CrossRef]
- Chen, H.; Liu, J.; Hua, C.; Zuo, Z.; Feng, J.; Pang, B.; Xiao, D. TransMixNet: An Attention Based Double-Branch Model for White Blood Cell Classification and Its Training with the Fuzzified Training Data. In Proceedings of the International Conference on Bioinformatics and Biomedicine—BIBM, Houston, TX, USA, 9–12 December 2021; pp. 842–847. [Google Scholar]
- Kouzehkanan, Z.M.; Saghari, S.; Tavakoli, S.; Rostami, P.; Abaszadeh, M.; Mirzadeh, F.; Satlsar, E.S.; Gheidishahran, M.; Gorgi, F.; Mohammadi, S.; et al. A large dataset of white blood cells containing cell locations and types, along with segmented nuclei and cytoplasm. Sci. Rep. 2022, 12, 1123. [Google Scholar] [CrossRef] [PubMed]
- Acevedo, A.; Merino, A.; Alférez, S.; Molina, Á.; Boldú, L.; Rodellar, J. A dataset of microscopic peripheral blood cell images for development of automatic recognition systems. Data Brief 2020, 30, 105474. [Google Scholar] [CrossRef] [PubMed]
- Chen, H.; Liu, J.; Hua, C.; Feng, J.; Pang, B.; Cao, D.; Li, C. Accurate classification of white blood cells by coupling pre-trained ResNet and DenseNet with SCAM mechanism. BMC Bioinform. 2022, 23, 282. [Google Scholar] [CrossRef] [PubMed]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. In Proceedings of the International Conference on Learning Representations—ICLR, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the Computer Vision and Pattern Recognition—CVPR, Las Vegas, NV, USA, 27–30 June 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 770–778. [Google Scholar]
- Howard, A.; Sandler, M.; Chen, B.; Wang, W.; Chen, L.C.; Tan, M.; Chu, G.; Vasudevan, V.; Zhu, Y.; Pang, R.; et al. Searching for MobileNetV3. In Proceedings of the International Conference on Computer Vision—ICCV, Seoul, Republic of Korea, 27 October–2 November 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1314–1324. [Google Scholar]
- Huang, G.; Liu, Z.; van der Maaten, L.; Weinberger, K.Q. Densely Connected Convolutional Networks. In Proceedings of the Computer Vision and Pattern Recognition—CVPR, Honolulu, HI, USA, 21–26 July 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 2261–2269. [Google Scholar]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the Inception Architecture for Computer Vision. In Proceedings of the Computer Vision and Pattern Recognition—CVPR, Las Vegas, NV, USA, 27–30 June 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 2818–2826. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In Proceedings of the International Conference on Learning Representations—ICLR, Vienna, Austria, 3–7 May 2021. [Google Scholar]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. In Proceedings of the International Conference on Computer Vision—ICCV, Montreal, QC, Canada, 10–17 October 2021; pp. 9992–10002. [Google Scholar]
- Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Fei-Fei, L. Imagenet: A large-scale hierarchical image database. In Proceedings of the Computer Vision and Pattern Recognition—CVPR, Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar]
- Shin, H.; Roth, H.R.; Gao, M.; Lu, L.; Xu, Z.; Nogues, I.; Yao, J.; Mollura, D.J.; Summers, R.M. Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning. IEEE Trans. Med. Imaging 2016, 35, 1285–1298. [Google Scholar] [CrossRef] [PubMed]
Arc. | Method | AML ⟶ Raabin | PBC ⟶ Raabin | LDWBC ⟶ Raabin | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
A | P | R | F1 | A | P | R | F1 | A | P | R | F1 | ||
VGG-19 | Baseline | 54.1 | 62.8 | 54.1 | 48.4 | 65.5 | 72.0 | 65.5 | 58.9 | 51.5 | 55.5 | 51.5 | 45.6 |
TTA | 55.9 | 63.4 | 55.9 | 50.5 | 67.2 | 73.2 | 67.2 | 61.6 | 49.8 | 53.8 | 49.8 | 43.4 | |
TTA+ | 55.9 | 63.4 | 55.9 | 50.5 | 67.2 | 73.6 | 67.2 | 61.7 | 49.8 | 53.9 | 49.8 | 43.4 | |
Res.152 | Baseline | 64.2 | 79.9 | 64.2 | 56.4 | 63.3 | 67.8 | 63.3 | 60.2 | 44.5 | 43.5 | 44.5 | 36.0 |
TTA | 65.1 | 79.8 | 65.1 | 57.8 | 62.9 | 67.4 | 62.9 | 59.5 | 46.3 | 50.8 | 46.3 | 38.2 | |
TTA+ | 65.9 | 80.0 | 65.9 | 58.6 | 63.8 | 68.5 | 63.8 | 60.5 | 46.3 | 50.8 | 46.3 | 38.4 | |
Inc.v3 | Baseline | 62.9 | 73.7 | 62.9 | 59.3 | 65.5 | 74.5 | 65.5 | 62.8 | 45.0 | 51.0 | 45.0 | 37.6 |
TTA | 63.3 | 73.7 | 63.3 | 60.7 | 67.7 | 79.5 | 67.7 | 64.9 | 45.9 | 50.7 | 45.9 | 38.6 | |
TTA+ | 64.2 | 74.0 | 64.2 | 62.1 | 68.6 | 80.2 | 68.6 | 65.7 | 45.9 | 50.7 | 45.9 | 38.6 | |
Den.121 | Baseline | 40.2 | 60.4 | 40.2 | 31.9 | 63.8 | 67.6 | 63.8 | 60.5 | 51.5 | 62.0 | 51.5 | 45.3 |
TTA | 41.5 | 61.0 | 41.5 | 33.3 | 62.9 | 67.5 | 62.9 | 58.6 | 50.7 | 59.0 | 50.7 | 43.8 | |
TTA+ | 41.9 | 61.8 | 41.9 | 33.7 | 62.9 | 67.5 | 62.9 | 58.6 | 51.1 | 59.7 | 51.1 | 44.7 | |
Mob.v3 | Baseline | 53.7 | 58.4 | 53.7 | 50.9 | 47.6 | 38.9 | 47.6 | 38.7 | 32.3 | 47.7 | 32.3 | 31.8 |
TTA | 57.6 | 63.6 | 57.6 | 55.9 | 53.3 | 48.1 | 53.3 | 44.0 | 33.6 | 48.9 | 33.6 | 32.8 | |
TTA+ | 60.3 | 65.6 | 60.3 | 58.7 | 55.9 | 55.3 | 55.9 | 46.6 | 33.6 | 48.9 | 33.6 | 32.8 | |
ViT | Baseline | 65.5 | 67.0 | 65.5 | 61.6 | 68.1 | 74.0 | 68.1 | 63.9 | 59.8 | 63.6 | 59.8 | 57.9 |
TTA | 67.2 | 68.4 | 67.2 | 64.2 | 70.3 | 76.7 | 70.3 | 66.4 | 64.6 | 68.7 | 64.6 | 63.8 | |
TTA+ | 67.2 | 69.0 | 67.2 | 64.9 | 71.2 | 77.2 | 71.2 | 67.6 | 66.4 | 69.0 | 66.4 | 65.5 | |
Swin | Baseline | 63.3 | 66.4 | 63.3 | 55.6 | 67.7 | 77.1 | 67.7 | 61.5 | 71.2 | 74.2 | 71.2 | 69.6 |
TTA | 63.8 | 65.1 | 63.8 | 57.0 | 69.0 | 73.9 | 69.0 | 63.8 | 71.6 | 75.1 | 71.6 | 69.7 | |
TTA+ | 63.8 | 66.5 | 63.8 | 57.1 | 69.4 | 76.7 | 69.4 | 64.3 | 71.6 | 75.1 | 71.6 | 69.9 |
Arc. | Method | Raabin ⟶ AML | PBC ⟶ AML | LDWBC ⟶ AML | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
A | P | R | F1 | A | P | R | F1 | A | P | R | F1 | ||
VGG-19 | Baseline | 63.4 | 70.7 | 63.4 | 64.6 | 83.5 | 84.0 | 83.5 | 83.6 | 78.9 | 80.3 | 78.9 | 77.4 |
TTA | 63.9 | 72.2 | 63.9 | 65.6 | 83.5 | 84.5 | 83.5 | 83.8 | 79.9 | 83.0 | 79.9 | 78.5 | |
TTA+ | 65.5 | 73.0 | 65.5 | 66.7 | 86.1 | 86.6 | 86.1 | 86.1 | 80.4 | 83.4 | 80.4 | 78.9 | |
Res.152 | Baseline | 76.8 | 77.7 | 76.8 | 77.1 | 73.2 | 80.5 | 73.2 | 72.9 | 66.5 | 78.7 | 66.5 | 62.8 |
TTA | 77.3 | 78.2 | 77.3 | 77.5 | 73.2 | 81.4 | 73.2 | 72.9 | 68.6 | 77.8 | 68.6 | 64.7 | |
TTA+ | 77.3 | 78.2 | 77.3 | 77.5 | 73.2 | 81.4 | 73.2 | 72.9 | 69.1 | 78.0 | 69.1 | 65.5 | |
Inc.v3 | Baseline | 71.1 | 77.7 | 71.1 | 73.3 | 77.8 | 81.3 | 77.8 | 76.7 | 67.5 | 75.6 | 67.5 | 65.5 |
TTA | 73.2 | 79.4 | 73.2 | 75.0 | 74.2 | 79.9 | 74.2 | 73.3 | 73.2 | 77.8 | 73.2 | 71.5 | |
TTA+ | 74.2 | 80.1 | 74.2 | 76.0 | 74.2 | 80.1 | 74.2 | 73.3 | 73.2 | 78.1 | 73.2 | 71.6 | |
Den.121 | Baseline | 77.8 | 77.7 | 77.8 | 76.0 | 74.7 | 80.7 | 74.7 | 74.4 | 78.4 | 82.4 | 78.4 | 75.8 |
TTA | 79.4 | 79.7 | 79.4 | 77.5 | 73.2 | 80.3 | 73.2 | 72.8 | 79.4 | 82.9 | 79.4 | 76.8 | |
TTA+ | 79.9 | 80.0 | 79.9 | 78.0 | 74.2 | 80.9 | 74.2 | 73.8 | 80.4 | 83.7 | 80.4 | 77.9 | |
Mob.v3 | Baseline | 44.3 | 65.6 | 44.3 | 43.9 | 69.1 | 73.9 | 69.1 | 66.6 | 52.1 | 47.5 | 52.1 | 46.3 |
TTA | 51.5 | 72.3 | 51.5 | 53.3 | 70.1 | 75.0 | 70.1 | 68.1 | 54.1 | 46.7 | 54.1 | 47.9 | |
TTA+ | 52.1 | 73.1 | 52.1 | 53.9 | 71.6 | 76.2 | 71.6 | 69.8 | 54.6 | 47.2 | 54.6 | 48.3 | |
ViT | Baseline | 56.7 | 67.1 | 56.7 | 56.7 | 64.9 | 79.0 | 64.9 | 65.2 | 58.8 | 70.0 | 58.8 | 56.0 |
TTA | 58.8 | 69.8 | 58.8 | 59.4 | 67.5 | 79.0 | 67.5 | 66.5 | 59.8 | 70.0 | 59.8 | 57.9 | |
TTA+ | 59.3 | 70.1 | 59.3 | 60.1 | 67.5 | 79.0 | 67.5 | 66.5 | 61.9 | 71.3 | 61.9 | 60.3 | |
Swin | Baseline | 62.9 | 70.2 | 62.9 | 63.3 | 69.6 | 75.8 | 69.6 | 67.5 | 58.8 | 71.9 | 58.8 | 51.0 |
TTA | 61.3 | 71.7 | 61.3 | 62.6 | 72.7 | 78.1 | 72.7 | 71.7 | 59.8 | 72.6 | 59.8 | 52.7 | |
TTA+ | 63.9 | 73.2 | 63.9 | 64.9 | 72.7 | 78.1 | 72.7 | 71.7 | 60.3 | 73.0 | 60.3 | 53.1 |
Arc. | Method | Raabin ⟶ PBC | AML ⟶ PBC | LDWBC ⟶ PBC | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
A | P | R | F1 | A | P | R | F1 | A | P | R | F1 | ||
VGG-19 | Baseline | 57.9 | 55.7 | 57.9 | 53.1 | 42.6 | 57.5 | 42.6 | 38.1 | 61.2 | 68.1 | 61.2 | 56.9 |
TTA | 62.4 | 61.9 | 62.4 | 58.0 | 49.6 | 62.9 | 49.6 | 45.6 | 69.4 | 74.6 | 69.4 | 67.6 | |
TTA+ | 62.4 | 61.9 | 62.4 | 58.0 | 51.2 | 64.5 | 51.2 | 48.0 | 75.2 | 78.8 | 75.2 | 74.0 | |
Res.152 | Baseline | 40.1 | 50.6 | 40.1 | 32.4 | 43.8 | 70.3 | 43.8 | 34.5 | 52.5 | 54.0 | 52.5 | 44.2 |
TTA | 43.0 | 50.2 | 43.0 | 35.3 | 48.3 | 74.1 | 48.3 | 39.0 | 59.9 | 56.8 | 59.9 | 52.9 | |
TTA+ | 44.2 | 52.4 | 44.2 | 36.9 | 49.6 | 74.9 | 49.6 | 40.9 | 62.0 | 77.8 | 62.0 | 55.5 | |
Inc.v3 | Baseline | 47.5 | 47.5 | 47.5 | 39.5 | 40.9 | 68.2 | 40.9 | 32.3 | 25.2 | 15.8 | 25.2 | 14.1 |
TTA | 51.2 | 49.4 | 51.2 | 42.5 | 40.9 | 58.6 | 40.9 | 32.9 | 32.6 | 40.3 | 32.6 | 22.0 | |
TTA+ | 52.1 | 69.1 | 52.1 | 43.4 | 44.2 | 61.2 | 44.2 | 38.6 | 37.6 | 59.9 | 37.6 | 27.0 | |
Den.121 | Baseline | 41.3 | 55.6 | 41.3 | 34.2 | 27.3 | 11.0 | 27.3 | 15.5 | 82.2 | 85.7 | 82.2 | 81.2 |
TTA | 45.5 | 58.6 | 45.5 | 38.9 | 31.4 | 12.8 | 31.4 | 18.2 | 88.0 | 89.8 | 88.0 | 87.6 | |
TTA+ | 46.3 | 58.9 | 46.3 | 40.0 | 32.6 | 13.4 | 32.6 | 19.0 | 89.7 | 91.3 | 89.7 | 89.4 | |
Mob.v3 | Baseline | 55.0 | 55.6 | 55.0 | 48.4 | 71.1 | 78.7 | 71.1 | 70.1 | 33.5 | 47.8 | 33.5 | 25.4 |
TTA | 56.6 | 58.0 | 56.6 | 50.6 | 79.8 | 84.2 | 79.8 | 79.6 | 36.8 | 48.4 | 36.8 | 28.0 | |
TTA+ | 59.5 | 58.7 | 59.5 | 53.6 | 81.4 | 84.9 | 81.4 | 81.3 | 39.7 | 49.5 | 39.7 | 31.2 | |
ViT | Baseline | 59.9 | 58.4 | 59.9 | 54.5 | 79.3 | 83.4 | 79.3 | 78.9 | 63.6 | 77.7 | 63.6 | 61.7 |
TTA | 62.0 | 63.9 | 62.0 | 57.0 | 83.5 | 86.3 | 83.5 | 83.4 | 65.3 | 78.0 | 65.3 | 64.0 | |
TTA+ | 62.0 | 63.9 | 62.0 | 57.0 | 85.5 | 88.0 | 85.5 | 85.4 | 67.4 | 78.6 | 67.4 | 66.3 | |
Swin | Baseline | 60.3 | 61.8 | 60.3 | 54.2 | 52.9 | 72.9 | 52.9 | 43.4 | 71.9 | 82.6 | 71.9 | 71.4 |
TTA | 60.7 | 60.6 | 60.7 | 54.7 | 56.2 | 73.4 | 56.2 | 48.9 | 76.4 | 83.6 | 76.4 | 75.9 | |
TTA+ | 61.2 | 62.7 | 61.2 | 55.8 | 58.3 | 74.7 | 58.3 | 51.3 | 78.1 | 84.7 | 78.1 | 77.7 |
Arc. | Method | Raabin ⟶ LDWBC | AML ⟶ LDWBC | PBC ⟶ LDWBC | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
A | P | R | F1 | A | P | R | F1 | A | P | R | F1 | ||
VGG-19 | Baseline | 58.2 | 65.3 | 58.2 | 59.7 | 42.7 | 39.7 | 42.7 | 38.5 | 67.6 | 68.7 | 67.6 | 64.6 |
TTA | 54.7 | 64.4 | 54.7 | 56.3 | 41.8 | 39.7 | 41.8 | 37.7 | 66.7 | 67.1 | 66.7 | 63.6 | |
TTA+ | 55.1 | 65.3 | 55.1 | 56.7 | 42.2 | 41.1 | 42.2 | 38.0 | 67.1 | 67.6 | 67.1 | 64.1 | |
Res.152 | Baseline | 53.3 | 55.5 | 53.3 | 49.0 | 50.7 | 52.3 | 50.7 | 45.8 | 79.1 | 81.1 | 79.1 | 78.2 |
TTA | 54.2 | 58.5 | 54.2 | 50.8 | 50.7 | 52.1 | 50.7 | 45.4 | 78.7 | 81.2 | 78.7 | 77.7 | |
TTA+ | 54.7 | 59.8 | 54.7 | 51.6 | 50.7 | 53.3 | 50.7 | 45.4 | 78.7 | 81.2 | 78.7 | 77.7 | |
Inc.v3 | Baseline | 51.6 | 60.4 | 51.6 | 52.1 | 45.8 | 50.7 | 45.8 | 41.8 | 59.1 | 61.7 | 59.1 | 59.0 |
TTA | 52.0 | 61.8 | 52.0 | 52.0 | 41.8 | 46.9 | 41.8 | 38.4 | 61.8 | 62.7 | 61.8 | 60.7 | |
TTA+ | 54.2 | 63.0 | 54.2 | 54.1 | 42.7 | 47.6 | 42.7 | 41.6 | 62.2 | 66.9 | 62.2 | 60.9 | |
Den.121 | Baseline | 45.3 | 62.6 | 45.3 | 41.0 | 40.9 | 38.2 | 40.9 | 33.0 | 61.3 | 62.2 | 61.3 | 59.4 |
TTA | 45.8 | 54.3 | 45.8 | 40.6 | 39.6 | 33.9 | 39.6 | 31.6 | 65.8 | 69.4 | 65.8 | 63.3 | |
TTA+ | 45.8 | 54.4 | 45.8 | 41.0 | 39.6 | 35.8 | 39.6 | 31.6 | 65.8 | 69.4 | 65.8 | 63.3 | |
Mob.v3 | Baseline | 47.6 | 62.1 | 47.6 | 46.3 | 39.1 | 50.6 | 39.1 | 33.0 | 57.8 | 71.9 | 57.8 | 52.8 |
TTA | 50.2 | 64.8 | 50.2 | 49.9 | 39.1 | 50.1 | 39.1 | 32.6 | 57.8 | 66.0 | 57.8 | 52.4 | |
TTA+ | 50.2 | 65.2 | 50.2 | 50.0 | 40.4 | 52.0 | 40.4 | 34.5 | 59.1 | 67.4 | 59.1 | 54.1 | |
ViT | Baseline | 55.1 | 66.3 | 55.1 | 53.3 | 62.7 | 75.1 | 62.7 | 61.9 | 66.2 | 69.9 | 66.2 | 65.2 |
TTA | 52.9 | 68.2 | 52.9 | 50.4 | 65.3 | 76.8 | 65.3 | 64.8 | 72.0 | 74.3 | 72.0 | 71.5 | |
TTA+ | 53.3 | 70.6 | 53.3 | 51.0 | 68.0 | 77.1 | 68.0 | 67.4 | 72.0 | 74.7 | 72.0 | 71.5 | |
Swin | Baseline | 52.9 | 56.5 | 52.9 | 50.2 | 44.9 | 40.5 | 44.9 | 35.8 | 60.4 | 71.2 | 60.4 | 57.0 |
TTA | 52.0 | 58.3 | 52.0 | 49.6 | 46.2 | 55.7 | 46.2 | 39.2 | 61.8 | 72.0 | 61.8 | 58.5 | |
TTA+ | 52.0 | 58.4 | 52.0 | 49.6 | 46.2 | 60.4 | 46.2 | 39.2 | 61.8 | 72.2 | 61.8 | 58.6 |
SEC | Stats | ||||
---|---|---|---|---|---|
TTA | FOODS | Voting | Weights | Diff. | p-Value |
✓ | ✗ | Hard | Uniform | 0.893 | 0.0547 |
✓ | ✗ | Hard | Weighted | 0.784 | 0.0838 |
✓ | ✗ | Soft | Uniform | 1.234 | 0.0499 |
✓ | ✗ | Soft | Weighted | 1.235 | 0.0212 |
✓ | ✓ | Hard | Uniform | 1.009 | 0.0455 |
✓ | ✓ | Hard | Weighted | 0.978 | 0.0429 |
✓ | ✓ | Soft | Uniform | 1.267 | 0.0104 |
✓ | ✓ | Soft | Weighted | 1.354 | 0.0102 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Putzu, L.; Loddo, A.; Di Ruberto, C. Test-Time Augmentation for Cross-Domain Leukocyte Classification via OOD Filtering and Self-Ensembling. J. Imaging 2025, 11, 295. https://doi.org/10.3390/jimaging11090295
Putzu L, Loddo A, Di Ruberto C. Test-Time Augmentation for Cross-Domain Leukocyte Classification via OOD Filtering and Self-Ensembling. Journal of Imaging. 2025; 11(9):295. https://doi.org/10.3390/jimaging11090295
Chicago/Turabian StylePutzu, Lorenzo, Andrea Loddo, and Cecilia Di Ruberto. 2025. "Test-Time Augmentation for Cross-Domain Leukocyte Classification via OOD Filtering and Self-Ensembling" Journal of Imaging 11, no. 9: 295. https://doi.org/10.3390/jimaging11090295
APA StylePutzu, L., Loddo, A., & Di Ruberto, C. (2025). Test-Time Augmentation for Cross-Domain Leukocyte Classification via OOD Filtering and Self-Ensembling. Journal of Imaging, 11(9), 295. https://doi.org/10.3390/jimaging11090295