Federated Learning for Histopathology Image Classification: A Systematic Review
Abstract
1. Introduction
Research Objectives and Questions
- How effective is federated learning in improving histopathology image classification performance while addressing data privacy and sharing limitations?
- Which datasets and staining techniques are commonly used in federated learning for histopathology image classification?
- What federated learning frameworks, aggregation algorithms, and classification models are employed in histopathology image analysis?
- How do different federated learning approaches compare in terms of classification performance for histopathology images?
- What software frameworks and hardware infrastructures are used in implementing federated learning for cancer histopathology image classification?
2. Background
2.1. Federated Learning Overview
- Clients (participants): Entities holding local datasets, often non-independent and identically distributed (non-IID). In healthcare, these clients may include hospitals, clinics, or medical imaging devices operating in either cross-silo (institutional) or cross-device (individual) scenarios [18].
- Central Server: Coordinates the FL process by distributing the global model, collecting client updates, and updating the global model iteratively. The server can be deployed within secure environments to enhance privacy [19].
- Communication Protocol: Ensures encrypted transmission of model updates between clients and the central server, safeguarding against security threats during communication [20].
- Global Model Initialization: The central server initializes the global model parameters and sends them to all clients at the start of each communication round .
- Local Model Training: Each client x trains the received global model on its local dataset for E epochs. The local parameters are updated according to the following equation:where is the learning rate, is the gradient of the local loss, and are the updated local parameters.
- Model Aggregation: Clients transmit their updated parameters to the server, which aggregates them using Federated Averaging (FedAvg):Weighted averaging ensures clients with larger datasets have proportionally greater impact on the global model.
- Model Redistribution: The server sends the updated global model back to all clients. This process repeats until convergence, either when performance metrics meet a threshold or after a set number of communication rounds.
2.1.1. Mathematical Foundations of Federated Learning
- denotes the number of samples in client x’s dataset.
- is the total number of data points across all clients.
- The weighting factor ensures that clients with larger datasets contribute proportionally more to the global model.
2.1.2. Federated Learning Approaches
2.2. Related Work
3. Methodology
3.1. Eligibility Criteria
3.1.1. Inclusion Criteria
- Focused on histopathology image classification using histopathology datasets.
- Involved the development or application of deep learning models within a federated learning framework.
- Provided performance analysis or comparative evaluation of FL models.
- Published between 2020 and 2025.
3.1.2. Exclusion Criteria
- Did not involve federated learning for histopathology image classification.
- Focused on non-healthcare domains or irrelevant medical applications.
- Inaccessible full-text articles.
- Duplicate records identified across multiple databases.
- Studies that used federated learning for cancer classification on datasets other than histopathology images.
3.2. Information Sources
3.3. Search Strategy
3.4. Data Collection Process
3.5. Data Items and Extraction
- Bibliographic Details: Information such as the study title, authors, and year of publication.
- Datasets Used: Identification of histopathology datasets, including dataset source and size, to evaluate data diversity and generalizability.
- Classification Models and Federated Learning Frameworks: Documentation of deep learning architectures (e.g., CNNs, ResNet, and EfficientNet) and federated learning aggregation techniques (e.g., FedAvg, FedProx, and FedMA) used in the studies.
- Performance Metrics and Results: Extraction of performance indicators such as accuracy, precision, recall, specificity, F1-score, and AUC to assess model effectiveness.
3.6. Study Risk of Bias Assessment
3.7. Effect Measures
4. Result and Analysis
4.1. Study Selection Process
4.2. Study Characteristics
4.3. Reported Performance Comparison Between Centralized and Federated Training
4.4. Federated Learning Frameworks, Aggregation Strategies, and Architectures
4.5. Staining Techniques and Datasets
4.6. Comparative Analysis of Results Across Reviewed Federated Learning Studies
4.7. Hardware and Software
5. Discussion
Limitations and Future Work
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Wu, Y.; Cheng, M.; Huang, S.; Pei, Z.; Zuo, Y.; Liu, J.; Yang, K.; Zhu, Q.; Zhang, J.; Hong, H.; et al. Recent advances of deep learning for computational histopathology: Principles and applications. Cancers 2022, 14, 1199. [Google Scholar] [CrossRef]
- Hong, R.; Fenyö, D. Deep learning and its applications in computational pathology. BioMedInformatics 2022, 2, 159–168. [Google Scholar] [CrossRef]
- Lee, J.G.; Jun, S.; Cho, Y.W.; Lee, H.; Kim, G.B.; Seo, J.B.; Kim, N. Deep learning in medical imaging: General overview. Korean J. Radiol. 2017, 18, 570–584. [Google Scholar] [CrossRef]
- Gerke, S.; Minssen, T.; Cohen, G. Ethical and legal challenges of artificial intelligence-driven healthcare. In Artificial Intelligence in Healthcare; Elsevier: Amsterdam, The Netherlands, 2020; pp. 295–336. [Google Scholar]
- European Parliament and Council of the European Union. General Data Protection Regulation (GDPR); European Parliament and Council of the European Union: Brussels, Belgium; Luxembourg, 2016. [Google Scholar]
- Rieke, N.; Hancox, J.; Li, W.; Milletarì, F.; Roth, H.R.; Albarqouni, S.; Bakas, S.; Galtier, M.N.; Landman, B.A.; Maier-Hein, K.; et al. The future of digital health with federated learning. Npj Digit. Med. 2020, 3, 119. [Google Scholar] [CrossRef] [PubMed]
- Konečnỳ, J.; McMahan, H.B.; Ramage, D.; Richtárik, P. Federated optimization: Distributed machine learning for on-device intelligence. arXiv 2016, arXiv:1610.02527. [Google Scholar] [CrossRef]
- McMahan, B.; Moore, E.; Ramage, D.; Hampson, S.; y Arcas, B.A. Communication-efficient learning of deep networks from decentralized data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (PMLR 2017), Fort Lauderdale, FL, USA, 20–22 April 2017; pp. 1273–1282. [Google Scholar]
- Hard, A.S.; Rao, K.; Mathews, R.; Beaufays, F.; Augenstein, S.; Eichner, H.; Kiddon, C.; Ramage, D. Federated Learning for Mobile Keyboard Prediction. arXiv 2018, arXiv:1811.03604. [Google Scholar]
- Yang, T.; Andrew, G.; Eichner, H.; Sun, H.; Li, W.; Kong, N.; Ramage, D.; Beaufays, F. Applied federated learning: Improving google keyboard query suggestions. arXiv 2018, arXiv:1812.02903. [Google Scholar] [CrossRef]
- Nazir, S.; Kaleem, M. Federated learning for medical image analysis with deep neural networks. Diagnostics 2023, 13, 1532. [Google Scholar] [CrossRef]
- Guan, H.; Yap, P.T.; Bozoki, A.; Liu, M. Federated learning for medical image analysis: A survey. Pattern Recognit. 2024, 151, 110424. [Google Scholar] [CrossRef]
- Brauneck, A.; Schmalhorst, L.; Kazemi Majdabadi, M.M.; Bakhtiari, M.; Völker, U.; Baumbach, J.; Baumbach, L.; Buchholtz, G. Federated machine learning, privacy-enhancing technologies, and data protection laws in medical research: Scoping review. J. Med Internet Res. 2023, 25, e41588. [Google Scholar] [CrossRef]
- Lu, M.Y.; Chen, R.J.; Kong, D.; Lipkova, J.; Singh, R.; Williamson, D.F.; Chen, T.Y.; Mahmood, F. Federated learning for computational pathology on gigapixel whole slide images. Med. Image Anal. 2022, 76, 102298. [Google Scholar] [CrossRef]
- Scheibner, J.; Ienca, M.; Kechagia, S.; Troncoso-Pastoriza, J.R.; Raisaro, J.L.; Hubaux, J.P.; Fellay, J.; Vayena, E. Data protection and ethics requirements for multisite research with health data: A comparative examination of legislative governance frameworks and the role of data protection technologies. J. Law Biosci. 2020, 7, lsaa010. [Google Scholar] [CrossRef]
- Hallaji, E.; Razavi-Far, R.; Saif, M.; Wang, B.; Yang, Q. Decentralized federated learning: A survey on security and privacy. IEEE Trans. Big Data 2024, 10, 194–213. [Google Scholar] [CrossRef]
- Yuan, W.; Wang, X. FedAgg: Adaptive Federated Learning with Aggregated Gradients. arXiv 2023, arXiv:2303.15799. [Google Scholar]
- Kairouz, P.; McMahan, H.B.; Avent, B.; Bellet, A.; Bennis, M.; Bhagoji, A.N.; Bonawitz, K.; Charles, Z.; Cormode, G.; Cummings, R.; et al. Advances and open problems in federated learning. Found. Trends® Mach. Learn. 2021, 14, 1–210. [Google Scholar] [CrossRef]
- Leroy, D.; Coucke, A.; Lavril, T.; Gisselbrecht, T.; Dureau, J. Federated learning for keyword spotting. In Proceedings of the ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 12–17 May 2019; pp. 6341–6345. [Google Scholar]
- Ergün, I.; Sami, H.U.; Güler, B. Communication-efficient secure aggregation for federated learning. In Proceedings of the GLOBECOM 2022-2022 IEEE Global Communications Conference, Rio de Janeiro, Brazil, 4–8 December 2022; pp. 3881–3886. [Google Scholar]
- McMahan, H.B.; Moore, E.; Ramage, D.; y Arcas, B.A. Federated Learning of Deep Networks using Model Averaging. arXiv 2016, arXiv:1602.05629. [Google Scholar]
- Yang, Q.; Liu, Y.; Chen, T.; Tong, Y. Federated machine learning: Concept and applications. ACM Trans. Intell. Syst. Technol. (TIST) 2019, 10, 1–19. [Google Scholar] [CrossRef]
- Pfitzner, B.; Steckhan, N.; Arnrich, B. Federated learning in a medical context: A systematic literature review. ACM Trans. Internet Technol. (TOIT) 2021, 21, 1–31. [Google Scholar] [CrossRef]
- Prayitno; Shyu, C.R.; Putra, K.T.; Chen, H.C.; Tsai, Y.Y.; Hossain, K.T.; Jiang, W.; Shae, Z.Y. A systematic review of federated learning in the healthcare area: From the perspective of data properties and applications. Appl. Sci. 2021, 11, 11191. [Google Scholar] [CrossRef]
- Teo, Z.L.; Jin, L.; Liu, N.; Li, S.; Miao, D.; Zhang, X.; Ng, W.Y.; Tan, T.F.; Lee, D.M.; Chua, K.J.; et al. Federated machine learning in healthcare: A systematic review on clinical applications and technical architecture. Cell Rep. Med. 2024, 5, 101419. [Google Scholar] [CrossRef]
- Yang, T.; Yu, X.; McKeown, M.J.; Wang, Z.J. When federated learning meets medical image analysis: A systematic review with challenges and solutions. APSIPA Trans. Signal Inf. Process. 2024, 13, 55. [Google Scholar] [CrossRef]
- Sandhu, S.S.; Gorji, H.T.; Tavakolian, P.; Tavakolian, K.; Akhbardeh, A. Medical imaging applications of federated learning. Diagnostics 2023, 13, 3140. [Google Scholar] [CrossRef]
- Raza, A.; Guzzo, A.; Ianni, M.; Lappano, R.; Zanolini, A.; Maggiolini, M.; Fortino, G. Federated Learning in radiomics: A comprehensive meta-survey on medical image analysis. Comput. Methods Programs Biomed. 2025, 267, 108768. [Google Scholar] [CrossRef]
- Tahir, N.; Jung, C.R.; Lee, S.D.; Azizah, N.; Ho, W.C.; Li, T.C. Federated learning-based model for predicting mortality: Systematic review and meta-analysis. J. Med Internet Res. 2025, 27, e65708. [Google Scholar] [CrossRef]
- Sohan, M.F.; Basalamah, A. A systematic review on federated learning in medical image analysis. IEEE Access 2023, 11, 28628–28644. [Google Scholar] [CrossRef]
- Hossain, M.M.; Islam, M.R.; Ahamed, M.F.; Ahsan, M.; Haider, J. A collaborative federated learning framework for lung and colon cancer classifications. Technologies 2024, 12, 151. [Google Scholar] [CrossRef]
- Gunesli, G.N.; Bilal, M.; Raza, S.E.A.; Rajpoot, N.M. A federated learning approach to tumor detection in colon histology images. J. Med. Syst. 2023, 47, 99. [Google Scholar] [CrossRef] [PubMed]
- Li, L.; Xie, N.; Yuan, S. A federated learning framework for breast cancer histopathological image classification. Electronics 2022, 11, 3767. [Google Scholar] [CrossRef]
- Yenilmez, M.; Aydin, I. A Federated Learning-Based Approach for Classification of Histopathology Images. In Proceedings of the 2024 14th International Conference on Advanced Computer Information Technologies (ACIT), Ceske Budejovice, Czech Republic, 19–21 September 2024; pp. 749–752. [Google Scholar]
- Hosseini, S.M.; Sikaroudi, M.; Babaei, M.; Tizhoosh, H.R. Cluster based secure multi-party computation in federated learning for histopathology images. In Proceedings of the International Workshop on Distributed, Collaborative, and Federated Learning; Springer: Cham, Switzerland, 2022; pp. 110–118. [Google Scholar]
- Peta, J.; Koppu, S. Enhancing breast cancer classification in histopathological images through federated learning framework. IEEE Access 2023, 11, 61866–61880. [Google Scholar] [CrossRef]
- Deng, T.; Huang, Y.; Han, G.; Shi, Z.; Lin, J.; Dou, Q.; Liu, Z.; Guo, X.j.; Chen, C.P.; Han, C. Feddbl: Communication and data efficient federated deep-broad learning for histopathological tissue classification. IEEE Trans. Cybern. 2024, 54, 7851–7864. [Google Scholar] [CrossRef]
- Gunesli, G.N.; Bilal, M.; Raza, S.E.A.; Rajpoot, N.M. Feddropoutavg: Generalizable federated learning for histopathology image classification. arXiv 2021, arXiv:2111.13230. [Google Scholar]
- Baid, U.; Pati, S.; Kurc, T.M.; Gupta, R.; Bremer, E.; Abousamra, S.; Thakur, S.P.; Saltz, J.H.; Bakas, S. Federated learning for the classification of tumor infiltrating lymphocytes. arXiv 2022, arXiv:2203.16622. [Google Scholar] [CrossRef]
- Agbley, B.L.Y.; Li, J.; Haq, A.U.; Bankas, E.K.; Adjorlolo, G.; Agyemang, I.O.; Ayekai, B.J.; Effah, D.; Adjeimensah, I.; Khan, J. Federated approach for lung and colon cancer classification. In Proceedings of the 2022 19th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), Chengdu, China, 16–18 December 2022; pp. 1–8. [Google Scholar]
- Agbley, B.L.Y.; Li, J.P.; Haq, A.U.; Bankas, E.K.; Mawuli, C.B.; Ahmad, S.; Khan, S.; Khan, A.R. Federated fusion of magnified histopathological images for breast tumor classification in the internet of medical things. IEEE J. Biomed. Health Inform. 2023, 28, 3389–3400. [Google Scholar] [CrossRef] [PubMed]
- Jiang, M.; Wang, Z.; Dou, Q. Harmofl: Harmonizing local and global drifts in federated learning on heterogeneous medical images. Proc. AAAI Conf. Artif. Intell. 2022, 36, 1087–1095. [Google Scholar] [CrossRef]
- Vyas, S.; Patra, A.N.; Shukla, R.M. Histopathological image classification and vulnerability analysis using federated learning. In Proceedings of the 2023 IEEE 22nd International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), Exeter, UK, 1–3 November 2023; pp. 2372–2377. [Google Scholar]
- Lusnig, L.; Sagingalieva, A.; Surmach, M.; Protasevich, T.; Michiu, O.; McLoughlin, J.; Mansell, C.; de’Petris, G.; Bonazza, D.; Zanconati, F.; et al. Hybrid quantum image classification and federated learning for hepatic steatosis diagnosis. Diagnostics 2024, 14, 558. [Google Scholar] [CrossRef]
- Baid, U.; Pati, S.; Kurc, T.M.; Gupta, R.; Bremer, E.; Abousamra, S.; Thakur, S.P.; Saltz, J.H.; Bakas, S. Pan-Cancer Tumor Infiltrating Lymphocyte Detection based on Federated Learning. In Proceedings of the 2024 IEEE International Conference on Big Data (BigData), Washington, DC, USA, 15–18 December 2024; pp. 7640–7647. [Google Scholar]
- Zhang, Y.; Li, Z.; Han, X.; Ding, S.; Li, J.; Wang, J.; Ying, S.; Shi, J. Pseudo-data based self-supervised federated learning for classification of histopathological images. IEEE Trans. Med. Imaging 2023, 43, 902–915. [Google Scholar] [CrossRef]
- Andreux, M.; du Terrail, J.O.; Beguier, C.; Tramel, E.W. Siloed federated learning for multi-centric histopathology datasets. In Proceedings of the Domain Adaptation and Representation Transfer, and Distributed and Collaborative Learning: Second MICCAI Workshop, DART 2020, and First MICCAI Workshop, DCL 2020, Held in Conjunction with MICCAI 2020, Lima, Peru, 4–8 October 2020; Proceedings 2. Springer: Berlin/Heidelberg, Germany, 2020; pp. 129–139. [Google Scholar]
- Bansal, S. The Classification of Breast Cancer Using a Transfer Learning Strategy in a Federated Learning Framework. In Proceedings of the 2023 2nd International Conference on Futuristic Technologies (INCOFT), Belagavi, Karnataka, India, 24–26 November 2023; pp. 1–7. [Google Scholar]
- Gupta, C.; Gill, N.S.; Gulia, P.; Alduaiji, N.; Shreyas, J.; Shukla, P.K. Applying YOLOv6 as an ensemble federated learning framework to classify breast cancer pathology images. Sci. Rep. 2025, 15, 3769. [Google Scholar] [CrossRef] [PubMed]
- Banerjee, M.; Paul, A. FedImp: Federated Learning Using Important Layers of Client Models for the Diagnosis of Breast Cancer Histopathology Images. In Proceedings of the ICASSP 2025–2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Hyderabad, India, 6–11 April 2025; pp. 1–5. [Google Scholar]
- Miao, Y.; Yang, X.; Fan, H.; Li, Y.; Hong, Y.; Guo, X.; Braytee, A.; Huang, W.; Anaissi, A. FedSAF: A Federated Learning Framework for Enhanced Gastric Cancer Detection and Privacy Preservation. arXiv 2025, arXiv:2503.15870. [Google Scholar] [CrossRef]
- Jin, H.; Liu, S.; Cong, C.; Feng, Q.; Liu, Y.; Huang, L.; Hu, Y. Fedwsidd: Federated whole slide image classification via dataset distillation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Daejeon, Republic of Korea, 23–37 September 2025; Springer: Berlin/Heidelberg, Germany, 2025; pp. 178–188. [Google Scholar]
- Chowdhury, A.A.; Mahmud, S.H.; Uddin, M.P.; Kadry, S.; Kim, J.Y.; Nam, Y. Nuclei segmentation and classification from histopathology images using federated learning for end-edge platform. PLoS ONE 2025, 20, e0322749. [Google Scholar] [CrossRef]
- Hassani, A.; Rekik, I. UniFed: A Universal Federation of a Mixture of Highly Heterogeneous Medical Image Classification Tasks. In Proceedings of the International Workshop on Machine Learning in Medical Imaging, Marrakesh, Morocco, 6 October 2024; Springer: Berlin/Heidelberg, Germany, 2024; pp. 32–42. [Google Scholar]
- Reina, G.A.; Gruzdev, A.; Foley, P.; Perepelkina, O.S.; Sharma, M.; Davidyuk, I.; Trushkin, I.; Radionov, M.; Mokrov, A.; Agapov, D.; et al. OpenFL: The open federated learning library. Phys. Med. Biol. 2021, 67, 214001. [Google Scholar]
- Beutel, D.J.; Topal, T.; Mathur, A.; Qiu, X.; Fernandez-Marques, J.; Gao, Y.; Sani, L.; Li, K.H.; Parcollet, T.; de Gusmão, P.P.B.; et al. Flower: A friendly federated learning research framework. arXiv 2020, arXiv:2007.14390. [Google Scholar]
- Ghosh, A. BreakHis-Breast Cancer Histopathological Image Dataset. 2022. Available online: https://www.kaggle.com/datasets/ambarish/breakhis (accessed on 10 November 2025).
- National Cancer Institute Genomic Data Commons (GDC) Portal. 2025. Available online: https://portal.gdc.cancer.gov/ (accessed on 10 November 2025).
- Mader, K. Skin Cancer MNIST: HAM10000 Dataset. 2018. Available online: https://www.kaggle.com/datasets/kmader/skin-cancer-mnist-ham10000 (accessed on 10 November 2025).
- Database, G. GigaDB Dataset: 100439. 2025. Available online: http://gigadb.org/dataset/100439 (accessed on 10 November 2025).
- Zhu, X. LC25000: Lung and Colon Histopathological Images. 2020. Available online: https://www.kaggle.com/datasets/xilezhu/lc25000 (accessed on 10 November 2025).
- Gamper, J.; Koohbanani, N.A.; Benet, K.; Khuram, A.; Rajpoot, N. PanNuke: An open pan-cancer histology dataset for nuclei instance segmentation and classification. In Proceedings of the European Congress on Digital Pathology, Warwick, UK, 10–13 April 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 11–19. [Google Scholar]
- Yang, J.; Shi, R.; Wei, D.; Liu, Z.; Zhao, L.; Ke, B.; Pfister, H.; Ni, B. MedMNIST v2–A large-scale lightweight benchmark for 2D and 3D biomedical image classification. Sci. Data 2023, 10, 41. [Google Scholar] [CrossRef] [PubMed]









| Study | Domain Focus | Key Contributions/Findings | Coverage of Histopathology |
|---|---|---|---|
| Nazir et al. (2023) [11] | Medical image FL (classification and segmentation) | Reviews FL for general medical imaging; covers architectures and performance. | Does not address histopathology tile-based classification. |
| Prayitno et al. (2021) [24] | General healthcare FL | Surveys FL across healthcare; discusses data partitioning, data distribution, and COVID-19 imaging. | Very limited; histopathology barely mentioned. |
| Teo et al. (2024) [25] | FL in healthcare (612 studies) | Finds only 5.2% of studies are real-world; radiology dominates; pathology/histology underrepresented. | Limited attention to pathology/histology. |
| Yang et al. (2024) [26] | FL challenges in medical image analysis | Discusses heterogeneity, label quality, and technical challenges. | Only peripheral mention of histopathology. |
| Sandhu et al. (2023) [27] | Medical FL categorized by disease, modality, body part | Gives broad overview of FL architectures and performance vs. traditional ML. | No dedicated section on histopathology classification. |
| Raza et al. (2025) [28] | Radiomics and medical image FL | Focuses on CT, MRI, PET radiomics workflows. | Radiomics usually excludes histopathology tile/WSI analysis. |
| Tahir et al. (2025) [29] | Healthcare FL for mortality prediction | Covers EHR and clinical prediction tasks, not imaging. | No coverage of histopathology image classification. |
| This review (2025) | Surveys FL for histopathology image classification | outlines tissue-level variability, annotation considerations, and diagnostic challenges in digital pathology. | Focuses exclusively on histopathology images; no other imaging modalities are considered |
| Question Label | Question |
|---|---|
| Q1 | Is the aim of the research stated clearly? |
| Q2 | Is the size of the dataset adequate for this type of analysis? |
| Q3 | Is the procedure for managing data in the federated setting described in detail? |
| Q4 | Does the study address the issue of non-IID data distribution? |
| Q5 | Are any additional privacy-preserving techniques implemented? |
| Q6 | Does the author offer enough information about the experimental setup? |
| Q7 | Are the learning methods explained thoroughly? |
| Q8 | Are the outcomes of the study presented clearly? |
| Q9 | Is there a comparison between the different methods or approaches used? |
| Q10 | Are the study’s limitations acknowledged? |
| Q11 | Does the research contribute meaningfully to the existing body of literature? |
| Q12 | Does the study make any tools or source code available online? |
| Ref | Author | Dataset | Advantages | Limitations |
|---|---|---|---|---|
| [31] | Hossain et al. | LC2500 dataset | Explainable AI Integration, Decentralized Training Improved Generalization | IID Assumption, Computational Complexity, Data Heterogeneity |
| [32] | Gozde N. Gunesli et al. | TCGA CRC-DX Dataset | Handles Data Heterogeneity, Reduces Overfitting, Performs Well on Unseen Data | Potential for Model Bias, Client Dropout Effects Computational Complexity |
| [33] | Lingxiao Li et al. | BreakHis Dataset | Effective Knowledge Fusion Comparable Performance to Centralized Learning, Supports Multi-Client Training | Encryption-Related Efficiency, Issues Lack of Large-Scale Validation, Potential Performance Drop in Non-IID Settings |
| [34] | Musa Yenilmez et al. | HAM10000 Dataset | Uses Pre-Trained Models for Efficiency Handles Multi-Client Scenarios Ensures Data Privacy | Limited to Specific Skin Lesions Potential Model Performance Drop Lack of Advanced Aggregation Techniques |
| [35] | S. Maryam Hosseini et al. | The Cancer Genome Atlas (TCGA) Dataset | Enhanced Privacy Protection Maintains Model Accuracy Cluster-Based Model Aggregation | Higher Communication Overhead Scalability Concerns Computational Complexity |
| [36] | Jyothi Peta et al. | BreakHis Dataset | Secure Image Transmission, Optimal Key Generation, High Classification Performance | Potential Performance, Drop in Non-IID Scenarios Encryption Complexity, Higher Communication Costs |
| [37] | Tianpeng Deng et al. | Multicenter Colorectal Cancer (MC-CRC) Dataset | Highly Communication-Efficient Performs Well with Limited Training Samples, Model Generalization to External Datasets | Performance Can Be Affected by Data Heterogeneity, Limited to Pretrained Deep Learning Backbones, Encryption Overhead |
| [38] | Gozde N. Gunesli et al. | TCGA CRC-DX Dataset | Handles Data Heterogeneity, Reduces Overfitting, Increases Model Robustness | Not Ideal for Small-Scale Studies Potential Model Convergence Challenges Communication Efficiency Not Fully Optimized |
| [39] | Ujjwal Baid et al. | The Cancer Genome Atlas (TCGA) Dataset | Handles Out of Distribution Data, Consensus Model, Achieves High Accuracy, Efficient Federated Training Framework | Slower Convergence, Limited to a Single Network Architecture, Data Heterogeneity Issues |
| [40] | Y. Agbley et al. | LC25000 Dataset | Preserves Data Privacy, Improves Model Generalization Handles Multi-Class Classification | Communication Overhead Computationally Intensive Data Heterogeneity Issues |
| [41] | Y. Agbley et al. | BreakHis Dataset | Multi-Magnification Image Fusion, Self-Attention Mechanism High Classification Performance | Computationally Intensive, Higher Storage Requirements, Communication Overhead in FL |
| [42] | Meirui Jiang et al. | Camelyon17 Dataset | Reduces Local and Global Model Drift, Improves Federated Model Convergence, Minimizes Communication Overhead | Requires Frequency-Domain Transformations, Limited to Certain Medical Imaging Tasks |
| [43] | Sankalp Vyas et al. | HAM10000 Dataset | Privacy-Preserving Learning, Protects Against Data Poisoning Attacks, Decentralized Model Training | Vulnerable to Malicious Clients Computational Overhead, Higher Communication Costs |
| [44] | Luca Lusnig et al. | Liver Biopsy Image Dataset | Hybrid Quantum Neural Network (HQNN) for Feature Learning Superior Classification Accuracy Efficient Learning on Small Datasets | Limited Quantum Hardware Availability, Potential Performance Drop in Highly Heterogeneous Data, Federated Learning Communication Overhead |
| [45] | Ujjwal Baid et al. | The Cancer Genome Atlas (TCGA) Dataset | Preserves Data Privacy, Improves Generalization Across Cancer Types, Comparable to Centralized Training | Data Heterogeneity, Challenges Computational Complexity, Limited Evaluation on Other Architectures |
| [46] | Yuanming Zhang et al. | 2015 Bioimaging Challenge Dataset 4th Symposium in Applied Bioimaging Dataset ICIAR 2018 Grand Challenge on Breast Cancer Histology Images Dataset Databiox Dataset | Multi Task Self Supervised Learning, Handles Non-IID Data in FL Contrastive Learning for Robust Training | Computationally Intensive Requires High-Quality Pseudo Data Higher Communication Costs in FL |
| [47] | Mathieu Andreux et al. | Camelyon16 and Camelyon17 Datasets | Improves Generalization Across Institutions, Enhances Privacy Protection Maintains High Performance on Non-IID Data | Increased Computational Overhead, Not Ideal for Small Datasets, Higher Communication Costs in Federated Learning |
| [48] | Shubhansh Bansal | BreakHis Dataset | Transfer Learning with Pretrained Models, Handles Data Imbalance with Balanced Accuracy, High Classification Accuracy | Computational Complexity, Vulnerability to Non-IID Data, Communication Overhead in Federated Learning |
| [49] | Chhaya Gupta et al. | BreakHis Dataset | Data privacy preservation, High accuracy, Reduced communication overhead, Model compression, Enhanced generalization, Efficient training | Non-IID data challenges, Security vulnerabilities, Data imbalance issues, Communication costs & scalability |
| [50] | Mangaldeep Banerjee et al. | BreakHis1, BreakHis2, BRACS1, BRACS2 | Enhanced performance, Handles data heterogeneity, Maintains model consistency, Focus on important layers, Privacy preservatio | Lack of explicit data distribution utilization, Potential computational overhead, Limited task scope |
| [51] | Yuxin Miao et al. | SEED, BOT (gastric cancer histopathology) | Enhanced model personalization, Improved communication efficiency, Robustness to data heterogeneity, Higher accuracy and generalization | Limited model architecture flexibility, Synchronous update constraints, Potential overfitting with high personalization |
| [52] | Haolong Jin et al. | CAMELYON16, CAMELYON17 | Model flexibility, Reduced communication cost, Improved performance, Stain normalization integration, Rapid convergence, Enhanced generalization | Synthetic data explainability issues, High computational cost of distillation, Hyperparameter sensitivity, Limited variability in synthetic data, Performance drop without stain normalization |
| [53] | AnjirChowdhury et al. | PanNuke dataset | Joint segmentation and classification Context-aware learning, Handles overlapping nuclei | High annotation requirement, High computational cost, Generalizability issues, Error propagation, Challenges with small structures, Clinical adoption barriers |
| [54] | Atefe Hassani et al. | TissueMNIST (MedMNIST2D) | Handles high heterogeneity, Dynamic training strategy, Curriculum learning integration, Efficient communication, Improved convergence time, Better performance | Increased system complexity, Higher initial computation, Dependence on task complexity estimation, Sequential dependency |
| Aggregation Algorithm | Description |
|---|---|
| FedAvg | Traditional federated averaging method that computes a weighted average of client model updates. |
| FedDropoutAvg | Extends FedAvg by applying dropout at model and client levels to improve robustness and reduce overfitting. |
| FedDBL | Uses weighted averaging to optimize communication efficiency and performance with limited training samples. |
| FedImp | Emphasizes the importance of specific model layers during aggregation using a semantically weighted strategy, reducing deviation between client and global models. |
| FedImpAvg | Refines FedImp by performing weighted averaging based on the number of local training samples. |
| Cluster-Based SMC | Secure aggregation by distributing model updates within small clusters before global aggregation to enhance privacy. |
| SiloBN | Retains local batch normalization statistics while sharing only learned BN parameters, improving model adaptation across clients. |
| FedSAF | Integrates Attention Message Passing (AMP) with the Fisher Information Matrix (FIM) to dynamically adjust client contributions based on model similarity. |
| FedWSIDD | Dataset distillation based aggregation where clients generate synthetic slides that are aggregated and redistributed to enhance generalization. |
| Dynamic Sequential Aggregation | Orders clients based on task complexity and sequentially updates the global model, improving convergence and communication efficiency. |
| Ref | Dataset Name | Domain/Modality | Task | Samples | Source | Paper & Approach |
|---|---|---|---|---|---|---|
| [57] | The BreakHis dataset | Histopathology | Classification | 7909 microscopic images of breast tumor tissue from 82 patients | Public | [33,36,41,48] Horizontal FL |
| [58] | The TCGA (The Cancer Genome Atlas)dataset | Histopathology | Classification | contains 512 × 512 non-overlapping tiles from WSIs of colorectal cancer (CRC) across 36 institutions | Public | [32,35] Horizontal FL, [39,45] Vertical FL |
| [59] | The HAM10000 dataset | Histopathology | Classification | 10,015 dermoscopic images of seven different skin lesion types | Public | [34,43] Horizontal FL |
| [60] | The Camelyon dataset | Histopathology | Classification | 170 WSIs (100 normal, 70 with metastases) 100 WSIs (60 normal, 40 with metastases) | Public | [42,47] Horizontal FL |
| [61] | The LC25000 dataset | Histopathology | Classification | 25,000 histopathology images of lung and colon cancer biopsies | Public | [31,40] Horizontal FL |
| - | The Multicenter CRC (MC-CRC) | Histopathology | Classification | includes colorectal cancer histopathology images with nine tissue classes | Public | [37] Horizontal FL |
| - | Breast Histopathology Image (BHI) | Histopathology | Classification | Contains 277,524 histopathological image patches at 40× magnification | - | [40] Horizontal FL |
| - | Liver Biopsy Image Dataset | Histopathology | Classification | Contains 4400 histopathological liver biopsy images | An anonymous teaching archive at the University of Trieste | [44] Horizontal FL |
| [62] | PanNuke dataset | Histopathology | Classification and segmentation | includes 205,343 annotated nuclei, each accompanied by an instance segmentation mask. | Public | [53] |
| [63] | MedMNIST | Histopathology | Classification | a large-scale MNIST-like collection of standardized biomedical images, including 12 datasets for 2D and 6 datasets for 3D. | Public | [54] |
| Paper | Aggregation | Classification Models | Results |
|---|---|---|---|
| Hossain et al., 2024 [31] | FedAvg | Inception-V3 | Near-perfect: 99.867% (lung), 100% (colon), 99.720% (combined) |
| Güneşli et al., 2023/2021 [32,38] | FedDropoutAvg | ResNet18 + GroupNorm | AUC 0.965 (local), AUC 0.954 (independent); mean F1 = 0.9102, AUC = 0.9542 |
| Li et al., 2022 [33] | FedAvg | ResNet-512, DenseNet-201, MobileNet-v2-100, EfficientNet-b7 | Image-level ACC (ACCIL): 84.02–91.06%; Patient-level ACC (ACCPL): 84.09–91.87% |
| Yenilmez et al., 2024 [34] | FedAvg | VGG16 | 82.04% accuracy |
| Peta et al., 2023 [36] | FedAvg | C2T2Net | 95.68% accuracy; “all key metrics > 95%” |
| Baid et al., 2022/2024 [39,45] | FedAvg | VGG16 | Balanced accuracies around 89% |
| Agbley et al., 2022/2023 [40,41] | FedAvg | Hybrid (e.g., ResNet18, ResNet50, GaborNet); ResNet + self-attention | Very high: 99.87% and 99.99% (lung), 99.72% (colon). |
| Vyas et al., 2023 [43] | FedAvg | 67.1% accuracy | |
| Lüsnig et al., 2024 [44] | FedAvg | ResNet-512, Hybrid Quantum ResNet (QDI layers) | 91.06% accuracy |
| Zhang et al., 2023 [46] | FedAvg | ResNet-50 | 81.48% accuracy (DenseNet) |
| Bansal et al., 2023 [48] | FedAvg | Xception, DarkNet53 | Balanced accuracy: Xception 83.07%, DarkNet53 87.17% |
| Gupta et al., 2025 [49] | FedAvg | YOLOv6 | 98% (BreakHis), 97% (BUSI); Recall 99%, F1 98% |
| Chowdhury et al., 2025 [53] | FedAvg | U-Net-style | 84–85% accuracy (segmentation/classification) |
| Andreux et al., 2020 [47] | SiloBN | Batch-normalized DCNNs | Mean accuracy = 0.94 |
| Hosseini et al., 2022 [35] | Cluster-based SMC | MIL gated attention classifier | 76.65% (F1 = 80.48%) and 76.16% (F1 = 79.84%) (privacy-preserving settings) |
| Miao et al., 2025 [51] | FedSAF | Various comparative CNNs (AlexNet, ResNet18, EfficientNet-B0, MobileNetV3 Small) | 98.43% (SEED dataset) and 81.16% (BOT dataset) |
| Jin et al., 2025 [52] | FedWSIDD | MIL methods for WSI: CLAM, TransMIL, ABMIL | 90.1% ± 0.2 (CAMELYON16), 81.2% ± 1.2 (CAMELYON17) |
| Banerjee et al., 2025 [50] | FedImp/FedImpAvg | Efficient/modern nets referenced (e.g., EfficientNet-B3/B7 & EfficientViT) | Accuracy = 0.86; AUROC: 0.86 ± 0.08 (BreakHis1), 0.80 ± 0.01 (BRACS1) |
| Jiang et al., 2022 [42] | HarmoFL | DenseNet-201 | 95.48% accuracy |
| Hassani et al. [54] | UniFed | small CNNs (2-layer CNN+FFN), VGG11, ResNet18 | 69.37% (strongly non-IID) vs. 77.10% centralized; outperformed FedAvg (38.44%) and FedProx (37.92%) |
| T. Deng et al. [37] | FedDBL | EfficientNet/variants | 92.13% |
| Ref | Q1 | Q2 | Q3 | Q4 | Q5 | Q6 | Q7 | Q8 | Q9 | Q10 | Q11 | Q12 | Score |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| [31] | 1 | 1 | 1 | −1 | −1 | 1 | 1 | 1 | 1 | 1 | 1 | −1 | 6 |
| [32] | 1 | 1 | 1 | 1 | −1 | 1 | 1 | 1 | 1 | 0 | 1 | −1 | 7 |
| [33] | 1 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | −1 | 7 |
| [34] | 1 | 1 | 1 | −1 | −1 | 1 | 1 | 1 | 1 | −1 | 1 | −1 | 4 |
| [35] | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | −1 | 9 |
| [36] | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | -−1 | 8 |
| [37] | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 9 |
| [38] | 1 | 1 | 1 | 0 | −1 | 1 | 1 | 1 | 1 | 0 | 1 | −1 | 6 |
| [39] | 1 | 1 | 1 | 1 | −1 | 1 | 1 | 1 | 1 | 1 | 1 | −1 | 8 |
| [40] | 1 | 1 | 0 | −1 | −1 | 1 | 1 | 1 | 1 | −1 | 1 | −1 | 4 |
| [45] | 1 | 0 | 1 | 1 | −1 | 0 | 1 | 1 | 0 | −1 | 1 | −1 | 3 |
| [46] | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | −1 | 4 |
| [47] | 1 | 0 | 1 | 0 | −1 | 1 | 1 | 1 | 1 | 0 | 1 | −1 | 5 |
| [48] | 1 | 1 | 1 | 1 | −1 | 0 | 0 | 1 | 1 | 0 | 1 | −1 | 5 |
| [42] | 1 | 1 | 1 | 1 | −1 | 0 | 0 | 1 | 1 | 0 | 1 | 1 | 9 |
| [41] | 1 | 1 | 1 | 0 | −1 | 1 | 1 | 1 | 0 | 0 | 1 | −1 | 5 |
| [43] | 1 | 1 | 1 | −1 | −1 | 1 | 1 | 1 | 0 | 0 | 1 | −1 | 4 |
| [49] | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | −1 | 10 |
| [50] | 1 | 0 | 1 | 1 | −1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 8 |
| [51] | 1 | 0 | 1 | 1 | −1 | 0 | 1 | 1 | 0 | 0 | 1 | −1 | 4 |
| [52] | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 0 | 1 | −1 | 8 |
| [53] | 1 | 1 | 0 | 0 | −1 | 1 | 1 | 1 | 1 | 0 | 1 | −1 | 5 |
| [54] | 1 | 0 | 1 | 1 | −1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 8 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Touhami, M.; Ahmad Fauzi, M.F.; Ur Rehman, Z.; Mansor, S. Federated Learning for Histopathology Image Classification: A Systematic Review. Diagnostics 2026, 16, 137. https://doi.org/10.3390/diagnostics16010137
Touhami M, Ahmad Fauzi MF, Ur Rehman Z, Mansor S. Federated Learning for Histopathology Image Classification: A Systematic Review. Diagnostics. 2026; 16(1):137. https://doi.org/10.3390/diagnostics16010137
Chicago/Turabian StyleTouhami, Meriem, Mohammad Faizal Ahmad Fauzi, Zaka Ur Rehman, and Sarina Mansor. 2026. "Federated Learning for Histopathology Image Classification: A Systematic Review" Diagnostics 16, no. 1: 137. https://doi.org/10.3390/diagnostics16010137
APA StyleTouhami, M., Ahmad Fauzi, M. F., Ur Rehman, Z., & Mansor, S. (2026). Federated Learning for Histopathology Image Classification: A Systematic Review. Diagnostics, 16(1), 137. https://doi.org/10.3390/diagnostics16010137

