Federated Learning Under Concept Drift: A Systematic Survey of Foundations, Innovations, and Future Research Directions
Abstract
1. Introduction
- Continuous Data Arrival: Data does not arrive all at once but continuously over time. This requires an incremental learning approach, where the model must retain useful prior knowledge while incorporating new information. It is essential to avoid “catastrophic forgetting,” where learning new data results in the complete loss of previously acquired knowledge.
- Infinite Data Streams: The data flow can be infinite, making it impractical to store all data in memory. Consequently, each data point may only be reviewed a limited number of times. Even if storage capacity permits, data may need to be deleted for legal reasons or to protect individual privacy.
- Unpredictable Statistical Changes: The statistical properties of data unpredictably change, a phenomenon known as “concept drift.” This means that insights derived from older data may become obsolete, potentially degrading the model’s performance. Therefore, it is crucial to detect these changes and update the model accordingly to maintain its efficacy.
- Clients, such as smartphones or other devices, can join or leave the network at any time due to factors like battery life, internet connectivity, or the availability of data to share.
- Different clients may process data at varying speeds, depending on their computational power and environmental conditions.
- The nature of the data (data distribution) each client receives can change over time and may differ from one client to another.
- Concept drift, which refers to the emergence of new patterns or trends in the data, can occur or be detected at different times for each client.
- It bridges the gap between FL and Concept Drift, an area that has been underexplored.
- It introduces and analyzes the relevant aspects motivating Federated Drift-Aware Learning (FDAL).
- It provides a structured taxonomy of FDAL approaches, elucidating the challenges and obstacles while identifying open research questions.
- It identifies and critically analyses the current state of FDAL approaches, highlighting both achievements and limitations.
- It proposes directions for future research, emphasizing the need for innovative solutions to improve the adaptability and effectiveness of federated learning in dynamic environments. Serving as a valuable resource, it guides researchers in navigating and addressing the adaptation challenges of federated learning, particularly in the context of concept drift.
Motivation and Contribution
- Comprehensive Synthesis: We present the first systematic survey that consolidates studies addressing Federated Learning under Concept Drift (FDAL), integrating findings across multiple domains and methodological frameworks.
- Theoretical Framing: We formally define the relationship between FL, CL, and CD, offering a unified theoretical foundation for understanding local and global drift phenomena in federated settings.
- FDAL Taxonomy: We introduce a structured taxonomy that categorizes drift-aware learning approaches based on temporal and spatial dimensions, providing a clear framework for analyzing adaptability in federated environments.
- Critical Analysis: We evaluate 22 state-of-the-art FDAL algorithms, highlighting their mechanisms, limitations, and future research opportunities related to scalability, fairness, and resource efficiency.
- Future Directions: We identify emerging challenges—such as catastrophic forgetting, communication overhead, and fairness under drift—and outline potential research pathways toward robust, adaptive, and privacy-preserving federated systems.
2. Research Methodology
- The first phase, Identification, involves accessing various repositories to search for studies in the identified research area. A total of 66 records were identified through the SCOPUS database.
- In the second phase, Screening, a transparent process is applied to select papers by evaluating recommendations from each stage of the systematic review. Records were excluded after reviewing titles and abstracts, with 44 papers found to be irrelevant to the topic.
- Next, Eligibility is determined by evaluating the full-length articles.
- In the Inclusion phase, the selected articles for the review are finalized, with a total of 22 articles included.
- The article is written in English and published in either an academic journal or as a conference paper.
- The selected research must strongly relate to both federated learning and concept drift.
- The article should contribute to the application of federated learning in environments with dynamic data, ensuring adaptation to concept drift.
Investigations
- (1)
- How can concept drift be formally defined within FL environments, and what unique challenges arise in detecting and managing it across decentralized data sources?
- (2)
- How can FL models be effectively adapted to handle diverse types of concept drift (e.g., sudden, gradual, incremental) in dynamic multi-device environments without compromising model accuracy or client privacy?
- (3)
- What role do local versus global concept drifts play in influencing the performance and fairness of federated models across diverse client environments, and how can these be managed to prevent model degradation?
- (4)
- How can FDAL frameworks minimize computational and communication overhead while maintaining model adaptability in non-stationary environments?
- (5)
- How can FL models utilize both temporal and spatial dimensions in client data to enhance the accuracy and timeliness of drift detection?
- (6)
- What challenges can we infer that researchers face in constructing FL models in the presence of concept drift?
3. Federated Learning: Background and Motivational Insights
- Peer-to-Peer (P2P) Architecture: In P2P setups, each device (or “node”) operates independently, communicating directly with other nearby nodes, allowing for a decentralized structure without a central server.
- Client-Server Architecture: Here, a central server coordinates the learning among devices, but rather than centralizing data, it facilitates the aggregation of insights from each device’s learning.
- Parallel Optimization: In this method, the training of a single model is divided among several devices, each handling a portion of the data. This approach accelerates training by allowing each device to work on a subset of the task.
- Distributed Ensemble Methods: In contrast to parallel optimization, each device in an ensemble method trains its own independent model. These models are then combined or aggregated to improve decision-making, a technique known as ensemble learning, which leverages the strengths of multiple models to enhance overall accuracy.
- Statistical Heterogeneity: In FL, each device’s data reflects unique interactions with users and its environment, leading to significant variability in data across devices. This variability means that data from one device may not represent the broader population, posing challenges for model performance and consistency.
- Massive Distribution and Limited Communication: FL must operate efficiently across a network with a much larger number of data samples per client than average, with many devices having sporadic network connectivity. This setup necessitates models that can be updated with minimal communication, accommodating devices with limited data exchange capabilities.
- Unbalanced and Non-stationary Data: User interaction with devices varies widely, so the volume and type of data collected on each device are often unbalanced. Some users interact with a service frequently, generating a lot of data, while others use it less, leading to varying data contributions. Additionally, the data is non-stationary, it changes over time as users’ behavior and the environment evolve, which requires models to adapt to ongoing shifts.
- Limited Communication Resources: Devices participating in FL often operate under constraints, such as sporadic internet connectivity, limited battery power, and bandwidth. To be effective, FL systems must perform well even when communication with the central server is restricted, making efficient model updates essential.
4. Concept Drift Phenomenon: Background and Motivational Insights
- Prior probabilities p(y) are susceptible to alterations.
- Probabilities p(x|y) of class conditional are likewise susceptible to alterations.
- As a result, posterior probabilities p(y|x) might either change or remain the same.
- Sudden Drift: This is when there is a clear moment in time where the old way of understanding the data changes to a new way. It is like flipping a switch from one concept to another.
- Recurring Drift: Here, the concept drift happens more than once and might even return to the original concept. It is like a pattern that repeats or comes back after some time.
- Gradual Drift: In this type, the new concept slowly starts to mix in with the old one. It is not a sudden change, but a blend where the old and new concepts exist together for a while.
- Trigger/Active Approaches: These methods update the model only when a drift is detected. They monitor the algorithm’s error rate, as a stable data environment typically results in a decreasing error rate. However, if the data changes (drifts), the error rate rises. These approaches use two thresholds: a warning level and a drift level. When the error rate reaches the warning level, it suggests a potential drift. If the error rate continues to rise and reaches the drift level, it confirms that a drift has indeed occurred.
- Evolving/Passive Approaches: Unlike trigger/active approaches, these methods continuously update the model with each new data point, regardless of whether a drift is detected. They do not specifically monitor for changes or drifts in the data. Instead, they aim to maintain a model that consistently reflects the most recent data.
5. Advancing Towards Federated Drift-Aware Learning (FDAL)
5.1. Learning in a Federated Setting with Drift Awareness: Problem Formulations
- ω: Global model parameters.
- m: Number of devices.
- : Proportion of total data from the device.
- : Loss function for the device’s data.
5.2. Federated Drift-Aware Learning (FDAL): Taxonomy and State of the Art Approaches
- Temporal Features—Unique to CD
- ○
- Form: This refers to the nature of changes in the data concept. For example, in a health monitoring FL system, a gradual drift might be observed as a slow evolution of patient health metrics over months, whereas a sudden drift could be seen with the rapid onset of an illness. This distinction emphasizes the importance of handling different forms of drift, including Virtual Drift and Real Drift.
- ○
- Speed: The rate at which the concept changes is vital. In financial market predictions using FL, a rapid drift might occur during market crashes, requiring immediate model adjustments, whereas slower drifts could be associated with gradual economic trends. These changes can manifest as Sudden Drift, Gradual Drift, Incremental Drift, and Reoccurring Drift, each posing unique challenges to the learning process.
- ○
- Severity: It measures the extent of the change. In a recommendation system, a minor change in user preferences might only need minor model updates, whereas a major shift, like a new technology trend, would necessitate significant alterations.
- ○
- Recurrence: This deals with how often previous concepts reappear. For instance, in retail, purchasing patterns might recur annually, demonstrating a cyclic nature in the data stream.
- ○
- Predictability: This concerns the ability to anticipate future concept changes. In a traffic management system using FL, predictable drifts might occur due to recurring events like holidays, while unpredictable drifts could arise from unexpected road closures.
- Spatial Features—Unique to FL
- ○
- Coverage: This describes the extent of drift across clients. For example, in an FL application for social media analysis, a new trending topic might affect only certain demographics (partial coverage) or become universally popular (full coverage).
- ○
- Synchronism: Indicates the timing of drifts across clients. In environmental monitoring, certain changes like seasonal shifts might occur synchronously across all sensors, while others, like local pollution events, happen asynchronously. This highlights the need to consider both Synchronous and Asynchronous drifts.
- ○
- Direction: This refers to the alignment of concept drifts among clients. For instance, in a network of autonomous vehicles, some might experience similar drifts in sensor readings due to weather conditions (aligned drift), while others in different regions might not (divergent drift).
- ○
- Correlation: This examines interdependencies among drifts in various clients. In a distributed energy grid, fluctuations in one part of the grid might be correlated with changes in another, indicating a dependent network of drifts. It is crucial to distinguish between Independent and Correlated drifts in such scenarios.
Federated Drift-Aware Learning: State of the Art Approach
- Summary of Key Limitations and Research Priorities
6. Federated Drift-Aware Learning (FDAL): Addressing Research Questions
“How can concept drift be formally defined within Federated Learning environments, and what unique challenges arise in detecting and managing it across decentralized data sources?”
“How can Federated Learning models be effectively adapted to handle diverse types of concept drift (e.g., sudden, gradual, incremental) in dynamic multi-device environments without compromising model accuracy or client privacy?”
“What role do local versus global concept drifts play in influencing the performance and fairness of federated models across diverse client environments, and how can these be managed to prevent model degradation?”
“How can Federated Drift-Aware Learning (FDAL) frameworks minimize computational and communication overhead while maintaining model adaptability in non-stationary environments?”
“How can federated learning models utilize both temporal and spatial dimensions in client data to enhance the accuracy and timeliness of drift detection?”
“What challenges can we infer that researchers face in constructing Federated Learning models in the presence of concept drift?”
7. Conclusions and Future Work
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Li, S.; Xu, L.D.; Zhao, S. 5G Internet of Things: A survey. J. Ind. Inf. Integr. 2018, 10, 1–9. [Google Scholar] [CrossRef]
- Chettri, L.; Bera, R. A Comprehensive Survey on Internet of Things (IoT) Toward 5G Wireless Systems. IEEE Internet Things J. 2020, 7, 16–32. [Google Scholar] [CrossRef]
- Ahmed, S.F.; Bin Alam, S.; Afrin, S.; Rafa, S.J.; Taher, S.B.; Kabir, M.; Muyeen, S.M.; Gandomi, A.H. Towards a secure 5G-enabled Internet of Things: A survey on requirements, privacy, security, challenges, and opportunities. IEEE Access 2024, 12, 13125–13145. [Google Scholar] [CrossRef]
- Vemuri, V.K. The Hundred-Page Machine Learning Book. J. Inf. Technol. Case Appl. Res. 2020, 22, 136–138. [Google Scholar] [CrossRef]
- Custers, B.; Sears, A.M.; Dechesne, F.; Georgieva, I.; Tani, T.; van der Hof, S. EU Personal Data Protection in Policy and Practice; Springer Nature: Durham, NC, USA, 2019; p. 29. [Google Scholar] [CrossRef]
- Ananthanarayanan, G.; Bahl, P.; Bodik, P.; Chintalapudi, K.; Philipose, M.; Ravindranath, L.; Sinha, S. Real-Time Video Analytics: The Killer App for Edge Computing. Computer 2017, 50, 58–67. [Google Scholar] [CrossRef]
- Wang, J.; Amos, B.; Das, A.; Pillai, P.; Sadeh, N.; Satyanarayanan, M. A scalable and privacy-aware IoT service for live video analytics. In Proceedings of the 8th ACM Multimedia Systems Conference, Taipei, Taiwan, 20–23 June 2017; MMSy: New York, NY, USA, 2017; pp. 38–49. [Google Scholar] [CrossRef]
- Yaqoob, I.; Khan, L.U.; Kazmi, S.M.A.; Imran, M.; Guizani, N.; Hong, C.S. Autonomous Driving Cars in Smart Cities: Recent Advances, Requirements, and Challenges. IEEE Netw. 2020, 34, 174–181. [Google Scholar] [CrossRef]
- Zhang, C.; Xie, Y.; Bai, H.; Yu, B.; Li, W.; Gao, Y. A survey on federated learning. Knowl. Based Syst. 2021, 216, 106775. [Google Scholar] [CrossRef]
- Kairouz, P.; McMahan, H.B.; Avent, B.; Bellet, A.; Bennis, M.; Bhagoji, A.N.; Bonawitz, K.; Charles, Z.; Cormode, G.; Cummings, R.; et al. Advances and open problems in federated learning. arXiv 2021, arXiv:1912.04977. [Google Scholar] [CrossRef]
- Banabilah, S.; Aloqaily, M.; Alsayed, E.; Malik, N.; Jararweh, Y. Federated learning review: Fundamentals, enabling technologies, and future applications. Inf. Process Manag. 2022, 59, 103061. [Google Scholar] [CrossRef]
- Mothukuri, V.; Parizi, R.M.; Pouriyeh, S.; Huang, Y.; Dehghantanha, A.; Srivastava, G. A survey on security and privacy of federated learning. Future Gener. Comput. Syst. 2021, 115, 619–640. [Google Scholar] [CrossRef]
- Wu, X.; Liang, Z.; Wang, J. FedMed: A Federated Learning Framework for Language Modeling. Sensors 2020, 20, 4048. [Google Scholar] [CrossRef]
- Rey, V.; Sánchez Sánchez, P.M.; Huertas Celdrán, A.; Bovet, G. Federated learning for malware detection in IoT devices. Comput. Netw. 2022, 204, 108693. [Google Scholar] [CrossRef]
- Fang, W.; He, J.; Li, W.; Lan, X.; Chen, Y.; Li, T.; Huang, J.; Zhang, L. Comprehensive Android Malware Detection Based on Federated Learning Architecture. IEEE Trans. Inf. Forensics Secur. 2023, 18, 3977–3990. [Google Scholar] [CrossRef]
- Qayyum, A.; Ahmad, K.; Ahsan, M.A.; Al-Fuqaha, A.; Qadir, J. Collaborative federated learning for healthcare: Multi-modal COVID-19 diagnosis at the edge. IEEE Open J. Comput. Soc. 2022, 3, 172–184. [Google Scholar] [CrossRef]
- Antunes, R.S.; Da Costa, C.A.; Küderle, A.; Yari, I.A.; Eskofier, B. Federated Learning for Healthcare: Systematic Review and Architecture Proposal. ACM Trans. Intell. Syst. Technol. 2022, 13, 1–23. [Google Scholar] [CrossRef]
- Zhang, H.; Bosch, J.; Olsson, H.H. End-to-End Federated Learning for Autonomous Driving Vehicles. In Proceedings of the International Joint Conference on Neural Networks, Online, 18–22 July 2021. [Google Scholar] [CrossRef]
- Nguyen, A.; Do, T.; Tran, M.; Nguyen, B.X.; Duong, C.; Phan, T.; Tjiputra, E.; Tran, Q.D. Deep Federated Learning for Autonomous Driving. In Proceedings of the IEEE Intelligent Vehicles Symposium, Aachen, Germany, 4–9 June 2022; pp. 1824–1830. [Google Scholar] [CrossRef]
- Lu, Z.; Pan, H.; Dai, Y.; Si, X.; Zhang, Y. Federated Learning with Non-IID Data: A Survey. IEEE Internet Things J. 2024, 11, 19188–19209. [Google Scholar] [CrossRef]
- Zeng, Y.; Mu, Y.; Yuan, J.; Teng, S.; Zhang, J.; Wan, J.; Ren, Y.; Zhang, Y. Adaptive Federated Learning with Non-IID Data. Comput. J. 2023, 66, 2758–2772. [Google Scholar] [CrossRef]
- Ma, X.; Zhu, J.; Lin, Z.; Chen, S.; Qin, Y. A state-of-the-art survey on solving non-IID data in Federated Learning. Future Gener. Comput. Syst. 2022, 135, 244–258. [Google Scholar] [CrossRef]
- Chen, Y.; Ning, Y.; Slawski, M.; Rangwala, H. Asynchronous Online Federated Learning for Edge Devices with Non-IID Data. In Proceedings of the 2020 IEEE International Conference on Big Data, Big Data 2020, Atlanta, GA, USA, 10–13 December 2020; pp. 15–24. [Google Scholar] [CrossRef]
- Zeng, Y.; Mu, Y.; Yuan, J.; Teng, S.; Zhang, J.; Wan, J.; Ren, Y.; Zhang, Y. Federated Learning with Non-IID Data. Comput. J. 2018. [Google Scholar] [CrossRef]
- Mahdi, O.A.; Ali, N.; Pardede, E.; Alazab, A.; Al-Quraishi, T.; Das, B. Roadmap of Concept Drift Adaptation in Data Stream Mining, Years Later. IEEE Access 2024, 12, 21129–21146. [Google Scholar] [CrossRef]
- Moher, D.; Liberati, A.; Tetzlaff, J.; Altman, D.G. Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. Int. J. Surg. 2010, 8, 336–341. [Google Scholar] [CrossRef]
- Samuel, A. Some studies in machine learning using the game of checkers. IBM J. Res. Dev. 1959, 3, 210–229. [Google Scholar] [CrossRef]
- Rosenblatt, F. The perceptron: A probabilistic model for information storage and organization in the brain. Psychol. Rev. 1958, 65, 386. [Google Scholar] [CrossRef]
- McCulloch, W.S.; Pitts, W. A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 1943, 5, 115–133. [Google Scholar] [CrossRef]
- Liu, J.; Huang, J.; Zhou, Y.; Li, X.; Ji, S.; Xiong, H.; Dou, D. From distributed machine learning to federated learning: A survey. Knowl. Inf. Syst. 2022, 64, 885–917. [Google Scholar] [CrossRef]
- Li, Q.; Wen, Z.; Wu, Z.; Hu, S.; Wang, N.; Li, Y.; Liu, X.; He, B. A Survey on Federated Learning Systems: Vision, Hype and Reality for Data Privacy and Protection. IEEE Trans. Knowl. Data Eng. 2023, 35, 3347–3366. [Google Scholar] [CrossRef]
- Lim, W.Y.B.; Luong, N.C.; Hoang, D.T.; Jiao, Y.; Liang, Y.C.; Yang, Q.; Niyato, D.; Miao, C. Federated Learning in Mobile Edge Networks: A Comprehensive Survey. IEEE Commun. Surv. Tutor. 2020, 22, 2031–2063. [Google Scholar] [CrossRef]
- Chamikara, M.A.P.; Bertok, P.; Khalil, I.; Liu, D.; Camtepe, S. Privacy preserving distributed machine learning with federated learning. Comput. Commun. 2021, 171, 112–125. [Google Scholar] [CrossRef]
- Konečný, J.; McMahan, B.; Ramage, D. Federated Optimization: Distributed Optimization Beyond the Datacenter. arXiv 2015, arXiv:1511.03575. [Google Scholar] [CrossRef]
- Li, T.; Sahu, A.K.; Talwalkar, A.; Smith, V. Federated Learning: Challenges, Methods, and Future Directions. IEEE Signal Process Mag. 2020, 37, 50–60. [Google Scholar] [CrossRef]
- Ghosh, A.; Hong, J.; Yin, D.; Ramchandran, K. Robust Federated Learning in a Heterogeneous Environment. arXiv 2019, arXiv:1906.06629. [Google Scholar] [CrossRef]
- Li, Y.; Zhou, W.; Wang, H.; Mi, H.; Hospedales, T.M. FedH2L: Federated Learning with Model and Statistical Heterogeneity. arXiv 2021, arXiv:1906.06629. [Google Scholar]
- Ye, M.; Fang, X.; Du, B.; Yuen, P.C.; Tao, D. Heterogeneous Federated Learning: State-of-the-art and Research Challenges. ACM Comput. Surv. 2023, 56, 1–44. [Google Scholar] [CrossRef]
- Gama, J.; Zliobaite, I.; Bifet, A.; Pechenizkiy, M.; Bouchachia, A. A survey on concept drift adaptation. ACM Comput. Surv. 2014, 46, 1–37. [Google Scholar] [CrossRef]
- Wang, S.; Schlobach, S.; Klein, M. Concept drift and how to identify it. J. Web Semant. 2011, 9, 247–265. [Google Scholar] [CrossRef]
- Lesort, T.; Caccia, M.; Rish, I. Understanding Continual Learning Settings with Data Distribution Drift Analysis. arXiv 2021, arXiv:2104.01678. [Google Scholar]
- Gama, J. Knowledge Discovery from Data Streams; CRC Press: Boca Raton, FL, USA, 2010. [Google Scholar]
- Krempl, G.; Žliobaite, I.; Brzeziński, D.; Hüllermeier, E.; Last, M.; Lemaire, V.; Noack, T.; Shaker, A.; Sievi, S.; Spiliopoulou, M.; et al. Open challenges for data stream mining research. ACM SIGKDD Explor. Newsl. 2014, 16, 1–10. [Google Scholar] [CrossRef]
- Prasad, B.R.; Agarwal, S. Stream data mining: Platforms, algorithms, performance evaluators and research trends. Int. J. Database Theory Appl. 2016, 9, 201–218. [Google Scholar] [CrossRef]
- Gama, J.; Sebastiao, R.; Rodrigues, P.P. Issues in evaluation of stream learning algorithms. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, 28 June–1 July 2009; pp. 329–338. [Google Scholar]
- Schlimmer, J.C.; Granger, R.H. Beyond incremental processing: Tracking concept drift. In Proceedings of the Fifth AAAI National Conference on Artificial Intelligence, Philadelphia, Pennsylvania, 11–15 August 1986. [Google Scholar]
- Li, J.; Yu, H.; Zhang, Z.; Luo, X.; Xie, S. Concept drift adaptation by exploiting drift type. ACM Trans. Knowl. Discov. Data 2024, 18, 1–22. [Google Scholar] [CrossRef]
- Kelly, M.G.; Hand, D.J.; Adams, N.M. The impact of changing populations on classifier performance. In Proceedings of the fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, 15–18 August 1999; pp. 367–371. [Google Scholar]
- Žliobaitė, I. Learning under concept drift: An overview. arXiv 2010, arXiv:1010.4784. [Google Scholar] [CrossRef]
- Tsymbal, A. The Problem of Concept Drift: Definitions and Related Work; Computer Science Department, Trinity College: Dublin, Ireland, 2004; Volume 106, p. 58. [Google Scholar]
- Gama, J.; Sebastiao, R.; Rodrigues, P.P. On evaluating stream learning algorithms. Mach. Learn. 2013, 90, 317–346. [Google Scholar] [CrossRef]
- Mahdi, O.A.; Pardede, E.; Ali, N.; Cao, J. Fast reaction to sudden concept drift in the absence of class labels. Appl. Sci. 2020, 10, 606. [Google Scholar] [CrossRef]
- Mahdi, O.A.; Pardede, E.; Ali, N.; Cao, J. Diversity measure as a new drift detection method in data streaming. Knowl. Based Syst. 2020, 191, 105227. [Google Scholar] [CrossRef]
- Mahdi, O.A.; Pardede, E.; Ali, N. KAPPA as drift detector in data stream mining. Procedia Comput. Sci. 2021, 184, 314–321. [Google Scholar] [CrossRef]
- Gama, J.; Medas, P.; Castillo, G.; Rodrigues, P. Learning with drift detection. In Proceedings of the Advances in Artificial Intelligence–SBIA 2004: 17th Brazilian Symposium on Artificial Intelligence, Sao Luis, Maranhao, Brazil, 29 September–1 October 2004; pp. 286–295. [Google Scholar]
- Baena-Garca, M.; del Campo-Ávila, J.; Fidalgo, R.; Bifet, A.; Gavalda, R.; Morales-Bueno, R. Early drift detection method. In Fourth International Workshop on Knowledge Discovery from Data Streams; ACM Press: New York, NY, USA, 2006; pp. 77–86. [Google Scholar]
- Street, W.N.; Kim, Y. A streaming ensemble algorithm (SEA) for large-scale classification. In Proceedings of the seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 26–29 August 2001; pp. 377–382. [Google Scholar]
- Nishida, K.; Yamauchi, K.; Omori, T. ACE: Adaptive classifiers-ensemble system for concept-drifting environments. In Proceedings of the Multiple Classifier Systems: 6th International Workshop, MCS 2005, Seaside, CA, USA, 13–15 June 2005; pp. 176–185. [Google Scholar]
- Elwell, R.; Polikar, R. Incremental learning of concept drift in nonstationary environments. IEEE Trans. Neural Netw. 2011, 22, 1517–1531. [Google Scholar] [CrossRef]
- Krawczyk, B.; Woźniak, M. Reacting to different types of concept drift with adaptive and incremental one-class classifiers. In Proceedings of the 2015 IEEE 2nd International Conference on Cybernetics (CYBCONF), Gdynia, Poland, 24–26 June 2015; pp. 30–35. [Google Scholar]
- Kotler, J.; Maloof, M. Dynamic weighted majority: A new ensemble method for tracking concept drift. In IEEE International Conference on Data Mining; IEEE Computer Society: Washington, DC, USA, 2003; pp. 123–130. [Google Scholar]
- Mahdi, O.A. Diversity Measures as New Concept Drift Detection Methods in Data Stream Mining. Ph.D. Thesis, La Trobe University, Melbourne, Australia, 2020. [Google Scholar]
- McMahan, H.B.; Moore, E.; Ramage, D.; Arcas, B.A.Y. Federated Learning of Deep Networks using Model Averaging. arXiv 2016. [Google Scholar] [CrossRef]
- Brendan McMahan, H.; Moore, E.; Ramage, D.; Hampson, S.; Agüera y Arcas, B. Communication-efficient learning of deep networks from decentralized data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, FL, USA, 20–22 April 2017. [Google Scholar]
- Yang, G.; Chen, X.; Zhang, T.; Wang, S.; Yang, Y. An Impact Study of Concept Drift in Federated Learning. In Proceedings of the 2023 IEEE International Conference on Data Mining (ICDM), Shanghai, China, 1–4 December 2023. [Google Scholar]
- Casado, F.E.; Lema, D.; Criado, M.F.; Iglesias, R.; Regueiro, C.V.; Barro, S. Concept drift detection and adaptation for federated and continual learning. Multimed. Tools Appl. 2022, 81, 3397–3419. [Google Scholar] [CrossRef]
- Casado, F.E.; Lema, D.; Iglesias, R.; Regueiro, C.V.; Barro, S. Concept Drift Detection and Adaptation for Robotics and Mobile Devices in Federated and Continual Settings. Adv. Intell. Syst. Comput. 2021, 1285, 79–93. [Google Scholar] [CrossRef]
- Canonaco, G.; Bergamasco, A.; Mongelluzzo, A.; Roveri, M. Adaptive Federated Learning in Presence of Concept Drift. In Proceedings of the International Joint Conference on Neural Networks, Shenzhen, China, 18–22 July 2021. [Google Scholar] [CrossRef]
- Jothimurugesan, E.; Hsieh, K.; Wang, J.; Joshi, G.; Gibbons, P.B. Federated learning under distributed concept drift. In Proceedings of the International Conference on Artificial Intelligence and Statistics, Valencia, Spain, 25–27 April 2023. [Google Scholar]
- Panchal, K.; Choudhary, S.; Mitra, S.; Mukherjee, K.; Sarkhel, S.; Mitra, S.; Guan, H. Flash: Concept drift adaptation in federated learning. In Proceedings of the International Conference on Machine Learning, Honolulu, HI, USA, 23–29 July 2023. [Google Scholar]
- Manias, D.M.; Shaer, I.; Yang, L.; Shami, A. Concept drift detection in federated networked systems. In Proceedings of the 2021 IEEE Global Communications Conference (GLOBECOM), Madrid, Spain, 7–11 December 2021. [Google Scholar]
- Chen, Y.; Chai, Z.; Cheng, Y.; Rangwala, H. Asynchronous federated learning for sensor data with concept drift. In Proceedings of the 2021 IEEE International Conference on Big Data (Big Data), Orlando, FL, USA, 15–18 December 2021. [Google Scholar]
- Badar, M.; Nejdl, W.; Fisichella, M. FAC-fed: Federated adaptation for fairness and concept drift aware stream classification. Mach. Learn. 2023, 112, 2761–2786. [Google Scholar] [CrossRef]
- Mawuli, C.B.; Che, L.; Kumar, J.; Din, S.U.; Qin, Z.; Yang, Q.; Shao, J. FedStream: Prototype-Based Federated Learning on Distributed Concept-Drifting Data Streams. In Proceedings of the IEEE Transactions on Systems, Man, and Cybernetics: Systems, Maui, HI, USA, 1–4 October 2023. [Google Scholar]
- Ganguly, B.; Aggarwal, V. Online Federated Learning via Non-Stationary Detection and Adaptation Amidst Concept Drift. In IEEE/ACM Transactions on Networking; IEEE: Washington, DC, USA, 2023. [Google Scholar]
- Chow, T.; Raza, U.; Mavromatis, I.; Khan, A. Flare: Detection and mitigation of concept drift for federated learning based IoT deployments. In Proceedings of the 2023 International Wireless Communications and Mobile Computing, Marrakesh, Morocco, 19–23 June 2023. [Google Scholar]
- Mawuli, C.B.; Kumar, J.; Nanor, E.; Fu, S.; Pan, L.; Yang, Q.; Zhang, W.; Shao, J. Semi-supervised federated learning on evolving data streams. Inf. Sci. 2023, 643, 119235. [Google Scholar] [CrossRef]
- Casado, F.E.; Lema, D.; Iglesias, R.; Regueiro, C.V.; Barro, S. Ensemble and continual federated learning for classification tasks. Mach. Learn. 2023, 112, 3413–3453. [Google Scholar] [CrossRef]
- Jiang, M.; Wang, Z.; Dou, Q. Harmofl: Harmonizing local and global drifts in federated learning on heterogeneous medical images. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual, 28 February–1 March 2022. [Google Scholar]
- Düsing, C.; Cimiano, P. Monitoring Concept Drift in Continuous Federated Learning Platforms. In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Berlin/Heidelberg, Germany, 2024; pp. 83–94. [Google Scholar] [CrossRef]
- Kang, M.; Kim, S.; Jin, K.H.; Adeli, E.; Pohl, K.M.; Park, S.H. FedNN: Federated learning on concept drift data using weight and adaptive group normalizations. Pattern Recognit. 2024, 149, 110230. [Google Scholar] [CrossRef]
- Stallmann, M.; Wilbik, A.; Weiss, G. Towards unsupervised sudden data drift detection in federated learning with fuzzy clustering. In Proceedings of the 2024 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), Yokohama, Japan, 30 June–5 July 2024. [Google Scholar]
- Tsiporkova, E.; De Vis, M.; Klein, S.; Hristoskova, A.; Boeva, V. Mitigating Concept Drift in Distributed Contexts with Dynamic Repository of Federated Models. In Proceedings of the 2023 IEEE International Conference on Big Data (BigData), Sorrento, Italy, 15–18 December 2023. [Google Scholar]
- Ito, R.; Tsukada, M.; Matsutani, H. An on-device federated learning approach for cooperative model update between edge devices. IEEE Access 2021, 9, 92986–92998. [Google Scholar] [CrossRef]
- Guo, Y.; Lin, T.; Tang, X. Towards federated learning on time-evolving heterogeneous data. arXiv 2021, arXiv:2112.13246. [Google Scholar]
- Yoon, J.; Jeong, W.; Lee, G.; Yang, E.; Hwang, S.J. Federated continual learning with weighted inter-client transfer. In Proceedings of the International Conference on Machine Learning, Online, 18–24 July 2021. [Google Scholar]
- Shukla, P.K.; Veerasamy, B.D.; Alduaiji, N.; Addula, S.R.; Sharma, S.; Shukla, P.K. Encoder only attention-guided transformer framework for accurate and explainable social media fake profile detection. Peer-Peer Netw. Appl. 2025, 18, 232. [Google Scholar] [CrossRef]







| Abbreviation | Full Form |
|---|---|
| AI | Artificial Intelligence |
| CD | Concept Drift |
| CL | Continual Learning |
| DML | Distributed Machine Learning |
| FDAL | Federated Drift-Aware Learning |
| FL | Federated Learning |
| IID | Independent and Identically Distributed |
| IoT | Internet of Things |
| ML | Machine Learning |
| P2P | Peer-to-Peer |
| PRISMA | Preferred Reporting Items for Systematic Reviews and Meta-Analyses |
| SGD | Stochastic Gradient Descent |
| FedAvg | Federated Averaging Algorithm |
| FLASH | Federated Learning Adaptive Shift Optimizer |
| FACFed | Federated Adaptation for Fairness and Concept Drift-Aware Stream Classification |
| FedConD | Federated Concept Drift Framework |
| CDA-FedAvg | Concept-Drift-Aware Federated Averaging |
| FLARE | Federated Learning Adaptive Reconfiguration |
| SFLEDS | Semi-supervised Federated Learning on Evolving Data Streams |
| FedNN | Federated Neural Network |
| FedRepo | Federated Repository Framework |
| ITS | Intelligent Transportation Systems |
| MAE | Mean Absolute Error |
| RMSE | Root Mean Squared Error |
| ROC-AUC | Receiver Operating Characteristic—Area Under the Curve |
| WN | Weight Normalization |
| AGN | Adaptive Group Normalization |
| Ref. | Year | Research Question | Approach/Mechanism Used | Application/Domain: | Type of Concept Drift | Limitations, Research Gaps, and Future Work: |
|---|---|---|---|---|---|---|
| [67] | 2021 | How can federated and continual learning effectively handle concept drift with non-IID and nonstationary data among clients over time? | The approach used is an extension of the Federated Averaging (FedAvg) algorithm, which includes concept drift detection and adaptation. | The research is applied in the domain of human activity recognition using service robots and smart devices like smartphones, which are part of a federated learning framework | Gradual and sudden concept drift | The experimental setup is unrealistic, assuming uniform data acquisition among users, which may not reflect real-world scenarios. There is limited research on non-IID data streams and concept drift in federated learning. Future work could address these gaps by exploring more realistic data scenarios and drift detection methods. |
| [68] | 2021 | How can Federated Learning algorithms be adapted to effectively handle non-stationary data generating processes affected by concept drift? | The paper introduces a novel Federated Learning algorithm called Adaptive-FedAVG, which incorporates an adaptive step size to handle concept drifts in data generating processes | application domain is image classification | class-introduction concept drift and class-swap concept drift | The paper lacks a detailed theoretical analysis of the Adaptive-FedAVG algorithm’s convergence properties, which is a noted limitation. Current FL literature lacks algorithms for non-stationary data with concept drift, a gap this paper aims to fill. Future work will focus on analyzing the algorithm’s convergence under stationarity assumptions. |
| [69] | 2023 | How can federated learning effectively adapt to distributed concept drift with heterogeneous data across time and clients? | The paper proposes clustering algorithms for reacting to drifts based on local drift detection and hierarchical clustering. The study focuses on federated learning under distributed concept drift | The paper is applicable to domains where federated learning is used, such as IoT and distributed systems, though specific domains are not explicitly mentioned in the contexts. | The paper addresses distributed concept drift, which can be staggered in time and space across clients | The paper lacks a theoretical analysis of the Adaptive-FedAVG algorithm’s convergence properties, a noted limitation. It also addresses a gap in FL literature on algorithms for non-stationary data with concept drift. Future work will include a theoretical analysis of the algorithm’s convergence under stationarity assumptions. |
| [70] | 2023 | How can federated learning models adapt to concept drift while addressing statistical heterogeneity in distributed data environments? | It uses a two-pronged approach: client-side early-stopping training and server-side drift-aware adaptive optimization | The research is applicable in the domain of Federated Learning (FL), which can be used in various fields like IoT and mobile networks | sudden and incremental concept drifts | FLASH did not perform as well as ORACLE, which has prior knowledge of concept drifts. The paper identifies a gap in existing adaptive optimization methods’ ability to quickly adapt to concept drift, which FLASH aims to address. It also suggests further empirical evaluations and improvements in generalized and personalized accuracy for federated learning with concept drift. |
| [65] | 2023 | How does concept drift impact federated learning models, and what features influence model accuracy and convergence in FL systems? | The study focuses on federated learning (FL) and examines the impact of concept drift on FL models by categorizing it into temporal and spatial dimensions and analyzing their effects on model performance. | The research is applied in the domain of federated learning, which is a distributed machine learning approach. It is relevant to any domain using FL with streaming data | The paper categorizes concept drift into temporal and spatial dimensions, considering factors like form, speed, severity, coverage, and synchronism. | The paper highlights uncertainty over whether concept drift should be detected locally or globally, given the differing performances, and acknowledges the challenge of quantitatively measuring features on real-world data. It emphasizes the need for targeted approaches to different types of drift, suggesting that focusing solely on global or local performance is insufficient. Future research should develop solutions for concept drift in FL by exploring the impact of temporal and spatial features and improving the measurement of proposed features on real-world data. |
| [66] | 2022 | How can federated and continual learning models be adapted to effectively handle concept drift in multi-device environments? | It involves a distribution-based algorithm for drift detection using a confidence metric to quantify dissimilarity between historical and new data distributions | Human Activity Recognition— Internet of Things (IoT) | sudden, gradual, and recurring drifts. | The paper acknowledges that the experimental scenario is not entirely realistic, assuming uniform data acquisition across users, which may not align with real-world conditions. It calls for further exploration of concept drift, where clients may label the same pattern differently, and highlights the challenge of addressing both temporal and spatial dimensions simultaneously. Future research will extend the framework for federated and continual learning to improve adaptability in both dimensions, and the authors plan to expand their experiments to applications beyond smartphones. |
| [71] | 2021 | How can concept drift be detected in federated networked systems using lightweight and scalable techniques to maintain system performance? | it uses dimensionality reduction through Principal Component Analysis (PCA) and clustering via K-Means to detect concept drift | Intelligent Transportation Systems (ITS) | The paper addresses concept drift in general | The paper calls for further exploration of thresholding schemes and deeper network architectures. It identifies a gap in developing lightweight, efficient drift detection techniques for resource-constrained environments like Multi-access Edge Computing (MEC). Future work will focus on concept drift detection frameworks for tasks like multi-class classification and regression, evaluating deeper network architectures, different thresholding schemes, and scalability with larger network sizes. |
| [72] | 2021 | How can asynchronous federated learning adapt to concept drift in sensor data to maintain model performance and reduce communication costs? | FedConD, a novel approach that detects drift using an adaptive mechanism based on historical performance, adapts by adjusting the local regularization parameter, and employs a communication strategy to select local updates and accelerate model convergence. | Internet of Things (IoT) | Sudden, gradual, and incremental drift | Traditional concept drift techniques are unsuitable for federated learning due to device heterogeneity, highlighting the need for adaptive algorithms. Developing efficient drift detection and adaptation strategies in federated learning is a potential area for future research. |
| [73] | 2021 | How can asynchronous federated learning adapt to concept drift in sensor data to maintain model performance and reduce communication costs? | The paper proposes FedConD, or FAC-fed ?, which detects drift adaptively, adjusts the local regularization parameter, and uses a communication strategy to accelerate model convergence. | Real-time distributed data streams (like stock market platforms, e-commerce websites, and telemedicine web platforms) | continuous concept drift | Limitations include the complexity of federated learning setups and the computational overhead of continuous drift detection. The paper is the first to address fairness-aware federated adaptation for stream classification, highlighting a gap in combining fairness, federated learning, and concept drift handling. Future areas include developing more efficient drift detection methods, expanding to other domains, and enhancing computational efficiency. |
| [74] | 2023 | How can federated learning effectively handle concept drift in distributed data streams while preserving privacy across multiple clients? | it uses Prototype-based learning and Metric-learning-based prototype transformation technique | Distributed data stream mining | Sudden, gradual, incremental, and recurrent drift. | The proposed algorithm is limited to fully supervised data streams and lacks support for semi-supervised settings. Existing algorithms struggle with handling concept drift from multiple sources in distributed data streams, and there is limited exploration of drift adaptation in distributed learning. The authors plan to extend the algorithm to semi-supervised settings with limited or delayed labels. |
| [75] | 2023 | How can federated learning frameworks be enhanced to detect and adapt to concept drift for improved generalization performance? | The paper presents a multiscale framework combining FedAvg and FedOMD with non-stationary detection and adaptation, using shorter training horizons and randomized training schedules. | The research is situated within the domain of Federated Learning, which is a part of artificial intelligence research | Not Available | Challenges include the complexity of implementing multiscale algorithms and maintaining privacy in federated learning. The paper notes that existing FL methods assume stationary data, which is unrealistic, highlighting a gap in managing non-stationary environments. Future research could refine detection and adaptation techniques for complex concept drift and explore the framework’s application in other domains. |
| [76] | 2023 | How can concept drift be detected and mitigated in federated learning-based IoT deployments to maintain model performance? | The paper introduces a novel lightweight dual-scheduler FL framework called FLARE, which conditionally transfers training data and deploys models based on observing the model’s training behavior and inference statistics | Internet of Things (IoT) | sudden concept drift | The framework was primarily tested for abrupt drift; further experiments are needed for gradual or incremental drifts. Additional research is required to detect drifts effectively while maintaining lightweight performance in resource-constrained settings. The paper highlights the need for adaptive thresholding schemes and automated optimization techniques to generalize across datasets. Future work includes refining these techniques and expanding the framework to other datasets. |
| [77] | 2023 | How can semi-supervised federated learning effectively handle evolving data streams with label scarcity and concept drift in a privacy-preserving manner? | The paper uses a prototype-based method for semi-supervised federated learning, incorporating micro-clustering and probabilistic inter-client server consistency matching to handle concept drift and label scarcity | Internet of Things (IoT) | incremental, gradual, and sudden concept drifts | Limitations include handling diverse concept drift types and the computational overhead of maintaining micro-clusters. Gaps involve the need for more testing on real-world datasets and exploring additional drift types. Future work could focus on improving the model’s scalability and robustness. |
| [78] | 2023 | How can semi-supervised federated learning effectively handle evolving data streams with label scarcity and concept drift in a privacy-preserving manner? | The paper uses a prototype-based method for semi-supervised federated learning, incorporating micro-clustering and probabilistic inter-client server consistency matching to handle concept drift and label scarcity | Internet of Things (IoT) | incremental, gradual, and sudden concept drifts | Managing concept drift in federated learning requires balancing detection frequency and cost, as frequent checks can be inefficient and may miss local changes affecting only subsets of clients. Research is limited on non-deep learning algorithms for drift adaptation, emphasizing the need for explicit detection methods, diverse dataset testing, robustness, and personalized client-specific solutions. |
| [79] | 2022 | How can local and global drifts in federated learning be harmonized to improve performance on heterogeneous medical image datasets? | The paper introduces a novel harmonizing strategy called HarmoFL, which involves amplitude normalization and weight perturbation to address local and global drifts | medical domain, specifically focusing on federated learning for medical image analysis | non-iid feature shifts | The paper highlights the need for further exploration of harmonizing strategies that can be generalized across different types of non-iid data and federated learning scenarios. Future work could involve extending the HarmoFL framework to other domains beyond medical imaging and exploring its applicability to other types of non-iid challenges in federated learning |
| [80] | 2024 | How can concept drift be effectively monitored in continuous federated learning platforms with dynamic client participation to sustain model performance? | employs error-based and data-based drift detection approaches to monitor concept drift in continuous federated learning (CFL) platforms. | continuous federated learning, which is a paradigm in distributed machine learning. | sudden, gradual, incremental, and reoccurring drift. | The findings are based on a single dataset and may not generalize to others. The study only considers one error-based and one data-based drift detection approach and lacks a comparison between their effectiveness and detection speed. Future work will evaluate different models, data types, and detection approaches and aim to develop a framework that combines error-based and data-based methods for dynamic adaptation. |
| [81] | 2024 | How can federated learning models effectively address concept drift in heterogeneous client data to improve model accuracy and convergence? | it employs Weight Normalization (WN) and Adaptive Group Normalization (AGN) to address concept drift in federated learning | is applicable to domains where data is collected under varying conditions, such as industrial and medical fields | local drift | While AGN prevents accuracy degradation, it does not improve it, requiring further refinement for concept drift scenarios. Existing FL methods struggle with concept drift, resulting in slow and unstable convergence, underscoring the need for more robust solutions to handle client data heterogeneity. The study aims to encourage further research to test methods under diverse heterogeneity and enhance the practicality of FL methods. Improving accuracy in concept drift scenarios remains a future research focus. |
| [82] | 2024 | How can unsupervised methods detect sudden data drift in federated learning environments while maintaining data privacy and accuracy? | The paper uses federated fuzzy c-means clustering. Also, the Federated fuzzy Davies-Bouldin index is used to estimate changes in data distributions | The study is focused on federated learning environments, which are applicable in domains with data privacy concerns | local drift, Sudden, global drift | The method is sensitive to parameter choices like the acceptability threshold and the number of clusters and has a low detection rate when few data points in a batch are affected by drift. Fuzzy c-means clustering may struggle with high-dimensional data, leading to a poor initial model. There are few options for unsupervised drift detection that differentiate between local and global drift in federated settings. The field of federated drift detection is still developing. Future work will address parameter sensitivity, evaluate the method in real-world scenarios, and explore detecting concept drift in supervised learning systems. |
| [83] | 2023 | How can federated learning effectively mitigate concept drift in distributed environments without compromising model performance and privacy? | The paper uses a federated learning methodology called FedRepo, which involves: Random Forest (RF) models for ensemble learning, Clustering to group clients with similar performance patterns, and Particle Swarm Optimization (PSO) for clustering optimization | the domain of electricity consumption forecasting, which is part of the Internet of Things (IoT) | Concept Drift in Distributed Context | The approach may require significant computational resources for maintaining and updating the model repository dynamically. The paper identifies that concept drift detection and mitigation during the inference phase is not widely explored in federated learning contexts. Future plans include studying and evaluating the FedRepo methodology in other distributed scenarios and benchmarking its customization and adaptability against other strategies |
| [84] | 2021 | How can on-device federated learning be effectively implemented for cooperative model updates between edge devices to handle concept drift? | It combines Online Sequential Extreme Learning Machine (OS-ELM) and autoencoder for anomaly detection. The Elastic Extreme Learning Machine (E2LM) is used to merge intermediate training results from multiple devices. | Internet of Things (IoT) | local drift | A limitation mentioned is the restricted amount of training data available for each edge device due to the distributed nature of the system. Future work could involve further optimizing the cooperative model update algorithm and exploring its application in other domains or with different types of neural networks. |
| Ref. | Dataset Name | Dataset Description | Link of the Dataset | Publicly/ Private | Metric Used/Performance |
|---|---|---|---|---|---|
| [67] | Human Activity Recognition |
| Link: https://www.utwente.nl/en/eemcs/ps/ (accessed on 8 November 2025) | Public |
|
| [68] | MNIST and CIFAR-10 |
| Link: MNIST -https://rasbt.github.io/mlxtend/user_guide/data/loadlocal_mnist/ (accessed on 8 November 2025) Link: CIFAR-10 (N/A) | MNIST (Public) CIFAR-10 (Private) |
|
| [69] | Real Dataset: 1. FMoW Synthetic Dataset:1. SINE, CIRCLE, 2. SEA, MNIST |
| Link: https://github.com/microsoft/FedDrift (accessed on 8 November 2025) | Public |
|
| [70] | 1. EMNIST, 2. Shakespeare, 3. Stackoverflow, 4. CIFAR10 |
|
| Not Available |
|
| [65] | A hyperplane data generator. Sine data generator |
| Link: https://moa.cms.waikato.ac.nz/ (accessed on 8 November 2025) | Public |
|
| [66] | Human Activity Recognition |
| Link: https://www.utwente.nl/en/eemcs/ps/ (accessed on 8 November 2025) | Public |
|
| [71] | The MNIST digit |
|
| private |
|
| [72] | 1. FitRec, Air 2. Quality, 3. ExtraSensory, 4. Fashion-MNIST, 5. Cifar-10 |
| Link: https://www.kaggle.com/datasets/tientd95/fitrec-dataset (accessed on 8 November 2025) Link: https://github.com/ceshine/kddcup2018 (accessed on 8 November 2025) Link: http://extrasensory.ucsd.edu/ (accessed on 8 November 2025) Link: https://github.com/zalandoresearch/fashion-mnist (accessed on 8 November 2025) Link: https://www.cs.toronto.edu/~kriz/cifar.html (accessed on 8 November 2025) | Public |
|
| [73] | 1. Bank Marketing (Bank M.), 2. Law School (Law S.), Default, Adult, Census (Adult C.) |
|
| Private |
|
| [74] | Real Dataset 1. Forest Covtype Electricity, 2. Shuttle Occupancy, 3. GSD, 4. SLDD, 5. KDDcup99. Synthetic Dataset 1. CR4, 2. CRE4V2, 3. GEA, 4. R2C2D, 5. Random. | Real Dataset
| Link: https://moa.cms.waikato.ac.nz/datasets/ (accessed on 8 November 2025) Link: https://archive.ics.uci.edu/datasets (accessed on 8 November 2025) Link: https://sites.google.com/site/nonstationaryarchive/datasets (accessed on 8 November 2025) Link: https://moa.cms.waikato.ac.nz/details/classification/streams/ (accessed on 8 November 2025) | Public |
|
| [75] | LIBSVM Collection of datasets: covtype and mnist. | covtype#class#training size#testing size#feature (7,581,012, N/A, 54) mnist#class#training size#testing size#feature (10,60,000,10,000,780) | Link: https://www.csie.ntu.edu.tw/%7Ecjlin/libsvmtools/datasets/ (accessed on 8 November 2025) | Public |
|
| [76] | MNIST Corrupted dataset | Is ideal for drifts involving 15 types of corruptions on handwritten digits. We selected Zigzag, Canny edges, and Glass blur. | Link: https://github.com/google-research/mnist-c (accessed on 8 November 2025) | Public |
|
| [77] | Real Dataset 1. KDDcup99, 2. Shuttle Forest Cover Type (FCT), 3. Electricity Gas Sensor Array Drift (GSD) Synthetic Dataset 1. CRE4V2, 2. CR4, 3. GEAR2C2D, 4. FG2C2D. | Real Dataset
| Link: https://github.com/mvisionai/FedLimited (accessed on 8 November 2025) Link: https://archive.ics.uci.edu/datasets (accessed on 8 November 2025) Link: https://sites.google.com/site/nonstationaryarchive/datasets (accessed on 8 November 2025) Link: https://moa.cms.waikato.ac.nz/datasets/ (accessed on 8 November 2025) | Public |
|
| [78] | 1. Synthetic dataset for training. 2. For testing, Walking Recognition Dataset (WRD) | Synthetic for training. The authors developed an Android app to continuously log inertial data (accelerometer and gyroscope). WRD is a fully labeled dataset with recordings from 77 people, yielding nearly 70,000 training patterns and 8000 for testing after feature extraction. |
| Private |
|
| [79] | 1. Breast cancer histology image classification. 2. Histology nuclei segmentation (MoNuSAC2020, MoNuSAC2018, TNBC) Prostate MRI segmentation |
|
| Not Available |
|
| [80] | Uber Fares Dataset |
| Link: https://www.kaggle.com/datasets/yasserh/uber-fares-dataset (accessed on 8 November 2025) | Public |
|
| [81] | 1. CIFAR10-C, 2. Digit, 3. Fairface, 4. Office-Home, 5. PACS, 6. VLCS, 7. DomainNet. |
|
| Not Available |
|
| [82] | For Senario were generated: 1. A.1, 2. A.2, 3. B.1, 4. B.2. | A.1 No global and no local drift. A.2 No global, but local drift. B.1 Sudden global drift because of previously unseen distribution. B.2 Sudden global drift because of disappearing distribution. |
| private |
|
| [83] | The UK Power Networks led Low Carbon London project | It consists of 5567 households in London, representing a balanced sample that is representative of the Greater London population, with a 30 min granularity between November 2011 and February 2014. | Link: https://data.london.gov.uk/dataset (accessed on 8 November 2025) | Public |
|
| [84] | 1. UAH-DriveSet dataset 2. Smartphone HAR dataset 3. MNIST dataset | Includes driving histories of six drivers simulating aggressive, drowsy, and normal patterns. Records human activities of 30 volunteers across six activities: walking, walking upstairs, walking downstairs, sitting, standing, and laying. Contains handwritten digits 0 to 9. | Link: https://rasbt.github.io/mlxtend/user_guide/data/loadlocal_mnist/ (accessed on 8 November 2025) | Public |
|
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Mahdi, O.A.; Pardede, E.; Bevinakoppa, S.; Ali, N. Federated Learning Under Concept Drift: A Systematic Survey of Foundations, Innovations, and Future Research Directions. Electronics 2025, 14, 4480. https://doi.org/10.3390/electronics14224480
Mahdi OA, Pardede E, Bevinakoppa S, Ali N. Federated Learning Under Concept Drift: A Systematic Survey of Foundations, Innovations, and Future Research Directions. Electronics. 2025; 14(22):4480. https://doi.org/10.3390/electronics14224480
Chicago/Turabian StyleMahdi, Osamah A., Eric Pardede, Savitri Bevinakoppa, and Nawfal Ali. 2025. "Federated Learning Under Concept Drift: A Systematic Survey of Foundations, Innovations, and Future Research Directions" Electronics 14, no. 22: 4480. https://doi.org/10.3390/electronics14224480
APA StyleMahdi, O. A., Pardede, E., Bevinakoppa, S., & Ali, N. (2025). Federated Learning Under Concept Drift: A Systematic Survey of Foundations, Innovations, and Future Research Directions. Electronics, 14(22), 4480. https://doi.org/10.3390/electronics14224480

