Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (451)

Search Parameters:
Keywords = source data privacy

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
53 pages, 2892 KB  
Review
Federated Learning in Edge Computing: Vulnerabilities, Attacks, and Defenses—A Survey
by Sahar Saleh Alhawas and Murad A. Rassam
Sensors 2026, 26(4), 1275; https://doi.org/10.3390/s26041275 - 15 Feb 2026
Viewed by 339
Abstract
Federated Learning (FL), a distributed machine learning framework, enables collaborative model training across multiple devices without sharing raw data, thereby preserving privacy and reducing communication costs. When combined with Edge Computing (EC), FL brings computations closer to data sources, enabling low-latency, real-time decision-making [...] Read more.
Federated Learning (FL), a distributed machine learning framework, enables collaborative model training across multiple devices without sharing raw data, thereby preserving privacy and reducing communication costs. When combined with Edge Computing (EC), FL brings computations closer to data sources, enabling low-latency, real-time decision-making in resource-constrained environments. However, this decentralization introduces several vulnerabilities, including data poisoning, backdoor attacks, inference leaks, and Byzantine behaviors, which are worsened by the heterogeneity of edge devices and their intermittent connectivity. This survey presents a comprehensive review of the intersection of FL and EC, focusing on vulnerabilities, attack vectors, and defense mechanisms. We analyze existing methods for robust aggregation, anomaly detection, differential privacy, and secure aggregation, with a focus on their feasibility within edge environments. Additionally, we identify open research challenges, such as scalability, resilience to heterogeneity, and energy-efficient defenses, and provide insights into the evolving landscape of FL in edge computing. This review aims to inform future research on enhancing the security, privacy, and efficiency of FL systems deployed in real-world edge environments. Full article
(This article belongs to the Section Internet of Things)
21 pages, 551 KB  
Article
Agentic RAG for Maritime AIoT: Natural Language Access to Structured Data
by Oxana Sachenkova, Melker Andreasson, Dongzhu Tan and Alisa Lincke
Sensors 2026, 26(4), 1227; https://doi.org/10.3390/s26041227 - 13 Feb 2026
Viewed by 223
Abstract
Maritime operations are increasingly reliant on sensor data to drive efficiency and enhance decision-making. However, despite rapid advances in large language models, including expanded context windows and stronger generative capabilities, critical industrial settings still require secure, role-constrained access to enterprise data and explicit [...] Read more.
Maritime operations are increasingly reliant on sensor data to drive efficiency and enhance decision-making. However, despite rapid advances in large language models, including expanded context windows and stronger generative capabilities, critical industrial settings still require secure, role-constrained access to enterprise data and explicit limitation of model context. Retrieval-Augmented Generation (RAG) remains essential to enforce data minimization, preserve privacy, support verifiability, and meet regulatory obligations by retrieving only permissioned, provenance-tracked slices of information at query time. However, current RAG solutions lack robust validation protocols for numerical accuracy for high-stakes industrial applications. This paper introduces Lighthouse Bot, a novel Agentic RAG system specifically designed to provide natural-language access to complex maritime sensor data, including time-series and relational sensor data. The system addresses a critical need for verifiable autonomous data analysis within the Artificial Intelligence of Things (AIoT) domain, which we explore through a case study on optimizing ferry operations. We present a detailed architecture that integrates a Large Language Model with a specialized database and coding agents to transform natural language into executable tasks, enabling core AIoT capabilities such as generating Python code for time-series analysis, executing complex SQL queries on relational sensor databases, and automating workflows, while keeping sensitive data outside the prompt and ensuring auditable, policy-aligned tool use. To evaluate performance, we designed a test suite of 24 questions with ground-truth answers, categorized by query complexity (simple, moderate, complex) and data interaction type (retrieval, aggregation, analysis). Our results show robust, controlled data access with high factual fidelity: the proprietary Claude 3.7 achieved close to 90% overall factual correctness, while the open-source Qwen 72B achieved 66% overall and 99% on simple retrieval and aggregation queries. These findings underscore the need for a secure limited-context RAG in maritime AIoT and the potential for cost-effective automation of routine exploratory analyses. Full article
Show Figures

Figure 1

29 pages, 2919 KB  
Article
A Model-Driven Engineering Approach to AI-Powered Healthcare Platforms
by Mira Raheem, Neamat Eltazi, Michael Papazoglou, Bernd Krämer and Amal Elgammal
Informatics 2026, 13(2), 32; https://doi.org/10.3390/informatics13020032 - 11 Feb 2026
Viewed by 224
Abstract
Artificial intelligence (AI) has the potential to transform healthcare by supporting more accurate diagnoses and personalized treatments. However, its adoption in practice remains constrained by fragmented data sources, strict privacy rules, and the technical complexity of building reliable clinical systems. To address these [...] Read more.
Artificial intelligence (AI) has the potential to transform healthcare by supporting more accurate diagnoses and personalized treatments. However, its adoption in practice remains constrained by fragmented data sources, strict privacy rules, and the technical complexity of building reliable clinical systems. To address these challenges, we introduce a model-driven engineering (MDE) framework designed specifically for healthcare AI. The framework relies on formal metamodels, domain-specific languages (DSLs), and automated transformations to move from high-level specifications to running software. At its core is the Medical Interoperability Language (MILA), a graphical DSL that enables clinicians and data scientists to define queries and machine learning pipelines using shared ontologies. When combined with a federated learning architecture, MILA allows institutions to collaborate without exchanging raw patient data, ensuring semantic consistency across sites while preserving privacy. We evaluate this approach in a multi-center cancer immunotherapy study. The generated pipelines delivered strong predictive performance, with best-performing models achieving up to 98.5% accuracy on selected prediction tasks, while substantially reducing manual coding effort. These findings suggest that MDE principles—metamodeling, semantic integration, and automated code generation—can provide a practical path toward interoperable, reproducible, and reliable digital health platforms. Full article
(This article belongs to the Section Health Informatics)
Show Figures

Figure 1

23 pages, 473 KB  
Article
Zero-Knowledge Proof Extensions for Digital Product Passports in Sustainability Claims Reporting and Verifications
by Chibuzor Udokwu and Stefan Craß
Electronics 2026, 15(4), 745; https://doi.org/10.3390/electronics15040745 - 10 Feb 2026
Viewed by 161
Abstract
Digital product passports outline information about a product’s lifecycle, circularity, and sustainability-related data. Sustainability data contains claims about carbon footprint, recycled material composition, ethical sourcing of production materials, etc. Also, upcoming regulatory directives require companies to disclose this type of information. However, current [...] Read more.
Digital product passports outline information about a product’s lifecycle, circularity, and sustainability-related data. Sustainability data contains claims about carbon footprint, recycled material composition, ethical sourcing of production materials, etc. Also, upcoming regulatory directives require companies to disclose this type of information. However, current sustainability reporting practices face challenges, such as greenwashing, where companies make incorrect claims that are difficult to verify. There is also a challenge of disclosing sensitive production information when other stakeholders, such as consumers or other economic operators, wish to verify sustainability claims independently. Zero-knowledge proofs (ZKPs) provide a cryptographic system for verifying statements without revealing sensitive information. The goal of this research paper is to explore ZKP cryptography, trust models, and implementation concepts for extending DPP capability in privacy-aware reporting and verification of sustainability claims in products. To achieve this goal, first, formal representations of sustainability claims are provided. Then, a data matrix and trust model for generating proofs are developed. An interaction sequence is provided to show different components for various proof generation and verification scenarios for sustainability claims. Lastly, the paper provides a circuit template for the proof generation of an example claim and a credential structure for their input data validation. The proposed approach is assessed using a scenario-based evaluation to check the performance metrics for data credential verification and proof generation for verifying material composition in a product. Full article
Show Figures

Figure 1

18 pages, 285 KB  
Review
Large Language Models in the Assessment and Care of Internet Gaming Disorder
by Athanasios Kranas and Vassilios S. Verykios
AI Med. 2026, 1(1), 6; https://doi.org/10.3390/aimed1010006 - 9 Feb 2026
Viewed by 275
Abstract
Internet Gaming Disorder (IGD), recognized in the International Classification of Diseases (ICD-11), affects millions—especially adolescents and young adults—and poses challenges that invite scalable innovations in care. This narrative review examined how Large Language Models (LLMs) could support IGD prevention, assessment, treatment, and research. [...] Read more.
Internet Gaming Disorder (IGD), recognized in the International Classification of Diseases (ICD-11), affects millions—especially adolescents and young adults—and poses challenges that invite scalable innovations in care. This narrative review examined how Large Language Models (LLMs) could support IGD prevention, assessment, treatment, and research. We conducted targeted searches of PubMed, Scopus, Google Scholar, and IEEE Xplore for 2010–October 2025, supplemented by backward/forward citation chasing. English, peer-reviewed clinical, methodological, and review work was prioritized. As a narrative review, we did not apply PRISMA or perform quantitative synthesis. In total, we synthesized over 50 sources. We analyzed peer-reviewed, IGD-specific AI/ML studies with explicit reporting of training approaches, validation/performance, dataset size, and model openness. While preliminary improvements have been observed in digital-health trials for depression, anxiety, and substance use, we emphasize that no IGD-specific LLM therapeutic trials exist to date; thus, evidence regarding treatment efficacy discussed here is extrapolated from these adjacent disorders. Evidence spans transformer-embedding text screening (r ≈ 0.48) and multimodal classification using EEG or fNIRS (accuracy ≈ 71–88%). Sample sizes ranged from 40 to 417 participants. Notably, most implementations remain research-only, lacking public code or data. Principal concerns include privacy and data governance, algorithmic bias, inconsistent crisis-escalation performance, and a nascent clinical evidence base. We conclude that LLMs may augment—but should not replace—human clinicians; future potential lies in hybrid human–AI pathways, multimodal integrations with wearables and gaming APIs, and rigorous prospective trials to establish safety, effectiveness, and equity in IGD care. Full article
25 pages, 6594 KB  
Article
Blockchain-Enabled Microgrid IoT with Accurate Predictions of Renewable Energy and Electricity Load Using LevySSA-LSTM-GRU
by Yuting Sun, Zhipeng Chang, Jianan Yu and Zongxiang Chen
Sustainability 2026, 18(3), 1653; https://doi.org/10.3390/su18031653 - 5 Feb 2026
Viewed by 193
Abstract
Smart microgrid is promising in providing a more affordable, efficient, and sustainable energy solution with increasing energy production from distributed renewable sources and diverse household electricity usage with large amounts of connected smart devices. Accurate prediction of the household electricity load and renewable [...] Read more.
Smart microgrid is promising in providing a more affordable, efficient, and sustainable energy solution with increasing energy production from distributed renewable sources and diverse household electricity usage with large amounts of connected smart devices. Accurate prediction of the household electricity load and renewable energy production plays a significant role in achieving optimized efficiency of the microgrid. Meanwhile, the privacy and security of data sharing over the smart grid are crucial. This paper proposes a blockchain-enabled microgrid Internet of Things (MIoT) with accurate predictions of renewable energy production and household electricity load. The blockchain framework can guarantee the privacy and security of data sharing over the microgrid. An improved model by stacking long short-term memory (LSTM) and gated recurrent units (GRUs) is proposed for energy generation and electricity load predictions using historical data in the microgrid and the weather forecasting data. The sparrow search algorithm optimized by Levy flights (LevySSA) is used to optimize the hyperparameters of the stacked LSTM-GRU method. The experimental results verify the accuracy and robustness of the proposed method in the prediction of electricity load and renewable energy production for effective smart microgrid operation. For PV forecasting, the proposed LevySSA-LSTM-GRU achieves nRMSE = 0.0535, nMAE = 0.0455, and R2 = 0.9898, outperforming the strongest baseline. For load forecasting, averaged over four test intervals, it yields nRMSE = 0.1034, nMAE = 0.0836, with R2 = 0.9340, demonstrating consistent superiority compared with conventional baseline models. Overall, the proposed framework enables secure data sharing and high-accuracy forecasting, offering strong potential to support real-time energy management and operational optimization in smart microgrids. Full article
Show Figures

Figure 1

34 pages, 2216 KB  
Review
Big Data Analytics and AI for Consumer Behavior in Digital Marketing: Applications, Synthetic and Dark Data, and Future Directions
by Leonidas Theodorakopoulos, Alexandra Theodoropoulou and Christos Klavdianos
Big Data Cogn. Comput. 2026, 10(2), 46; https://doi.org/10.3390/bdcc10020046 - 2 Feb 2026
Viewed by 894
Abstract
In the big data era, understanding and influencing consumer behavior in digital marketing increasingly relies on large-scale data and AI-driven analytics. This narrative, concept-driven review examines how big data technologies and machine learning reshape consumer behavior analysis across key decision-making areas. After outlining [...] Read more.
In the big data era, understanding and influencing consumer behavior in digital marketing increasingly relies on large-scale data and AI-driven analytics. This narrative, concept-driven review examines how big data technologies and machine learning reshape consumer behavior analysis across key decision-making areas. After outlining the theoretical foundations of consumer behavior in digital settings and the main data and AI capabilities available to marketers, this paper discusses five application domains: personalized marketing and recommender systems, dynamic pricing, customer relationship management, data-driven product development and fraud detection. For each domain, it highlights how algorithmic models affect targeting, prediction, consumer experience and perceived fairness. This review then turns to synthetic data as a privacy-oriented way to support model development, experimentation and scenario analysis, and to dark data as a largely underused source of behavioral insight in the form of logs, service interactions and other unstructured records. A discussion section integrates these strands, outlines implications for digital marketing practice and identifies research needs related to validation, governance and consumer trust. Finally, this paper sketches future directions, including deeper integration of AI in real-time decision systems, increased use of edge computing, stronger consumer participation in data use, clearer ethical frameworks and exploratory work on quantum methods. Full article
(This article belongs to the Section Big Data)
Show Figures

Figure 1

25 pages, 1979 KB  
Article
Classifying and Predicting Household Energy Consumption Using Data Analytics and Machine Learning
by David Cordon, Antonio Pita and Angel A. Juan
Algorithms 2026, 19(2), 114; https://doi.org/10.3390/a19020114 - 1 Feb 2026
Viewed by 315
Abstract
Growing pressure on electricity grids and the increasing availability of smart meter data have intensified the need for accurate, interpretable, and scalable methods to analyze and forecast household electricity consumption. In this context, this study presents a general, data-agnostic methodology for predicting and [...] Read more.
Growing pressure on electricity grids and the increasing availability of smart meter data have intensified the need for accurate, interpretable, and scalable methods to analyze and forecast household electricity consumption. In this context, this study presents a general, data-agnostic methodology for predicting and classifying household energy consumption. The proposed workflow unifies data preparation, feature engineering, and machine learning techniques (including clustering, classification, regression, and time series forecasting) within a single interpretable pipeline that supports actionable insights. Rather than proposing new prediction algorithms, this work contributes a fully reproducible, end-to-end methodological pipeline that enables the controlled evaluation of the impact of contextual variables, customer segmentation, and cold-start conditions on household energy forecasting. A distinctive aspect of the pipeline is the explicit use of household- and dwelling-level contextual variables to derive customer typologies via clustering and to enrich forecasting models. The models are evaluated for predictive accuracy, reliability under varying conditions, and suitability for operational use. The results show that incorporating contextual variables and clustering significantly improves forecasting accuracy, particularly in cold-start scenarios where no historical consumption data are available. Although numerous public datasets of residential electricity consumption exist, they rarely provide, in an openly accessible form, both detailed load histories and rich contextual attributes, while many are subject to privacy or licensing restrictions. To ensure full reproducibility and to enable controlled experiments where contextual variables can be switched on and off, the experiments are conducted on a synthetically generated dataset that reproduces realistic behavior and seasonal usage patterns. However, the proposed methodology is independent of the specific data source and can be directly applied to any real or synthetic dataset with similar structure. The approach enables applications such as short- and long-term demand forecasting, estimation of household energy costs, and forecasting demand for new customers. These findings demonstrate that the proposed pipeline provides a transparent and effective framework for end-to-end analysis of household electricity consumption. Full article
(This article belongs to the Section Algorithms for Multidisciplinary Applications)
Show Figures

Figure 1

44 pages, 2025 KB  
Review
Precision Farming with Smart Sensors: Current State, Challenges and Future Outlook
by Bonface O. Manono, Boniface Mwami, Sylvester Mutavi and Faith Nzilu
Sensors 2026, 26(3), 882; https://doi.org/10.3390/s26030882 - 29 Jan 2026
Cited by 3 | Viewed by 1232
Abstract
The agricultural sector, a vital industry for human survival and a primary source of food and raw materials, faces increasing pressure due to global population growth and environmental strains. Productivity, efficiency, and sustainability constraints are preventing traditional farming methods from adequately meeting the [...] Read more.
The agricultural sector, a vital industry for human survival and a primary source of food and raw materials, faces increasing pressure due to global population growth and environmental strains. Productivity, efficiency, and sustainability constraints are preventing traditional farming methods from adequately meeting the growing demand for food. Precision farming has emerged as a transformative paradigm to address these issues. It integrates advanced technologies to improve decision making, optimize yield, and conserve resources. This approach leverages technologies such as wireless sensor networks, the Internet of Things (IoT), robotics, drones, artificial intelligence (AI), and cloud computing to provide effective and cost-efficient agricultural services. Smart sensor technologies are foundational to precision farming. They offer crucial information regarding soil conditions, plant growth, and environmental factors in real time. This review explores the status, challenges, and prospects of smart sensor technologies in precision farming. The integration of smart sensors with the IoT and AI has significantly transformed how agricultural data is collected, analyzed, and utilized to optimize yield, conserve resources, and enhance overall farm efficiency. The review delves into various types of smart sensors used, their applications, and emerging technologies that promise to further innovate data acquisition and decision making in agriculture. Despite progress, challenges persist. They include sensor calibration, data privacy, interoperability, and adoption barriers. To fully realize the potential of smart sensors in ensuring global food security and promoting sustainable farming, the challenges need to be addressed. Full article
(This article belongs to the Section Smart Agriculture)
Show Figures

Figure 1

20 pages, 646 KB  
Article
From Framework to Reliable Practice: End-User Perspectives on Social Robots in Public Spaces
by Samson Ogheneovo Oruma, Ricardo Colomo-Palacios and Vasileios Gkioulos
Systems 2026, 14(2), 137; https://doi.org/10.3390/systems14020137 - 29 Jan 2026
Viewed by 233
Abstract
As social robots increasingly enter public environments, their acceptance depends not only on technical robustness but also on ethical integrity, accessibility, transparency, and consistent system behaviour across diverse users. This paper reports an in situ pilot deployment of an ARI social robot functioning [...] Read more.
As social robots increasingly enter public environments, their acceptance depends not only on technical robustness but also on ethical integrity, accessibility, transparency, and consistent system behaviour across diverse users. This paper reports an in situ pilot deployment of an ARI social robot functioning as a university receptionist, designed and implemented in alignment with the SecuRoPS framework for secure, ethical, and reliable social robot deployment. Thirty-five students and staff interacted with the robot in a real public setting and provided structured feedback on safety, privacy, usability, accessibility, ethical transparency, and perceived reliability. The results indicate strong user confidence in physical safety, data protection, and regulatory compliance while revealing persistent challenges related to accessibility and interaction dynamics. These findings show that reliability in public-facing robotic systems extends beyond fault-free operation to include equitable and consistent user experience across contexts. Beyond reporting empirical outcomes, the study contributes in three key ways. First, it demonstrates a reproducible method for operationalising lifecycle governance frameworks in real-world deployments. Second, it provides new empirical insights into how trust, accessibility, and transparency are experienced by end users in public spaces. Third, it delivers a publicly available, open-source GitHubrepository containing reusable templates for ARI robot applications developed using the PAL Robotics ARI SDK (v23.12), lowering technical entry barriers and supporting reproducibility. By integrating empirical evaluation with practical system artefacts, this work advances research on reliable intelligent environments and provides actionable guidance for the responsible deployment of social robots in public spaces. Full article
Show Figures

Figure 1

17 pages, 1882 KB  
Article
Metadata-Based Privacy Assessment for Mobile mHealth
by Alejandro Pérez-Fuente, M. Mercedes Martínez-González, Amador Aparicio and Pablo A. Criado-Lozano
Sensors 2026, 26(3), 870; https://doi.org/10.3390/s26030870 - 28 Jan 2026
Viewed by 336
Abstract
The widespread adoption of mobile health applications has increased the volume of sensitive personal and physiological data processed through interconnected devices. Ensuring privacy compliance in this context remains a challenge, as existing app stores and privacy labeling systems rely heavily on self-declared information. [...] Read more.
The widespread adoption of mobile health applications has increased the volume of sensitive personal and physiological data processed through interconnected devices. Ensuring privacy compliance in this context remains a challenge, as existing app stores and privacy labeling systems rely heavily on self-declared information. App-PI is a data-driven ecosystem designed to offer end users with tools they can easily manage and privacy researchers with structured and reliable app metadata. It is designed to automate the collection, analysis, and visualization of privacy-related metadata from mobile applications. Heterogeneous data sources are integrated into a unified repository (App-PIMD), enabling the empirical assessment of privacy risks. The data flow design is critical to ensure that the data used to assess privacy impact is of good quality, as well as the privacy indicators that end users will be offered. It is shown on a popular mHealth application, demonstrating the importance of data flow design in order to be able to obtain, from documents and files created for consumption by an operating system, a set of data and tools ready for consumption by the true recipients of health apps: people. Full article
(This article belongs to the Special Issue Internet of Things, Big Data and Smart Systems II)
Show Figures

Figure 1

38 pages, 6181 KB  
Article
An AIoT-Based Framework for Automated English-Speaking Assessment: Architecture, Benchmarking, and Reliability Analysis of Open-Source ASR
by Paniti Netinant, Rerkchai Fooprateepsiri, Ajjima Rukhiran and Meennapa Rukhiran
Informatics 2026, 13(2), 19; https://doi.org/10.3390/informatics13020019 - 26 Jan 2026
Viewed by 530
Abstract
The emergence of low-cost edge devices has enabled the integration of automatic speech recognition (ASR) into IoT environments, creating new opportunities for real-time language assessment. However, achieving reliable performance on resource-constrained hardware remains a significant challenge, especially on the Artificial Internet of Things [...] Read more.
The emergence of low-cost edge devices has enabled the integration of automatic speech recognition (ASR) into IoT environments, creating new opportunities for real-time language assessment. However, achieving reliable performance on resource-constrained hardware remains a significant challenge, especially on the Artificial Internet of Things (AIoT). This study presents an AIoT-based framework for automated English-speaking assessment that integrates architecture and system design, ASR benchmarking, and reliability analysis on edge devices. The proposed AIoT-oriented architecture incorporates a lightweight scoring framework capable of analyzing pronunciation, fluency, prosody, and CEFR-aligned speaking proficiency within an automated assessment system. Seven open-source ASR models—four Whisper variants (tiny, base, small, and medium) and three Vosk models—were systematically benchmarked in terms of recognition accuracy, inference latency, and computational efficiency. Experimental results indicate that Whisper-medium deployed on the Raspberry Pi 5 achieved the strongest overall performance, reducing inference latency by 42–48% compared with the Raspberry Pi 4 and attaining the lowest Word Error Rate (WER) of 6.8%. In contrast, smaller models such as Whisper-tiny, with a WER of 26.7%, exhibited two- to threefold higher scoring variability, demonstrating how recognition errors propagate into automated assessment reliability. System-level testing revealed that the Raspberry Pi 5 can sustain near real-time processing with approximately 58% CPU utilization and around 1.2 GB of memory, whereas the Raspberry Pi 4 frequently approaches practical operational limits under comparable workloads. Validation using real learner speech data (approximately 100 sessions) confirmed that the proposed system delivers accurate, portable, and privacy-preserving speaking assessment using low-power edge hardware. Overall, this work introduces a practical AIoT-based assessment framework, provides a comprehensive benchmark of open-source ASR models on edge platforms, and offers empirical insights into the trade-offs among recognition accuracy, inference latency, and scoring stability in edge-based ASR deployments. Full article
Show Figures

Figure 1

25 pages, 681 KB  
Systematic Review
A Systematic Review of Topic Modeling Techniques for Electronic Health Records
by Iqra Mehmood, Zoya Zahra, Sarah Iqbal, Ayman Qahmash and Ijaz Hussain
Healthcare 2026, 14(2), 282; https://doi.org/10.3390/healthcare14020282 - 22 Jan 2026
Viewed by 399
Abstract
Background: Electronic Health Records (EHRs) are a rich source of clinical information used for patient monitoring, disease progression analysis, and treatment outcome assessment. However, their large-scale, heterogeneity, and temporal characteristics make them difficult to analyze. Topic modeling has emerged as an effective [...] Read more.
Background: Electronic Health Records (EHRs) are a rich source of clinical information used for patient monitoring, disease progression analysis, and treatment outcome assessment. However, their large-scale, heterogeneity, and temporal characteristics make them difficult to analyze. Topic modeling has emerged as an effective method to extract latent structures, detect disease characteristics, and trace patient trajectories in EHRs. Recent neural and transformer-based approaches such as BERTopic has significantly improved coherence, scalability, and domain adaptability compared to earlier probabilistic models. Methods: This Systematic Literature Review (SLR) examines topic modeling and its variants applied to EHR data over the past decade. We follow the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) framework to identify, screen, and select relevant studies. The reviewed techniques span traditional probabilistic models, neural embedding-based methods, and temporal extensions designed for pathway and sequence modeling in clinical data. Results: The synthesis covers trends in publication patterns, dataset usage, application domains, and methodological contributions. The reviewed literature demonstrates strengths across different modeling families, while also highlighting challenges related to scalability, interpretability, temporal complexity, and privacy when analyzing large-scale EHRs. Conclusions: Topic modeling continues to play a central role in understanding temporal patterns and latent structures in EHRs. This review also outlines future possibilities for integrating topic modeling with Agentic AI and large language models to enhance clinical decision-making. Overall, this SLR provides researchers and practitioners with a consolidated foundation on temporal topic modeling in EHRs and its potential to advance data-driven healthcare. Full article
(This article belongs to the Special Issue AI-Driven Healthcare Insights)
Show Figures

Figure 1

22 pages, 401 KB  
Article
Federated Learning for Intrusion Detection Under Class Imbalance: A Multi-Domain Ablation Study with Per-Client SMOTE
by Atike Demirbaş Paray and Murat Aydos
Appl. Sci. 2026, 16(2), 801; https://doi.org/10.3390/app16020801 - 13 Jan 2026
Viewed by 303
Abstract
Federated learning (FL) enables privacy-preserving collaboration for Network Intrusion Detection Systems (NIDSs), but its effectiveness under heterogeneous traffic, severe class imbalance, and domain shift remains insufficiently characterized. We evaluate FL in two settings: (i) single-domain training on CICIDS-2017, InSDN/OVS, and 5G-NIDD with cross-domain [...] Read more.
Federated learning (FL) enables privacy-preserving collaboration for Network Intrusion Detection Systems (NIDSs), but its effectiveness under heterogeneous traffic, severe class imbalance, and domain shift remains insufficiently characterized. We evaluate FL in two settings: (i) single-domain training on CICIDS-2017, InSDN/OVS, and 5G-NIDD with cross-domain testing, and (ii) multi-domain training that learns a unified model across enterprise and Software-Defined Network (SDN) traffic. Using consistent preprocessing and controlled ablations over balancing strategy, loss function, and client sampling, we find that dataset structure (class separability) largely determines single-domain FL gains. On datasets with lower separability, FL with Per-Client Synthetic Minority Over-sampling Technique (SMOTE) substantially improves Macro-F1 over centralized baselines, while well-separated datasets show limited benefit. However, single-domain models degrade sharply under domain shift, showing substantial degradation in cross-domain transfer. To mitigate this, we combine multi-domain FL with AutoEncoder pretraining and achieve 77% Macro-F1 across environments, demonstrating that FL can learn domain-invariant representations when trained on diverse traffic sources. Overall, our results indicate that Per-Client SMOTE is the preferred balancing strategy for federated NIDS, and that multi-domain training is often necessary when deployment environments differ from training data. Full article
Show Figures

Figure 1

25 pages, 540 KB  
Article
Pricing Incentive Mechanisms for Medical Data Sharing in the Internet of Things: A Three-Party Stackelberg Game Approach
by Dexin Zhu, Zhiqiang Zhou, Huanjie Zhang, Yang Chen, Yuanbo Li and Jun Zheng
Sensors 2026, 26(2), 488; https://doi.org/10.3390/s26020488 - 12 Jan 2026
Viewed by 372
Abstract
In the context of the rapid growth of the Internet of Things and mobile health services, sensors and smart wearable devices are continuously collecting and uploading dynamic health data. Together with the long-term accumulated electronic medical records and multi-source heterogeneous clinical data from [...] Read more.
In the context of the rapid growth of the Internet of Things and mobile health services, sensors and smart wearable devices are continuously collecting and uploading dynamic health data. Together with the long-term accumulated electronic medical records and multi-source heterogeneous clinical data from healthcare institutions, these data form the cornerstone of intelligent healthcare. In the context of medical data sharing, previous studies have mainly focused on privacy protection and secure data transmission, while relatively few have addressed the issue of incentive mechanisms. However, relying solely on technical means is insufficient to solve the problem of individuals’ willingness to share their data. To address this challenge, this paper proposes a three-party Stackelberg-game-based incentive mechanism for medical data sharing. The mechanism captures the hierarchical interactions among the intermediator, electronic device users, and data consumers. In this framework, the intermediator acts as the leader, setting the transaction fee; electronic device users serve as the first-level followers, determining the data price; and data consumers function as the second-level followers, deciding on the purchase volume. A social network externality is incorporated into the model to reflect the diffusion effect of data demand, and the optimal strategies and system equilibrium are derived through backward induction. Theoretical analysis and numerical experiments demonstrate that the proposed mechanism effectively enhances users’ willingness to share data and improves the overall system utility, achieving a balanced benefit among the cloud platform, electronic device users, and data consumers. This study not only enriches the game-theoretic modeling approaches to medical data sharing but also provides practical insights for designing incentive mechanisms in IoT-based healthcare systems. Full article
(This article belongs to the Section Biomedical Sensors)
Show Figures

Figure 1

Back to TopTop