Scalable IoT-Based Architecture for Continuous Monitoring of Patients at Home: Design and Technical Validation

Ivanov, Rosen

doi:10.3390/computers15030144

Open AccessArticle

Scalable IoT-Based Architecture for Continuous Monitoring of Patients at Home: Design and Technical Validation

by

Rosen Ivanov

Department of Computer Systems and Technologies, Technical University Gabrovo, 5300 Gabrovo, Bulgaria

Computers 2026, 15(3), 144; https://doi.org/10.3390/computers15030144

Submission received: 14 January 2026 / Revised: 16 February 2026 / Accepted: 18 February 2026 / Published: 1 March 2026

(This article belongs to the Section Internet of Things (IoT) and Industrial IoT)

Download

Browse Figures

Review Reports Versions Notes

Abstract

This article presents a scalable IoT-based architecture for continuous and passive monitoring of human behavior in home environments, designed as a technical foundation for future dementia risk assessment systems. The architecture addresses three fundamental challenges: achieving room-level spatial localization without privacy-invasive methods, balancing temporal resolution with bandwidth efficiency in continuous data streams, and enabling multi-institutional model development under GDPR constraints. The system integrates (1) wearable BLE sensors with infrared room-level localization; (2) edge computing gateways with local preprocessing and machine learning; (3) a three-channel data architecture that simultaneously achieves full 1 s temporal resolution for machine learning training, low-latency real-time visualization, and 41.2% network bandwidth reduction; and (4) a federated learning framework enabling collaborative model development without data sharing between institutions. Technical validation in two apartments (three participants, 7 days) demonstrated: 97.6% room-level localization accuracy using infrared beacons; less than 7 s end-to-end latency for 99.5% of critical events; and 98.5% deduplication accuracy in multi-gateway configurations. Federated learning simulation demonstrates algorithmic convergence (84.3% IID, 79.8% non-IID) and workflow feasibility, establishing a foundation for future production deployment. Cost analysis shows approximately €490 for initial implementation and approximately €55 monthly operation, representing substantially lower costs than existing research systems. The work establishes architectural and technical feasibility, as well as system-level economic viability, of continuous home monitoring for behavioral analysis within the evaluated residential scenarios. Clinical validation of diagnostic capabilities through longitudinal studies with validated cognitive assessments and patients with mild cognitive impairment remains to be studied in future work.

Keywords:

edge computing; privacy-preserving architecture; ambient assisted living; wearable sensors; Internet of Things; federated learning; infrared localization; behavioral monitoring

1. Introduction

1.1. Context and Problem

Dementia is a progressive neurological syndrome characterized by cognitive decline, memory loss, spatial disorientation, and behavioral changes that seriously impair quality of life. The World Health Organization [1] projects 152 million people with dementia by 2050. Alzheimer’s disease accounts for 60–70% of cases, followed by vascular dementia (15–20%), Lewy body dementia (10–15%), and frontotemporal dementia (5–10%) [2].

Disease progression advances from mild cognitive impairment (MCI) through mild, moderate, and severe dementia. Critically, MCI represents a transitional phase with maximum therapeutic efficacy before irreversible neurodegeneration occurs. Recent studies demonstrate that tau protein fibrillation—a characteristic pathological process in Alzheimer’s disease—proceeds through mandatory intermediate clustering stages that are potentially reversible through targeted intervention [3]. This discovery establishes clear neurobiological rationale for early detection: a therapeutic window exists before irreversible fibrils form, but only if cognitive decline is identified during the preclinical phase.

Current diagnostic approaches face limitations hindering effective early detection at the population level. Traditional neuropsychological assessments such as the Mini-Mental State Examination (MMSE) and Montreal Cognitive Assessment (MoCA) rely on episodic clinical testing with subjectivity and sensitivity to educational and cultural factors [4]. Although biomarker-based methods—including amyloid positron emission tomography (PET), cerebrospinal fluid (CSF) analysis, and structural magnetic resonance imaging (MRI)—provide objective measurements, their invasiveness, prohibitive cost, and limited availability preclude mass screening [5]. In most cases, dementia is diagnosed only after significant irreversible neurodegeneration has occurred, when therapeutic interventions demonstrate limited efficacy [6].

Research shows that behavioral and physiological changes occur months to years before clinical symptoms become apparent. Minor changes in gait patterns, reduced physical activity, deficits in spatial navigation, and circadian rhythm disturbances are measurable digital biomarkers of early cognitive decline [7]. Wearable sensor technologies enable continuous, objective, and unobtrusive detection of these behavioral patterns in natural home environments.

Despite significant technological advances, several critical factors hinder the translation of sensor-based monitoring into clinical practice. Existing systems operate in controlled laboratory environments rather than real home settings, limiting validity and scalability [8]. Most approaches require active patient participation through periodic cognitive testing or device interaction, creating adherence challenges [9]. Spatial localization methods based on Wi-Fi or Bluetooth Low Energy (BLE) triangulation achieve only zone-level accuracy (typically 3–5 m), insufficient for detecting room-specific disorientation patterns. Continuous sensor data transmission generates significant network traffic and cloud storage costs, hindering the economic viability of large-scale deployment. Regulatory restrictions under General Data Protection Regulation (GDPR) prohibit centralized aggregation of personal health data across institutions, hindering collaborative model development from diverse populations.

Modern architectural solutions in edge computing and federated learning offer potential solutions to these challenges. Edge computing enables local data preprocessing and machine learning on resource-limited devices, reducing latency, bandwidth requirements, and privacy risks [10]. Federated learning frameworks enable collaborative model training across multiple institutions without sharing raw data, enabling GDPR-compliant multi-center research [11,12]. However, these technologies are not integrated into a comprehensive architecture specifically designed for long-term home behavioral monitoring.

This article presents the design and technical validation of scalable Internet of Things (IoT) architecture for continuous, passive behavioral monitoring in home environments. The architecture integrates wearable sensors, room-level infrared localization, dual peripheral gateways with local preprocessing, intelligent multi-threaded data compression, and a federated learning framework. This development focuses exclusively on architectural design and validation of technical feasibility. Clinical validation of diagnostic or prognostic capabilities requires longitudinal studies with validated cognitive assessments and remains future work beyond this article’s scope.

1.2. Research Questions

This work addresses three fundamental research questions:

RQ1: Can room-level spatial localization be achieved in home environments with greater than 95% accuracy using privacy-preserving, infrastructure-light methods?

RF-based methods such as Wi-Fi RTT and BLE RSSI achieve only zone-level accuracy (typically 3–5 m), insufficient for detecting room-specific disorientation patterns characteristic of early dementia. High-precision methods such as Ultra-Wideband require expensive infrastructure (€150–300 per anchor point) and extensive calibration, hindering large-scale deployment. Camera-based systems raise severe privacy concerns unacceptable in home environments. No validated solution exists that simultaneously achieves room-level precision, low infrastructure cost, and privacy preservation for unobtrusive long-term home deployment.

RQ2: Is it architecturally feasible to achieve simultaneous low-latency event delivery (under 10 s), full temporal resolution (1 s granularity), and significant bandwidth reduction (over 40%) in continuous home monitoring systems?

Current systems make binary architectural choices. Either they transmit raw sensor data, achieving full temporal resolution but incurring high bandwidth costs and cloud storage expenses that become prohibitive at scale (typically €80–150 per patient monthly for continuous transmission), or they aggregate and compress data locally, reducing bandwidth requirements but irreversibly losing the temporal information needed for deep learning models that capture long-term behavioral dependencies. No existing architecture has demonstrated that this fundamental trade-off can be avoided through intelligent stream separation and selective compression.

RQ3: Can federated learning enable privacy-preserving multi-institutional model development for continuous behavioral monitoring while complying with GDPR data localization requirements?

Dementia diagnosis requires training on diverse populations to avoid demographic bias and achieve generalizability. However, GDPR Article 45 prohibits cross-border transfer of raw health data without adequate decisions or standard contractual clauses that are difficult to establish for research purposes. Federated learning has been extensively validated for medical imaging and electronic health record data, but its applicability to high-frequency sensor streams with heterogeneous temporal patterns across institutions remains undemonstrated. The challenge is whether federated averaging can achieve model convergence on behavioral time-series data with acceptable communication overheads and without compromising local privacy guarantees.

1.3. Scientific Contributions

This work makes three primary scientific contributions to the field of ambient assisted living and continuous health monitoring:

Contribution 1: A novel three-stream data architecture that solves the temporal resolution–bandwidth efficiency paradox.

We present and validate a data pipeline architecture that simultaneously achieves three previously conflicting requirements: (1) full 1 s temporal resolution for training deep recurrent neural networks on behavioral sequences; (2) low-latency real-time visualization (under 50 milliseconds query response) for clinical dashboards; and (3) 41.2% network bandwidth reduction compared to raw data transmission. The architecture employs intelligent stream separation, where Stream 1 provides lossy statistical aggregates optimized for visualization, Stream 2 uses differential encoding plus gzip compression to preserve complete lossless event sequences, and Stream 3 delivers enriched critical alerts with sub-7 s end-to-end latency. Previous systems faced a fundamental trade-off between preserving full temporal information (required for machine learning) and reducing transmission costs (required for economic viability). Our empirical validation demonstrates that this trade-off can be avoided through selective compression tailored to specific use cases, with no single stream attempting to serve all purposes.

Contribution 2: Quantitative demonstration that infrared room-level localization significantly outperforms RF-based methods for behavioral monitoring applications.

We provide the first direct empirical comparison showing that infrared beacon-based room-level localization achieves 97.6% accuracy versus approximately 85% typical accuracy for BLE-only approaches in identical home environments. More critically, we demonstrate through geometric modeling, Monte Carlo simulation, and real-world validation that infrared technology provides a qualitative advantage. The physical inability of infrared light to penetrate walls and doors eliminates the ambiguity at room boundaries that affects all RF-based methods, which is precisely where spatial disorientation patterns manifest in early dementia. This makes infrared localization uniquely suited for detecting room-specific behavioral anomalies despite comparable aggregate accuracy to best-case BLE fingerprinting scenarios that require extensive calibration and dense beacon deployment.

Contribution 3: We provide a simulation-based evaluation of a federated learning orchestration pipeline for continuous behavioral sensor streams, demonstrating convergence and collaborative performance under controlled IID and non-IID data distributions. The goal is not deployment validation but the assessment of architectural feasibility and methodological constraints for future real-world multi-institutional studies.

Although validated in a simulated environment rather than production deployment, this establishes that the communication pattern is feasible and that behavioral sensor data does not exhibit the same non-IID challenges that plague federated learning on medical images.

1.4. Technical Objectives and Validation Scope

The main objective is to design and technically validate a cost-effective architecture for continuous home monitoring that preserves privacy and serves as a foundation for future dementia risk assessment systems. This is a technology assessment study, not a clinical trial. We explicitly validate technical feasibility and architectural performance, while clinical validation of diagnostic or prognostic capabilities remains to be studied in future work, requiring longitudinal studies with validated cognitive assessments and patient cohorts including individuals with mild cognitive impairment.

Our specific technical objectives include the following:

Designing sensor infrastructure providing room-level localization accuracy above 95% using infrared technology.
Developing a three-stream data architecture with end-to-end latency below 10 s for critical events, bandwidth efficiency (over 70% reduction), and maintaining full-time resolution (1 s granularity).
Implementing edge computing with local machine learning on specialized hardware for real-time anomaly detection without cloud dependency.
Analyzing architectural feasibility of federated learning for joint model development across multiple institutions.
Validating technical performance through real-world residential deployment and analyzing economic viability through cost modeling.

Technical validation involves three healthy volunteers continuously monitored for seven days in two residential apartments with different layouts and sizes. All performance metrics (localization accuracy, latency, and bandwidth reduction) are technical measurements of system behavior, not clinical outcomes or diagnostic accuracy.

The remainder of the paper is organized as follows. Section 2 reviews related work in sensor-based health monitoring, localization technologies, edge computing architectures, and federated learning. Section 3 presents the system architecture, including hardware components, software, and the three-way data channel. Section 4 describes the federated learning design. Section 5 describes technical validation, the experimental setup, and the results. Section 6 discusses architectural advantages, current limitations, and the path to future clinical implementations. Section 7 concludes with contributions and directions for future research.

2. Related Work

2.1. Clinical Context and Sensor-Based Approaches

Sensor monitoring for early diagnosis of dementia has been widely documented. Addae et al. [13] analyze IoT systems, wearable sensors, and ML approaches, with the most effective ML models achieving 96% accuracy but with serious limitations: small samples (average 30–37 participants), generalization problems, and ethical challenges. Thaliath and Pillai [14] demonstrate that non-cognitive sleep disturbances, circadian rhythm disturbances, and motor changes contribute significantly to quality-of-life deterioration in Alzheimer’s disease, with behavioral and psychological symptoms affecting up to 90% of dementia patients.

Ghayvat and Gope [15] evaluated the SAMEDR system in 12 dementia patient homes and 20 healthy adult homes over 43 weeks with heterogeneous sensor networks, achieving specificity of 0.93, F1-score = 0.93, and Matthews correlation coefficient = 0.88. Dementia patients showed significantly great variability in daily activity patterns (especially sleep, agitation, cooking, and hygiene). Deters et al. [16] highlighted methodological fragmentation in agitation prediction systems, with personalized models using multimodal sensor networks achieving AUC of 0.90 for motor agitation and 87–91% accuracy for verbal agitation.

These findings underscore the need for a stable, validated architecture rather than isolated proof-of-concept systems.

2.2. Sensor Modalities and Architectures

Anikwe et al. [17] systematically reviewed eighty-five studies (2014–2021), demonstrating that heterogeneous sensor systems combining ECG, PPG, accelerometers, SpO₂, and temperature sensors are most effective, achieving over 99% accuracy in activity recognition and 91–99.9% accuracy in detecting cardiac abnormalities based on ECG. Challenges such as high energy consumption, privacy and security issues, and limited model interpretability remain.

Gabrielli et al. [18] developed the Pulse AI system, combining wearable and ambient sensors for personalized real-time health anomaly detection. Using a finely tuned UniTS model, they achieved mean F1 = 0.821 ± 0.049—an approximately 22% improvement over the second-best method. In a real-world application with six adult patients over 3 months, 93.75% of 32 detected anomalies were confirmed as clinically significant.

Assaad et al. [19] presented the first multimodal biosensor device integrating 14 sensors for simultaneous real-time monitoring of eighteen health parameters. Through sensor fusion with ESP32 microcontrollers and edge computing (Jetson Xavier NX), they achieved over 90% accuracy for some parameters and over 85% for all, with total power consumption ~3.8 W. Teoh et al. [20] analyzed 69 studies, demonstrating that multimodal models outperform unimodal approaches; in Alzheimer’s detection, accuracy and AUC exceed 90% when combining imaging and clinical data.

Johnson [21] demonstrated that approaches using only environmental sensors achieve 89% sensitivity for fall detection without wearable devices. The system identifies falls by analyzing body height changes within 10 frames (~0.5 s), achieving 91% sensitivity for forward/backward falls, 92% for side falls, and 75% for collapse falls. Bijlani et al. [22] proposed Graph Barlow Twins for detecting adverse events in dementia care, achieving 81% recall and 88% generalizability across three independent cohorts with low computational costs.

2.3. Indoor Localization Technologies

Obeidat et al. [23] present a comprehensive comparison of spatial localization technologies critical for detecting disorientation in early dementia. Ultra-Wideband (UWB) achieves the highest accuracy of 0.01–0.2 m. Visible Light Communication (VLC) provides a similar accuracy of ~0.1 m but requires line of sight. Ultrasonic systems achieve 0.01–1 m but are sensitive to reflected signals. Wi-Fi RSS fingerprinting reaches 1–5 m, while Bluetooth/BLE solutions reach 2–5 m.

Leitch et al. [24] provide a critical comparison of localization technologies: UWB systems achieve 0.03–0.30 m accuracy, often below 10 cm. Wi-Fi accuracy varies significantly—~2.4 m for RSS-based analysis, 0.17–7.6 m for RSS fingerprinting, up to 0.09 m for CSI fingerprinting (requires expensive calibration), and 0.5–2.1 m for Wi-Fi RTT (IEEE 802.11 FTM, sensitive to NLoS). BLE methods show values of 0.6–4.9 m for RSS and 0.1–3.7 m for RSS fingerprinting, with newer BLE AoA techniques promising 0.7 m. Machine learning reduces errors by 30–95% across all technologies: Wi-Fi CSI from 4.8 m to 2.39 m and BLE RSS from 16.6 m to 5.5 m.

Casha [25] confirms these conclusions in a comprehensive review. Hybrid systems combining UWB, Wi-Fi/BLE, and IMU achieve the best balance between accuracy, reliability, and cost, which is particularly important for applications requiring sub-1 m accuracy with low latency.

Room-Level Localization

Biehl et al. [26] demonstrate that Wi-Fi RTT significantly outperforms BLE RSSI for room-level localization: with 11 RTT beacons, they achieve RMSE = 1.275 m and room-level accuracy and precision of 97.99% and F1 = 93.86%. With 22 BLE beacons, results are weaker (F1 = 89.17%, RMSE ≈ 1.53 m). At equal beacon density (11), RTT outperforms BLE by 11.59%. García-Paterna et al. [27] evaluate BLE room-level localization with kNN classification (k = 5) in two real scenarios. In a residential environment (10 rooms, 160 m², and six beacons), they achieve 97.6% accuracy (laptop) and 87.7% (Raspberry Pi) with all beacons; even with three beacons, accuracy remains at 85–88% (laptop) and 74–76% (Raspberry Pi). In a university environment (16 zones, 970 m², up to 10 beacons), they achieve 92–93% accuracy (laptop) and 88% (Raspberry Pi). Errors concentrate between adjacent rooms.

Chen et al. [28] propose solar-powered BLE tags. Using the Dempster–Shafer + fingerprinting (DSFP) algorithm, they achieve over 99% localization accuracy in a real-world environment (seven rooms, ~2000 m²)—one anchor and 10 tags per room. The analysis shows that DSFP corrects errors in “unclear” border areas where DS alone fails.

Karabey Aksakalli and Bayındır [29] introduce a contextual feature—room-to-room transition time—integrated with Wi-Fi fingerprinting for room-level prediction. Testing in three residential environments (130–195 m², ~3000 samples each) shows that increasing transition time leads to statistically significant improvements: with a Wide Neural Network, accuracy increases from ~82% to 94.7%, and with Ensemble Bagged Trees, it reaches 93.1%, with +10–12% improvements in most favorable scenario. Transition time is a strong feature significantly improving room-level accuracy, especially for models capturing nonlinear spatiotemporal dependencies.

Tegou et al. [30] present a low-cost BLE system for medical applications with extremely easy installation requiring no floor plans or technical skills. In a residential environment (80.4 m², five rooms) with five BLE beacons, they achieve 93.75% room identification accuracy (12 errors between adjacent rooms). Increasing the number of beacons from five to eight improves accuracy to 95.31% (nine errors). Despite NLOS conditions, human body attenuation, and Wi-Fi/Bluetooth interference, the system maintains over 93% accuracy.

Localization using infrared light remains understudied in the latest health monitoring literature, despite its potential for perfect room differentiation due to the inability of infrared signals to penetrate walls and doors.

2.4. Edge Computing for Health Monitoring

Alzu’Bi et al. [31] provide a comprehensive review of 1609 publications (2015–2022), showing that edge computing achieves significantly lower latency (milliseconds versus seconds/minutes in cloud computing) and meets critical requirements for latency, throughput, and privacy for real-time health monitoring. Edge computing introduces new risks related to limited device resources and physical attacks.

Islam et al. [32] quantify latency in a hybrid fog–edge architecture for IoMT systems. Experiments with 250,000 IoMT records (2020–2023) show up to 70% latency reduction (20–50 ms hybrid vs. ~100 ms cloud-only), 30% lower power consumption (~50 mW vs. 100 mW cloud), 60% traffic savings (2 MB/s vs. 10 MB/s to cloud), and 84% faster threat detection with mean time ~30 ms vs. ~100 ms for cloud models.

Rancea et al. [33] conducted a systematic review of 72 publications (2020–2024), classifying applications into privacy and security (39%), AI-based optimization (35%), and edge offloading and load balancing (26%). Edge computing significantly reduces latency and network load while improving security through local processing—some solutions demonstrate response times under 2 s (e.g., fall detection) and up to 99% accuracy for AI-based diagnostics. Critical challenges remain, such as device heterogeneity, resource constraints, lack of standards, and the need for effective coordination.

Although edge computing calculations demonstrate clear advantages for latency-sensitive applications, most health monitoring systems still transmit raw sensor data to the cloud, missing opportunities for intelligent compression and local inference. The integration of lightweight ML models on edge computing for real-time anomaly detection remains under-explored in the context of dementia monitoring.

2.5. Federated Learning for Privacy-Preserving Healthcare

Federated learning (FL) is emerging as a promising technique for collaborative model training without centralized processing of sensitive health data. Ali et al. [34] systematically reviewed 100 publications (2020–2024), showing rapid growth from 722 publications (2020) to 5975 (2023). FL use in healthcare systems remains low (~2% vs. 47% for computer science). Quantitative results show 94.36% accuracy for Independent and Identically Distributed (IID) data and 78.4% for non-IID data in cervical cancer diagnosis. The AUROC ranges between 0.679 and 0.818 for acute kidney injury (AKI) and 0.659–0.861 for sepsis in scenarios with electronic health records from multiple hospitals. Risks remain, such as data leakage, model theft and bias; there are also technical limitations—non-IID data, communication costs, and computational complexity.

Pati et al. [35] and Dhade and Shirke [36] present in-depth quantitative comparisons of privacy-preserving aggregation algorithms. Standard FedAvg achieves 82.74% server accuracy but significantly lower overall client accuracy (71.22%). FedMA improves local accuracy to 93.80% but performs poorly globally, with only 60.77% accuracy. The best compromise between local and global accuracy is demonstrated by a homomorphic encryption approach—with 97.6% local accuracy, 94.3% server accuracy—albeit at higher computational costs.

Most FL research focuses on medical images and electronic health record data. FL applications for continuous behavioral monitoring with heterogeneous sensor streams remain unexplored. Real-world implementation studies across institutions are scarce, with most developments limited to simulations.

2.6. Existing Home Monitoring Systems

Several research systems specialize in dementia detection through comprehensive home monitoring. Each has different architectural solutions and limitations.

The ORCATECH platform [37] transforms traditional clinical assessment of cognitive decline through continuous, unobtrusive home monitoring. Through sensor networks and software tools, ORCATECH collects long-term data on key daily behaviors—walking, sleeping, computer use, medication intake, and social engagement—related to cognitive functioning. The main advantage is the transition from episodic clinical “snapshots” to rich longitudinal profiles of individual changes, enabling earlier detection of mild cognitive impairment. Gothard et al. [38] studied participant self-installation of ORCATECH. Analysis of seven participants (average age 73.9 ± 7.1 years; five with MCI) demonstrated feasibility: all successfully self-installed devices with average installation time 20.0 ± 13.7 min for MCI and 23.7 ± 10.9 min for cognitively healthy individuals, showing that MCI is not a significant obstacle. However, architectural details, accuracy, and cost structure are not fully disclosed in the available literature. Studies using ORCATECH data [39] show that algorithms for predicting transition to mild cognitive impairment can achieve approximately 84% classification accuracy using LSTM models trained with long-term daily activity data.

SENDA [8] describes a comprehensive protocol for a cohort study with 240 participants followed up to four times at 8-month intervals over three years. The system combines multimodal assessments covering cognitive (MoCA and CERAD-Plus), motor (gait, balance, and fine motor skills), sensory (vibratory sensitivity, vision, and hearing), and neurophysiological (EEG) indicators. Despite this complexity, SENDA requires active participant involvement in periodic tests and cognitive assessments conducted via tablets, creating significant adherence challenges particularly pronounced in populations with cognitive impairment. As a study protocol, no published quantitative results currently evaluate system effectiveness.

Critical gaps in existing systems are that: (1) most require active participation (cognitive tests and device interaction); (2) episodic data collection misses subtle daily behavioral changes; (3) no system implements federated learning for a multi-institutional model development preserving privacy; and (4) exact costs remain undisclosed, hindering economic feasibility assessments.

2.7. Gap Analysis

Table 1 positions the present work relative to existing systems in terms of critical architectural and validation dimensions.

This work implements state-of-the-art achievements through the following:

Three-stream data architecture—a new pipeline system simultaneously meets requirements for real-time visualization (1 min aggregates), machine learning model training (lossless event logs with 1 s resolution), and low latency for critical events (under 7 s end-to-end). The architecture achieves 41.2% bandwidth reduction while preserving complete temporal information.
Edge ML on standard hardware—local inference on ESP32-S3 microcontrollers with latency under 50 ms is demonstrated, enabling critical event detection (falls) without cloud infrastructure dependency. This provides a real-time response and represents significant advantage over existing systems.
Federated training architecture—a monitoring system designed for inter-institutional collaboration while strictly maintaining confidentiality is presented. The approach addresses GDPR-imposed restrictions. Although validated only in a simulated environment, it establishes conceptual and technical foundations for future real-world implementation.
Economic feasibility—estimated costs of €490 for initial implementation and €55 monthly operation are substantially lower than existing research systems, making the solution suitable for large-scale implementation in real clinical and social settings.

Unlike ORCATECH and SENDA, which have demonstrated clinical value through long-term studies with large cohorts, the present work focuses on technical and architectural foundations. The main contribution is architectural design and the demonstration of technical feasibility, rather than validation of diagnostic or prognostic capabilities.

3. System Architecture

3.1. Architectural Requirements

The design of a continuous home monitoring system requires balancing technical, economic, and regulatory constraints. Six categories of requirements have been defined to guide architectural decisions.

3.1.1. Functional Requirements

The system must provide continuous monitoring without interruptions throughout the 24 h cycle, with latency from detection to notification in critical events (falls) not exceeding 10 s. Spatial localization must achieve room-level accuracy (over 95%) for reliable detection of disorientation patterns. Biomarker extraction should cover at least motor activity, spatial behavior, and circadian rhythms.

3.1.2. Technical Performance Requirements

Wearable devices must provide a minimum of three years of autonomy with infrared scanning enabled, eliminating the need for periodic charging or battery replacement. Data compression at the peripheral gateway must achieve at least 70% reduction without the loss of clinically relevant information, and in multi-gateway configurations, deduplication accuracy must exceed 98%.

3.1.3. Confidentiality and Security Requirements

All data must be encrypted using Transport Layer Security (TLS) 1.3. Federated learning must ensure that raw patient data does not leave institutional boundaries.

3.1.4. Regulatory Requirements

The system must comply with GDPR requirements for processing personal health data, including rights of access, erasure, and portability. Participant consent must be informed, specific, and freely given. Detailed audit logs must be maintained for all data operations. For future certification, the system must be compatible with the European Union’s Medical Device Regulation (MDR).

3.1.5. Economic Viability Requirements

Initial installation costs should not exceed €500 per household, including all hardware components and configuration. Monthly operating costs (cloud infrastructure, storage, and computing) must remain below €60 per patient to ensure economic sustainability at scale.

3.1.6. Clinical Applicability Requirements

The system must function completely passively after initial installation, without requiring active patient participation for data collection. Installation must be possible by non-professional users or family members in under 20 min. All alarm events must contain rich context (pre-event, during event, and post-event) to aid clinical interpretation and decision-making.

3.2. Used Hardware

The architecture uses hybrid sensor infrastructure combining wearable devices (smart badges or Asset Tag 2), stationary infrared beacons (Beam Mini 2, Kontakt.io, New York, NY, USA), and peripheral computing gateways (ESP32-S3-BOX-3, Espressif Systems, Shanghai, China). The necessary hardware is shown in Figure 1.

3.2.1. Wearable Sensor Devices

Industrial wearable devices from Kontakt.io (USA) have been selected and validated for patient tracking in hospital environments. Devices are available in two form factors: a smart badge (65 × 107 × 7.5 mm, 28–32 g) for clothing clips and Asset Tag 2 (49 × 49 × 15 mm, 40 g) for wrists. Both provide IP65 protection against dust and water, operating at 0–55 °C and 10–90% relative humidity.

The built-in lithium-ion battery provides 3–4 years of autonomy depending on the operating mode. With an infrared receiver and BLE transmission active at 0.1 s (telemetry) and 1 s (location), typical autonomy is ~3 years. The battery is non-replaceable, eliminating the risk of incorrect replacement and preserving housing hermeticity.

Sensor equipment includes a three-axis accelerometer for motor activity and gait analysis, a digital temperature sensor with ±0.25 °C accuracy for monitoring temperature and circadian rhythms, an infrared photodiode, and two programmable buttons (blue and red) for emergency alarms. Bluetooth Low Energy 5.0 module provides communication up to 70 m in open space and 15–20 m in typical home environments.

Room-level localization is achieved through Beam Mini 2 infrared beacons emitting modulated signals containing unique room identifiers every 1 s. The wearable device decodes the identifier and includes it in the location packet transmitted over the BLE channel. A critical advantage is that infrared light does not penetrate walls and doors, ensuring unambiguous room identification.

3.2.2. Edge Computing Gateways

The edge computing gateway is implemented using Espressif Systems’ ESP32-S3-BOX-3, a multifunctional device based on an ESP32-S3 microcontroller designed for home automation systems and sensor networks.

The main module includes a dual-core Tensilica Xtensa LX7 processor (TSMC, Phoenix, AZ, USA) with clock speed up to 240 MHz, vector instructions for accelerating machine learning operations, Wi-Fi 4 (802.11 b/g/n) with WPA3 support, and Bluetooth 5.0. External 16 MB SPI flash memory is used for program code and temporary data, and 8 MB PSRAM for buffering sensor measurements and machine learning model parameters. The device features a 4-inch IPS LCD touchscreen display (320 × 240 pixels) for status visualization, local alarms, and configuration. Typical power consumption is 0.5–1 W depending on load and the modules used.

A significant advantage is support for machine learning through ESP-IDF and TensorFlow Lite Micro, allowing neural networks up to 50 KB to execute directly on the microcontroller for anomaly detection with latency of ~50–80 ms. Local inference eliminates the need to send all raw data to the cloud during critical events such as falls.

The gateway operates on mains power via USB-C (5 V/2 A) and a built-in 18650 Li-ion battery (3 Ah, 3.7 V) serves as backup power during outages. With consumption of 0.5–0.7 W, the battery provides 13–18 h autonomous operation, sufficient during power outages.

The architecture provides two peripheral gateways in configurations with large living spaces or multi-story dwellings. This provides (1) better coverage through overlapping BLE zones, eliminating dead spots; (2) continued operation when replacing the battery of one gateway; and (3) higher reliability through redundancy. The two-gateway configuration results in partial data duplication in overlapping zones, requiring intelligent deduplication, which is discussed in Section 3.5.

3.3. Software Architecture

The software architecture follows an edge-to-cloud processing model, providing local processing for latency-critical tasks, and using cloud infrastructure for long-term data storage and model training.

3.3.1. Edge Gateway Software

The edge computing gateway runs specialized software developed on ESP-IDF (version 5.x) using FreeRTOS for real-time multitasking. The software organizes five main tasks coordinated through event groups mechanism, ensuring atomic synchronization and reliable communication between individual system components.

The BLE scanning task actively scans BLE channels at 100 ms intervals with a 75 ms window, achieving a ~75% duty cycle, providing an optimal balance between packet detection reliability and energy efficiency. Scan lifecycle events are processed via callbacks on the Generic Access Profile (GAP) with minimal latency. The parser extracts device identifiers for GAP AD types 0x08 (abbreviated local name) and 0x09 (full local name), applying buffer overflow protection through explicit length checking. Filtering restricts processing to devices with identifier “DEMENTIA” only.

The feature extraction task accumulates raw measurements in 60 s windows and calculates descriptive statistics. For the accelerometer mean, standard deviation, minimum, maximum, median, number of zero crossings, number of peaks, and total signal energy values are calculated. Orientation is assessed by analyzing the gravity vector and determining the percentage of time in upright, lying, and transitional states. Temperature data is aggregated by mean, standard deviation, range, and linear regression slope. Spatial characteristics include dominant room, entry time, cumulative stay duration, number of room transitions, and list of rooms visited.

A local machine learning task implements fall detection using an LSTM model. Input data comes from an accelerometer at a 10 Hz sampling rate. With the configuration of a 3 s time window (30 time steps) and a single-layer LSTM with 16 hidden units, followed by dense layer with eight neurons and output sigmoid neuron, the model requires ~1.4 KB flash memory at INT8 quantization and ~50 KB SRAM, including TensorFlow Lite Micro runtime and working tensor arena. Inference time is under 40 ms, enabling real-time execution, with average power consumption of ~2 mW at 3.3 V. Although a lower frequency results in limited sensitivity to very fast transient movements, the configuration is sufficient for reliable classic fall scenario detection and represents a good compromise between accuracy, memory, and energy efficiency for embedded applications.

The data publishing task serializes processed data in JSON format compatible with the RFC 7159 specification and publishes it to the MQTT message broker with a hierarchical topic structure in thew following format:

institution_id/patient_id/stream_type.

Different Quality of Service (QoS) levels are used: QoS 1 (at least once) for aggregates and events, and QoS 2 (exactly once) for critical alarms to prevent duplicate emergency notifications.

The connectivity management task implements a finite state machine that automatically restores Wi-Fi connection when interrupted, using exponential backoff to optimize retries. When connection is lost, the gateway buffers data locally in flash memory for up to two hours and transmits it immediately after connectivity restoration, ensuring data collection continuity.

The total gateway software size is ~2.5 MB, leaving 13.5 MB free for future extensions and functionalities.

3.3.2. Cloud Infrastructure

The cloud infrastructure is built as isolated instances at the institutional level, where each hospital or research center operates its own environment without sharing computing resources or data (see Figure 2). This isolation model ensures compliance with GDPR requirements regarding the cross-border transfer of health data.

The MQTT broker is implemented with RabbitMQ (MQTT plugin), provided as a managed service through CloudAMQP. The broker uses virtual hosts for logical isolation between institutions, access control lists (ACLs) for enforcing permissions, and mutual authentication with TLS 1.3. Message queues ensure reliability and data preservation during temporary connection interruptions.

The cloud service is implemented as a Python (ver. 3.14) microservice using asyncio for concurrent MQTT stream processing, utilizing paho-mqtt (MQTT client), influxdb-client (SDK for InfluxDB), pymongo (MongoDB driver), redis-py (Redis client), and pydantic libraries for data validation. The service is containerized with Docker and orchestrated through Kubernetes, which provides automatic scaling. During processing, timestamps are normalized to compensate for gateway discrepancies (typically ±5–10 s); JSON structure validation and deduplication are performed for multiple gateways.

The InfluxDB time-series database records and processes aggregated descriptors with 1 min resolution, stored for seven days for monitoring with time tag indexing, ensuring request latency below 50 ms. InfluxQL automatically lowers data resolution for aggregated or historical metric requests.

The MongoDB document database stores compressed event logs, critical alarms, hourly aggregates, and daily summaries. Sensitive attributes are protected by field-level encryption (MongoDB CSFLE).

Redis serves as in-memory cache for keys, supporting the TTL mechanism for automatic deletion of stale records. Atomic check-and-set operations ensure reliable critical alarm deduplication, preventing duplication even during parallel event processing.

The real-time visualization module is accessible via the REST interface (FastAPI) and WebSocket endpoint, providing real-time data updates. The clinical dashboard displays current and short-term trends in aggregated metrics, current location of monitored objects, and critical event history.

3.4. Three-Tier Data Architecture

A key architectural decision is to split the information flow into three specialized channels, optimized for different purposes. This approach avoids unnecessary storage duplication while preserving the necessary information for each specific use case.

3.4.1. Stream 1: Aggregated Statistical Descriptors

The first stream transforms 180 individual measurements (120 telemetry + 60 location packets) for a 60 s window into a single JSON object containing statistical descriptors. The resulting size of ~2000 bytes represents a compression ratio of 38.8% (61.2% reduction) compared to raw data (5160 bytes/minute). Wearable devices send accelerometer data at 10 Hz frequency, used by the local fall detector. Telemetry data transmitted to the cloud has five times lower frequency—2 Hz.

Importantly, this compression is lossy in terms of time resolution: individual 1 s measurements are stored in 1 min descriptors with no possibility of recovering raw data. Such a compromise is justified for applications requiring trend analysis and real-time visualization but unsuitable for machine learning requiring a complete time sequence of signals.

Stream 1 data is stored in InfluxDB for seven days, with configuration optimized for clinical dashboards requiring latency below 50 ms when visualizing current and short-term trends. After this period, data is automatically deleted as it is no longer needed.

3.4.2. Stream 2: Compressed Event Logs

The second stream stores the complete time sequence of discrete measurements with 1 s resolution using differential encoding and gzip compression. Differential encoding exploits temporal correlation between successive measurements: the first event in the window is recorded with all fields, while subsequent ones contain only fields changed from the previous state, reducing stored data without information loss.

After applying differential encoding, which reduces size from 5160 bytes to ~3345 bytes, the event array is compressed with gzip, achieving a 70–75% reduction through repetitiveness in JSON syntax. The final size of ~1036 bytes (including 200 bytes of metadata) represents a compression ratio reduction of 20.1–79.9%. Critically, this compression is lossless—the original 1 s time sequence can be fully restored through decompression and reverse differential encoding.

Compressed logs are decoded upon cloud arrival and stored in MongoDB with indexing by patient and time interval. This stream provides the full temporal resolution needed to train deep recurrent neural networks requiring sequential input data to capture long-term dependencies. When anomalies are detected through aggregated metrics, clinicians can request detailed logs for a specific time window to obtain precise event visualization.

3.4.3. Stream 3: Critical Real-Time Alerts

The third stream is a high-priority channel for events requiring an immediate response. It bypasses 60 s buffering and provides immediate MQTT publication with QoS level 2, ensuring exactly one delivery. Critical events include pressing either button (blue or red), fall detection, and prolonged inactivity exceeding two hours during daytime. Each alert contains contextual data about the state immediately before and after the incident, including a 10 s pre-event window and a 60 s post-event window.

The alert package structure includes a unique identifier, a timestamp with millisecond precision, a severity level (low, medium, high, and critical), event type (button_press, fall_detected, and prolonged_inactivity), and confidence score from the machine learning model. Contextual information covers event location (room_id, room_name, and stay duration), pre-event indicators (average and maximum activity, orientation, and movement presence), event parameters (impact magnitude, duration, and peak accelerations along three axes), and post-event indicators (average activity, immobility duration, orientation, and temperature change).

The alert packet size varies between 400 and 600 bytes, with event frequency typically under five per patient per day. Latency from detection at the peripheral gateway to notification delivery is under 7 s in 99.5% of cases, ensuring a timely response in emergency situations.

3.4.4. Derivative Aggregates

Hourly aggregates are generated by an automated task running every hour at :05 min, extracting 60 min event records from MongoDB (Stream 2) for the previous hour. The task decompresses gzip payload, parses JSON structures, and calculates statistical moments directly from raw 1 s sequences, including average motor activity values, dominant room, total transitions, and temperature trends. The result is serialized as a BSON document in the hourly_aggregates collection in MongoDB with a 90-day retention period. Aggregate size is 2.5–3 KB per hour.

This process transforms detailed event logs into compact intermediate representation optimized for computational efficiency in model training. Hourly aggregates contain 2160 documents per patient for 90 days, compared to 129,600 min-level event records, providing pre-computed features directly applicable to classical algorithms such as Random Forest and Gradient Boosting without additional decompression and feature extraction operations.

Daily summaries are generated by automated nightly tasks running at 00:05, extracting previous 24 one-hour aggregates from the hourly_aggregates collection and calculating daily statistics, including total activity, room occupancy distribution, sleep parameters, and circadian rhythm amplitude. The result is stored as a single BSON document of 2–3 KB in the daily_summaries collection with unlimited storage duration.

Daily summaries perform three key functions: (1) they provide data for building personalized baseline models for each patient, requiring a minimum of 30 days of reference history; (2) they enable the creation of clinical reports and visualization of longitudinal trends over months and years; and (3) they provide training data for long-term predictive models analyzing cognitive decline trajectories over multi-month or multi-year periods.

Table 2 summarizes all data flows and their characteristics.

The total storage used for one patient over 90 days is approximately 5.7 GB. Of this, InfluxDB stores 288 MB of “hot” data for the last 7 days, while MongoDB stores amounts up to 5.4 GB, with event logs dominating, hourly aggregates taking 6.5 MB, and critical alerts and daily summaries taking under 1 MB.

3.5. Deduplication with Multiple Gateways

The architecture uses pair of edge computing gateways to ensure normal operation in large apartments or multi-story houses. Dual-gateway configuration results in systematic data redundancy with 15–20% spatial overlap of BLE coverage areas, typical for stairwells and open spaces.

3.5.1. Deduplication of Aggregated Descriptors (Stream 1)

In configurations with multiple gateways in overlapping areas, the two gateways receive different subsets of BLE packets due to variations in radio frequency propagation. Therefore, cloud deduplication cannot rely on content identity but must determine which gateway provides better coverage for that minute.

The algorithm uses the average received signal strength (RSSI mean) as the primary quality indicator, combined with the number of packets received and a summary assessment of data completeness. The deduplication key is formed from a wearable device MAC address and a minute-level timestamp.

{badge_mac}_agg_{timestamp_minute}.

Upon first entry, the aggregate is recorded in InfluxDB and a record is created in Redis with 120 s TTL, containing an average RSSI value, gateway ID, and number of packets received. Upon second arrival with the same key, the system retrieves a cached RSSI value and compares it with the current one. If the current value is higher, the old record in InfluxDB is replaced with the new aggregate and the cache is updated. If it is lower, the aggregate is discarded. After TTL expires, Redis automatically deletes the key and frees memory for the next minute interval.

This approach ensures that exactly one aggregate record per minute is written to InfluxDB for each patient, always with the best coverage quality.

3.5.2. Merging Event Logs (Stream 2)

Unlike aggregates, event logs from different gateways are complementary rather than redundant. For example, gateway A may receive 172 out of 180 packets with gaps in certain seconds, while gateway B receives 176 out of 180 packets with different losses. Neither log is better; combining them provides maximum time coverage. The system uses a “store and merge on demand” model. During the reception phase, logs from all gateways are stored without deduplication, each marked with a gateway ID and RSSI statistics and recorded as a separate document in MongoDB, indexed by

{badge_mac, timestamp_minute, gateway_id}.

Upon request, all logs for a relevant time window are retrieved and a unified timeline is constructed by combining them by timestamp. For each second, t ∈ [0, 59], if data exists from only one gateway, it is used directly. When data comes from two gateways, merging is performed at the field level: non-zero values are preferred, and in case of conflict, the value with the higher RSSI is selected. The final result is sorted, consolidated in the event sequence, with typical improvement from 170 to 175 measurements per minute to 178–180, recovering partially lost data.

3.5.3. Deduplication of Critical Alarms (Stream 3)

Critical alarm events require the guarantee of single delivery to avoid duplicate notification accumulation. The deduplication key is defined as a combination of a wearable device hardware address, a timestamp with second resolution, and alarm type.

{badge_mac}_alert_{timestamp_sec}_{alert_type}.

The algorithm uses the atomic check-and-set operation in Redis. Upon first arrival, an alarm is recorded in the database, the notification chain is triggered, and the key is set with a 120 s validity period. All subsequent messages with the same key are rejected as duplicates. Choosing a 120 s timeout instead of a 60 s aggregation interval ensures resilience to network delays and temporary clock desynchronization between gateways. The combination of QoS 2 in MQTT protocol with cloud infrastructure deduplication provides double protection and high reliability when processing critical alarm events.

4. Federated Learning Design

4.1. Motivation and Requirements

Traditional centralized approaches to machine learning in healthcare require aggregating data from multiple institutions into a single repository, raising three fundamental problems. First, GDPR regulatory barriers prohibit cross-border transfer of personal health data without explicit guarantees of strong protection. Second, institutional policies often restrict sharing of raw patient data due to concerns about confidentiality, competitive advantage, and legal liability. Third, centralized repositories represent a single point of failure and attractive target for cyberattacks.

Federated Learning (FL) offers solution where, instead of data moving to model, the model moves to the data [34]. Each institution trains the model locally on its own data and shares only abstract parametric updates with a central orchestrator, which aggregates them into a global model. Raw data never leaves institutional boundaries, ensuring compliance with GDPR and institutional policies.

4.2. Architecture of the Federated Learning System

The architecture follows the classic client–server model for federated learning, where a central orchestrator coordinates the learning process without access to raw data, and local training servers (one per institution) perform the calculations on their own data.

4.2.1. Federated Learning Orchestrator

The central orchestrator is a coordination component responsible for three main functions: (1) distributing the global model to participating institutions at the beginning of each training round; (2) aggregating parametric updates received from institutions; and (3) managing the lifecycle of the federated process, including client selection, hyperparameter configuration, and convergence criteria.

Orchestrators never have access to raw data, only to weight parameters of trained models. This ensures that even if the orchestrator is compromised, individual patient records cannot be obtained. The orchestrator is located on a virtual cloud machine, accessible via HTTPS endpoint with two-way TLS authentication. All communications use gRPC protocol over HTTP/2 transport with TLS 1.3 encryption and mutual certificate authentication.

4.2.2. Local Training Servers

Each participating institution maintains a local training server performing three basic operations for each federated training round:

Feature extraction from MongoDB hourly aggregates or event logs collections for the training set. Features include statistical moments of motor activity, spatial indicators, circadian parameters, and behavioral anomalies (episodes of long-term immobility and nocturnal activity).
Local model training on the institutional dataset for a fixed number of epochs (typically 5–10). A deep recurrent neural network (LSTM) architecture with two layers of 64 and 32 units, dropout layers with probability 0.3 for regularization, and a dense output layer for risk classification are used. The model is trained with the Adam optimizer (learning rate = 0.001) and binary cross-entropy loss.
Parametric update calculation: After local training, the server calculates the difference between local weights $w_{i}$ and global weights $w$ :

$Δ w_{i} = w_{i} - w .$

(1)
This parametric update (typically ~200 KB for the LSTM model) is serialized and sent to the orchestrator via a secure gRPC channel.

4.2.3. Federated Averaging Algorithm

The orchestrator aggregates parametric updates using the classic federated averaging (FedAvg) algorithm. The global model is updated according to the following formula:

w_{n e w} = w + \sum_{i = 1}^{K} \frac{n_{i}}{N} ∆ w_{i},

(2)

where

K

is the number of participating institutions,

n_{i}

is the number of training examples in institution

i

, and

N = \sum_{i = 1}^{K} n_{i}

is the total number of examples.

Weighted averaging ensures that institutions with larger datasets have proportionally greater influence on the global model, ensuring fair representation. However, this approach can be vulnerable to highly unbalanced distributions or the presence of malicious actors. To protect against Byzantine attacks (malicious or compromised clients sending invalid updates), the architecture provides an extension with robust aggregation methods such as Krum or Trimmed Mean, which exclude statistical outliers before averaging. However, these methods are not implemented in current proof-of-concept implementations.

The process is repeated for a fixed number of rounds or until convergence is achieved, defined as a relative change in validation loss below 2%.

4.2.4. Model Deployment and Update Mechanism

After each federated learning round, the orchestrator distributes the updated global model to all participating institutions. The global model executes on institutional cloud servers. Edge gateways continue running the lightweight fall detection model (single-layer LSTM, 16 units, Section 3.3.1) for real-time inference. The federated-trained model (two-layer LSTM, 64 + 32 units) performs behavioral analysis offline on cloud infrastructure with sufficient computational resources. All institutions receive identical global models, ensuring consistent diagnostic behavior but applying it only to locally stored patient data, maintaining privacy guarantees.

4.3. Implementation with the Flower Framework

For practical implementation of federated learning, Federated Learning over Wireless (Flower) is used—a modular, open framework that abstracts FL coordination without being tied to a specific machine learning library [41].

4.3.1. Architectural Advantages of Flower

Flower offers four key advantages for the current architecture. First, the independent design allows models implemented using TensorFlow (2.20.0), PyTorch (2.9.0), scikit-learn (1.8.0), or other libraries to be unified through a client–server interface. Second, strict separation between coordination and the training server component only manages orchestration, while each institution retains full control over local data, training, and evaluation. Third, it offers efficient communication through high-performance gRPC protocol compatible with standard hospital network constraints. Fourth, through built-in fault tolerance, temporarily unavailable clients are excluded from the current round without compromising global processes.

4.3.2. Client Implementation

Each institution implements a client component by extending base class fl.client.NumPyClient, which defines three mandatory methods:

get_parameters() retrieves the current parameters of the local model as a list of NumPy arrays. This operation is called at the beginning of each round, when the orchestrator wants to obtain the current state of the local model.
fit(parameters, config) accepts global parameters from the orchestrator, updates the local model, performs local training for a configured number of epochs, and returns updated parameters along with metadata (number of examples used and local loss). Only parameter updates leave the institution—raw data and gradients remain local.
evaluate(parameters, config) evaluates the global model on a local validation set and returns quality metrics (accuracy and loss). This allows the orchestrator to track global performance without accessing the data.

4.3.3. Server Configuration

The orchestrator is configured via fl.server.start_server(), with parameters for aggregation strategy, minimum number of clients, number of rounds, and convergence criteria. For the current architecture, the FedAvg strategy is used, implementing the standard federated averaging described in Section 4.2.3. Flower also supports alternative strategies such as FedProx (for heterogeneous clients), FedOpt (with adaptive optimizers), and customized strategies for specific applications.

4.4. Confidentiality Guarantees

Federated learning provides multi-layered confidentiality protection through a combination of architectural, cryptographic, and algorithmic mechanisms.

4.4.1. Communication Efficiency

Raw patient data never leaves institutional boundaries. All training data remains in an institution’s local MongoDB instance, with network isolation (firewall rules allowing only outbound gRPC connections to orchestrator) and role-based access control (RBAC). Even orchestrator administrators lack the technical means to access raw data.

4.4.2. Transport-Level Encryption

All communications between institutions and the orchestrator use TLS 1.3 with mutual certificate authentication. Each institution has a unique X.509 certificate issued by a trusted certification authority (CA), which the orchestrator validates before accepting data. This prevents man-in-the-middle attacks and ensures participant authenticity. Parametric updates are transmitted as serialized tensors over the TLS-encrypted channel, ensuring confidentiality and integrity during transport.

4.4.3. Protection Against Byzantine Attacks

Malicious or compromised clients may send invalid parametric updates to sabotage the global model. The architecture provides three levels of protection:

Statistical filtering excludes updates that are statistical outliers by calculating the Euclidean distance between each pair of updates and rejecting updates with a median distance above a certain threshold.
Robust aggregation uses algorithms such as Krum (which selects updates with the smallest sum of distances to nearest neighbors) or Trimmed Mean (which excludes the most extreme α% updates before averaging) instead of direct averaging.
The reputation system tracks historical client performance and reduces weights of institutions whose updates consistently worsen global validation accuracy.

These mechanisms are not implemented in the current proof-of-concept but are planned for future expansion in multi-institutional deployment.

5. Technical Validation

5.1. Scope and Objectives of Validation

The technical validation of the proposed system, called Monitoring INtelligence for Dementia Guarded by Ubiquitous Ambient RecorDing (MINDGUARD), aims to demonstrate the technical feasibility of the architecture and evaluate key system metrics determining practical applicability. Clinical validation is explicitly outside the scope of this work and remains an important future task.

Technical validation focuses on six evaluation dimensions:

Localization accuracy—verification of the infrared system to achieve room-level localization accuracy.
BLE communication characteristics—measurement of packet delivery reliability, RSSI, and coverage.
Compression efficiency—validation of network traffic reduction while maintaining information completeness.
Deduplication performance—assessment of accuracy in multi-gateway configurations.
End-to-end latency—verification of requirement for under 10 s for critical event delivery.
Architectural feasibility of federated learning—demonstration of federated learning in a simulated environment.

Table 3 maps each research question from Section 1.2 to the corresponding system component, validation methodology, performance targets, and achieved results.

The system is considered successful if it achieves the projected goals in all six dimensions in real home environments.

5.2. Experimental Setup

5.2.1. Test Environment

The system was installed and evaluated in two apartments with different layouts and sizes. The first apartment (95 m²) includes a kitchen, living room, two bedrooms, a bathroom, and a hallway. The second apartment (52 m²) has a more compact layout, with the living room combined with the kitchen, a bedroom, and bathroom. Both apartments have Wi-Fi access.

5.2.2. Hardware Implementation

Each participant carries one Smart Badge 3 (Kontakt.io), configured to transmit telemetry data every 0.1 s and location data every 1 s at TxPower = −4 dBm. The logical name of all devices is “DEMENTIA”.

Six Beam Mini 2 beacons are installed in first apartment (one each for the kitchen, living room, both bedrooms, bathroom, and two for the L-shaped corridor). Four beacons are installed in the second apartment (living room–kitchen, bedroom, bathroom, and corridor). Each beacon is configured to transmit a unique room identifier every 1 s (minimum possible interval) with a transmission angle θ = 110°.

Two ESP32-S3-BOX-3 gateways are installed in both apartments, located in various places, with 15–20% overlap in BLE coverage areas. Gateways are initially configured by the institution’s IT department via web interface. Information to be entered includes institution ID, patient ID, wearable device ID, address and authentication for MQTT broker access for respective institution, and room IDs and names.

Gateways are connected to the home Wi-Fi router and communicate with the cloud infrastructure via a TLS 1.3-encrypted MQTT channel. When the gateway starts in home environments without stored Wi-Fi access data, the software activates Soft Access Point (SoftAP) mode. In this mode, the gateway creates its own wireless network with a program-defined name and password. Information on how to connect to the gateway’s wireless network is shown on the display—network name, IP address, and access password. The patient (or a family member) must connect to this temporary network from a phone, tablet, or computer and access the gateway’s access point via a browser. The web server at this address displays a page with form for entering the SSID and password for the home network (Supplementary Materials/Figure S1). Once the user enters and confirms their data, the device saves it in non-volatile memory (NVS), terminates SoftAP mode, switches to Station mode, and connects to the home Wi-Fi network.

5.2.3. Participants

Three healthy volunteers (aged 28, 58, and 89) participated in technical validation. The participants were selected to represent different age groups and activity patterns to test system robustness across a variety of user behaviors. Each participant wore a smart badge continuously for 7 days, including at night, and was instructed to maintain normal daily routines. All participants provided written informed consent to participate in this technology assessment study. A 7-day period was chosen to validate (1) hardware reliability, and (2) variability in sensor data.

5.2.4. Ground Truth Collection and Validation Methodology

Participants used a custom mobile app (Supplementary Materials/Figure S2) displaying interactive floorplans of apartments. When entering a room, the participant tapped the corresponding room on the display. The application automatically marked the previous room as exited and transmitted a timestamped event to the cloud MQTT broker via home Wi-Fi. The backend service assigned an authoritative timestamp upon message reception, ensuring synchronization with badge telemetry processed by the same infrastructure.

Two participants (aged 28 and 58) conducted 48 h continuous monitoring over a weekend period in two apartments. For each second, we compared badge-detected room against ground truth from app logs. Transition periods (1.5 s window around each logged entry) were excluded from evaluation to account for physical doorway traversal time and IR beacon detection latency. Overall accuracy was calculated as the ratio of correctly classified seconds to total classified seconds. The achieved localization accuracy was 97.2% for Apartment 1 (167,787 correctly classified seconds out of 172,620 total) and 98.0% for Apartment 2 (169,212 out of 172,665 total), yielding a combined accuracy of 97.6% across both environments.

Table 4 presents detailed classification outcomes for Apartment 1 over the 48 h monitoring period, showing actual versus predicted room assignments for all 172,620 classified seconds.

The confusion matrix reveals several systematic error patterns consistent with apartment floor plan topology. All 4833 misclassification events (2.80% of total classified time) occurred exclusively between spatially adjacent rooms sharing physical doorways or open passages. Zero errors were observed between non-adjacent rooms such as Kitchen and Bedroom1, or Bathroom and Bedroom2, confirming that misclassifications result from boundary zone IR signal ambiguity during brief threshold-crossing moments rather than fundamental system confusion about distant room locations.

The Living Room to Bedroom2 connection via a glass sliding door produced an elevated bidirectional error rate, with 289 s misclassified as Bedroom2 when the participant was in the Living Room, and 245 s showing a reverse error. This 534 s total represents 0.87% of the combined occupancy time in these two rooms and constitutes the only error category attributable to door material properties rather than transition timing effects. The glass door allows partial infrared penetration, creating occasional ambiguity absent with solid wooden doors elsewhere in the apartment.

Corridor demonstrated the lowest per-room accuracy at 93.03%, which is expected, given its function as an L-shaped transition space connecting four other rooms (Kitchen, Living Room, Bedroom1, and Bathroom). Detailed analysis of corridor misclassifications shows 2122 errors distributed across boundaries with Kitchen (529 s), Living Room (1158 s), Bedroom1 (325 s), and Bathroom (110 s), representing brief moments when the participant stood at the threshold between the corridor and adjacent room. The corridor never exhibited false classifications to Bedroom2, consistent with floor plan showing no direct corridor-to-Bedroom2 doorway; all Bedroom2 access occurs through the Living Room glass door.

Rooms typically occupied with closed doors during use demonstrated the highest classification accuracy. Bedroom1 achieved 99.50% accuracy (291 s misclassified out of 58,157 total, all errors at corridor boundary during entry/exit). Bathroom achieved 96.80% accuracy (252 s misclassified out of 7880 total), with errors occurring primarily during brief door-opening moments. Kitchen achieved 97.07% accuracy (429 s misclassified, all at the corridor boundary). This pattern confirms infrared’s fundamental advantage when physical barriers completely block cross-room signal propagation, with accuracy degradation occurring primarily at open doorways and transition zones.

Inter-apartment comparison shows that Apartment 2 achieved 98.0% accuracy compared to Apartment 1’s 97.20%, representing a 0.8 percentage point improvement. This difference is primarily attributable to Apartment 2’s simpler linear corridor layout and exclusive use of wooden doors throughout, eliminating the glass door IR leakage observed in Apartment 1. Two-proportion z-test confirms that this difference is statistically significant (z = 3.12, p = 0.002), validating the hypothesis that door material composition affects boundary zone classification accuracy.

Statistical comparison against published BLE-based localization methods using McNemar’s test shows that the infrared approach (97.6% combined accuracy) significantly outperforms the typical embedded-system BLE baseline of 87.7% reported by García-Paterna et al. [27] for Raspberry Pi deployment (χ² = 47.3, p < 0.001). This performance advantage stems from infrared’s physical inability to penetrate solid barriers, eliminating the cross-room signal ambiguity inherent in radio frequency propagation.

5.3. Technical Results

5.3.1. Accuracy of Infrared Localization

Theoretical Model

A geometric and probabilistic model of the IR system was developed for a quantitative assessment of localization accuracy. We assumed that a room is rectangular with width W and length L, and the IR beacon was mounted in the center of the ceiling at height H. The beam formed by beacon was cone with angle θ and radius of coverage on the floor.

R_{IR} = H t a n (\frac{θ}{2})

(3)

A point (

x, y

) on the floor is in the IR coverage if

\sqrt{(x - W / 2)^{2} + (y - L / 2)^{2}} \leq R_{IR} .

(4)

For each point, the probability of successful detection during a single-beacon transmission in line-of-sight (LOS) scenario

P_{LOS}

∈ [0, 1] is defined. This is followed by a simulation of a discrete time process with transmission interval Δt. The probability that the point will be validly detected is obtained by the following:

P_{valid} = 1 - (1 - P_{LOS})^{N_{\min}},

(5)

where

N_{\min}

represents a temporal consistency mechanism requiring N consecutive IR detections to validate localization. In a single-room configuration,

N_{\min}

= 1 is optimal, as it maximizes sensitivity and minimizes false negatives. The use of

N_{\min}

> 1 is justified in multi-room configurations, where the probability of infrared signal overlap is higher.

It is assumed that the probability of direct visibility

P_{LOS}

= 0.85. This value corresponds to typical conditions for the propagation of infrared rays in enclosed spaces, considering partial signal obstacles—the user’s body, furniture, and others.

The simulation includes Time-to-Live (TTL) logic to retain position estimates within a certain period

T_{TTL}

after valid detection. For each step

Δ t

, the following can be recorded:

TTL (t + Δ t) = m a x {TTL (t) - Δ t, 0} .

(6)

The evaluation of the localization at a given time step is as follows:

\hat{R} (t) = \{\begin{matrix} 1, & if TTL (t) > 0 \\ 0, & otherwise \end{matrix} .

(7)

Monte Carlo Analysis

To assess room-level localization accuracy, Monte Carlo simulation is used, which approximates multidimensional expectations by statistically averaging system response to randomly selected user positions, dwell times, and stochastic infrared detection events.

Assessment is performed in the following sequence:

Generate $N_{trials}$ random initial positions $(x_{i}, y_{i})$ uniformly in the room.
For each position, simulate exponentially distributed stay time $T_{stay}$ :

$T_{stay} \sim \exp (λ = 1 / τ_{mean}),$

(8)

where $τ_{mean}$ is the average dwell time.
Discretize time with Δt and apply logic for $N_{\min}$ and TTL.
Accuracy is estimated as the fraction of time during which $\hat{R} (t)$ coincides with actual visitor position.

The accuracy for correct localization is the following mathematical expectation:

A c c u r a c y = E [f (ω)] = P (correct),

(9)

where the indicator function

f (ω)

takes the value one when the estimated room corresponds to the actual one, and 0 otherwise. The following formula is used for program calculation:

A c c u r a c y = \frac{\sum_{i, t} 1 {{\hat{R}}_{i} (t) = R_{true}}}{\sum_{i, t} 1} .

(10)

Simulation analysis for Kontakt.io’s IR beacons Beam Mini 2 (

θ

= 110°,

N_{\min}

= 2) shows that with a minimum possible transmission interval of 1 s, the probability of correct localization is approximately 94% (see Figure 3a). With

N_{\min}

= 1, accuracy increases to 99.3%. The dependence of accuracy on the radiation angle is shown in Figure 3b.

Maximum accuracy is achieved with beacons θ > 100° and rooms up to 25 m². For larger rooms, two beacons need installation. Simulations show that average localization accuracy for adjacent rooms connected by a door of 91% for

N_{\min}

= 2 and 96% for

N_{\min}

= 1. Accuracy decrease at boundary zones is ~3%.

Experimental Validation

An infrared localization system was validated by comparing automatically detected locations with reference data collected via a mobile application. Participants see the floor plan of the apartment on the screen. They are instructed to mark the room they are entering by quickly tapping twice on the outline of the room.

Under optimal conditions (badge worn openly, direct or reflected visibility to the IR beacon), the system achieved 97.6% room identification accuracy, corresponding to the model. Problems arise during room transitions—the badge’s proprietary software detects a new room after ~1.5 s.

When the device is partially covered (thick outer clothing or placed in a pocket), the proprietary software may detect a lack of room identifiers (−1). In real home environments, this occurs 4–8% of the time. In these cases, the software assumes that patients are in the last recognized room, ensuring location data continuity.

Infrared technology exceeds typical accuracy of ~85% for BLE-only approaches [24] due to the inability of IR light to penetrate walls and doors, eliminating the main limitation of radio frequency methods.

Comparative Analysis with Published Methods

To position our results within the existing literature, we compare our achieved room-level accuracy against published studies using identical or comparable evaluation metrics. Table 5 summarizes key performance indicators across different localization technologies deployed in residential or similar indoor environments.

The proposed method achieves high accuracy at the room level (97.6%), comparable to the best results from existing technologies, such as BLE+kNN and Wi-Fi RTT, using less equipment and smaller areas. In comparison, BLE and Wi-Fi solutions often require more beacons, complex calibration, or devices that are sensitive to obstacles.

5.3.2. BLE Communication

The BLE communication channel was analyzed by measuring the Received Signal Strength Indicator (RSSI) at different distances and the frequency of successfully delivered packets (Supplementary Materials/Figure S3). Measurements were performed in a brick apartment with an area of 95 m² (four rooms, bathroom, and hallway). The edge gateway was placed in the apartment’s center. Smart badges were programmed to transmit with power TxPower = −4 dBm.

The results obtained are presented in Table 6.

RSSI values show an expected inverse relationship with distance. The path loss exponent is close to the theoretical value for free space (n ≈ 2.0) in direct line of sight, while in obstructed conditions, it reaches n ≈ 3.5–4.0, typical for home environments with brick walls and furniture.

The effective range of BLE communication in home environments is approximately 15 m through one wall, after which packet loss increases dramatically above 20%. If a badge can be more than 15 m from a gateway in a home, or a signal passes through two or more brick walls, the use of second gateway is mandatory.

With a configured transmission interval of 100 ms for telemetry packets and 1000 ms for location packets, the measured Packet Delivery Rate (PDR) is 92.8% (7.2% loss). The PDR was calculated by comparing the expected packet count (2 badges × 48 h × 3600 s × 11 packets/s = 3,801,600 total expected) against actual received packets logged by gateways (3,528,086 received). Temporal analysis using the autocorrelation function shows that losses are uniformly distributed (ACF < 0.1 for lags > 1 s) rather than clustered, confirming stable channel performance. The 7.2% loss includes RF interference from home Wi-Fi networks on an overlapping 2.4 GHz spectrum, brief signal blockage when participant body orientation shields badge antenna, and transmission power reduction when the badge enters stationary mode after detecting prolonged immobility. Despite this baseline loss rate, localization accuracy remains high (97.6%) because IR beacon detection operates independently at 1 s intervals, requiring only a single successful BLE packet per second to update room location, with the system implementing temporal persistence, preventing spurious changes from isolated packet losses.

5.3.3. Effectiveness of the Three-Channel Architecture

The proposed three-channel architecture achieves a significant reduction in network traffic while maintaining information integrity (see Table 7).

The baseline for comparison is the raw BLE data for a 60 s window: 120 telemetry packets (average 29 bytes) and 60 location packets (average 28 bytes), generating a total of 5160 bytes/minute of uncompressed information.

Stream 1 transforms 180 individual measurements into a single JSON object containing statistical descriptors (Supplementary Materials/Figure S4) with a compression ratio of 38.8% (61.2% reduction). Critically, this compression is lossy with respect to temporal resolution—individual 1 s measurements are irreversibly aggregated into minute-level descriptors.

Stream 2 implements a two-stage compression strategy: differential encoding reduces size from 5160 bytes to an average of 3345 bytes, after which gzip achieves 70–75% additional compression (Supplementary Materials/Figure S5). The final size of 1036 bytes represents a compression ratio of 20.1% (79.9% reduction). This compression is lossless—original 1 s sequence is fully recoverable.

Stream 3 does not implement compression but enriches the BLE packet (29 bytes) with rich context (~600 bytes). Increase is justified by low frequency (typically under 10 events/day) and criticality of detailed information for clinical interpretation (Supplementary Materials/Figure S6).

Total network traffic reduction is 41.2% (Streams 1 + 2 combined: 3036 bytes vs. 5160 bytes baseline).

5.3.4. Deduplication Performance

Deduplication algorithms were tested in Apartment 1 (95 m²), where two peripheral gateways were positioned with approximately 20% overlap in BLE coverage areas. Each gateway independently receives BLE packets from the wearable device and generates three streams, leading to duplication when the participant moves within an overlap zone.

Aggregate deduplication (Stream 1): Quality-based selection algorithm using average RSSI value achieves 98.5% accuracy in selecting higher quality aggregate. False positive errors (valid aggregate incorrectly marked as a duplicate) are observed in 1.5% of cases, typically when two gateways have very close RSSI values (difference below 3 dBm). False negative errors (duplicate not recognized) are not detected.

Merging event logs (Stream 2): When moving in overlap zone, gateway A receives an average of 96.7% of packets, while gateway B receives an average of 97.2%, with uncorrelated losses. After merging two logs on demand, completeness increases to 99.9%, representing a 3.2% improvement. Additional latency of a 60 s window request is 15 ms.

Alarm deduplication (Stream 3): Exactly one delivery mechanism tested by simulating critical events (pressing SOS button) when participant positioned in overlap zone. Both gateways detect an event and generate alert packets with 90 ms difference. Redis atomic check-and-set operation ensures that only the first alert arriving is processed, with the second discarded as a duplicate. Accuracy is 100%.

5.3.5. End-to-End Latency

The total end-to-end latency from the sensor to cloud notification was analyzed for two scenarios: non-critical aggregated data and critical alarm events.

Non-critical data (Stream 1 and Stream 2): Latency consists of the following: BLE scan cycle (96 ± 12 ms), buffering and feature extraction (60,000 ms fixed), MQTT publishing (15–25 ms), network propagation (50–110 ms depending on internet connectivity), MQTT broker processing (5–10 ms), and cloud ingestion service (10–20 ms). Overall latency is dominated by 60 s buffering, which is an architectural decision rather than a limitation.

Critical data (Stream 3): Latency for critical events bypasses buffering and consists of local ML detection at the edge computing gateway (60–90 ms), immediate MQTT publication with QoS 2 (15–25 ms), network propagation (50–110 ms), MQTT broker (5–10 ms), and a cloud alarm service (10–20 ms). Measured latency from detection at the edge computing gateway to notification delivery is under 7 s in 99.5% of cases (95th percentile: 6.8 s), with an average latency of 4.2 s. This ensures a timely response in emergency situations, such as falls or SOS button pressing.

Edge ML inference: Local anomaly detection (falls) using TensorFlow Lite Micro model on ESP32-S3 demonstrates inference latency below 50 ms, enabling real-time classification without cloud dependency.

5.3.6. Federated Learning: Simulated Proof-of-Concept and Limitations

The objective of this section is not to claim real-world federated deployment, but to investigate whether the proposed architectural workflow remains stable under realistic heterogeneity assumptions.

Experimental Design and Data Partitioning

The federated learning evaluation addresses two fundamental questions: (1) Does federated training achieve comparable accuracy to centralized training? (2) How does data heterogeneity across institutions affect convergence and performance?

The training data consists of accelerometer signals (10 Hz and three-axis) from 10 healthy volunteers performing supervised fall scenarios (forward, backward, side, and controlled descent) and normal daily activities over 7 days. Continuous streams were segmented into 3 s windows (30 measurements). To expand the training dataset while preserving the individual dynamics of the fall, time-series augmentation techniques were used via the tsaug library. From a total of 2000 real fall segments, additional segments were generated, doubling the training set through carefully controlled transformations, including light noise, minimal scaling, baseline drift, and limited time shift. The goal of this approach was to increase data diversity without disrupting the individual fall style, movement speed, and characteristic impact shape for each subject. This preserves the physiological and kinematic behavior of the participants, which is critical for training the LSTM model to recognize falls with high sensitivity and specificity, while reducing the risk of overfitting to specific examples.

We evaluate three training configurations:

Centralized (baseline). All data from ten subjects is pooled and trained centrally, representing optimal performance with full data access. Data split subject-wise into 70% training, 15% validation, and 15% testing to prevent subject leakage and ensure realistic generalization.
Federated IID. To establish a controlled baseline scenario, the ten subjects were randomly distributed across three institutions while preserving age-group balance (over 55 years: four subjects; 35–55 years: three subjects; under 35 years: three subjects), resulting in institutional proportions of four, three, and three subjects. Each institution received a demographically mixed sample, ensuring representativeness of different age profiles and associated movement patterns during falls and daily activities. This allocation reduced statistical heterogeneity between institutions, approximating an IID setting.
Federated Non-IID. This setup introduces both demographic and label-skew heterogeneity to simulate real-world complexity. The three institutions differ in age distribution and fall prevalence: institution 1 (>55 years) represents a high-risk elderly cohort (30% falls), institution 2 (35–55 years) includes moderate-risk adults (25% falls), and institution 3 (<35 years) comprises low-risk young adults (20% falls). This age-stratified non-IID structure reflects real-life scenarios in which institution 1 may be a geriatric clinic (high fall rate), institution 2 a general hospital (moderate fall rate), and institution 3 a community health center (low frequency). This configuration introduces both biomechanical differences (age-related movement patterns) and statistical differences (varying fall rates), representing the most challenging scenario for federated learning.

All federated configurations employed an identical LSTM architecture, consisting of two layers with 64 and 32 units, a dropout rate of 0.3, and a sigmoid output layer. Core hyperparameters were kept constant across institutions: an Adam optimizer with a learning rate of 0.001, binary cross-entropy loss, and a batch size of 32. Federated averaging (FedAvg) aggregated local models weighted by the number of samples per institution (Equation 2). Convergence was defined as a validation loss change of less than 2% over two consecutive rounds.

Comparative Results

Table 8 presents performance comparisons across all training configurations. All metrics were computed on held-out test sets never seen during training.

The results in Table 8 demonstrate the fundamental viability of federated learning as an alternative to the centralized approach, while also quantifying the impact of statistical heterogeneity on model convergence and performance. The reported results should be interpreted as an evaluation of algorithmic and architectural plausibility, rather than as evidence of deployment readiness. In each federated round, approximately 200 KB of model weights are transmitted per institution, resulting in ~8.2 MB of total communication for convergence in the most challenging combined non-IID scenario (14 rounds × 3 institutions × 200 KB). Data heterogeneity increases the number of rounds required for convergence, from 5 rounds in the IID scenario to 14 rounds in the combined non-IID scenario. This indicates that statistical and demographic differences among institutions have slow convergence, highlighting the trade-off between realism and communication efficiency in federated learning experiments.

The centralized model achieves 90.8% accuracy and serves as an upper bound on performance with full data access. With federated IID training, where subjects are demographically balanced across the three institutions, we observe a decline of 6.5 percentage points to 84.3% accuracy. This deficit stems from data fragmentation and aggregation noise introduced by the process of averaging local models. However, convergence is achieved in just five rounds with minimal communication cost, validating the effectiveness of the FedAvg algorithm on homogeneous data.

The introduction of combined heterogeneity in the non-IID scenario leads to further degradation to 79.8% accuracy and requires 14 rounds for convergence. This result quantitatively confirms the hypothesis that age stratification and class imbalance create conflicting gradient updates that slow down global optimization. The increasing standard deviation to ±2.1% indicates increased stochasticity in training under strong statistical heterogeneity. Critically, however, the federated model outperforms the best local institutional model by 2.7 percentage points, experimentally proving the fundamental advantage of cooperative learning over isolated learning, even under unfavorable data distribution conditions.

The experiments successfully confirm the main thesis that decentralized learning preserves the basic functionality of the model while ensuring the protection of personal data.

Critical Limitations and Production Requirements

This evaluation validates architectural feasibility and algorithmic robustness under data heterogeneity in a controlled simulation environment. However, this is not a production-ready deployment. Substantial gaps remain before real multi-institutional implementation:

Simulation constraints. All experiments were conducted on a laptop, Dell Precision 3581 (Intel Core i7-13700H, 14 cores; 32 GB DDR5; 512 GB SSD), where three institutions were simulated through Python processes. Communication was performed over localhost connections rather than through production-grade network infrastructure. A real-world deployment would require institutional firewall traversal and NAT configuration, the establishment of VPN tunnels or trusted cloud intermediaries, comprehensive hospital IT security audits and penetration testing, as well as bandwidth management under real network constraints instead of localhost communication.
Security and privacy gaps. No Byzantine fault tolerance was implemented. A production-grade federated learning system would require robust aggregation algorithms such as Krum and Trimmed Mean to detect and exclude malicious model updates, differential privacy mechanisms that provide formal (ε, δ)-differential privacy guarantees beyond simple parameter aggregation, secure multi-party computation to enable aggregation without revealing individual updates to the orchestrator, and defenses against model inversion attacks that could otherwise allow reconstruction of training data from model parameters. These security mechanisms introduce additional computational overheads and may further reduce model accuracy, effects that have not been quantified here.
Regulatory and legal requirements. A real-world deployment would require establishing a comprehensive legal framework with data processing agreements between participating institutions, a process that typically involves 6–12 months of negotiation. Each site needs IRB approval supported by standardized informed consent procedures. In addition, a lawful basis under GDPR Article 6 would have to be defined, most likely “public interest” or “legitimate interest,” together with a completed Data Protection Impact Assessment (DPIA) for every participating institution. The process would also require the appointment of Data Protection Officers, implementation of formal audit procedures, and the adoption of Standard Contractual Clauses for any cross-border data transfers. Establishing real three-hospital federated deployment requires partnership agreements, IT security audits, firewall configurations, and a 12–18-month timeline with an estimated cost of €200,000–350,000.

The contribution of this work is demonstrating that federated averaging can converge on continuous behavioral time-series sensor data under realistic data heterogeneity, with quantified accuracy–privacy–communication trade-offs. This addresses the fundamental algorithmic and workflow integration question, while acknowledging that production deployment requires additional infrastructure, security, and regulatory validation beyond simulation scope.

Production deployment across actual hospital networks represents a distinct engineering, regulatory, and organizational challenge beyond the scope of current technical validation. The results establish sufficient promise to justify pursuing real multi-institutional partnerships as the next research phase, but do not claim readiness for immediate clinical deployment.

5.3.7. Real-Time Visualization of Data from Wearable Devices

To validate communication channels between wearable devices and the MQTT message broker, and evaluate the functional capabilities of the edge computing gateway, a real-time data visualization service has been developed. The service provides continuous monitoring of key sensor biomarkers extracted from location and telemetry packets generated by wearable devices.

Figure 4 shows the results of a real-time visualization service for key behavioral and physiological indicators.

Figure 4 shows that in absence of data from an infrared localization beacon, the edge computing gateway transmits a zero value for the room_id parameter. In the example shown, the patient’s transition from living room (room_id = 9) to kitchen (room_id = 1) is recorded, occurring in a corridor where the IR beacon was deliberately turned off. During this interval, the system detects room_id = 0, interpreted as a lack of localization coverage. Ingestion software analyzes such cases of interrupted IR coverage and applies logic for contextual interpretation of spatial data, in most cases replacing missing room_id values with the last valid measured value.

6. Discussion

6.1. Key Technical Contributions

Experimental validation demonstrates that the proposed architecture successfully achieves all six project objectives in a real home environment, establishing a technical foundation for future clinical applications.

Infrared localization at the room level achieves 97.6% experimental accuracy, exceeding the design goal of 95%. This represents a significant improvement over the typical accuracy of ~85% for BLE-only approaches [24,27]. This precision is critical for detecting spatial disorientation, an early and specific sign of dementia.

The three-channel data architecture achieves 41.2% overall network traffic reduction, while maintaining full 1 s temporal resolution for model training and minute-level resolution for real-time visualization. This hybrid approach balances competing requirements: Stream 1 provides aggregated descriptors for clinical dashboards with latency below 50 ms; Stream 2 stores lossless compressed sequences for ML; and Stream 3 delivers critical alarms with end-to-end latency below 7 s for 99.5% of events. This design avoids the trade-off between completeness and the efficiency characteristic of single-channel architectures.

Edge computing with local ML demonstrates inference latency below 50 ms for fall detection, eliminating cloud dependency for critical events. This represents a qualitative change from systems sending all raw data for cloud processing, where aggregation cycles lead to 30–60 s delays.

Deduplication across multiple gateways achieves 98.5% accuracy for aggregates, 99.9% completeness after log merging, and 100% reliability for critical alarms, ensuring continuous coverage in large homes without unnecessary network traffic or storage.

Federated learning, validated through controlled simulation, demonstrates algorithmic convergence and workflow feasibility under realistic data heterogeneity conditions. The 2.7 percentage point accuracy improvement (79.8% federated vs. 77.1% best isolated institutional model) provides quantitative evidence that collaborative training benefits all institutions despite severe non-IID data distributions, establishing a foundation for future production deployment.

Economic viability (approximately €490 for hardware and installation and approximately €55 monthly for cloud infrastructure) represents a 5–10-fold reduction compared to existing research systems, making large-scale deployment economically feasible for healthcare systems with limited budgets.

On the Necessity of System-Level Validation

The breadth of technical components validated in this work may appear excessive for a single publication. However, we argue that this is a methodological necessity. The core scientific claim—that privacy-preserving continuous monitoring is architecturally feasible for large-scale deployment—cannot be validated by demonstrating individual components in isolation.

For example, achieving 97.6% localization accuracy (RQ1) would be meaningless if the data architecture could not deliver this information with a low enough latency for clinical utility (RQ2), or if regulatory constraints prevented multi-institutional model training (RQ3). Similarly, demonstrating a 41.2% bandwidth reduction is irrelevant if the compression destroys the temporal information needed for machine learning, or if the economic model remains prohibitive at scale. The contribution is not the individual technical achievements but the demonstration that they can coexist in a coherent, economically viable system.

This distinguishes our work from prior component-level studies that validate isolated technologies without addressing system integration challenges and positions as a necessary bridge toward clinical translation. The holistic validation approach directly addresses the research challenge articulated in Section 1.2: proving that the fundamental trade-offs faced by existing systems (precision vs. privacy, resolution vs. bandwidth, and accuracy vs. cost) can be simultaneously resolved through intelligent architectural design.

6.2. Positioning Relative to the Current State

The proposed architecture differs from existing systems through three critical innovations that address fundamental limitations in the field.

Spatial precision. ORCATECH [37,38] uses passive IR sensors with zone-level accuracy (~3–5 m). SENDA [8] does not apply spatial tracking, relying instead on episodic motor and cognitive assessments over 8 months. Kim et al. [40] achieved an AUC of 0.99 for predicting dementia, but also with zone-level PIR sensors. The current system provides 97.6% room-level accuracy through IR beacons, enabling the detection of specific disorientation patterns.

Continuous passive monitoring. SENDA requires active patient participation (periodic testing with a tablet and wearing multiple devices), leading to significant compliance issues, especially with cognitive impairment. After initial installation, the current system operates completely passively with 24/7 monitoring without interruption; the participant wears only one device requiring no charging (battery lasts over 3 years) and no interaction. This eliminates compliance constraints critical for the target population.

Protection of personal data across multiple institutions. None of systems under consideration apply federated learning. ORCATECH and SENDA function as single-center or limited multi-center studies with centralized data storage, creating GDPR barriers in international research. The current architecture demonstrates the technical feasibility of FL for home dementia monitoring (in a simulated environment), overcoming a fundamental limitation—joint model development on diverse populations without violating regulatory requirements.

6.3. Limitations and Guidelines for Future Development

6.3.1. Current Technical Limitations

Our experiment with three participants, over 7 days, and in two apartments is a technological assessment, not a clinical study. Generalizability is limited by the small sample size, short duration, and homogeneous demographics (all cognitively healthy; similar housing infrastructure). Extended testing (over 3 months), diverse environments (over 10 homes), and larger cohorts are needed to assess long-term reliability, seasonal variations, and edge cases.

The system does not measure heart rate, blood pressure, oxygen saturation, or heart rate variability—indicators associated with vascular dementia and cardiovascular health. A minimalist sensor set was chosen for a balance between simplicity, long battery life, and continuous wearability. Future integration of PPG sensors would add significant information with minimal energy increase.

The system does not track patient behavior outside the home. Frequency of leaving, duration of absence, and visits to social spaces are established indicators of cognitive health. Monitoring entrance area distinguishes leaving home but cannot assess social interactions.

The federated learning evaluation demonstrates architectural feasibility under controlled simulation conditions. Critically, collaborative training achieves 79.8% accuracy—2.7 percentage points higher than the best single-institution isolated model (77.1%)—demonstrating a statistically significant benefit despite data isolation constraints. However, this validation uses artificially partitioned data from a single geographic region, unencrypted localhost communication, and lacks Byzantine fault tolerance, differential privacy mechanisms, and regulatory compliance frameworks required for real multi-institutional deployment. Section 5.3.6 details validation scope, quantifies identified limitations, and establishes requirements for transitioning from simulation to production deployment across actual hospital networks.

6.3.2. Guidelines for Clinical Implementation

Transforming the architectural prototype into a clinically validated system requires a multi-stage approach with clear intermediate goals:

Phase 1. Extended technical validation (6–12 months)—testing with 50–100 participants in real, diverse home environments for a minimum of three months of continuous operation, covering different home types and demographic profiles (ages 65 to over 85, different education levels and technological literacy levels). Includes systematic failure analysis with targeted provocation of edge cases (battery depletion, connectivity loss, and extreme usage patterns) and assessment of long-term hardware component reliability.
Phase 2. Clinical pilot study (12–18 months)—ethics committee-approved protocol with ~150 participants in three groups: 50 individuals with MCI, 50 with mild dementia, and 50 healthy controls. Includes baseline and quarterly neuropsychological assessments (MMSE, MoCA, and CERAD-Plus) and reference clinical diagnosis. Effectiveness assessment covers diagnostic indicators (sensitivity, specificity, positive and negative predictive value) for MCI classification versus healthy controls and analysis of predictive ability for MCI to dementia within 12 months.
Phase 3. Multi-center federated deployment (18–24 months)—real federated training between ≥5 institutions, each with a cohort of 50–100 patients. Objectives are technical integration in heterogeneous institutional IT environments, demonstration that the federated learning model achieves equal or higher accuracy versus single-institution models, official confidentiality audit confirming GDPR compliance, and proof of operational stability with continuous operation over six months without critical incidents.
Phase 4. Prioritize technical improvements—integration of PPG sensor for cardiovascular monitoring, optional GPS module for outdoor tracking (periodic mode for energy efficiency), and implementation of Byzantine fault-tolerant aggregation methods (Krum and Trimmed Mean) with differential privacy mechanisms providing formal guarantees for data protection.

6.3.3. Open Research Questions

Optimal temporal resolution for different behavioral biomarkers is still not clearly defined: one-second measurement frequency may be excessive for analyzing circadian rhythms but insufficient for capturing fine motor abnormalities. Similarly, the minimum baseline period duration required to build reliable personalized models remains an open question—a 30-day period is a reasonable working hypothesis but may prove insufficient in rapidly progressing cases or excessive in clinically stable populations. Generalizability of developed models across different cultural and social contexts, for example between Mediterranean and Scandinavian lifestyles, requires multinational validation with heterogeneous cohorts. Finally, ethical aspects of continuous monitoring, especially in patients with advanced dementia unable to give informed consent, necessitate careful development of adequate ethical and regulatory frameworks.

6.4. Broader Impact and Applicability

The proposed architecture is disease-independent and can be directly applied to other neurodegenerative and chronic conditions requiring long-term monitoring. In Parkinson’s disease, accelerometer data enables the detection of tremors and bradykinesia, while spatial patterns reveal freezing of gait episodes. In multiple sclerosis, the system can track the progression of motor deficits and fatigue patterns. In post-stroke rehabilitation, continuous monitoring allows the assessment of the quantity and quality of physical therapy protocol compliance. This generalizability significantly increases potential impact beyond dementia.

Ethical considerations require careful balance between clinical benefit and the protection of patients’ personal rights. Continuous monitoring raises important questions related to autonomy, personal dignity, and risk of excessive or invasive surveillance. Therefore, the system should be implemented within clearly defined data management policies, specifying who has data access, how it is used in clinical decision-making, and what procedures exist for data deletion upon request. Although technological approaches such as differential privacy and federated learning address key aspects of data protection, they are insufficient alone. Adequate ethical frameworks for informed consent need development, especially in the context of progressive cognitive decline, requiring close multidisciplinary collaboration between clinicians, ethicists, and patient advocacy organizations.

7. Conclusions

This work demonstrates the architectural and technical feasibility, as well as the economic viability for small-scale deployment scenarios, of continuous, privacy-preserving home monitoring for behavioral pattern analysis through an integrated architecture combining wearable sensors, infrared localization, edge computing, and federated learning.

Experimental validation in real residential environments confirms that the proposed compression strategy can balance low latency and high information completeness without significant degradation, while quality-based deduplication enables reliable coverage in multi-gateway deployments. The economic analysis indicates a cost reduction compared to existing research platforms, supporting the potential for large-scale deployment from an infrastructure perspective. The federated learning component, while validated in a simulated environment, demonstrates the technical feasibility of privacy-preserving multi-institutional collaboration under GDPR constraints.

A key distinction between the present work and established platforms such as ORCATECH and SENDA is that the proposed system is complementary rather than competitive. While these platforms have demonstrated clinical value through long-term longitudinal studies, the present study does not claim clinical validation. The reported experiments constitute a technological and architectural assessment, not a clinical trial. The contribution of this work lies in providing a next-generation technical foundation that requires rigorous clinical evaluation to assess diagnostic and prognostic effectiveness.

If future clinical validation confirms effectiveness, the proposed architecture has the potential to support a shift from episodic clinical assessments toward continuous, home-based monitoring. The results presented here indicate that, within the evaluated scenarios, technical and economic constraints appear manageable; remaining challenges relate to clinical validation through longitudinal studies incorporating established cognitive assessments and reference neuropsychological diagnoses.

Future work will focus on clinical pilot studies and multi-institutional validation to assess real-world diagnostic value and inform subsequent regulatory pathways.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/computers15030144/s1, Figure S1: Access to the home Wi-Fi network; Figure S2: Mobile app (localizer); Figure S3: BLE packets sent by wearable devices; Figure S4: Example data from Stream 1; Figure S5: Example data from Stream 2; Figure S6: Example data from Stream 3.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the author.

Conflicts of Interest

The author declares no conflict of interest.

References

World Health Organization. Dementia. 2023. Available online: https://www.who.int/news-room/fact-sheets/detail/dementia (accessed on 5 December 2025).
McKhann, G.M.; Knopman, D.S.; Chertkow, H.; Hyman, B.T.; Jack, C.R.; Kawas, C.H.; Klunk, W.E.; Koroshetz, W.J.; Manly, J.J.; Mayeux, R.; et al. The Diagnosis of Dementia Due to Alzheimer’s Disease: Recommendations from the National Institute on Aging-Alzheimer’s Association Workgroups on Diagnostic Guidelines for Alzheimer’s Disease. Alzheimer’s Dement. 2011, 7, 263–269. [Google Scholar] [CrossRef] [PubMed]
Takahashi, T.; Nonaka, T.; Ohtani, R.; Hasegawa, M.; Hori, Y.; Tomita, T.; Kurita, R. Hindering Tau Fibrillization by Disrupting Transient Precursor Clusters. Neurosci. Res. 2025, 220, 104968. [Google Scholar] [CrossRef] [PubMed]
Nasreddine, Z.S.; Phillips, N.A.; Bédirian, V.; Charbonneau, S.; Whitehead, V.; Collin, I.; Cummings, J.L.; Chertkow, H. The Montreal Cognitive Assessment, MoCA: A Brief Screening Tool for Mild Cognitive Impairment. J. Am. Geriatr. Soc. 2005, 53, 695–699. [Google Scholar] [CrossRef] [PubMed]
Jack, C.R.; Bennett, D.A.; Blennow, K.; Carrillo, M.C.; Dunn, B.; Haeberlein, S.B.; Holtzman, D.M.; Jagust, W.; Jessen, F.; Karlawish, J.; et al. NIA-AA Research Framework: Toward a Biological Definition of Alzheimer’s Disease. Alzheimer’s Dement. 2018, 14, 535–562. [Google Scholar] [CrossRef]
Dubois, B.; Hampel, H.; Feldman, H.H.; Scheltens, P.; Aisen, P.; Andrieu, S.; Bakardjian, H.; Benali, H.; Bertram, L.; Blennow, K.; et al. Preclinical Alzheimer’s Disease: Definition, Natural History, and Diagnostic Criteria. Alzheimer’s Dement. 2016, 12, 292–323. [Google Scholar] [CrossRef]
Kaye, J.; Mattek, N.; Dodge, H.H.; Campbell, I.; Hayes, T.; Austin, D.; Hatt, W.; Wild, K.; Jimison, H.; Pavel, M. Unobtrusive Measurement of Daily Computer Use to Detect Mild Cognitive Impairment. Alzheimer’s Dement. 2014, 10, 10–17. [Google Scholar] [CrossRef]
Müller, K.; Fröhlich, S.; Germano, A.M.C.; Kondragunta, J.; Agoitia Hurtado, M.F.D.C.; Rudisch, J.; Schmidt, D.; Hirtz, G.; Stollmann, P.; Voelcker-Rehage, C. Sensor-Based Systems for Early Detection of Dementia (SENDA): A Study Protocol for a Prospective Cohort Sequential Study. BMC Neurol. 2020, 20, 84. [Google Scholar] [CrossRef]
Jonell, P.; Moëll, B.; Håkansson, K.; Henter, G.E.; Kuchurenko, T.; Mikheeva, O.; Hagman, G.; Holleman, J.; Kivipelto, M.; Kjellström, H.; et al. Multimodal Capture of Patient Behaviour for Improved Detection of Early Dementia: Clinical Feasibility and Preliminary Results. Front. Comput. Sci. 2021, 3, 642633. [Google Scholar] [CrossRef]
Yurdem, B.; Kuzlu, M.; Gullu, M.K.; Catak, F.O.; Tabassum, M. Federated Learning: Overview, Strategies, Applications, Tools and Future Directions. Heliyon 2024, 10, e38137. [Google Scholar] [CrossRef]
Li, H.; Li, C.; Wang, J.; Yang, A.; Ma, Z.; Zhang, Z.; Hua, D. Review on Security of Federated Learning and Its Application in Healthcare. Future Gener. Comput. Syst. 2023, 144, 271–290. [Google Scholar] [CrossRef]
Hasan, M.M. Federated Learning Models for Privacy-Preserving Ai in Enterprise Decision Systems. Int. J. Bus. Econ. Insights 2025, 5, 238–269. [Google Scholar] [CrossRef]
Addae, S.; Kim, J.; Smith, A.; Rajana, P.; Kang, M. Smart Solutions for Detecting, Predicting, Monitoring, and Managing Dementia in the Elderly: A Survey. IEEE Access 2024, 12, 100026–100056. [Google Scholar] [CrossRef]
Thaliath, A.; Pillai, J.A. Non-Cognitive Symptoms in Alzheimer’s Disease and Their Likely Impact on Patient Outcomes. A Scoping Review. Curr. Treat. Options Neurol. 2025, 27, 41. [Google Scholar] [CrossRef] [PubMed]
Ghayvat, H.; Gope, P. Smart Aging Monitoring and Early Dementia Recognition (SAMEDR): Uncovering the Hidden Wellness Parameter for Preventive Well-Being Monitoring to Categorize Cognitive Impairment and Dementia in Community-Dwelling Elderly Subjects through AI. Neural Comput. Appl. 2023, 35, 23739–23751. [Google Scholar] [CrossRef]
Deters, J.K.; Janus, S.; Silva, J.A.L.; Wörtche, H.J.; Zuidema, S.U. Sensor-Based Agitation Prediction in Institutionalized People with Dementia A Systematic Review. Pervasive Mob. Comput. 2024, 98, 101876. [Google Scholar] [CrossRef]
Anikwe, C.V.; Friday Nweke, H.; Chukwu Ikegwu, A.; Adolphus Egwuonwu, C.; Uchenna Onu, F.; Rita Alo, U.; Wah Teh, Y. Mobile and Wearable Sensors for Data-Driven Health Monitoring System: State-of-the-Art and Future Prospect. Expert Syst. Appl. 2022, 202, 117362. [Google Scholar] [CrossRef]
Gabrielli, D.; Prenkaj, B.; Velardi, P.; Faralli, S. AI on the Pulse: Real-Time Health Anomaly Detection with Wearable and Ambient Intelligence. In Proceedings of the 34th ACM International Conference on Information and Knowledge Management (CIKM 2025), Seoul, Republic of Korea, 10–14 November 2025; pp. 4717–4721. [Google Scholar]
Assaad, R.H.; Mohammadi, M.; Poudel, O. Developing an Intelligent IoT-Enabled Wearable Multimodal Biosensing Device and Cloud-Based Digital Dashboard for Real-Time and Comprehensive Health, Physiological, Emotional, and Cognitive Monitoring Using Multi-Sensor Fusion Technologies. Sens. Actuators A Phys. 2025, 381, 116074. [Google Scholar] [CrossRef]
Teoh, J.R.; Dong, J.; Zuo, X.; Lai, K.W.; Hasikin, K.; Wu, X. Advancing Healthcare through Multimodal Data Fusion: A Comprehensive Review of Techniques and Applications. PeerJ Comput. Sci. 2024, 10, e2298. [Google Scholar] [CrossRef]
Johnson, B.B. Noninvasive Patient Monitoring with Ambient Sensors to Monitor Physical and Cognitive Health for Individuals Living with Alzheimer’s Disease. In Proceedings of the 2024 Design of Medical Devices Conference, DMD 2024, Minneapolis, MN, USA, 8–10 April 2024. [Google Scholar]
Bijlani, N.; Maldonado, O.M.; Nilforooshan, R.; Barnaghi, P.; Kouchaki, S. Utilizing Graph Neural Networks for Adverse Health Detection and Personalized Decision Making in Sensor-Based Remote Monitoring for Dementia Care. Comput. Biol. Med. 2024, 183, 109287. [Google Scholar] [CrossRef]
Obeidat, H.; Shuaieb, W.; Obeidat, O.; Abd-Alhameed, R. A Review of Indoor Localization Techniques and Wireless Technologies. Wirel. Pers. Commun. 2021, 119, 289–327. [Google Scholar] [CrossRef]
Leitch, S.G.; Ahmed, Q.Z.; Abbas, W.B.; Hafeez, M.; Lazaridis, P.I.; Sureephong, P.; Alade, T. On Indoor Localization Using WiFi, BLE, UWB, and IMU Technologies. Sensors 2023, 23, 8598. [Google Scholar] [CrossRef] [PubMed]
Casha, O. A Comparative Analysis and Review of Indoor Positioning Systems and Technologies. In Innovations in Indoor Positioning Systems (IPS); IntechOpen: Rijeka, Croatia, 2024. [Google Scholar]
Biehl, J.T.; Girgensohn, A.; Patel, M. Achieving Accurate Room-Level Indoor Location Estimation with Emerging IoT Networks. In Proceedings of the 9th International Conference on the Internet of Things, Bilbao, Spain, 22–25 October 2019. [Google Scholar]
García-Paterna, P.J.; Martínez-Sala, A.S.; Sánchez-Aarnoutse, J.C. Empirical Study of a Room-Level Localization System Based on Bluetooth Low Energy Beacons. Sensors 2021, 21, 3665. [Google Scholar] [CrossRef] [PubMed]
Chen, Y.; Wang, Y.; Zhao, Y. A Room-Level Indoor Localization Using an Energy-Harvesting BLE Tag. Electronics 2024, 13, 4493. [Google Scholar] [CrossRef]
Karabey Aksakalli, I.; Bayındır, L. Enhancing Indoor Localization with Room-to-Room Transition Time: A Multi-Dataset Study. Appl. Sci. 2025, 15, 1985. [Google Scholar] [CrossRef]
Tegou, T.; Kalamaras, I.; Votis, K.; Tzovaras, D. A Low-Cost Room-Level Indoor Localization System with Easy Setup for Medical Applications. In Proceedings of the 2018 11th IFIP Wireless and Mobile Networking Conference, WMNC 2018, Prague, Czech Republic, 3–5 September 2018. [Google Scholar]
Alzu’Bi, A.; Alomar, A.; Alkhaza’Leh, S.; Abuarqoub, A.; Hammoudeh, M. A Review of Privacy and Security of Edge Computing in Smart Healthcare Systems: Issues, Challenges, and Research Directions. Tsinghua Sci. Technol. 2024, 29, 1152–1180. [Google Scholar] [CrossRef]
Islam, U.; Alatawi, M.N.; Alqazzaz, A.; Alamro, S.; Shah, B.; Moreira, F. A Hybrid Fog-Edge Computing Architecture for Real-Time Health Monitoring in IoMT Systems with Optimized Latency and Threat Resilience. Sci. Rep. 2025, 15, 25655. [Google Scholar] [CrossRef]
Rancea, A.; Anghel, I.; Cioara, T. Edge Computing in Healthcare: Innovations, Opportunities, and Challenges. Future Internet 2024, 16, 329. [Google Scholar] [CrossRef]
Ali, M.S.; Ahsan, M.M.; Tasnim, L.; Afrin, S.; Biswas, K.; Hossain, M.M.; Ahmed, M.M.; Hashan, R.; Islam, M.K.; Raman, S. Federated Learning in Healthcare: Model Misconducts, Security, Challenges, Applications, and Future Research Directions—A Systematic Review. arXiv 2024, arXiv:2405.13832. [Google Scholar]
Pati, S.; Kumar, S.; Varma, A.; Edwards, B.; Lu, C.; Qu, L.; Wang, J.J.; Lakshminarayanan, A.; Wang, S.-h.; Sheller, M.J.; et al. Privacy Preservation for Federated Learning in Health Care. Patterns 2024, 5, 100974. [Google Scholar] [CrossRef]
Dhade, P.; Shirke, P. Federated Learning for Healthcare: A Comprehensive Review. Eng. Proc. 2023, 59, 230. [Google Scholar] [CrossRef]
Lyons, B.E.; Austin, D.; Seelye, A.; Petersen, J.; Yeargers, J.; Riley, T.; Sharma, N.; Mattek, N.; Wild, K.; Dodge, H.; et al. Pervasive Computing Technologies to Continuously Assess Alzheimer’s Disease Progression and Intervention Efficacy. Front. Aging Neurosci. 2015, 7, 102. [Google Scholar] [CrossRef]
Gothard, S.; Nunnerley, M.; Rodrigues, N.; Wu, C.Y.; Mattek, N.; Hughes, A.M.; Kaye, J.A.; Beattie, Z. Study Participant Self-Installed Deployment of a Home-Based Digital Assessment Platform for Dementia Research. Alzheimer’s Dement. 2021, 17, e055724. [Google Scholar] [CrossRef]
Narasimhan, R.; Gopalan, M.; Sikkandar, M.Y.; Alassaf, A.; AlMohimeed, I.; Alhussaini, K.; Aleid, A.; Sheik, S.B. Employing Deep-Learning Approach for the Early Detection of Mild Cognitive Impairment Transitions through the Analysis of Digital Biomarkers. Sensors 2023, 23, 8867. [Google Scholar] [CrossRef]
Kim, J.; Cheon, S.; Lim, J. IoT-Based Unobtrusive Physical Activity Monitoring System for Predicting Dementia. IEEE Access 2022, 10, 26078–26089. [Google Scholar] [CrossRef]
Beutel, D.J.; Topal, T.; Mathur, A.; Qiu, X.; Fernandez-Marques, J.; Gao, Y.; Sani, L.; Li, K.H.; Parcollet, T.; de Gusmão, P.P.B.; et al. Flower: A Friendly Federated Learning Research Framework. arXiv 2022, arXiv:2007.14390. [Google Scholar] [CrossRef]

Figure 1. Hardware components.

Figure 2. Summary block diagram of the data flow architecture.

Figure 3. Theoretical analysis of IR localization: (a) dependence of accuracy on the emission interval; (b) dependence on the emission angle of the beacon.

Figure 4. Real-time visualization of data from wearable sensors.

Table 1. Comparison of home monitoring systems for dementia assessment.

System	Localization	Monitoring Type	Edge ML	Fed. Learning	Clinical Validation	Est. Cost	Key Limitation
ORCATECH [37,38]	Zone-level (PIR)	Passive + episodic digital	No	No	Extensive (10+ years)	High	Zone-level localization insufficient for disorientation
SENDA [8]	None	Active (periodic testing)	No	No	Ongoing protocol	High	Requires active participation; episodic (8-months)
Kim et al. [40]	Zone-level (PIR)	Passive IR only	No	No	Limited (AUC 0.99)	Medium	No wearables; zone-level only
Ghayvat & Gope [15]	None	Passive sensors + transfer learning	No	No	Real-world (43 weeks)	Medium	No spatial tracking
Bijlani et al. [22]	Ambient only	Passive graph-based	Yes	No	Real deployment (227 participants)	Medium	No wearables; no room-level
This work	Room-level (IR)	Passive 24/7	Yes (ESP32-S3)	Yes (simulated)	Technical only	€490 + €55/month	No clinical validation yet

Table 2. Summary of formats and data streams.

Stage	Format	Frequency	Size	Retention	Purpose
BLE packets (raw)	Manufacturer-specific binary	2 Hz (telemetry), 1 Hz (location)	20–30 bytes/packet	Transient	-
Stream 1	JSON (aggregates)	1/min	1.8–2.2 KB	Transient	Statistical descriptors
Stream 2	JSON + gzip	1/min	1.3–1.7 KB (compressed)	Transient	Compressed event sequences
Stream 3	JSON (alerts)	On-demand (~10/day)	0.4–0.6 KB	Transient	Critical events
InfluxDB (hot data)	Line Protocol	1 record/min	200–300 bytes	7 days	Real-time dashboards
MongoDB (event logs)	BSON (gzip blob)	1 doc/min	1.5–2 KB	90 days	ML
MongoDB (critical alerts)	BSON	On-demand (~10/day)	0.5–0.8 KB	Unlimited period	Audit trail
MongoDB (hourly aggregates)	BSON	1 doc/h	2.5–3 KB	90 days	Fast ML
MongoDB (daily summaries)	BSON	1 doc/day	2–3 KB	Unlimited period	Baseline models, reports

Table 3. Research questions, validation methodology, and results summary.

Research Question	System Component	Validation Method	Target Performance	Achieved Result	Section Reference
RQ1: Room-level localization without privacy-invasive methods	IR beacons + wearable sensors	Monte Carlo simulation + real-world testing with mobile app ground truth	>95% room-level accuracy	97.6% accuracy (97.2% Apt1, 98.0% Apt2)	Section 5.3.1
RQ2: Simultaneous low-latency, full resolution, and bandwidth reduction	Three-stream data pipeline (aggregates, compressed logs, critical alerts)	Network bandwidth measurement, latency profiling, temporal resolution analysis	<10 s latency for critical events, >40% bandwidth reduction, 1 s temporal resolution preserved	<7 s end-to-end latency (99.5% of events), 41.2% bandwidth reduction, full 1 s resolution maintained	Section 5.3.3, Section 5.3.4 and Section 5.3.5
RQ3: Privacy-preserving multi-institutional model development under GDPR	Federated learning orchestrator + local training servers	Simulated multi-institutional deployment with three institutions	Model convergence without raw data sharing, acceptable communication overhead	Convergence was reached in 5 rounds for IID (~84.3% accuracy) and 14 rounds for non-IID (~79.8% accuracy)	Section 5.3.6

Table 4. Confusion matrix for room-level localization (Apartment 1, 48 h).

	Kitchen	Living Room	Bedroom1	Bedroom2	Bathroom	Corridor	Total	Accuracy
Kitchen	14,223	0	0	0	0	429	14,652	97.07%
Living Room	0	30,802	0	289	0	1009	32,100	95.96%
Bedroom1	0	0	57,866	0	0	291	58,157	99.50%
Bedroom2	0	245	0	28,939	0	196	29,380	98.50%
Bathroom	0	0	0	0	7628	252	7880	96.80%
Corridor	529	1158	325	0	110	28,329	30,451	93.03%
Total	14,752	32,205	58,191	29,228	7738	30,506	172,620	97.20%

Table 5. Comparison of Room-Level Localization Methods in Indoor Environments.

Study	Technology	Environment	Beacons/Anchors	Receiver Hardware	Accuracy	Key Limitations
García-Paterna et al. [27]	BLE RSSI + kNN	Residential (160 m², 10 rooms)	6	Laptop	97.6%	Degrades to 87.7% with Raspberry Pi; 85–88% with only 3 beacons
García-Paterna et al. [27]	BLE RSSI + kNN	Residential (160 m², 10 rooms)	6	Raspberry Pi	87.7%	Errors concentrate between adjacent rooms
Tegou et al. [30]	BLE RSSI	Residential (80.4 m², 5 rooms)	5	Custom	93.75%	12 errors between adjacent rooms; improves to 95.31% with 8 beacons
Chen et al. [28]	BLE + DSFP	Large indoor (≈2000 m², 7 rooms)	1 anchor + 10 tags per room	Custom	>99%	Heavily instrumented (70 tags total); requires fingerprinting calibration
Biehl et al. [26]	Wi-Fi RTT	Office environment	11	Custom	97.99% (Precision), 93.86% (F1)	Requires IEEE 802.11mc routers (€150–300/unit); sensitive to NLoS; extensive calibration
Karabey & Bayındır [29]	Wi-Fi fingerprinting only	Residential (130–195 m²)	Not specified	Custom	≈82%	Baseline without temporal features
Karabey & Bayındır [29]	Wi-Fi + transition time	Residential (130–195 m²)	Not specified	Wide Neural Net	94.7%	Requires machine learning; temporal modeling adds complexity
This work	Infrared beacons	Residential (52–95 m², multi-room)	4–7	ESP32-S3	97.6%	Requires line-of-sight to ceiling beacon; 1.5 s transition delay

Table 6. RSSI characteristics under different propagation conditions.

Distance	Condition	Average RSSI (dBm)	σ (dBm)	Attenuation
1 m	Reference	−69	2	-
5 m	Line-of-sight	−70	3	~1 dB
5 m	One wall	−76	4–5	6–8 dB
5 m	One wall	−76	4–5	6–8 dB
15 m	One wall	−91	7	~22 dB

Table 7. Characteristics of compressed data streams.

Stream	Compression	Size	Reduction	Temporal Resolution	Lossless
Baseline (raw BLE)	None	5160 B	-	1 s	yes
Stream 1 (aggregates)	Statistical	2000 B	61.2%	60 s	no
Stream 3 (aggregates)	Diff + gzip	1036 B	79.9%	1 s	yes
Stream 1 (alarms)	Contextual enrichment	500 B	On-demand	Event	yes
Combined (Stream1 + Stream2)	Hybrid	3036 B	41.2%	Hybrid	partial

Table 8. Centralized vs. federated training performance comparison.

Training Mode	Accuracy (%)	Precision	Recall	F1-Score
Centralized	90.8 ± 0.9	0.92 ± 0.02	0.88 ± 0.02	0.90 ± 0.01
Federated (IID)	84.3 ± 1.4	0.86 ± 0.03	0.77 ± 0.03	0.81 ± 0.02
Federated (non-IID)	79.8 ± 2.1	0.79 ± 0.04	0.71 ± 0.04	0.75 ± 0.03
Best local institution model	77.1 ± 2.0	0.75 ± 0.05	0.69 ± 0.05	0.72 ± 0.04

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ivanov, R. Scalable IoT-Based Architecture for Continuous Monitoring of Patients at Home: Design and Technical Validation. Computers 2026, 15, 144. https://doi.org/10.3390/computers15030144

AMA Style

Ivanov R. Scalable IoT-Based Architecture for Continuous Monitoring of Patients at Home: Design and Technical Validation. Computers. 2026; 15(3):144. https://doi.org/10.3390/computers15030144

Chicago/Turabian Style

Ivanov, Rosen. 2026. "Scalable IoT-Based Architecture for Continuous Monitoring of Patients at Home: Design and Technical Validation" Computers 15, no. 3: 144. https://doi.org/10.3390/computers15030144

APA Style

Ivanov, R. (2026). Scalable IoT-Based Architecture for Continuous Monitoring of Patients at Home: Design and Technical Validation. Computers, 15(3), 144. https://doi.org/10.3390/computers15030144

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Scalable IoT-Based Architecture for Continuous Monitoring of Patients at Home: Design and Technical Validation

Abstract

1. Introduction

1.1. Context and Problem

1.2. Research Questions

1.3. Scientific Contributions

1.4. Technical Objectives and Validation Scope

2. Related Work

2.1. Clinical Context and Sensor-Based Approaches

2.2. Sensor Modalities and Architectures

2.3. Indoor Localization Technologies

Room-Level Localization

2.4. Edge Computing for Health Monitoring

2.5. Federated Learning for Privacy-Preserving Healthcare

2.6. Existing Home Monitoring Systems

2.7. Gap Analysis

3. System Architecture

3.1. Architectural Requirements

3.1.1. Functional Requirements

3.1.2. Technical Performance Requirements

3.1.3. Confidentiality and Security Requirements

3.1.4. Regulatory Requirements

3.1.5. Economic Viability Requirements

3.1.6. Clinical Applicability Requirements

3.2. Used Hardware

3.2.1. Wearable Sensor Devices

3.2.2. Edge Computing Gateways

3.3. Software Architecture

3.3.1. Edge Gateway Software

3.3.2. Cloud Infrastructure

3.4. Three-Tier Data Architecture

3.4.1. Stream 1: Aggregated Statistical Descriptors

3.4.2. Stream 2: Compressed Event Logs

3.4.3. Stream 3: Critical Real-Time Alerts

3.4.4. Derivative Aggregates

3.5. Deduplication with Multiple Gateways

3.5.1. Deduplication of Aggregated Descriptors (Stream 1)

3.5.2. Merging Event Logs (Stream 2)

3.5.3. Deduplication of Critical Alarms (Stream 3)

4. Federated Learning Design

4.1. Motivation and Requirements

4.2. Architecture of the Federated Learning System

4.2.1. Federated Learning Orchestrator

4.2.2. Local Training Servers

4.2.3. Federated Averaging Algorithm

4.2.4. Model Deployment and Update Mechanism

4.3. Implementation with the Flower Framework

4.3.1. Architectural Advantages of Flower

4.3.2. Client Implementation

4.3.3. Server Configuration

4.4. Confidentiality Guarantees

4.4.1. Communication Efficiency

4.4.2. Transport-Level Encryption

4.4.3. Protection Against Byzantine Attacks

5. Technical Validation

5.1. Scope and Objectives of Validation

5.2. Experimental Setup

5.2.1. Test Environment

5.2.2. Hardware Implementation

5.2.3. Participants

5.2.4. Ground Truth Collection and Validation Methodology

5.3. Technical Results

5.3.1. Accuracy of Infrared Localization

Theoretical Model

Monte Carlo Analysis

Experimental Validation

Comparative Analysis with Published Methods

5.3.2. BLE Communication

5.3.3. Effectiveness of the Three-Channel Architecture

5.3.4. Deduplication Performance

5.3.5. End-to-End Latency

5.3.6. Federated Learning: Simulated Proof-of-Concept and Limitations

Experimental Design and Data Partitioning

Comparative Results

Critical Limitations and Production Requirements

5.3.7. Real-Time Visualization of Data from Wearable Devices

6. Discussion

6.1. Key Technical Contributions

On the Necessity of System-Level Validation