Enhanced Distributed Multimodal Federated Learning Framework for Privacy-Preserving IoMT Applications: E-DMFL
Abstract
1. Introduction
- Adaptive multimodal architecture: gated/attention fusion with Shapley-based modality selection and per-client adaptation, resilient to missing or intermittent sensors and heterogeneous formats.
- Privacy mechanism: DP with norm clipping integrated with secure aggregation, resisting inversion and membership inference while preserving utility.
- Efficiency: Quantization-aware training, battery- and bandwidth-aware scheduling, and sliding-window low-latency inference for resource-constrained earable/wearable devices.
2. Related Works
2.1. Multimodal Federated Learning
2.2. Privacy-Preserving Federated Learning
2.3. Communication-Efficient Federated Learning
2.4. Federated Learning for IoMT
2.5. Research Gaps and Comparative Analysis
3. Methodology
3.1. Problem Formulation
3.2. E-DMFL Architecture
3.2.1. Modality-Specific Encoders
3.2.2. Attention-Based Fusion
3.2.3. Shapley Value Modality Selection
3.3. Privacy and Security Framework
3.3.1. Differential Privacy Mechanism
3.3.2. Privacy-Preserving Multimodal Fusion
3.3.3. Secure Aggregation and Byzantine Robustness
- Collect noisy updates: ;
- Apply coordinate-wise trimming: remove top/bottom (10%) of values;
- Compute weighted average: where .
3.3.4. Multimodal Privacy Amplification
3.3.5. Threat Model
3.4. Theoretical Guarantees
3.4.1. Convergence Analysis
3.4.2. Privacy Accounting
3.4.3. Communication Complexity
3.5. E-DMFL Training Protocol
Algorithm 1 E-DMFL Training Protocol |
|
4. Experimental Setup
4.1. Baseline Evaluation Framework
4.2. Dataset Description and Characteristics
4.3. Data Partitioning and Preprocessing
4.3.1. Federated Learning Data Split
4.3.2. Multimodal Data Preprocessing Pipeline
4.4. Multimodal Configuration Analysis
4.4.1. Single-Modality Baseline Configurations
4.4.2. Multimodal Fusion Configuration
4.4.3. Illustrative Case Studies
4.5. Evaluation Framework and Metrics
4.5.1. Performance Assessment
4.5.2. Robustness Assessment
4.6. Experimental Configuration
4.6.1. Training Configuration and Parameters
4.6.2. Hardware and Deployment Environment
4.6.3. Privacy Attack Evaluation Protocol
4.6.4. Data Specifications, Model Architecture, and Reproducibility
5. Results and Discussion
5.1. Experimental Results
5.1.1. Overall System Performance
5.1.2. Baseline Comparison with Matched Training Budgets
5.1.3. Client-Level Heterogeneity Analysis
5.1.4. Multimodal Fusion Analysis
5.1.5. Privacy Attack Resistance
5.1.6. Communication Efficiency and Scalability
5.1.7. Detailed Classification Performance
5.1.8. Comprehensive Baseline Comparison
5.1.9. Convergence Speed and Accuracy
5.1.10. Ablation Study and Component Analysis
5.1.11. Collusion Attack Evaluation
5.2. Discussion and Implications
5.2.1. Theoretical and Practical Significance
5.2.2. Privacy and Regulatory Compliance
5.2.3. Communication Innovation and Scalability
5.2.4. Methodological Contributions and Future Impact
5.3. Convergence Analysis and Training Horizon
6. Conclusions and Future Work
Scope and Future Directions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- HIPAA Journal. Healthcare Data Breach Statistics. 2025. Available online: https://www.hipaajournal.com/healthcare-data-breach-statistics (accessed on 15 September 2025).
- Peng, B.; Bian, X.; Xu, Y. FedMM: Federated multi-modal learning with modality heterogeneity in computational pathology. arXiv 2024, arXiv:2402.15858. [Google Scholar]
- Poudel, S.; Bose, D.; Zhang, T. CAR-MFL: Cross-modal augmentation by retrieval for multimodal federated learning. arXiv 2024, arXiv:2407.08648. [Google Scholar]
- Zhang, Y.; Li, H.; Tang, Z.; Zhou, Y. MAFed: Modality-Adaptive Federated Learning for Multimodal Tasks. In Advances in Neural Information Processing Systems (NeurIPS); NeurIPS Foundation: New Orleans, LA, USA, 2023; Volume 36, pp. 15234–15246. [Google Scholar]
- McMahan, H.B.; Ramage, D.; Talwar, K.; Zhang, L. Learning differentially private recurrent language models. arXiv 2017, arXiv:1710.06963. [Google Scholar]
- Adnan, M.; Kalra, S.; Cresswell, J.C.; Taylor, G.W.; Tizhoosh, H.R. Federated learning and differential privacy for medical image analysis. Sci. Rep. 2022, 12, 1953. [Google Scholar] [CrossRef] [PubMed]
- Kumar, A.; Shukla, S.; Mishra, R. DP-FLHealth: A Differentially Private Federated Learning Framework for Wearable Health Devices. IEEE J. Biomed. Health Inform. 2024, 28, 512–523. [Google Scholar]
- Thrasher, M.; Jones, K.; Williams, R. Multimodal federated learning in healthcare: A review. arXiv 2023, arXiv:2310.09650. [Google Scholar]
- Aga, D.T.; Siddula, M. Exploring Secure and Private Data Aggregation Techniques for the Internet of Things: A Comprehensive Review. J. Internet Things Secur. 2024, 12, 1–23. [Google Scholar] [CrossRef]
- Yuan, L.; Han, D.; Wang, S. Communication-efficient multimodal federated learning. arXiv 2024, arXiv:2401.16685. [Google Scholar] [CrossRef]
- Yang, Q.; Liu, Y.; Chen, T.; Tong, Y. Federated machine learning: Concept and applications. ACM Trans. Intell. Syst. Technol. 2021, 10, 1–19. [Google Scholar] [CrossRef]
- Wu, H.; Zhang, Z.; Lin, Y. EMFed: Energy-aware Multimodal Federated Learning for Resource-Constrained IoMT. arXiv 2024, arXiv:2403.13204. [Google Scholar]
- Chen, Y.; Wang, J.; Yu, C.; Gao, W.; Qin, X. FedHealth: A federated transfer learning framework for wearable healthcare. arXiv 2021, arXiv:1907.09173. [Google Scholar] [CrossRef]
- Diao, E.; Ding, J.; Tarokh, V. HeteroFL: Computation and communication efficient federated learning for heterogeneous clients. In Proceedings of the International Conference on Learning Representations (ICLR 2021); OpenReview: Vienna, Austria, 2021. [Google Scholar]
- Sarkar, D.; Narang, A.; Rai, S. Fed-Focal Loss for imbalanced data classification in Federated Learning. arXiv 2020, arXiv:2011.06283. [Google Scholar] [CrossRef]
- Li, T.; Sahu, A.K.; Talwalkar, A.; Smith, V. Federated optimization in heterogeneous networks. In Proceedings of Machine Learning and Systems (MLSys); MLSys: Austin, TX, USA, 2020; Volume 2, pp. 429–450. [Google Scholar]
- Ren, X.; Chen, X.; Liu, J.; Wang, W. FedMV: Federated Multiview Learning with Missing Modalities. IEEE Trans. Neural Netw. Learn. Syst. 2023, 35, 10234–10246. [Google Scholar] [CrossRef]
- Liu, J.; Wang, T.; Zhao, H.; Yang, Q. pFedMulti: Personalized Multimodal Federated Learning. In Proceedings of the International Conference on Learning Representations (ICLR 2024); OpenReview: Vienna, Austria, 2024. [Google Scholar]
- Zhang, X.; Wang, Y.; Han, Y.; Liang, C.; Chatterjee, I.; Tang, J. The EarSAVAS Dataset: Enabling Subject-Aware Vocal Activity Sensing on Earables. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2024, 8, 1–31. [Google Scholar]
Reference | Handles Modality Heterogeneity | Supports Missing Modalities | Optimizes Communication | Enhances Privacy | Applies to Real Healthcare Data | Improves Model Generalization |
---|---|---|---|---|---|---|
Thrasher 2023 [8] | ✓ | ✗ | ✗ | ✓ | ✓ | ✗ |
CAR-MFL 2024 [3] | ✗ | ✓ | ✗ | ✓ | ✓ | ✗ |
FedMM 2024 [2] | ✓ | ✗ | ✗ | ✓ | ✓ | ✗ |
FedMV 2023 [17] | ✓ | ✓ | ✗ | ✗ | ✗ | ✓ |
MAFed 2024 [4] | ✓ | ✗ | ✗ | ✗ | ✓ | ✓ |
pFedMulti 2024 [18] | ✓ | ✗ | ✗ | ✗ | ✓ | ✓ |
DP-FLHealth 2024 [7] | ✗ | ✗ | ✗ | ✓ | ✓ | ✗ |
EMFed 2024 [12] | ✓ | ✗ | ✓ | ✗ | ✓ | ✗ |
FedHealth 2021 [13] | ✗ | ✗ | ✗ | ✓ | ✓ | ✓ |
HeteroFL 2021 [14] | ✓ | ✗ | ✗ | ✗ | ✗ | ✓ |
DP-FedAvg 2018 [5] | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ |
Kaissis 2022 [6] | ✗ | ✗ | ✗ | ✓ | ✓ | ✗ |
mmFedMC 2024 [10] | ✓ | ✗ | ✓ | ✓ | ✓ | ✗ |
FedMBridge 2023 [11] | ✓ | ✗ | ✓ | ✗ | ✓ | ✓ |
FedFocal 2024 [15,16] | ✗ | ✗ | ✗ | ✗ | ✓ | ✓ |
E-DMFL (Ours) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Metric | Value and Significance |
---|---|
Test Accuracy | 92.0% [95% CI: 91.2%, 92.8%] |
Validation Accuracy | 91.8% ± 2.1% (p < 0.001) |
Precision | 0.932 [95% CI: 0.918, 0.946] |
Recall | 0.920 [95% CI: 0.905, 0.935] |
F1-Score | 0.923 [95% CI: 0.910, 0.936] |
Convergence Rounds | 6 (4.2× faster than FedAvg) |
Training Time | 4.2 h (78% reduction) |
Method | Rounds | Epochs/Round | Total Samples | Comm. (GB) | Accuracy |
---|---|---|---|---|---|
E-DMFL | 6 | 3 | 30,240 | 2.1 | 92.0% |
FedAvg | 25 | 3 | 126,000 | 9.6 | 91.0% |
FedProx () | 22 | 3 | 587 K | 8.4 | 91.2% |
FedNova | 20 | 3 | 533 K | 7.6 | 91.5% |
DP-FedAvg | 30 | 3 | 800 K | 11.2 | 82.7% |
SMPC-FL | 18 | 3 | 480 K | 6.8 | 88.4% |
Centralized | 15 | - | 400 K | N/A | 95.0% |
Total Sample-Epochs = Rounds × Epochs × Total Training Samples (1680) |
Configuration | Accuracy | Rounds | Improvement |
---|---|---|---|
Audio-Only | 85.0% | 35 | Baseline |
Motion-Only | 80.0% | 45 | Baseline |
Multimodal (E-DMFL) | 92.0% | 6 | +7%/+12% |
Shapley Value Analysis | |||
Audio Contribution | 65% | - | Primary modality |
Motion Contribution | 35% | - | Supporting modality |
Robustness Under Sensor Failure | |||
Missing Audio | 88.2% | 8 | −3.8% degradation |
Missing Motion | 85.7% | 10 | −6.3% degradation |
Fusion Strategy Comparison | |||
Late Fusion (Ours) | 92.0% | 6 | Optimal |
Early Fusion | 89.1% | 9 | −2.9% performance |
Feature-Level Fusion | 87.3% | 12 | −4.7% performance |
Attack Type | Attack Config | Baseline | E-DMFL | Protection |
---|---|---|---|---|
Model Inversion | 1000 queries, SSIM > 0.5 | 87 ± 3.2% | 12 ± 2.1% | 86% reduction |
Membership Inference | Shadow model, black-box | 78 ± 2.8% | 52 ± 1.9% | Random-level |
Property Inference | Weight-based, AUC | 73 ± 4.1% | 18 ± 3.3% | 75% reduction |
All results: mean ± std over 5 runs with seeds (42, 123, 456, 789, 2024) | ||||
Privacy Parameters | Value | Accuracy Impact | Compliance | |
-DP | (1.0, ) | −0.9% | HIPAA/GDPR | |
Noise Variance () | Minimal | Optimal |
Approach | Accuracy | Rounds | Privacy | Modalities | Efficiency |
---|---|---|---|---|---|
E-DMFL (Ours) | 92.0% | 6 | Comprehensive | Multi | High |
FedAvg Baseline | 91.0% | 25 | None | Multi | Medium |
Audio-Only FL | 85.0% | 35 | None | Single | Low |
Motion-Only FL | 80.0% | 45 | None | Single | Low |
Centralized Upper Bound | 95.0% | 15 | None | Multi | N/A |
DP-FedAvg | 82.7% | 30 | DP Only | Single | Low |
SecAgg FL | 87.1% | 20 | Secure Agg | Single | Medium |
Basic Multimodal FL | 89.3% | 15 | Basic | Multi | Medium |
SMPC-FL | 88.4% | 18 | SMPC Only | Multi | Low |
Performance Gap Analysis | |||||
vs. Best Baseline (FedAvg) | +1.0% | −19 rounds | +Privacy | Same | +78% |
vs. Centralized Bound | −3.0% | +9 rounds | +Privacy | Same | +Federated |
Configuration | Accuracy | Rounds | Accuracy Impact | Convergence Impact |
---|---|---|---|---|
E-DMFL (Full) | 92.0% | 6 | Baseline | Baseline |
w/o Multimodal Fusion | 85.5% | 14 | −6.5% | +8 rounds |
w/o Shapley Selection | 87.2% | 12 | −4.8% | +6 rounds |
w/o Gated Attention | 88.4% | 10 | −3.6% | +4 rounds |
w/o Robust Aggregation | 88.9% | 11 | −3.1% | +5 rounds |
w/o Adaptive Learning | 89.7% | 9 | −2.3% | +3 rounds |
w/o Differential Privacy | 91.1% | 6 | −0.9% | No change |
Cumulative Impact | ||||
Top 3 Components | −14.9% | +18 rounds | Critical | Severe |
All Components | −20.4% | +27 rounds | Substantial | Dramatic |
Method | k = 1 (Baseline) | k = 2 | k = 5 | k = 10 | k = 20 |
---|---|---|---|---|---|
No DP | 87.0% | 87.0% | 87.0% | 87.0% | 87.0% |
E-DMFL () | 51.0% | 62.3% | 71.5% | 75.7% | 79.4% |
Privacy degradation | — | +11.3% | +20.6% | +24.7% | +28.4% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Aga, D.T.; Siddula, M. Enhanced Distributed Multimodal Federated Learning Framework for Privacy-Preserving IoMT Applications: E-DMFL. Electronics 2025, 14, 4024. https://doi.org/10.3390/electronics14204024
Aga DT, Siddula M. Enhanced Distributed Multimodal Federated Learning Framework for Privacy-Preserving IoMT Applications: E-DMFL. Electronics. 2025; 14(20):4024. https://doi.org/10.3390/electronics14204024
Chicago/Turabian StyleAga, Dagmawit Tadesse, and Madhuri Siddula. 2025. "Enhanced Distributed Multimodal Federated Learning Framework for Privacy-Preserving IoMT Applications: E-DMFL" Electronics 14, no. 20: 4024. https://doi.org/10.3390/electronics14204024
APA StyleAga, D. T., & Siddula, M. (2025). Enhanced Distributed Multimodal Federated Learning Framework for Privacy-Preserving IoMT Applications: E-DMFL. Electronics, 14(20), 4024. https://doi.org/10.3390/electronics14204024