You are currently viewing a new version of our website. To view the old version click .
Sensors
  • This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
  • Article
  • Open Access

1 January 2026

GRU-Based Deep Multimodal Fusion of Speech and Head-IMU Signals in Mixed Reality for Parkinson’s Disease Detection

,
,
,
,
,
,
,
and
1
Department of Measurement and Electronics, AGH University of Krakow, 30-059 Krakow, Poland
2
Department of Neurology, Andrzej Frycz Modrzewski Krakow University, 30-705 Krakow, Poland
*
Author to whom correspondence should be addressed.
Sensors2026, 26(1), 269;https://doi.org/10.3390/s26010269 
(registering DOI)
This article belongs to the Special Issue Intelligent Biomedical Systems: The Convergence of Sensors, Signal Processing, and Machine Learning

Abstract

Parkinson’s disease (PD) alters both speech and movement, yet most automated assessments still treat these signals separately. We examined whether combining voice with head motion improves discrimination between patients and healthy controls (HC). Synchronous measurements of acoustic and inertial signals were collected using a HoloLens 2 headset. Data were obtained from 165 participants (72 PD/93 HC), following a standardized mixed-reality (MR) protocol. We benchmarked single-modality models against fusion strategies under 5-fold stratified cross-validation. Voice alone was robust (pooled AUC ≈ 0.865), while the inertial channel alone was near chance (AUC ≈ 0.497). Fusion provided a modest but repeatable improvement: gated early-fusion achieved the highest AUC (≈0.875), cross-attention fusion was comparable (≈0.873). Gains were task-dependent. While speech-dominated tasks were already well captured by audio, tasks that embed movement benefited from complementary inertial data. Proposed MR capture proved feasible within a single session and showed that motion acts as a conditional improvement factor rather than a sole predictor. The results outline a practical path to multimodal screening and monitoring for PD, preserving the reliability of acoustic biomarkers while integrating kinematic features when they matter.

Article Metrics

Citations

Article Access Statistics

Article metric data becomes available approximately 24 hours after publication online.