Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (17)

Search Parameters:
Keywords = multimodal depression classification

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 3151 KB  
Article
MMDD: A Multimodal Multitask Dynamic Disentanglement Framework for Robust Major Depressive Disorder Diagnosis Across Neuroimaging Sites
by Qiongpu Chen, Peishan Dai, Kaineng Huang, Ting Hu and Shenghui Liao
Diagnostics 2025, 15(23), 3089; https://doi.org/10.3390/diagnostics15233089 - 4 Dec 2025
Viewed by 323
Abstract
Background/Objectives: Major Depressive Disorder (MDD) is a severe psychiatric disorder, and effective, efficient automated diagnostic approaches are urgently needed. Traditional methods for assessing MDD face three key challenges: reliance on predefined features, inadequate handling of multi-site data heterogeneity, and suboptimal feature fusion. To [...] Read more.
Background/Objectives: Major Depressive Disorder (MDD) is a severe psychiatric disorder, and effective, efficient automated diagnostic approaches are urgently needed. Traditional methods for assessing MDD face three key challenges: reliance on predefined features, inadequate handling of multi-site data heterogeneity, and suboptimal feature fusion. To address these issues, this study proposes the Multimodal Multitask Dynamic Disentanglement (MMDD) Framework. Methods: The MMDD Framework has three core innovations. First, it adopts a dual-pathway feature extraction architecture combining a 3D ResNet for modeling gray matter volume (GMV) data and an LSTM–Transformer for processing time series data. Second, it includes a Bidirectional Cross-Attention Fusion (BCAF) mechanism for dynamic feature alignment and complementary integration. Third, it uses a Gradient Reversal Layer-based Multitask Learning (GRL-MTL) strategy for enhancing the model’s domain generalization capability. Results: MMDD achieved 77.76% classification accuracy on the REST-meta-MDD dataset. Ablation studies confirmed that both the BCAF mechanism and GRL-MTL strategy played critical roles: the former optimized multimodal fusion, while the latter effectively mitigated site-related heterogeneity. Through interpretability analysis, we identified distinct neurobiological patterns: time series were primarily localized to subcortical hubs and the cerebellum, whereas GMV mainly involved higher-order cognitive and emotion-regulation cortices. Notably, the middle cingulate gyrus showed consistent abnormalities across both imaging modalities. Conclusions: This study makes two major contributions. First, we develop a robust and generalizable computational framework for objective MDD diagnosis by effectively leveraging multimodal data. Second, we provide data-driven insights into MDD’s distinct neuropathological processes, thereby advancing our understanding of the disorder. Full article
Show Figures

Figure 1

24 pages, 1716 KB  
Article
Multi-Modal Decentralized Hybrid Learning for Early Parkinson’s Detection Using Voice Biomarkers and Contrastive Speech Embeddings
by Khaled M. Alhawiti
Sensors 2025, 25(22), 6959; https://doi.org/10.3390/s25226959 - 14 Nov 2025
Viewed by 671
Abstract
Millions worldwide are affected by Parkinson’s disease, with the World Health Organization highlighting its growing prevalence. Early neuromotor speech impairments make voice analysis a promising tool for detecting Parkinson’s, aided by advances in deep speech embeddings. However, existing approaches often rely on either [...] Read more.
Millions worldwide are affected by Parkinson’s disease, with the World Health Organization highlighting its growing prevalence. Early neuromotor speech impairments make voice analysis a promising tool for detecting Parkinson’s, aided by advances in deep speech embeddings. However, existing approaches often rely on either handcrafted acoustic features or opaque deep representations, limiting diagnostic performance and interoperability. To address this, we propose a multi-modal decentralized hybrid learning framework that combines structured voice biomarkers from the UCI Parkinson’s dataset (195 sustained-phonation samples from 31 subjects) with contrastive speech embeddings derived from the DAIC-WOZ corpus (189 interview recordings originally collected for depression detection) using Wav2Vec 2.0. This system employs an early fusion strategy followed by a dense neural classifier optimized for binary classification. By integrating both clinically interpretable and semantically rich features, the model captures complementary phonatory and affective patterns relevant to early-stage Parkinson’s detection. Extensive evaluation demonstrates that the proposed method achieves an accuracy of 96.2% and an AUC of 97.1%, outperforming unimodal and baseline fusion models. SHAP-based analysis confirms that a subset of features have disproportionately high discriminative value, enhancing interpretability. Overall, the proposed framework establishes a promising pathway toward data-driven, non-invasive screening for neurodegenerative conditions through voice analysis. Full article
(This article belongs to the Special Issue Blockchain Technology for Internet of Things)
Show Figures

Figure 1

18 pages, 3052 KB  
Article
Classifying Major Depressive Disorder Using Multimodal MRI Data: A Personalized Federated Algorithm
by Zhipeng Fan, Jingrui Xu, Jianpo Su and Dewen Hu
Brain Sci. 2025, 15(10), 1081; https://doi.org/10.3390/brainsci15101081 - 6 Oct 2025
Viewed by 964
Abstract
Background: Neuroimaging-based diagnostic approaches are of critical importance for the accurate diagnosis and treatment of major depressive disorder (MDD). However, multisite neuroimaging data often exhibit substantial heterogeneity in terms of scanner protocols and population characteristics. Moreover, concerns over data ownership, security, and privacy [...] Read more.
Background: Neuroimaging-based diagnostic approaches are of critical importance for the accurate diagnosis and treatment of major depressive disorder (MDD). However, multisite neuroimaging data often exhibit substantial heterogeneity in terms of scanner protocols and population characteristics. Moreover, concerns over data ownership, security, and privacy make raw MRI datasets from multiple sites inaccessible, posing significant challenges to the development of robust diagnostic models. Federated learning (FL) offers a privacy-preserving solution to facilitate collaborative model training across sites without sharing raw data. Methods: In this study, we propose the personalized Federated Gradient Matching and Contrastive Optimization (pF-GMCO) algorithm to address domain shift and support scalable MDD classification using multimodal MRI. Our method incorporates gradient matching based on cosine similarity to weight contributions from different sites adaptively, contrastive learning to promote client-specific model optimization, and multimodal compact bilinear (MCB) pooling to effectively integrate structural MRI (sMRI) and functional MRI (fMRI) features. Results and Conclusions: Evaluated on the Rest-Meta-MDD dataset with 2293 subjects from 23 sites, pF-GMCO achieved accuracy of 79.07%, demonstrating superior performance and interpretability. This work provides an effective and privacy-aware framework for multisite MDD diagnosis using federated learning. Full article
Show Figures

Figure 1

12 pages, 878 KB  
Communication
Depression Recognition Using Daily Wearable-Derived Physiological Data
by Xinyu Shui, Hao Xu, Shuping Tan and Dan Zhang
Sensors 2025, 25(2), 567; https://doi.org/10.3390/s25020567 - 19 Jan 2025
Cited by 8 | Viewed by 6995
Abstract
The objective identification of depression using physiological data has emerged as a significant research focus within the field of psychiatry. The advancement of wearable physiological measurement devices has opened new avenues for the identification of individuals with depression in everyday-life contexts. Compared to [...] Read more.
The objective identification of depression using physiological data has emerged as a significant research focus within the field of psychiatry. The advancement of wearable physiological measurement devices has opened new avenues for the identification of individuals with depression in everyday-life contexts. Compared to other objective measurement methods, wearables offer the potential for continuous, unobtrusive monitoring, which can capture subtle physiological changes indicative of depressive states. The present study leverages multimodal wristband devices to collect data from fifty-eight participants clinically diagnosed with depression during their normal daytime activities over six hours. Data collected include pulse wave, skin conductance, and triaxial acceleration. For comparison, we also utilized data from fifty-eight matched healthy controls from a publicly available dataset, collected using the same devices over equivalent durations. Our aim was to identify depressive individuals through the analysis of multimodal physiological measurements derived from wearable devices in daily life scenarios. We extracted static features such as the mean, variance, skewness, and kurtosis of physiological indicators like heart rate, skin conductance, and acceleration, as well as autoregressive coefficients of these signals reflecting the temporal dynamics. Utilizing a Random Forest algorithm, we distinguished depressive and non-depressive individuals with varying classification accuracies on data aggregated over 6 h, 2 h, 30 min, and 5 min segments, as 90.0%, 84.7%, 80.1%, and 76.0%, respectively. Our results demonstrate the feasibility of using daily wearable-derived physiological data for depression recognition. The achieved classification accuracies suggest that this approach could be integrated into clinical settings for the early detection and monitoring of depressive symptoms. Future work will explore the potential of these methods for personalized interventions and real-time monitoring, offering a promising avenue for enhancing mental health care through the integration of wearable technology. Full article
(This article belongs to the Special Issue Wearable Technologies and Sensors for Healthcare and Wellbeing)
Show Figures

Figure 1

18 pages, 2813 KB  
Article
Multimodal Data Fusion for Depression Detection Approach
by Mariia Nykoniuk, Oleh Basystiuk, Nataliya Shakhovska and Nataliia Melnykova
Computation 2025, 13(1), 9; https://doi.org/10.3390/computation13010009 - 2 Jan 2025
Cited by 15 | Viewed by 9103
Abstract
Depression is one of the most common mental health disorders in the world, affecting millions of people. Early detection of depression is crucial for effective medical intervention. Multimodal networks can greatly assist in the detection of depression, especially in situations where in patients [...] Read more.
Depression is one of the most common mental health disorders in the world, affecting millions of people. Early detection of depression is crucial for effective medical intervention. Multimodal networks can greatly assist in the detection of depression, especially in situations where in patients are not always aware of or able to express their symptoms. By analyzing text and audio data, such networks are able to automatically identify patterns in speech and behavior that indicate a depressive state. In this study, we propose two multimodal information fusion networks: early and late fusion. These networks were developed using convolutional neural network (CNN) layers to learn local patterns, a bidirectional LSTM (Bi-LSTM) to process sequences, and a self-attention mechanism to improve focus on key parts of the data. The DAIC-WOZ and EDAIC-WOZ datasets were used for the experiments. The experiments compared the precision, recall, f1-score, and accuracy metrics for the cases of using early and late multimodal data fusion and found that the early information fusion multimodal network achieved higher classification accuracy results. On the test dataset, this network achieved an f1-score of 0.79 and an overall classification accuracy of 0.86, indicating its effectiveness in detecting depression. Full article
(This article belongs to the Special Issue Artificial Intelligence Applications in Public Health: 2nd Edition)
Show Figures

Figure 1

20 pages, 2139 KB  
Article
Hypergraph Neural Network for Multimodal Depression Recognition
by Xiaolong Li, Yang Dong, Yunfei Yi, Zhixun Liang and Shuqi Yan
Electronics 2024, 13(22), 4544; https://doi.org/10.3390/electronics13224544 - 19 Nov 2024
Cited by 4 | Viewed by 2581
Abstract
Deep learning-based approaches for automatic depression recognition offer advantages of low cost and high efficiency. However, depression symptoms are challenging to detect and vary significantly between individuals. Traditional deep learning methods often struggle to capture and model these nuanced features effectively, leading to [...] Read more.
Deep learning-based approaches for automatic depression recognition offer advantages of low cost and high efficiency. However, depression symptoms are challenging to detect and vary significantly between individuals. Traditional deep learning methods often struggle to capture and model these nuanced features effectively, leading to lower recognition accuracy. This paper introduces a novel multimodal depression recognition method, HYNMDR, which utilizes hypergraphs to represent the complex, high-order relationships among patients with depression. HYNMDR comprises two primary components: a temporal embedding module and a hypergraph classification module. The temporal embedding module employs a temporal convolutional network and a negative sampling loss function based on Euclidean distance to extract feature embeddings from unimodal and cross-modal long-time series data. To capture the unique ways in which depression may manifest in certain feature elements, the hypergraph classification module introduces a threshold segmentation-based hyperedge construction method. This method is the first attempt to apply hypergraph neural networks to multimodal depression recognition. Experimental evaluations on the DAIC-WOZ and E-DAIC datasets demonstrate that HYNMDR outperforms existing methods in automatic depression monitoring, achieving an F1 score of 91.1% and an accuracy of 94.0%. Full article
(This article belongs to the Special Issue Digital Intelligence Technology and Applications)
Show Figures

Figure 1

18 pages, 2001 KB  
Article
Multimodal Fusion of EEG and Audio Spectrogram for Major Depressive Disorder Recognition Using Modified DenseNet121
by Musyyab Yousufi, Robertas Damaševičius and Rytis Maskeliūnas
Brain Sci. 2024, 14(10), 1018; https://doi.org/10.3390/brainsci14101018 - 15 Oct 2024
Cited by 5 | Viewed by 6857
Abstract
Background/Objectives: This study investigates the classification of Major Depressive Disorder (MDD) using electroencephalography (EEG) Short-Time Fourier-Transform (STFT) spectrograms and audio Mel-spectrogram data of 52 subjects. The objective is to develop a multimodal classification model that integrates audio and EEG data to accurately identify [...] Read more.
Background/Objectives: This study investigates the classification of Major Depressive Disorder (MDD) using electroencephalography (EEG) Short-Time Fourier-Transform (STFT) spectrograms and audio Mel-spectrogram data of 52 subjects. The objective is to develop a multimodal classification model that integrates audio and EEG data to accurately identify depressive tendencies. Methods: We utilized the Multimodal open dataset for Mental Disorder Analysis (MODMA) and trained a pre-trained Densenet121 model using transfer learning. Features from both the EEG and audio modalities were extracted and concatenated before being passed through the final classification layer. Additionally, an ablation study was conducted on both datasets separately. Results: The proposed multimodal classification model demonstrated superior performance compared to existing methods, achieving an Accuracy of 97.53%, Precision of 98.20%, F1 Score of 97.76%, and Recall of 97.32%. A confusion matrix was also used to evaluate the model’s effectiveness. Conclusions: The paper presents a robust multimodal classification approach that outperforms state-of-the-art methods with potential application in clinical diagnostics for depression assessment. Full article
(This article belongs to the Special Issue Computational Intelligence and Brain Plasticity)
Show Figures

Figure 1

15 pages, 5909 KB  
Article
Abnormality in Peripheral and Brain Iron Contents and the Relationship with Grey Matter Volumes in Major Depressive Disorder
by Wenjia Liang, Bo Zhou, Zhongyan Miao, Xi Liu and Shuwei Liu
Nutrients 2024, 16(13), 2073; https://doi.org/10.3390/nu16132073 - 28 Jun 2024
Cited by 5 | Viewed by 2637
Abstract
Major depressive disorder (MDD) is a prevalent mental illness globally, yet its etiology remains largely elusive. Recent interest in the scientific community has focused on the correlation between the disruption of iron homeostasis and MDD. Prior studies have revealed anomalous levels of iron [...] Read more.
Major depressive disorder (MDD) is a prevalent mental illness globally, yet its etiology remains largely elusive. Recent interest in the scientific community has focused on the correlation between the disruption of iron homeostasis and MDD. Prior studies have revealed anomalous levels of iron in both peripheral blood and the brain of MDD patients; however, these findings are not consistent. This study involved 95 MDD patients aged 18–35 and 66 sex- and age-matched healthy controls (HCs) who underwent 3D-T1 and quantitative susceptibility mapping (QSM) sequence scans to assess grey matter volume (GMV) and brain iron concentration, respectively. Plasma ferritin (pF) levels were measured in a subset of 49 MDD individuals and 41 HCs using the Enzyme-linked immunosorbent assay (ELISA), whose blood data were simultaneously collected. We hypothesize that morphological brain changes in MDD patients are related to abnormal regulation of iron levels in the brain and periphery. Multimodal canonical correlation analysis plus joint independent component analysis (MCCA+jICA) algorithm was mainly used to investigate the covariation patterns between the brain iron concentration and GMV. The results of “MCCA+jICA” showed that the QSM values in bilateral globus pallidus and caudate nucleus of MDD patients were lower than HCs. While in the bilateral thalamus and putamen, the QSM values in MDD patients were higher than in HCs. The GMV values of these brain regions showed a significant positive correlation with QSM. The GMV values of bilateral putamen were found to be increased in MDD patients compared with HCs. A small portion of the thalamus showed reduced GMV values in MDD patients compared to HCs. Furthermore, the region of interest (ROI)-based comparison results in the basal ganglia structures align with the outcomes obtained from the “MCCA+jICA” analysis. The ELISA results indicated that the levels of pF in MDD patients were higher than those in HCs. Correlation analysis revealed that the increase in pF was positively correlated with the iron content in the left thalamus. Finally, the covariation patterns obtained from “MCCA+jICA” analysis as classification features effectively differentiated MDD patients from HCs in the support vector machine (SVM) model. Our findings indicate that elevated peripheral ferritin in MDD patients may disrupt the normal metabolism of iron in the brain, leading to abnormal changes in brain iron levels and GMV. Full article
(This article belongs to the Section Micronutrients and Human Health)
Show Figures

Figure 1

51 pages, 795 KB  
Review
A Comprehensive Review on Synergy of Multi-Modal Data and AI Technologies in Medical Diagnosis
by Xi Xu, Jianqiang Li, Zhichao Zhu, Linna Zhao, Huina Wang, Changwei Song, Yining Chen, Qing Zhao, Jijiang Yang and Yan Pei
Bioengineering 2024, 11(3), 219; https://doi.org/10.3390/bioengineering11030219 - 25 Feb 2024
Cited by 102 | Viewed by 21423
Abstract
Disease diagnosis represents a critical and arduous endeavor within the medical field. Artificial intelligence (AI) techniques, spanning from machine learning and deep learning to large model paradigms, stand poised to significantly augment physicians in rendering more evidence-based decisions, thus presenting a pioneering solution [...] Read more.
Disease diagnosis represents a critical and arduous endeavor within the medical field. Artificial intelligence (AI) techniques, spanning from machine learning and deep learning to large model paradigms, stand poised to significantly augment physicians in rendering more evidence-based decisions, thus presenting a pioneering solution for clinical practice. Traditionally, the amalgamation of diverse medical data modalities (e.g., image, text, speech, genetic data, physiological signals) is imperative to facilitate a comprehensive disease analysis, a topic of burgeoning interest among both researchers and clinicians in recent times. Hence, there exists a pressing need to synthesize the latest strides in multi-modal data and AI technologies in the realm of medical diagnosis. In this paper, we narrow our focus to five specific disorders (Alzheimer’s disease, breast cancer, depression, heart disease, epilepsy), elucidating advanced endeavors in their diagnosis and treatment through the lens of artificial intelligence. Our survey not only delineates detailed diagnostic methodologies across varying modalities but also underscores commonly utilized public datasets, the intricacies of feature engineering, prevalent classification models, and envisaged challenges for future endeavors. In essence, our research endeavors to contribute to the advancement of diagnostic methodologies, furnishing invaluable insights for clinical decision making. Full article
(This article belongs to the Special Issue Biomedical Application of Big Data and Artificial Intelligence)
Show Figures

Figure 1

14 pages, 616 KB  
Article
Mood Disorder Severity and Subtype Classification Using Multimodal Deep Neural Network Models
by Joo Hun Yoo, Harim Jeong, Ji Hyun An and Tai-Myoung Chung
Sensors 2024, 24(2), 715; https://doi.org/10.3390/s24020715 - 22 Jan 2024
Cited by 6 | Viewed by 4009
Abstract
The subtype diagnosis and severity classification of mood disorder have been made through the judgment of verified assistance tools and psychiatrists. Recently, however, many studies have been conducted using biomarker data collected from subjects to assist in diagnosis, and most studies use heart [...] Read more.
The subtype diagnosis and severity classification of mood disorder have been made through the judgment of verified assistance tools and psychiatrists. Recently, however, many studies have been conducted using biomarker data collected from subjects to assist in diagnosis, and most studies use heart rate variability (HRV) data collected to understand the balance of the autonomic nervous system on statistical analysis methods to perform classification through statistical analysis. In this research, three mood disorder severity or subtype classification algorithms are presented through multimodal analysis of data on the collected heart-related data variables and hidden features from the variables of time and frequency domain of HRV. Comparing the classification performance of the statistical analysis widely used in existing major depressive disorder (MDD), anxiety disorder (AD), and bipolar disorder (BD) classification studies and the multimodality deep neural network analysis newly proposed in this study, it was confirmed that the severity or subtype classification accuracy performance of each disease improved by 0.118, 0.231, and 0.125 on average. Through the study, it was confirmed that deep learning analysis of biomarker data such as HRV can be applied as a primary identification and diagnosis aid for mental diseases, and that it can help to objectively diagnose psychiatrists in that it can confirm not only the diagnosed disease but also the current mood status. Full article
(This article belongs to the Special Issue Advanced Machine Intelligence for Biomedical Signal Processing)
Show Figures

Figure 1

37 pages, 3956 KB  
Article
Towards Personalised Mood Prediction and Explanation for Depression from Biophysical Data
by Sobhan Chatterjee, Jyoti Mishra, Frederick Sundram and Partha Roop
Sensors 2024, 24(1), 164; https://doi.org/10.3390/s24010164 - 27 Dec 2023
Cited by 8 | Viewed by 3595
Abstract
Digital health applications using Artificial Intelligence (AI) are a promising opportunity to address the widening gap between available resources and mental health needs globally. Increasingly, passively acquired data from wearables are augmented with carefully selected active data from depressed individuals to develop Machine [...] Read more.
Digital health applications using Artificial Intelligence (AI) are a promising opportunity to address the widening gap between available resources and mental health needs globally. Increasingly, passively acquired data from wearables are augmented with carefully selected active data from depressed individuals to develop Machine Learning (ML) models of depression based on mood scores. However, most ML models are black box in nature, and hence the outputs are not explainable. Depression is also multimodal, and the reasons for depression may vary significantly between individuals. Explainable and personalised models will thus be beneficial to clinicians to determine the main features that lead to a decline in the mood state of a depressed individual, thus enabling suitable personalised therapy. This is currently lacking. Therefore, this study presents a methodology for developing personalised and accurate Deep Learning (DL)-based predictive mood models for depression, along with novel methods for identifying the key facets that lead to the exacerbation of depressive symptoms. We illustrate our approach by using an existing multimodal dataset containing longitudinal Ecological Momentary Assessments of depression, lifestyle data from wearables and neurocognitive assessments for 14 mild to moderately depressed participants over one month. We develop classification- and regression-based DL models to predict participants’ current mood scores—a discrete score given to a participant based on the severity of their depressive symptoms. The models are trained inside eight different evolutionary-algorithm-based optimisation schemes that optimise the model parameters for a maximum predictive performance. A five-fold cross-validation scheme is used to verify the DL model’s predictive performance against 10 classical ML-based models, with a model error as low as 6% for some participants. We use the best model from the optimisation process to extract indicators, using SHAP, ALE and Anchors from explainable AI literature to explain why certain predictions are made and how they affect mood. These feature insights can assist health professionals in incorporating personalised interventions into a depressed individual’s treatment regimen. Full article
(This article belongs to the Special Issue Use of Smart Wearable Sensors and AI Methods in Providing P4 Medicine)
Show Figures

Figure 1

20 pages, 6790 KB  
Article
Cross-Silo, Privacy-Preserving, and Lightweight Federated Multimodal System for the Identification of Major Depressive Disorder Using Audio and Electroencephalogram
by Chetna Gupta, Vikas Khullar, Nitin Goyal, Kirti Saini, Ritu Baniwal, Sushil Kumar and Rashi Rastogi
Diagnostics 2024, 14(1), 43; https://doi.org/10.3390/diagnostics14010043 - 25 Dec 2023
Cited by 15 | Viewed by 2691
Abstract
In this day and age, depression is still one of the biggest problems in the world. If left untreated, it can lead to suicidal thoughts and attempts. There is a need for proper diagnoses of Major Depressive Disorder (MDD) and evaluation of the [...] Read more.
In this day and age, depression is still one of the biggest problems in the world. If left untreated, it can lead to suicidal thoughts and attempts. There is a need for proper diagnoses of Major Depressive Disorder (MDD) and evaluation of the early stages to stop the side effects. Early detection is critical to identify a variety of serious conditions. In order to provide safe and effective protection to MDD patients, it is crucial to automate diagnoses and make decision-making tools widely available. Although there are various classification systems for the diagnosis of MDD, no reliable, secure method that meets these requirements has been established to date. In this paper, a federated deep learning-based multimodal system for MDD classification using electroencephalography (EEG) and audio datasets is presented while meeting data privacy requirements. The performance of the federated learning (FL) model was tested on independent and identically distributed (IID) and non-IID data. The study began by extracting features from several pre-trained models and ultimately decided to use bidirectional short-term memory (Bi-LSTM) as the base model, as it had the highest validation accuracy of 91% compared to a convolutional neural network and LSTM with 85% and 89% validation accuracy on audio data, respectively. The Bi-LSTM model also achieved a validation accuracy of 98.9% for EEG data. The FL method was then used to perform experiments on IID and non-IID datasets. The FL-based multimodal model achieved an exceptional training and validation accuracy of 99.9% when trained and evaluated on both IID and non-IIID datasets. These results show that the FL multimodal system performs almost as well as the Bi-LSTM multimodal system and emphasize its suitability for processing IID and non-IIID data. Several clients were found to perform better than conventional pre-trained models in a multimodal framework for federated learning using EEG and audio datasets. The proposed framework stands out from other classification techniques for MDD due to its special features, such as multimodality and data privacy for edge machines with limited resources. Due to these additional features, the framework concept is the most suitable alternative approach for the early classification of MDD patients. Full article
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)
Show Figures

Figure 1

24 pages, 1451 KB  
Review
Scoping Review on the Multimodal Classification of Depression and Experimental Study on Existing Multimodal Models
by Umut Arioz, Urška Smrke, Nejc Plohl and Izidor Mlakar
Diagnostics 2022, 12(11), 2683; https://doi.org/10.3390/diagnostics12112683 - 3 Nov 2022
Cited by 18 | Viewed by 5790
Abstract
Depression is a prevalent comorbidity in patients with severe physical disorders, such as cancer, stroke, and coronary diseases. Although it can significantly impact the course of the primary disease, the signs of depression are often underestimated and overlooked. The aim of this paper [...] Read more.
Depression is a prevalent comorbidity in patients with severe physical disorders, such as cancer, stroke, and coronary diseases. Although it can significantly impact the course of the primary disease, the signs of depression are often underestimated and overlooked. The aim of this paper was to review algorithms for the automatic, uniform, and multimodal classification of signs of depression from human conversations and to evaluate their accuracy. For the scoping review, the PRISMA guidelines for scoping reviews were followed. In the scoping review, the search yielded 1095 papers, out of which 20 papers (8.26%) included more than two modalities, and 3 of those papers provided codes. Within the scope of this review, supported vector machine (SVM), random forest (RF), and long short-term memory network (LSTM; with gated and non-gated recurrent units) models, as well as different combinations of features, were identified as the most widely researched techniques. We tested the models using the DAIC-WOZ dataset (original training dataset) and using the SymptomMedia dataset to further assess their reliability and dependency on the nature of the training datasets. The best performance was obtained by the LSTM with gated recurrent units (F1-score of 0.64 for the DAIC-WOZ dataset). However, with a drop to an F1-score of 0.56 for the SymptomMedia dataset, the method also appears to be the most data-dependent. Full article
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)
Show Figures

Figure 1

37 pages, 2486 KB  
Review
Electrocardiogram-Based Emotion Recognition Systems and Their Applications in Healthcare—A Review
by Muhammad Anas Hasnul, Nor Azlina Ab. Aziz, Salem Alelyani, Mohamed Mohana and Azlan Abd. Aziz
Sensors 2021, 21(15), 5015; https://doi.org/10.3390/s21155015 - 23 Jul 2021
Cited by 162 | Viewed by 17165
Abstract
Affective computing is a field of study that integrates human affects and emotions with artificial intelligence into systems or devices. A system or device with affective computing is beneficial for the mental health and wellbeing of individuals that are stressed, anguished, or depressed. [...] Read more.
Affective computing is a field of study that integrates human affects and emotions with artificial intelligence into systems or devices. A system or device with affective computing is beneficial for the mental health and wellbeing of individuals that are stressed, anguished, or depressed. Emotion recognition systems are an important technology that enables affective computing. Currently, there are a lot of ways to build an emotion recognition system using various techniques and algorithms. This review paper focuses on emotion recognition research that adopted electrocardiograms (ECGs) as a unimodal approach as well as part of a multimodal approach for emotion recognition systems. Critical observations of data collection, pre-processing, feature extraction, feature selection and dimensionality reduction, classification, and validation are conducted. This paper also highlights the architectures with accuracy of above 90%. The available ECG-inclusive affective databases are also reviewed, and a popularity analysis is presented. Additionally, the benefit of emotion recognition systems towards healthcare systems is also reviewed here. Based on the literature reviewed, a thorough discussion on the subject matter and future works is suggested and concluded. The findings presented here are beneficial for prospective researchers to look into the summary of previous works conducted in the field of ECG-based emotion recognition systems, and for identifying gaps in the area, as well as in developing and designing future applications of emotion recognition systems, especially in improving healthcare. Full article
(This article belongs to the Special Issue Computer Aided Diagnosis Sensors)
Show Figures

Figure 1

17 pages, 810 KB  
Article
Multi-Modal Adaptive Fusion Transformer Network for the Estimation of Depression Level
by Hao Sun, Jiaqing Liu, Shurong Chai, Zhaolin Qiu, Lanfen Lin, Xinyin Huang and Yenwei Chen
Sensors 2021, 21(14), 4764; https://doi.org/10.3390/s21144764 - 12 Jul 2021
Cited by 78 | Viewed by 7996
Abstract
Depression is a severe psychological condition that affects millions of people worldwide. As depression has received more attention in recent years, it has become imperative to develop automatic methods for detecting depression. Although numerous machine learning methods have been proposed for estimating the [...] Read more.
Depression is a severe psychological condition that affects millions of people worldwide. As depression has received more attention in recent years, it has become imperative to develop automatic methods for detecting depression. Although numerous machine learning methods have been proposed for estimating the levels of depression via audio, visual, and audiovisual emotion sensing, several challenges still exist. For example, it is difficult to extract long-term temporal context information from long sequences of audio and visual data, and it is also difficult to select and fuse useful multi-modal information or features effectively. In addition, how to include other information or tasks to enhance the estimation accuracy is also one of the challenges. In this study, we propose a multi-modal adaptive fusion transformer network for estimating the levels of depression. Transformer-based models have achieved state-of-the-art performance in language understanding and sequence modeling. Thus, the proposed transformer-based network is utilized to extract long-term temporal context information from uni-modal audio and visual data in our work. This is the first transformer-based approach for depression detection. We also propose an adaptive fusion method for adaptively fusing useful multi-modal features. Furthermore, inspired by current multi-task learning work, we also incorporate an auxiliary task (depression classification) to enhance the main task of depression level regression (estimation). The effectiveness of the proposed method has been validated on a public dataset (AVEC 2019 Detecting Depression with AI Sub-challenge) in terms of the PHQ-8 scores. Experimental results indicate that the proposed method achieves better performance compared with currently state-of-the-art methods. Our proposed method achieves a concordance correlation coefficient (CCC) of 0.733 on AVEC 2019 which is 6.2% higher than the accuracy (CCC = 0.696) of the state-of-the-art method. Full article
(This article belongs to the Special Issue Artificial Intelligence and Internet of Things in Healthcare Systems)
Show Figures

Figure 1

Back to TopTop