Large Language Models for Depression Detection: A Review with Prospects of Incomplete Multimodality
Highlights
- What are the main findings?
- Depression represents a growing global mental health burden, and considerable progress has been achieved in depression recognition using unimodal, multimodal analyses and large language model (LLM)–based methods.
- However, research addressing incomplete multimodal data in real-world clinical settings remains limited, which represents a key gap in current depression-recognition methodologies.
- What are the implications of the main findings?
- Future research should address incomplete multimodal learning strategies to handle missing or noisy modalities in real-world clinical scenarios.
- LLM-driven adaptive and knowledge-aware frameworks will be essential for building robust, clinically reliable and interpretable depression-recognition systems.
Abstract
1. Introduction
2. Depression Assessment Scale
3. Depression Dataset
| Dataset | Subjects | Annotation | Language | Modality | Public/Private | Time |
|---|---|---|---|---|---|---|
| AVEC2013 [41] | 292 | BDI-II | German | A + V | Public | 2013 |
| AVEC2014 [51] | 292 | BDI-II | German | A + V | Public | 2014 |
| AVEC2017 [53] | - | PHQ-8 | English | - | Public | 2017 |
| AVEC2019 [54] | - | PHQ-8 | English | - | Public | 2019 |
| DAIC-WoZ [43] | 110 | PHQ-9 | English | A + V + T | Public | 2014 |
| Rochester [62] | 27 | Manual annotation | - | V | Private | 2015 |
| CHI-MEI [52] | 53 | DSSS, HAMD | - | V | Private | 2016 |
| Pittsburgh [55] | 57 | DSM-IV, HAMD > 15 | English | A + V | Public | 2018 |
| BD [56] | 46 | DSM-V | - | A + V | Public | 2018 |
| MODMA [44] | 163 | PHQ-9 | Chinese | A + EEG | Public | 2020 |
| Blackdog [58] | 80 | complete the task | English | A + V | Private | 2009 |
| DEPAC [57] | 571 | complete the task | English | A | Public | 2023 |
| MMDA [59] | 524 | Interview and complete the task | Chinese | A + V + T | Public | 2022 |
| CMDC [60] | 78 | Semi-structured interview | Chinese | A + V + T | Public | 2023 |
| EATD [61] | 162 | Semi-structured interview | Chinese | A + T | Public | 2022 |
4. Depression Detection
4.1. Unimodal Depression Detection
4.1.1. Visual Modality
4.1.2. Audio Modality
4.1.3. Text Modality
4.1.4. EEG Modality
4.2. Multimodal Depression Detection
4.2.1. Physiological Signal and Behavioral Modality Fusion
4.2.2. Multi-Behavioral Modality Fusion
5. Detection of Depression Using LLMs
5.1. Applications of Basic LLMs in Depression Detection
5.2. Optimized LLMs for Depression Detection
5.2.1. Prompt Engineering
5.2.2. Fine-Tuning and Performance Optimization
5.3. Innovations of MLLMs for Depression Detection
5.3.1. Shallow Feature Concatenation
5.3.2. Deep Semantic Fusion
6. Incomplete Multimodal Learning for Depression Detection: Challenges and Prospects
6.1. Incomplete Modal Sentiment Analysis
6.1.1. Generative Methods
6.1.2. Joint Learning Methods
6.2. Incomplete Modality Depression Detection Methods
6.3. Future Research Prospect
6.4. Limitations of the Study and Clinical Translation Considerations
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
| LLM | Large language model |
| MLLM | Multimodal large language model |
| ML | Machine learning |
| AI | Artificial intelligence |
| DL | Deep learning |
| CNN | Convolutional Neural Network |
References
- Xiang, A.H.; Martinez, M.P.; Chow, T.; Carter, S.A.; Negriff, S.; Velasquez, B.; Spitzer, J.; Zuberbuhler, J.C.; Zucker, A.; Kumar, S. Depression and anxiety among US children and young adults. JAMA Netw. Open 2024, 7, e2436906. [Google Scholar] [CrossRef]
- Penninx, B.W.; Lamers, F.; Jansen, R.; Berk, M.; Khandaker, G.M.; De Picker, L.; Milaneschi, Y. Immuno-metabolic depression: From concept to implementation. Lancet Reg. Health 2025, 48, 101166. [Google Scholar] [CrossRef]
- Zhang, Y.; Jia, X.; Yang, Y.; Sun, N.; Shi, S.; Wang, W. Change in the global burden of depression from 1990–2019 and its prediction for 2030. J. Psychiatr. Res. 2024, 178, 16–22. [Google Scholar] [CrossRef]
- Zafar, A.; Aftab, D.; Qureshi, R.; Wang, Y.; Yan, H. Multi-explainable temporalNet: An interpretable multimodal approach using temporal convolutional network for user-level depression detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 16–22 June 2024; pp. 2258–2265. [Google Scholar]
- Yin, H.; Wardenaar, K.J.; Xu, G.; Tian, H.; Schoevers, R.A. Help-seeking behaviors among Chinese people with mental disorders: A cross-sectional study. BMC Psychiatry 2019, 19, 373. [Google Scholar] [CrossRef] [PubMed]
- Wang, Y.; Lin, Z.; Yang, C.; Zhou, Y.; Yang, Y. Automatic depression recognition with an ensemble of multimodal spatio-temporal routing features. IEEE Trans. Affect. Comput. 2025, 16, 1855–1872. [Google Scholar] [CrossRef]
- Niu, M.; Zhao, Z.; Tao, J.; Li, Y.; Schuller, B.W. Selective element and two orders vectorization networks for automatic depression severity diagnosis via facial changes. IEEE Trans. Circuits Syst. Video Technol. 2022, 32, 8065–8077. [Google Scholar] [CrossRef]
- He, M.; Bakker, E.M.; Lew, M.S. DPD (DePression Detection) Net: A deep neural network for multimodal depression detection. Health Inf. Sci. Syst. 2024, 12, 53. [Google Scholar] [CrossRef] [PubMed]
- Fan, H.; Zhang, X.; Xu, Y.; Fang, J.; Zhang, S.; Zhao, X.; Yu, J. Transformer-based multimodal feature enhancement networks for multimodal depression detection integrating video, audio and remote photoplethysmograph signals. Inf. Fusion 2024, 104, 102161. [Google Scholar] [CrossRef]
- Chen, Z.; Wang, D.; Lou, L.; Zhang, S.; Zhao, X.; Jiang, S.; Yu, J.; Xiao, J. Text-guided multimodal depression detection via cross-modal feature reconstruction and decomposition. Inf. Fusion 2025, 117, 102861. [Google Scholar] [CrossRef]
- Zhou, L.; Hu, B.; Guan, Z.H. MDRA: A multimodal depression risk assessment model using audio and text. IEEE Signal Process. Lett. 2025, 32, 2045–2049. [Google Scholar] [CrossRef]
- Fröhlich, H.; Balling, R.; Beerenwinkel, N.; Kohlbacher, O.; Kumar, S.; Lengauer, T.; Maathuis, M.H.; Moreau, Y.; Murphy, S.A.; Przytycka, T.M.; et al. From hype to reality: Data science enabling personalized medicine. BMC Med. 2018, 16, 150. [Google Scholar] [CrossRef]
- Brunn, M.; Diefenbacher, A.; Courtet, P.; Genieys, W. The future is knocking: How artificial intelligence will fundamentally change psychiatry. Acad. Psychiatry 2020, 44, 461–466. [Google Scholar] [CrossRef] [PubMed]
- Squires, M.; Tao, X.; Elangovan, S.; Gururajan, R.; Zhou, X.; Acharya, U.R.; Li, Y. Deep learning and machine learning in psychiatry: A survey of current progress in depression detection, diagnosis and treatment. Brain Inform. 2023, 10, 10. [Google Scholar] [CrossRef]
- Jin, Y.; Liu, J.; Li, P.; Wang, B.; Yan, Y.; Zhang, H.; Ni, C.; Wang, J.; Li, Y.; Bu, Y.; et al. The applications of large language models in mental health: Scoping review. J. Med. Internet Res. 2025, 27, e69284. [Google Scholar] [CrossRef]
- Hasib, K.M.; Islam, M.R.; Sakib, S.; Akbar, M.A.; Razzak, I.; Alam, M.S. Depression detection from social networks data based on machine learning and deep learning techniques: An interrogative survey. IEEE Trans. Comput. Soc. Syst. 2023, 10, 1568–1586. [Google Scholar] [CrossRef]
- Tahir, W.B.; Khalid, S.; Almutairi, S.; Abohashrh, M.; Memon, S.A.; Khan, J. Depression detection in social media: A comprehensive review of machine learning and deep learning techniques. IEEE Access 2025, 13, 12789–12818. [Google Scholar] [CrossRef]
- Bzdok, D.; Meyer-Lindenberg, A. Machine learning for precision psychiatry: Opportunities and challenges. Biol. Psychiatry Cogn. Neurosci. Neuroimaging 2018, 3, 223–230. [Google Scholar] [CrossRef] [PubMed]
- Pinto, S.J.; Parente, M. Comprehensive review of depression detection techniques based on machine learning approach. Soft Comput. 2024, 28, 10701–10725. [Google Scholar] [CrossRef]
- Thieme, A.; Belgrave, D.; Doherty, G. Machine learning in mental health: A systematic review of the HCI literature to support the development of effective and implementable ML systems. ACM Trans. Comput.-Hum. Interact. 2020, 27, 1–53. [Google Scholar] [CrossRef]
- Yu, J.; Xue, A.; Redei, E.; Bagheri, N. A support vector machine model provides an accurate transcript-level-based diagnostic for major depressive disorder. Transl. Psychiatry 2016, 6, e931. [Google Scholar] [CrossRef] [PubMed]
- Sun, G.; Shinba, T.; Kirimoto, T.; Matsui, T. An objective screening method for major depressive disorder using logistic regression analysis of heart rate variability data obtained in a mental task paradigm. Front. Psychiatry 2016, 7, 180. [Google Scholar] [CrossRef]
- Yang, L.; Jiang, D.; He, L.; Pei, E.; Oveneke, M.C.; Sahli, H. Decision tree based depression classification from audio video and language information. In Proceedings of the 6th International Workshop on Audio/Visual Emotion Challenge; Association for Computing Machinery: New York, NY, USA, 2016; pp. 89–96. [Google Scholar] [CrossRef]
- Sarkar, A.; Singh, A.; Chakraborty, R. A deep learning-based comparative study to track mental depression from EEG data. Neurosci. Inform. 2022, 2, 100039. [Google Scholar] [CrossRef]
- Sau, A.; Bhakta, I. Artificial neural network (ANN) model to predict depression among geriatric population at a slum in Kolkata, India. J. Clin. Diagn. Res. 2017, 11, VC01. [Google Scholar] [CrossRef] [PubMed]
- Allahyari, E. Predicting elderly depression: An artificial neural network model. Iran. J. Psychiatry Behav. Sci. 2019, 13, e98497. [Google Scholar] [CrossRef]
- Jiang, H.; Hu, B.; Liu, Z.; Wang, G.; Zhang, L.; Li, X.; Kang, H. Detecting depression using an ensemble logistic regression model based on multiple speech features. Comput. Math. Methods Med. 2018, 2018, 6508319. [Google Scholar] [CrossRef]
- Saeedi, M.; Saeedi, A.; Maghsoudi, A. Major depressive disorder assessment via enhanced k-nearest neighbor method and EEG signals. Phys. Eng. Sci. Med. 2020, 43, 1007–1018. [Google Scholar] [CrossRef]
- Sau, A.; Bhakta, I. Screening of anxiety and depression among seafarers using machine learning technology. Inform. Med. Unlocked 2019, 16, 100228. [Google Scholar] [CrossRef]
- Lin, L.; Chen, X.; Shen, Y.; Zhang, L. Towards automatic depression detection: A BiLSTM/1D CNN-based model. Appl. Sci. 2020, 10, 8701. [Google Scholar] [CrossRef]
- Rejaibi, E.; Komaty, A.; Meriaudeau, F.; Agrebi, S.; Othmani, A. MFCC-based recurrent neural network for automatic clinical depression recognition and assessment from speech. Biomed. Signal Process. Control 2022, 71, 103107. [Google Scholar] [CrossRef]
- Geng, X.F.; Xu, J.H. Application of autoencoder in depression diagnosis. DEStech Trans. Comput. Sci. Eng. 2017, 146–151. [Google Scholar] [CrossRef] [PubMed][Green Version]
- Bauer, B.; Norel, R.; Leow, A.; Rached, Z.A.; Wen, B.; Cecchi, G. Using large language models to understand suicidality in a social media–based taxonomy of mental health disorders: Linguistic analysis of reddit posts. JMIR Ment. Health 2024, 11, e57234. [Google Scholar] [CrossRef]
- Lan, X.; Han, Z.; Cheng, Y.; Sheng, L.; Feng, J.; Gao, C.; Li, Y. Depression detection on social media with large language models. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track; Association for Computational Linguistics: Stroudsburg, PA, USA, 2025; pp. 2155–2171. [Google Scholar] [CrossRef]
- Hu, Y.; Zhang, S.; Dang, T.; Jia, H.; Salim, F.D.; Hu, W.; Quigley, A.J. Exploring large-scale language models to evaluate eeg-based multimodal data for mental health. In Proceedings of the Companion of the 2024 on ACM International Joint Conference on Pervasive and Ubiquitous Computing; Association for Computing Machinery: New York, NY, USA, 2024; pp. 412–417. [Google Scholar] [CrossRef]
- McCoy, T.H.; Castro, V.M.; Perlis, R.H. Estimating depression severity in narrative clinical notes using large language models. J. Affect. Disord. 2025, 381, 270–274. [Google Scholar] [CrossRef] [PubMed]
- Xin, A.W.; Nielson, D.M.; Krause, K.R.; Fiorini, G.; Midgley, N.; Pereira, F.; Lossio-Ventura, J.A. Using large language models to detect outcomes in qualitative studies of adolescent depression. J. Am. Med. Inform. Assoc. 2026, 33, 79–89. [Google Scholar] [CrossRef]
- Liu, J.; Ding, P.; Chen, J. DepGLM: Depression degree recognition on social media based on large language models. Digit. Health 2025, 11, 20552076251408281. [Google Scholar] [CrossRef] [PubMed]
- Doraiswamy, P.M.; Blease, C.; Bodner, K. Artificial intelligence and the future of psychiatry: Insights from a global physician survey. Artif. Intell. Med. 2020, 102, 101753. [Google Scholar] [CrossRef] [PubMed]
- Wu, P.; Wang, R.; Lin, H.; Zhang, F.; Tu, J.; Sun, M. Automatic depression recognition by intelligent speech signal processing: A systematic survey. CAAI Trans. Intell. Technol. 2023, 8, 701–711. [Google Scholar] [CrossRef]
- Valstar, M.; Schuller, B.; Smith, K.; Eyben, F.; Jiang, B.; Bilakhia, S.; Schnieder, S.; Cowie, R.; Pantic, M. Avec 2013: The continuous audio/visual emotion and depression recognition challenge. In Proceedings of the 3rd ACM International Workshop on Audio/Visual Emotion Challenge; Association for Computing Machinery: New York, NY, USA, 2013; pp. 3–10. [Google Scholar] [CrossRef]
- Beck, A.T.; Steer, R.A.; Brown, G. Beck depression inventory–II. In APA PsycTests; American Psychological Association: Washington, DC, USA, 1996. [Google Scholar] [CrossRef]
- Gratch, J.; Artstein, R.; Lucas, G.M.; Stratou, G.; Scherer, S.; Nazarian, A.; Wood, R.; Boberg, J.; DeVault, D.; Marsella, S.; et al. The distress analysis interview corpus of human and computer interviews. In Proceedings of the LREC, Reykjavik, Iceland, 26–31 May 2014; Volume 14, pp. 3123–3128. [Google Scholar]
- Cai, H.; Yuan, Z.; Gao, Y.; Sun, S.; Li, N.; Tian, F.; Xiao, H.; Li, J.; Yang, Z.; Li, X.; et al. A multi-modal open dataset for mental-disorder analysis. Sci. Data 2022, 9, 178. [Google Scholar] [CrossRef]
- American Psychiatric Association. Diagnostic and Statistical Manual of Mental Disorders, 5th ed.; American Psychiatric Publishing: Arlington, VA, USA, 2013. [Google Scholar] [CrossRef]
- Kroenke, K.; Strine, T.W.; Spitzer, R.L.; Williams, J.B.; Berry, J.T.; Mokdad, A.H. The PHQ-8 as a measure of current depression in the general population. J. Affect. Disord. 2009, 114, 163–173. [Google Scholar] [CrossRef]
- Kroenke, K.; Spitzer, R.L.; Williams, J.B. The PHQ-9: Validity of a brief depression severity measure. J. Gen. Intern. Med. 2001, 16, 606–613. [Google Scholar] [CrossRef]
- Hamilton, M. The Hamilton rating scale for depression. In Assessment of Depression; Springer: Berlin/Heidelberg, Germany, 1986; pp. 143–152. [Google Scholar] [CrossRef]
- Rush, A.J.; Trivedi, M.H.; Ibrahim, H.M.; Carmody, T.J.; Arnow, B.; Klein, D.N.; Markowitz, J.C.; Ninan, P.T.; Kornstein, S.; Manber, R.; et al. The 16-Item Quick Inventory of Depressive Symptomatology (QIDS), clinician rating (QIDS-C), and self-report (QIDS-SR): A psychometric evaluation in patients with chronic major depression. Biol. Psychiatry 2003, 54, 573–583. [Google Scholar] [CrossRef]
- Montgomery, S.A.; Åsberg, M. A new depression scale designed to be sensitive to change. Br. J. Psychiatry 1979, 134, 382–389. [Google Scholar] [CrossRef] [PubMed]
- Valstar, M.; Schuller, B.; Smith, K.; Almaev, T.; Eyben, F.; Krajewski, J.; Cowie, R.; Pantic, M. Avec 2014: 3d dimensional affect and depression recognition challenge. In Proceedings of the 4th International Workshop on Audio/Visual Emotion Challenge; Association for Computing Machinery: New York, NY, USA, 2014; pp. 3–10. [Google Scholar] [CrossRef]
- Huang, K.Y.; Wu, C.H.; Kuo, Y.T.; Jang, F.L. Unipolar depression vs. bipolar disorder: An elicitation-based approach to short-term detection of mood disorder. In Proceedings of the Interspeech 2016, San Francisco, CA, USA, 8–12 September 2016; pp. 1452–1456. [Google Scholar] [CrossRef]
- Ringeval, F.; Schuller, B.; Valstar, M.; Gratch, J.; Cowie, R.; Scherer, S.; Mozgai, S.; Cummins, N.; Schmitt, M.; Pantic, M. Avec 2017: Real-life depression, and affect recognition workshop and challenge. In Proceedings of the 7th Annual Workshop on Audio/Visual Emotion Challenge; Association for Computing Machinery: New York, NY, USA, 2017; pp. 3–9. [Google Scholar] [CrossRef]
- Ringeval, F.; Schuller, B.; Valstar, M.; Cummins, N.; Cowie, R.; Tavabi, L.; Schmitt, M.; Alisamir, S.; Amiriparian, S.; Messner, E.M.; et al. AVEC 2019 workshop and challenge: State-of-mind, detecting depression with AI, and cross-cultural affect recognition. In Proceedings of the 9th International on Audio/Visual Emotion Challenge and Workshop; Association for Computing Machinery: New York, NY, USA, 2019; pp. 3–12. [Google Scholar] [CrossRef]
- Dibeklioğlu, H.; Hammal, Z.; Cohn, J.F. Dynamic multimodal measurement of depression severity using deep autoencoding. IEEE J. Biomed. Health Inform. 2017, 22, 525–536. [Google Scholar] [CrossRef]
- Çiftçi, E.; Kaya, H.; Güleç, H.; Salah, A.A. The turkish audio-visual bipolar disorder corpus. In Proceedings of the 2018 First Asian Conference on Affective Computing and Intelligent Interaction (ACII Asia); IEEE: Piscataway, NJ, USA, 2018; pp. 1–6. [Google Scholar] [CrossRef]
- Tasnim, M.; Ehghaghi, M.; Diep, B.; Novikova, J. Depac: A corpus for depression and anxiety detection from speech. In Proceedings of the Eighth Workshop on Computational Linguistics and Clinical Psychology; Association for Computational Linguistics: Stroudsburg, PA, USA, 2022; pp. 1–16. [Google Scholar] [CrossRef]
- Alghowinem, S.; Goecke, R.; Wagner, M.; Epps, J.; Breakspear, M.; Parker, G. From joyous to clinically depressed: Mood detection using spontaneous speech. In Proceedings of the FLAIRS, Marco Island, FL, USA, 23–25 May 2012. [Google Scholar]
- Jiang, Y.; Zhang, Z.; Sun, X. MMDA: A multimodal dataset for depression and anxiety detection. In Proceedings of the International Conference on Pattern Recognition; Springer: Berlin/Heidelberg, Germany, 2022; pp. 691–702. [Google Scholar] [CrossRef]
- Zou, B.; Han, J.; Wang, Y.; Liu, R.; Zhao, S.; Feng, L.; Lyu, X.; Ma, H. Semi-structural interview-based Chinese multimodal depression corpus towards automatic preliminary screening of depressive disorders. IEEE Trans. Affect. Comput. 2022, 14, 2823–2838. [Google Scholar] [CrossRef]
- Shen, Y.; Yang, H.; Lin, L. Automatic depression detection: An emotional audio-textual corpus and a gru/bilstm-based model. In Proceedings of the ICASSP 2022–2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); IEEE: Piscataway, NJ, USA, 2022; pp. 6247–6251. [Google Scholar] [CrossRef]
- Zhou, D.; Luo, J.; Silenzio, V.; Zhou, Y.; Hu, J.; Currier, G.; Kautz, H. Tackling mental health by integrating unobtrusive multimodal sensing. In Proceedings of the AAAI Conference on Artificial Intelligence; AAAI Press: Washington, DC, USA, 2015; Volume 29. [Google Scholar] [CrossRef]
- Das, A.K.; Naskar, R. A deep learning model for depression detection based on MFCC and CNN generated spectrogram features. Biomed. Signal Process. Control 2024, 90, 105898. [Google Scholar] [CrossRef]
- Huang, X.; Wang, F.; Gao, Y.; Liao, Y.; Zhang, W.; Zhang, L.; Xu, Z. Depression recognition using voice-based pre-training model. Sci. Rep. 2024, 14, 12734. [Google Scholar] [CrossRef]
- Wu, W.; Zhang, C.; Woodland, P.C. Self-supervised representations in speech-based depression detection. In Proceedings of the ICASSP 2023–2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); IEEE: Piscataway, NJ, USA, 2023; pp. 1–5. [Google Scholar] [CrossRef]
- Lyu, S.; Ren, X.; Du, Y.; Zhao, N. Detecting depression of Chinese microblog users via text analysis: Combining Linguistic Inquiry Word Count (LIWC) with culture and suicide related lexicons. Front. Psychiatry 2023, 14, 1121583. [Google Scholar] [CrossRef]
- Fu, G.; Yu, Y.; Ye, J.; Zheng, Y.; Li, W.; Cui, N.; Wang, Q. A method for diagnosing depression: Facial expression mimicry is evaluated by facial expression recognition. J. Affect. Disord. 2023, 323, 809–818. [Google Scholar] [CrossRef]
- Liu, Z.; Yuan, X.; Li, Y.; Shangguan, Z.; Zhou, L.; Hu, B. PRA-Net: Part-and-Relation Attention Network for depression recognition from facial expression. Comput. Biol. Med. 2023, 157, 106589. [Google Scholar] [CrossRef]
- Pan, Y.; Shang, Y.; Liu, T.; Shao, Z.; Guo, G.; Ding, H.; Hu, Q. Spatial–temporal attention network for depression recognition from facial videos. Expert Syst. Appl. 2024, 237, 121410. [Google Scholar] [CrossRef]
- Song, S.; Luo, Y.; Tumer, T.; Fu, C.; Valstar, M.; Gunes, H. Loss relaxation strategy for noisy facial video-based automatic depression recognition. ACM Trans. Comput. Healthc. 2024, 5, 1–24. [Google Scholar] [CrossRef]
- Xi, Y.; Chen, Y.; Meng, T.; Lan, Z.; Zhang, L. Depression detection based on the temporal-spatial-frequency feature fusion of EEG. Biomed. Signal Process. Control 2025, 100, 106930. [Google Scholar] [CrossRef]
- Ying, M.; Shao, X.; Zhu, J.; Zhao, Q.; Li, X.; Hu, B. EDT: An EEG-based attention model for feature learning and depression recognition. Biomed. Signal Process. Control 2024, 93, 106182. [Google Scholar] [CrossRef]
- Zhang, Z.; Meng, Q.; Jin, L.; Wang, H.; Hou, H. A novel EEG-based graph convolution network for depression detection: Incorporating secondary subject partitioning and attention mechanism. Expert Syst. Appl. 2024, 239, 122356. [Google Scholar] [CrossRef]
- Wang, Y.; Qu, T.; Zhu, W.; Wang, Q.; Cao, Y.; Gui, R. A hybrid model using multimodal feature perception and multiple cross-attention fusion for depressive episodes detection. Inf. Fusion 2025, 124, 103354. [Google Scholar] [CrossRef]
- Hamid, D.S.B.A.; Goyal, S.; Bedi, P. Integration of deep learning for improved diagnosis of depression using EEG and facial features. Mater. Today Proc. 2023, 80, 1965–1969. [Google Scholar] [CrossRef]
- Kumar, P.; Misra, S.; Shao, Z.; Zhu, B.; Raman, B.; Li, X. Multimodal interpretable depression analysis using visual, physiological, audio and textual data. In Proceedings of the 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV); IEEE: Piscataway, NJ, USA, 2025; pp. 5305–5315. [Google Scholar] [CrossRef]
- Li, X.; Yi, X.; Lu, L.; Wang, H.; Zheng, Y.; Han, M.; Wang, Q. TSFFM: Depression detection based on latent association of facial and body expressions. Comput. Biol. Med. 2024, 168, 107805. [Google Scholar] [CrossRef] [PubMed]
- Liu, Y.; Li, X.; Wang, M.; Bi, J.; Lin, S.; Wang, Q.; Yu, Y.; Ye, J.; Zheng, Y. Multimodal depression recognition and analysis: Facial expression and body posture changes via emotional stimuli. J. Affect. Disord. 2025, 381, 44–54. [Google Scholar] [CrossRef] [PubMed]
- Iyortsuun, N.K.; Kim, S.H.; Yang, H.J.; Kim, S.W.; Jhon, M. Additive cross-modal attention network (ACMA) for depression detection based on audio and textual features. IEEE Access 2024, 12, 20479–20489. [Google Scholar] [CrossRef]
- Zhang, W.; Mao, K.; Chen, J. A multimodal approach for detection and assessment of depression using text, audio and video. Phenomics 2024, 4, 234. [Google Scholar] [CrossRef]
- Zhang, X.; Li, B.; Qi, G. A novel multimodal depression diagnosis approach utilizing a new hybrid fusion method. Biomed. Signal Process. Control 2024, 96, 106552. [Google Scholar] [CrossRef]
- Liu, J.; Shang, Y.; Yang, M.; Shao, Z.; Lu, J.; Liu, T. Mfmamba: A multimodal fusion state space model for depression recognition. In Proceedings of the ICASSP 2025–2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); IEEE: Piscataway, NJ, USA, 2025; pp. 1–5. [Google Scholar] [CrossRef]
- Yang, S.; Liu, S.; Nie, G.; Wang, L.; Wang, T.; You, J.; Cambria, E. Fine-grained multimodal fusion for depression assisted recognition based on hierarchical knowledge-enhanced prompt learning. Expert Syst. Appl. 2025, 291, 128532. [Google Scholar] [CrossRef]
- Nepal, S.; Pillai, A.; Wang, W.; Griffin, T.; Collins, A.C.; Heinz, M.; Lekkas, D.; Mirjafari, S.; Nemesure, M.; Price, G.; et al. Moodcapture: Depression detection using in-the-wild smartphone images. In Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems; Association for Computing Machinery: New York, NY, USA, 2024; pp. 1–18. [Google Scholar] [CrossRef]
- Pan, Y.; Shang, Y.; Wang, W.; Shao, Z.; Han, Z.; Liu, T.; Guo, G.; Ding, H. Multi-feature deep supervised voiceprint adversarial network for depression recognition from speech. Biomed. Signal Process. Control 2024, 89, 105704. [Google Scholar] [CrossRef]
- Menne, F.; Dörr, F.; Schräder, J.; Tröger, J.; Habel, U.; König, A.; Wagels, L. The voice of depression: Speech features as biomarkers for major depressive disorder. BMC Psychiatry 2024, 24, 794. [Google Scholar] [CrossRef]
- Li, Y.; Yang, X.; Zhao, M.; Wang, Z.; Yao, Y.; Qian, W.; Qi, S. FPT-Former: A Flexible Parallel Transformer of Recognizing Depression by Using Audiovisual Expert-Knowledge-Based Multimodal Measures. Int. J. Intell. Syst. 2024, 2024, 1564574. [Google Scholar] [CrossRef]
- Xian, L.; Ni, J.; Wang, M. Leveraging Large Language Models for Cost-Effective, Multilingual Depression Detection and Severity Assessment. arXiv 2025, arXiv:2504.04891. [Google Scholar] [CrossRef]
- Lho, S.K.; Park, S.C.; Lee, H.; Oh, D.Y.; Kim, H.; Jang, S.; Jung, H.Y.; Yoo, S.Y.; Park, S.M.; Lee, J.Y. Large language models and text embeddings for detecting depression and suicide in patient narratives. JAMA Netw. Open 2025, 8, e2511922. [Google Scholar] [CrossRef]
- Sezgin, E.; Chekeni, F.; Lee, J.; Keim, S. Clinical accuracy of large language models and Google search responses to postpartum depression questions: Cross-sectional study. J. Med. Internet Res. 2023, 25, e49240. [Google Scholar] [CrossRef] [PubMed]
- Teferra, B.G.; Perivolaris, A.; Hsiang, W.N.; Sidharta, C.K.; Rueda, A.; Parkington, K.; Wu, Y.; Soni, A.; Samavi, R.; Jetly, R.; et al. Leveraging large language models for automated depression screening. PLoS Digit. Health 2025, 4, e0000943. [Google Scholar] [CrossRef] [PubMed]
- Kim, J.; Ma, S.P.; Chen, M.L.; Galatzer-Levy, I.R.; Torous, J.; van Roessel, P.J.; Sharp, C.; Pfeffer, M.A.; Rodriguez, C.I.; Linos, E.; et al. Optimizing large language models for detecting symptoms of depression/anxiety in chronic diseases patient communications. npj Digit. Med. 2025, 8, 580. [Google Scholar] [CrossRef]
- Wang, Y.; Inkpen, D.; Gamaarachchige, P.K. Explainable depression detection using large language models on social media data. In Proceedings of the 9th Workshop on Computational Linguistics and Clinical Psychology (CLPsych 2024); Association for Computational Linguistics: Stroudsburg, PA, USA, 2024; pp. 108–126. [Google Scholar] [CrossRef]
- Weber, S.; Deperrois, N.; Heun, R.; Frühschütz, L.; Monn, A.; Homan, S.; Häfliger, A.; Seifritz, E.; Kowatsch, T.; MULTICAST consortium; et al. Using a fine-tuned large language model for symptom-based depression evaluation. npj Digit. Med. 2025, 8, 598. [Google Scholar] [CrossRef] [PubMed]
- Perez-Toro, P.A.; Dineley, J.; Iniesta, R.; Zhang, Y.; Matcham, F.; Siddi, S.; Lamers, F.; Haro, J.M.; Penninx, B.W.; Folarin, A.A.; et al. Exploring biases related to the use of large language models in a multilingual depression corpus: PA Perez-Toro et al. Sci. Rep. 2025, 15, 36197. [Google Scholar] [CrossRef]
- Shao, Z.; Wang, X.; Liu, Z.; Wang, C.; Subbalakshmi, K. Systematic Evaluation of Machine-Generated Reasoning and PHQ-9 Labeling for Depression Detection Using Large Language Models. arXiv 2025, arXiv:2505.17119. [Google Scholar] [CrossRef]
- Liu, J.M.; Gao, M.; Sabour, S.; Chen, Z.; Huang, M.; Lee, T.M. Enhanced large language models for effective screening of depression and anxiety. Commun. Med. 2025, 5, 457. [Google Scholar] [CrossRef]
- Xu, S.; Yan, Y.; Ding, Y.; Li, F.; Zhang, S.; Tang, H.; Luo, C.; Li, Y.; Liu, H.; Mei, Y.; et al. Identifying psychiatric manifestations in outpatients with depression and anxiety: A large language model-based approach. npj Ment. Health Res. 2025, 4, 63. [Google Scholar] [CrossRef] [PubMed]
- Shah, S.M.; Gillani, S.A.; Baig, M.S.A.; Saleem, M.A.; Siddiqui, M.H. Advancing depression detection on social media platforms through fine-tuned large language models. Online Soc. Netw. Media 2025, 46, 100311. [Google Scholar] [CrossRef]
- Li, Y.; Shao, S.; Milling, M.; Schuller, B.W. Large language models for depression recognition in spoken language integrating psychological knowledge. Front. Comput. Sci. 2025, 7, 1629725. [Google Scholar] [CrossRef]
- Zhang, X.; Liu, H.; Xu, K.; Zhang, Q.; Liu, D.; Ahmed, B.; Epps, J. When llms meets acoustic landmarks: An efficient approach to integrate speech into large language models for depression detection. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing; Association for Computational Linguistics: Stroudsburg, PA, USA, 2024; pp. 146–158. [Google Scholar] [CrossRef]
- Jin, Y.; Chen, X.; Hong, X.; Wang, M.; Niu, W.; Liu, A.; Li, Y.; Bu, Y.; Wang, Y. Depression screening with textual and audio features based on large language models and machine learning. J. Affect. Disord. 2025, 395, 120644. [Google Scholar] [CrossRef]
- Sadeghi, M.; Richer, R.; Egger, B.; Schindler-Gmelch, L.; Rupp, L.H.; Rahimi, F.; Berking, M.; Eskofier, B.M. Harnessing multimodal approaches for depression detection using large language models and facial expressions. npj Ment. Health Res. 2024, 3, 66. [Google Scholar] [CrossRef]
- Zhang, W.; Chen, J.; Zhu, E.; Cheng, W.; Li, Y.; Li, Y.; Wang, Y.J. MLlm-DR: Towards Explainable Depression Recognition with MultiModal Large Language Models. In ACM Transactions on Multimedia Computing, Communications and Applications; Association for Computing Machinery: New York, NY, USA, 2025. [Google Scholar] [CrossRef]
- Tank, C.; Pol, S.; Katoch, V.; Mehta, S.; Anand, A.; Shah, R.R. Depression detection and analysis using large language models on textual and audio-visual modalities. arXiv 2024, arXiv:2407.06125. [Google Scholar] [CrossRef]
- Zhao, X.; Shen, Y.; Jiang, Y.; Wang, Z.; Liu, J.; Cheng, M.H.; Oliveira, G.C.; Desimone, R.; Dwyer, D.; Ge, Z. It hears, it sees too: Multi-modal LLM for depression detection by integrating visual understanding into audio language models. arXiv 2025, arXiv:2511.19877. [Google Scholar] [CrossRef]
- Pan, Y.; Jiang, J.; Jiang, K.; Liu, X. Disentangled-multimodal privileged knowledge distillation for depression recognition with incomplete multimodal data. In Proceedings of the 32nd ACM International Conference on Multimedia; Association for Computing Machinery: New York, NY, USA, 2024; pp. 5712–5721. [Google Scholar] [CrossRef]
- Cai, L.; Wang, Z.; Gao, H.; Shen, D.; Ji, S. Deep adversarial learning for multi-modality missing data completion. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining; Association for Computing Machinery: New York, NY, USA, 2018; pp. 1158–1166. [Google Scholar] [CrossRef]
- Zhao, J.; Li, R.; Jin, Q. Missing modality imagination network for emotion recognition with uncertain missing modalities. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers); Association for Computational Linguistics: Stroudsburg, PA, USA, 2021; pp. 2608–2618. [Google Scholar] [CrossRef]
- Zhou, T.; Canu, S.; Vera, P.; Ruan, S. Feature-enhanced generation and multi-modality fusion based deep neural network for brain tumor segmentation with missing MR modalities. Neurocomputing 2021, 466, 102–112. [Google Scholar] [CrossRef]
- Zhang, C.; Cui, Y.; Han, Z.; Zhou, J.T.; Fu, H.; Hu, Q. Deep partial multi-view learning. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 44, 2402–2415. [Google Scholar] [CrossRef]
- Tran, L.; Liu, X.; Zhou, J.; Jin, R. Missing modalities imputation via cascaded residual autoencoder. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1405–1414. [Google Scholar]
- Wang, Y.; Li, Y.; Cui, Z. Incomplete multimodality-diffused emotion recognition. Adv. Neural Inf. Process. Syst. 2023, 36, 17117–17128. [Google Scholar]
- Han, J.; Zhang, Z.; Ren, Z.; Schuller, B. Implicit fusion by joint audiovisual training for emotion recognition in mono modality. In Proceedings of the ICASSP 2019–2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); IEEE: Piscataway, NJ, USA, 2019; pp. 5861–5865. [Google Scholar] [CrossRef]
- Luo, W.; Xu, M.; Lai, H. Multimodal reconstruct and align net for missing modality problem in sentiment analysis. In Proceedings of the International Conference on Multimedia Modeling; Springer: Cham, Switzerland, 2023; pp. 411–422. [Google Scholar] [CrossRef]
- Yuan, Z.; Li, W.; Xu, H.; Yu, W. Transformer-based feature reconstruction network for robust multimodal sentiment analysis. In Proceedings of the 29th ACM International Conference on Multimedia; Association for Computing Machinery: New York, NY, USA, 2021; pp. 4400–4407. [Google Scholar] [CrossRef]
- Pham, H.; Liang, P.P.; Manzini, T.; Morency, L.P.; Póczos, B. Found in translation: Learning robust joint representations by cyclic translations between modalities. In Proceedings of the AAAI Conference on Artificial Intelligence; AAAI Press: Washington, DC, USA, 2019; Volume 33, pp. 6892–6899. [Google Scholar] [CrossRef]
- Lian, Z.; Chen, L.; Sun, L.; Liu, B.; Tao, J. Gcnet: Graph completion network for incomplete multimodal learning in conversation. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 8419–8432. [Google Scholar] [CrossRef]
- Liu, Z.; Zhou, B.; Chu, D.; Sun, Y.; Meng, L. Modality translation-based multimodal sentiment analysis under uncertain missing modalities. Inf. Fusion 2024, 101, 101973. [Google Scholar] [CrossRef]


| Scale | Interview | Self Assessment | Normal | Mild | Moderate | Severe | Very Severe |
|---|---|---|---|---|---|---|---|
| BDI-II [42] | √ | 0–13 | 14–19 | 20–28 | 29–63 | — | |
| PHQ–8 [46] | √ | 0–4 | 5–9 | 10–14 | 15–19 | 20–24 | |
| PHQ–9 [47] | √ | 0–4 | 5–9 | 10–14 | 15–19 | 20–27 | |
| HAMD [48] | √ | 0–7 | 8–13 | 14–18 | 19–22 | ≥23 | |
| QIDs [49] | √ | 0–5 | 6–10 | 11–15 | 16–20 | ≥21 | |
| MADRS [50] | √ | 0–11 | 12–22 | 23–30 | 31–35 | ≥36 | |
| DSM–IV [45] | √ | 11–15 | 16–20 | ≥21 |
| Paradigm | Subcategory | Representative Studies |
|---|---|---|
| Unimodal | Audio | Arnab Kumar Das et al. [63]; Huang et al. [64]; Wu et al. [65] |
| Text | Lyu et al. [66] | |
| Visual | Fu et al. [67]; Liu et al. [68]; Pan et al. [69]; Song et al. [70] | |
| EEG | Xi et al. [71]; Ying et al. [72]; Zhang et al. [73] | |
| Multimodal | Physiology & Behavior | Wang et al. [74]; Hamid et al. [75]; Puneet Kumar et al. [76] |
| Multi-behavioral Modality | Li et al. [77]; Liu et al. [78]; Iyortsuun et al. [79]; Zhang et al. [80] | |
| Fusion Strategy Optimization | Zhang et al. [81]; Liu et al. [82]; Yang et al. [83] |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Dai, A.; Shi, W.; Gu, X.; Huang, L. Large Language Models for Depression Detection: A Review with Prospects of Incomplete Multimodality. Brain Sci. 2026, 16, 593. https://doi.org/10.3390/brainsci16060593
Dai A, Shi W, Gu X, Huang L. Large Language Models for Depression Detection: A Review with Prospects of Incomplete Multimodality. Brain Sciences. 2026; 16(6):593. https://doi.org/10.3390/brainsci16060593
Chicago/Turabian StyleDai, Anqi, Weipeng Shi, Xiaogang Gu, and Lingqin Huang. 2026. "Large Language Models for Depression Detection: A Review with Prospects of Incomplete Multimodality" Brain Sciences 16, no. 6: 593. https://doi.org/10.3390/brainsci16060593
APA StyleDai, A., Shi, W., Gu, X., & Huang, L. (2026). Large Language Models for Depression Detection: A Review with Prospects of Incomplete Multimodality. Brain Sciences, 16(6), 593. https://doi.org/10.3390/brainsci16060593

