Depression Detection Method Based on Multi-Modal Multi-Layer Collaborative Perception Attention Mechanism of Symmetric Structure
Abstract
1. Introduction
- Existing studies generally focus on single-modal analysis and lack the full utilization of multi-modal data, resulting in insufficient feature extraction.
- Some existing attention mechanisms do not fully consider data from different modalities. Especially in depression detection, only single-modality data is considered, and feature information such as gender is not taken into account, resulting in certain limitations in detection accuracy.
- The existing models have complex structures and are parameter-heavy, resulting in relatively weak computational performance [1].
- A depression detection model using a symmetric structure multi-modal multi-layer collaborative perception attention mechanism is proposed. This model incorporates multi-modal data, including emotional and gender characteristics, to systematically investigate their differential impacts on depression.
- A multi-head attention mechanism module based on multi-layer perception is constructed. An interactive attention mechanism module is introduced. It can enable the model to effectively focus on the dynamic evolution process of emotional states, establish a deep association between emotion, gender information, and depression features, and fully explore the relationship among emotion, gender, and depression, thereby obtaining more important depression features.
- We adopt a symmetric parallel structure and a lightweight design, such as parallel dilated convolution and a parallel multi-layer perceptron multi-head attention mechanism. This reduces computational complexity and facilitates the effective capture of cross-modal information. We utilized the publicly accessible and challenging AVEC 2014 dataset for comprehensive testing, and a comparative analysis was made with the most advanced HMTL-IMHAFF model. The prediction accuracy increased by 0.0308, the F1-score reached 0.892, and the Kappa coefficient was 0.837.
2. Related Study
2.1. Traditional Depression Detection Methods
2.2. Depression Detection Method Based on Deep Learning
2.3. Depression Detection Method Based on Attention Mechanism
3. Methodology
3.1. Overview
3.2. Feature Extraction Module
3.3. Multi-Layer Perceptron-Based Multi-Head Attention Mechanism Module
3.4. Interactive Multi-Head Attention Mechanism Module
3.5. BiLSTM Module
4. Experiments
4.1. Dataset
4.2. Experimental Environment and Evaluation Indicators
4.3. Results and Analysis
- (1)
- Accuracy comparison. Previous research results have shown that deep learning methods have significant improvements compared to traditional methods [1]. Therefore, this paper compares against relatively new models. As shown in Table 6, when the training proportion was 80%, our model achieved an accuracy of 0.861. Compared with HMTL-IMHAFF [1], MMFAN [23], and STFN [38], the accuracy was improved by 0.0308, 0.047, and 0.052, respectively. As shown in Table 7, the prediction accuracy was 0.665 for males and 0.673 for females. Compared with HMTL-IMHAFF [1], the improvements were 0.0213 and 0.016, respectively. Further analysis shows that HMTL-IMHAFF [1] adopted the feature fusion method of interactive multi-head attention (IMHAFF) and a two-layer multi-task learning framework to analyze the intrinsic associations among emotion, gender, and depression. MMFAN [23] adopts a method that combines an attention model with multi-modal data input to extract facial and voice features for analyzing enhanced audiovisual sequence data to evaluate the degree of depression. Both of these methods enhance feature fusion and modeling capabilities through advanced architectures. HMTL-IMHAFF [1] uses an attention mechanism but fails to fully consider the relationships between different features, making it impossible to extract key features and limiting the accuracy of the model. Our model fully considers multi-modal data, enhances feature learning, and extracts depression and gender features. Moreover, it also adopts the methods of multi-layer perceptron multi-head attention mechanism, interactive MHA, and BiLSTM, enabling the model to fully focus on the gender difference information between men and women, exploring the relationship among emotions, gender, and depression, and obtaining more important depression features. Therefore, the accuracy of our model is higher.
| Model | Accuracy | F1-Score | Kappa |
|---|---|---|---|
| MMFAN [23] | 0.814 | 0.798 | 0.731 |
| HMTL-IMHAFF [1] | 0.8302 | 0.8732 | 0.815 |
| Bi-LSTM + CNN [34] | 0.726 | 0.703 | 0.665 |
| BERT-BiLSTM [39] | 0.787 | 0.763 | 0.712 |
| CNN + MFCC + spectrogram [40] | 0.765 | 0.749 | 0.706 |
| LSTM + MHA [41] | 0.697 | 0.663 | 0.605 |
| CNN + LSTM [9] | 0.746 | 0.710 | 0.663 |
| STFN [38] | 0.809 | 0.781 | 0.725 |
| Ours | 0.861 | 0.892 | 0.837 |
| Model | Accuracy | F1-Score | Female | Male |
|---|---|---|---|---|
| HMTL-IMHAFFH [1] | 0.8302 | 0.7432 | 0.6570 | 0.6437 |
| Ours | 0.861 | 0.775 | 0.673 | 0.665 |
- (2)
- The attention mechanism has been effectively verified. Further analysis shows that compared with MMFAN [23], the accuracy of HMTL-IMHAFF [1] increased by 0.0162. Our model improved by 0.0308 compared with HMTL-IMHAFF [1]. Through analysis, it was discovered that MMFAN [23] uses self-attention and channel attention mechanisms to extract the facial features of depression patients. However, this method lacks feature interaction between self-attention and channel attention. The HMTL-IMHAFF [1] model adopts an interactive multi-head attention mechanism, emphasizing the information interaction between multiple attention heads, thereby enhancing the representation and improving the generalization ability to comprehensively explore the in-depth relationships among gender, emotion, and depression and obtain an enhanced depression feature representation. Our proposed model not only adopts a multi-layer perceptron multi-head attention mechanism but also introduces interactive MHA. Through the collaboration of the two attention mechanisms, the model has significant advantages in capturing local and global features in depression feature extraction, which can significantly improve the accuracy. Therefore, different attention mechanisms have different impacts on the model. With a 0.0308 accuracy gain compared to the HMTL-IMHAFF model [1], the evidence strongly supports the effectiveness of our attention mechanism design.
- (3)
- Reduction in the number of model parameters. Our method obtained an F1-score of 0.892 for the depression detection task, an increase of 0.0188 compared to the HMTL-IMHAFF model [1]. The improvement in the F1-score indicates that our model can effectively alleviate the bias caused by data imbalance in depression recognition. In addition, the Kappa coefficient of our model is 0.837, an increase of 0.022 compared to the HMTL-IMHAFF model [1], which reflects the degree to which the model’s prediction results exceed random consistency and shows that its prediction results are more reliable. Further analysis reveals that the HMTL-IMHAFF [1] model employs a traditional one-dimensional CNN for feature extraction and uses interactive multi-head attention (IMHA) for feature fusion. In contrast, the model we propose adopts a parameter-efficient symmetric structure, along with a co-design of the multi-layer perceptron multi-head attention mechanism and the BiLSTM module, to enhance the capture of key features. This not only reduces the number of parameters but also maintains strong representational ability. The experimental results show that through the innovative design of the symmetric structure, the multi-layer perceptron multi-head attention mechanism, and the BiLSTM module, our model achieves rapid convergence. Consequently, it outperforms existing models in both predictive precision and operational efficiency for depression assessment, providing a new technological paradigm for current depression detection.
4.4. Ablation Studies
4.4.1. The Influence of Gender on Prediction Accuracy
4.4.2. Impact of the Attention Mechanism on Prediction Accuracy
4.4.3. Influence of BiLSTM on Prediction Accuracy
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
| AMM | Attentive multi-modal multi-task learning framework |
| CNN | Convolutional neural network |
| Conv | Convolution |
| BN | Batch normalization |
| LSTM | Long short-term memory |
| BiLSTM | Bidirectional long short-term memory |
| SMMCA | Symmetric structure multi-modal multi-layer collaborative perception attention |
| PHQ-9 | Patient Health Questionnaire-9 |
| PPG | Photoplethysmography |
| ECG | Electrocardiogram |
| EDA | Electrodermal activity |
| BDI | Beck Depression Inventory |
| RADS-2 | Reynolds Adolescent Depression Scale, second edition |
| STA-DRN | Spatial–temporal attention depression recognition network |
| BERT | Bidirectional Encoder Representations from Transformers |
| MHA | Multi-head attention |
| SeLU | Scaled Exponential Linear Unit |
| Word2Vec | Word to Vector |
| Avg | Average pooling |
| MSA | Window multi-head self-attention |
| MLP | Multi-layer perceptron |
| W-MSA | window-based multi-head self-attention mechanism |
| SW-MSA | Shifted-window multi-head self-attention |
| GAP | Global average pooling |
| MMFAN | Multi-modal fused-attention network |
| STFN | The spatial–temporal feature network |
| MFCC | Mel-frequency cepstral coefficients |
| CNN-BiLSTM | Convolutional neural networks and bidirectional long short-term memory |
| HMTL-IMHAFF | Hierarchical multi-task learning framework based on interactive multi-head attention feature fusion |
References
- Xing, Y.; He, R.; Zhang, C.; Tan, P. Hierarchical Multi-Task Learning Based on Interactive Multi-Head Attention Feature Fusion for Speech Depression Recognition. IEEE Access 2025, 13, 51208–51219. [Google Scholar] [CrossRef]
- Brookman, R.; Kalashnikova, M.; Conti, J.; Rattanasone, N.; Grant, K.; Demuth, K.; Burnham, D. Maternal depression affects infants’ lexical processing abilities in the second year of life. Brain Sci. 2020, 10, 977. [Google Scholar] [CrossRef]
- Luo, L.; Yuan, J.; Wu, C.; Wang, Y.; Zhu, R.; Xu, H.; Zhang, L.; Zhang, Z. Predictors of Depression among Chinese College Students: A Machine Learning Approach. BMC Public Health 2025, 25, 470. [Google Scholar] [CrossRef] [PubMed]
- Giannakakis, G.; Grigoriadis, D.; Giannakaki, K.; Simantiraki, O.; Roniotes, A.; Tsiknakis, M. Review on psychological stress detection using biosignals. IEEE Trans. Affect. Comput. 2019, 13, 440–460. [Google Scholar] [CrossRef]
- Schwartz, M.S.; Andrasik, F. Biofeedback: A Practitioner’s Guide; Guilford Press: New York, NY, USA, 2017; pp. 68–113. [Google Scholar]
- Marriwala, N.; Chaudhuri, D. Hybrid Model for Depression Detection Using Deep Learning. Meas. Sens. 2023, 25, 100587. [Google Scholar] [CrossRef]
- Zhang, Y.; Li, X.; Rong, L.; Tiwari, P. Multi-task learning for jointly detecting depression and emotion. In Proceedings of the 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Houston, TX, USA, 9–12 December 2021; pp. 3142–3149. [Google Scholar] [CrossRef]
- Niu, M.; Tao, J.; Liu, B.; Huang, J.; Lian, Z. Multimodal spatiotemporal representation for automatic depression level detection. IEEE Trans. Affect. Comput. 2023, 14, 294–307. [Google Scholar] [CrossRef]
- Verma, A.; Jain, P.; Kumar, T. An effective depression diagnostic system using speech signal analysis through deep learning methods. Int. J. Artif. Intell. Tools 2023, 32, 2340004. [Google Scholar] [CrossRef]
- Von Glischinski, M.; von Brachel, R.; Hirschfeld, G. How “depressed” is “depressed”? A systematic review and diagnostic meta-analysis of the optimal cut-off points of the revised Beck Depression Inventory (BDI-II). Qual. Life Res. 2019, 28, 1111–1118. [Google Scholar] [CrossRef]
- Ramos-Vera, C.; Quispe-Callo, G.; Bashualdo-Delgado, M.; Vallejos-Saldarriaga, J.; Santillán, J. Factorial and network structure ofthe Reynolds Adolescent Depression Scale (RADS-2) in Peruvian adolescents. PLoS ONE 2023, 18, e0286081. [Google Scholar] [CrossRef]
- Kraepelin, E. Manic-Depressive Insanity and Paranoia; E & S Livingstone: London, UK, 1921; pp. 4–9. [Google Scholar]
- He, Y.; Liang, F.; Wang, Y.; Wei, Y.; Ma, T. Advances in the Application of Wearable Devices in Depression Monitoring and Intervention. Chin. J. Med. Devices 2024, 48, 407–412. [Google Scholar] [CrossRef]
- Li, M.; Li, J.; Chen, Y.; Hu, B. Detecting Stress Levels in College Students Using Affective Pulse Signals and Deep Learning. IEEE Trans. Affect. Comput. 2025, 16, 1942–1954. [Google Scholar] [CrossRef]
- Zhao, J.; Su, W.; Jia, J. Depression Detection Algorithm Combining Prosody and Sparse Face Recognition. Clust. Comput. 2019, 22, 7873–7884. [Google Scholar] [CrossRef]
- Amanat, A.; Rizwan, M.; Javed, A.R.; Alsaqour, R.; Pandya, S.; Uddin, M. Deep Learning for Depression Detection from Textual Data. Electronics 2022, 11, 676. [Google Scholar] [CrossRef]
- Wongkoblap, A.; Vadillo, M.; Curcin, V. Depression Detection of Twitter Posters using Deep Learning with Anaphora Resolution: Algorithm Development and Validation. JMIR Ment. Health, 2021; in press. [Google Scholar] [CrossRef]
- Al Jazaery, M.; Guo, G. Video-based depression level analysis by encoding deep spatiotemporal features. IEEE Trans. Affect. Comput. 2021, 12, 262–268. [Google Scholar] [CrossRef]
- He, L.; Niu, M.; Tiwari, P.; Matin, P.; Su, R.; Jiang, J.; Guo, C.; Wang, H.; Ding, S.; Wang, Z.; et al. Deep Learning for Depression Recognition Using Audio-Visual Cues: A Review. Inf. Fusion 2022, 80, 56–86. [Google Scholar] [CrossRef]
- Jan, A.; Meng, M.; Gaus, F.; Zhang, F. Artificial intelligent system for automatic depression level analysis through visual and vocal expressions. IEEE Trans. Cognit. Develop. Syst. 2018, 10, 668–680. [Google Scholar] [CrossRef]
- Bhatt, D.; Patel, C.; Talsania, H.; Patel, J.; Vaghela, R.; Pandya, S.; Modi, K.; Ghayvat, H. CNN Variants for Computer Vision: History, Architecture, Application, Challenges and Future Scope. Electronics 2021, 10, 2470. [Google Scholar] [CrossRef]
- Fan, H.; Zhang, X.; Xu, Y.; Fang, J.; Zhang, S.; Zhao, X.; Yu, J. Transformer-based multimodal feature enhancement networks for multimodal depression detection integrating video, audio and remote photoplethysmograph signals. Inf. Fusion 2024, 104, 102161. [Google Scholar] [CrossRef]
- Zhou, Y.; Yu, X.; Huang, Z.; Palati, F.; Zhao, Z.; He, Z. Multi-Modal Fusion Attention Network for Depression Level Recognition Based on Enhanced Audio-Visual Cues. IEEE Access 2025, 13, 37913–37923. [Google Scholar] [CrossRef]
- Zhang, X.; Li, B.; Qi, G. A novel multimodal depression diagnosis approach utilizing a new hybrid fusion method. Biomed. Signal Process. Control 2024, 96, 106552. [Google Scholar] [CrossRef]
- Mahayossanunt, Y.; Nupairoj, N.; Hemrungrojn, S.; Vateekul, P. Explainable depression detection based on facial expression u sing LSTM on attentional intermediate feature fusion with label smoothing. Sensors 2023, 23, 9402. [Google Scholar] [CrossRef]
- Thekkekara, J.P.; Yongchareon, S.; Lesaputri, V. Attention-based CNN-BiLSTM model for depression detection from social media text. Expert Syst. Appl. 2024, 249, 123834. [Google Scholar] [CrossRef]
- Botalb, A.; Moinuddin, M.; Al-Saggaf, U.M.; Ali, S.S.A. Contrasting Convolutional Neural Network (CNN) with Multi-Layer Perceptron (MLP) for Big Data Analysis. In Proceedings of the 2018 International Conference on Intelligent and Advanced System (ICIAS), Kuala Lumpur, Malaysia, 13–14 August 2018; pp. 1–5. [Google Scholar] [CrossRef]
- AbdelRaouf, H.; Abouyoussef, M.; Ibrahem, M.I. An Innovative Approach for Human Activity Recognition Based on a Multi-Head Attention Mechanism. In Proceedings of the 2024 International Conference on Machine Learning and Applications (ICMLA), Miami, FL, USA, 18–20 December 2024; pp. 1559–1563. [Google Scholar] [CrossRef]
- Hameed, Z.; Garcia-Zapirain, B. Sentiment Classification Using a Single-Layered BiLSTM Model. IEEE Access 2020, 8, 73992–74001. [Google Scholar] [CrossRef]
- Xu, C.; Zhu, G.; Shu, J. A Combination of Lie Group Machine Learning and Deep Learning for Remote Sensing Scene Classification Using Multi-Layer Heterogeneous Feature Extraction and Fusion. Remote Sens. 2022, 14, 1445. [Google Scholar] [CrossRef]
- Zhang, Y.; Li, T.; Li, C.; Zhou, X. A Novel Driver Distraction Detection Method Based on Masked Image Modeling for Self-Supervised Learning. IEEE IoT J. 2024, 11, 6056–6071. [Google Scholar] [CrossRef]
- Desai, M.; Shah, M. Anatomy of Breast Cancer Detection and Diagnosis Using Multilayer Perceptron Neural Network (MLP) and Convolutional Neural Network (CNN). Clin. Health Inform. 2021, 4, 1–11. [Google Scholar] [CrossRef]
- Xu, C.; Shu, J.; Zhu, G. Adversarial Remote Sensing Scene Classification Based on Lie Group Feature Learning. Remote Sens. 2023, 15, 914. [Google Scholar] [CrossRef]
- Jo, A.-H.; Kwak, K.-C. Diagnosis of Depression Based on Four-Stream Model of Bi-LSTM and CNN From Audio and Text Information. IEEE Access 2022, 10, 134113–134135. [Google Scholar] [CrossRef]
- Lin, L.; Chen, X.; Shen, Y.; Zhang, L. Towards automatic depression detection: A BiLSTM/1D CNN-based model. Appl. Sci. 2020, 10, 8701. [Google Scholar] [CrossRef]
- Valstar, M.; Schuller, B.; Smith, K.; Almaev, T.; Eyben, F.; Krajewski, J.; Cowie, R.; AVEC, M.P. 2014: 3D dimensional affect anddepression recognition challenge. In Proceedings of the 4th International Workshop on Audio/Visual Emotion Challenge, Orlando, FL, USA, 7 November 2014; pp. 3–10. [Google Scholar] [CrossRef]
- Niu, M.; Zhao, Z.; Tao, J.; Li, Y.; Schuller, B.W. Dual attention and element recalibration networks for automatic depression level prediction. IEEE Trans. Affect. Comput. 2022, 14, 1954–1965. [Google Scholar] [CrossRef]
- Han, Z.; Shang, Y.; Shao, Z.; Liu, J.; Guo, G.; Liu, T.; Ding, H.; Hu, Q. Spatial–temporal feature network for speech-based depression recognition. IEEE Trans. Cognit. Develop. Syst. 2024, 1, 308–318. [Google Scholar] [CrossRef]
- Cao, X.; Zakaria, L.Q. Integrating Bert With CNN and BiLSTM for Explainable Detection of Depression in Social Media Contents. IEEE Access 2024, 12, 161203–161212. [Google Scholar] [CrossRef]
- Das, A.K.; Naskar, R. A deep learning model for depression detection based on MFCC and CNN generated spectrogram features. Biomed. Signal Process. Control 2024, 90, 105898. [Google Scholar] [CrossRef]
- Zhao, Y.; Liang, Z.; Du, J.; Zhang, L.; Liu, C.; Zhao, L. Multi-head attention-based long short-term memory for depression detection from speech. Front. Neurorobotics 2021, 15, 684037. [Google Scholar] [CrossRef] [PubMed]





| Methods | Kernel Size | Input Channel | Output Channel | Layer | Parameters | Total (M) |
|---|---|---|---|---|---|---|
| Ordinary | 3 × 3 | 1024 | 1024 | Conv1 | 1024 × 1024 × 3 × 3 = 9,437,184 | 2,381,155 ≈ 23.8 |
| Conv2 | 1024 × 1024 × 3 × 3 = 9,437,184 | |||||
| Conv3 | 1024 × 1024 × 3 × 3 = 9,437,184 | |||||
| 5 × 5 | 1024 | 1024 | Conv1 | 1024 × 1024 × 5 × 5 = 26,214,400 | 7,864,320 ≈ 78.6 | |
| Conv2 | 1024 × 1024 × 5 × 5 = 26,214,400 | |||||
| Conv3 | 1024 × 1024 × 5 × 5 = 26,214,400 | |||||
| Parrallel | 7 × 7 | 512 | 512 | Conv1 | 512 × 512 × 7 × 7 = 12,845,056 | 12,845,056 ≈ 12.8 |
| Conv2 | ||||||
| Conv3 |
| Dataset | Task Type | Training/ Testing Ratio | Sample Situation |
|---|---|---|---|
| AVEC 2014 | Northwind: Participants read aloud a passage from the fable “The North Wind and the Sun.” | 80%/20% | Number of participants: 50. Original voice recordings: 50 (one per participant). Sample processing method: Each recording was preprocessed and segmented into fixed-length clips. Average video duration per session: About 25 min. Input clip duration: 3 s. Participant age range: 18–63 years Mean age ± SD: 31.5 ± 12.3 years. |
| Freeform: Participants freely responded in German to a self-selected prompt, such as “What is your favorite dish?” | 80%/20% |
| BDI Scores | Depression Level | Number of Videos | Valid Segments |
|---|---|---|---|
| 0–13 | Non-depressed | 77 | 1435 |
| 14–19 | Mild | 22 | 411 |
| 20–28 | Moderate | 26 | 484 |
| 29–64 | Severe | 25 | 466 |
| female | 88 | 114 | 202 |
| male | 58 | 40 | 98 |
| total | 146 | 154 | 300 |
| Project | Content |
|---|---|
| Processor | Intel Xeon Gold 6248R @ 3.0 GHz (Santa Clara, CA, USA) |
| Memory | 256 GB DDR4 ECC (Kingston: Fountain Valley, CA, USA—Headquarters) |
| Operating system | Ubuntu 20.04 LTS (Microsoft: Redmond, DC, USA) |
| Hard disk | 2 TB NVMe SSD (RAID 0)(Western Digital: San Jose, CA, USA—Headquarters) |
| Software | Python 3.9.7 (MathWorks, Netik, MA, USA) |
| GPU | NVIDIA GeForce GTX 2080Ti (NVIDIA, Santa Clara, CA, USA) |
| Number of cycles | 50 |
| PyTorch | 1.13.1 (Meta AI, Menlo Park, CA, USA) |
| Learning rate | 10−4 |
| Training rate | 5 × 10−5 |
| Momentum | β1 = 0.9, β2 = 0.999 |
| Weight decay | 1 × 10−4 |
| Average pooling kernel size | 2 × 2 |
| Average pooling stride | 2 |
| padding | 0 |
| Number of filters in Conv layers | [64, 128, 256, 512] |
| Dropout rate | 0.5 |
| Stride in Conv layers | 1 |
| Weight initialization | He normal |
| Feature output dimensions | Varied per layer |
| Model | Accuracy | F1-Score |
|---|---|---|
| Without gender | 0.835 | 0.856 |
| Ours | 0.861 | 0.892 |
| Model | Accuracy | F1-Score |
|---|---|---|
| Without attention and with interactive MHA | 0.825 | 0.837 |
| With attention and without interactive MHA | 0.813 | 0.825 |
| Without attention and interactive MHA | 0.763 | 0.781 |
| Ours | 0.861 | 0.892 |
| Model | Accuracy | F1-Score |
|---|---|---|
| Without BiLSTM | 0.829 | 0.836 |
| Ours | 0.861 | 0.892 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Jiang, S.; Xu, C.; Fang, X. Depression Detection Method Based on Multi-Modal Multi-Layer Collaborative Perception Attention Mechanism of Symmetric Structure. Informatics 2026, 13, 8. https://doi.org/10.3390/informatics13010008
Jiang S, Xu C, Fang X. Depression Detection Method Based on Multi-Modal Multi-Layer Collaborative Perception Attention Mechanism of Symmetric Structure. Informatics. 2026; 13(1):8. https://doi.org/10.3390/informatics13010008
Chicago/Turabian StyleJiang, Shaorong, Chengjun Xu, and Xiuya Fang. 2026. "Depression Detection Method Based on Multi-Modal Multi-Layer Collaborative Perception Attention Mechanism of Symmetric Structure" Informatics 13, no. 1: 8. https://doi.org/10.3390/informatics13010008
APA StyleJiang, S., Xu, C., & Fang, X. (2026). Depression Detection Method Based on Multi-Modal Multi-Layer Collaborative Perception Attention Mechanism of Symmetric Structure. Informatics, 13(1), 8. https://doi.org/10.3390/informatics13010008

