Next Article in Journal
Air Battlefield Time Series Data Augmentation Model Based on a Lightweight Denoising Diffusion Probabilistic Model
Previous Article in Journal
The Impact of Artificial Intelligence on Modern Society
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

RUDA-2025: Depression Severity Detection Using Pre-Trained Transformers on Social Media Data

1
Instituto Politécnico Nacional (IPN), Centro de Investigación en Computación (CIC), Mexico City 07700, Mexico
2
Dipartimento di Informatica, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
*
Author to whom correspondence should be addressed.
AI 2025, 6(8), 191; https://doi.org/10.3390/ai6080191
Submission received: 2 July 2025 / Revised: 31 July 2025 / Accepted: 7 August 2025 / Published: 18 August 2025

Abstract

Depression is a serious mental health disorder affecting cognition, emotions, and behavior. It impacts over 300 million people globally, with mental health care costs exceeding $1 trillion annually. Traditional diagnostic methods are often expensive, time-consuming, stigmatizing, and difficult to access. This study leverages NLP techniques to identify depressive cues in social media posts, focusing on both standard Urdu and code-mixed Roman Urdu, which are often overlooked in existing research. To the best of our knowledge, a script-conversion and combination-based approach for Roman Urdu and Nastaliq Urdu has not been explored earlier. To address this gap, our study makes four key contributions. First, we created a manually annotated dataset named Ruda-2025, containing posts in code-mixed Roman Urdu and Nastaliq Urdu for both binary and multiclass classification. The binary classes are depression” and not depression, with the depression class further divided into fine-grained categories: Mild, Moderate, and Severe depression alongside not depression. Second, we applied first-time two novel techniques to the RUDA-2025 dataset: (1) script-conversion approach that translates between code-mixed Roman Urdu and Standard Urdu and (2) combination-based approach that merges both scripts to make a single dataset to address linguistic challenges in depression assessment. Finally, we employed 60 different experiments using a combination of traditional machine learning and deep learning techniques to find the best-fit model for the detection of mental disorder. Based on our analysis, our proposed model (mBERT) using custom attention mechanism outperformed baseline (XGB) in combination-based, code-mixed Roman and Nastaliq Urdu script conversions.
Keywords: depression analysis; mental health; code-mixed Roman Urdu; Nastaliq Urdu; deep learning; transfer learning; social media; Facebook; YouTube; Twitter depression analysis; mental health; code-mixed Roman Urdu; Nastaliq Urdu; deep learning; transfer learning; social media; Facebook; YouTube; Twitter

Share and Cite

MDPI and ACS Style

Ahmad, M.; Basile, P.; Ullah, F.; Batyrshin, I.; Sidorov, G. RUDA-2025: Depression Severity Detection Using Pre-Trained Transformers on Social Media Data. AI 2025, 6, 191. https://doi.org/10.3390/ai6080191

AMA Style

Ahmad M, Basile P, Ullah F, Batyrshin I, Sidorov G. RUDA-2025: Depression Severity Detection Using Pre-Trained Transformers on Social Media Data. AI. 2025; 6(8):191. https://doi.org/10.3390/ai6080191

Chicago/Turabian Style

Ahmad, Muhammad, Pierpaolo Basile, Fida Ullah, Ildar Batyrshin, and Grigori Sidorov. 2025. "RUDA-2025: Depression Severity Detection Using Pre-Trained Transformers on Social Media Data" AI 6, no. 8: 191. https://doi.org/10.3390/ai6080191

APA Style

Ahmad, M., Basile, P., Ullah, F., Batyrshin, I., & Sidorov, G. (2025). RUDA-2025: Depression Severity Detection Using Pre-Trained Transformers on Social Media Data. AI, 6(8), 191. https://doi.org/10.3390/ai6080191

Article Metrics

Back to TopTop