You are currently viewing a new version of our website. To view the old version click .
Big Data and Cognitive Computing
  • Article
  • Open Access

5 September 2024

Sentiment Informed Sentence BERT-Ensemble Algorithm for Depression Detection

,
and
1
Department of Computing & Mathematics, University of Brighton, Brighton BN2 4GJ, UK
2
Department of Computing, Sheffield Hallam University, Sheffield S1 2NU, UK
*
Authors to whom correspondence should be addressed.

Abstract

The World Health Organisation (WHO) revealed approximately 280 million people in the world suffer from depression. Yet, existing studies on early-stage depression detection using machine learning (ML) techniques are limited. Prior studies have applied a single stand-alone algorithm, which is unable to deal with data complexities, prone to overfitting, and limited in generalization. To this end, our paper examined the performance of several ML algorithms for early-stage depression detection using two benchmark social media datasets (D1 and D2). More specifically, we incorporated sentiment indicators to improve our model performance. Our experimental results showed that sentence bidirectional encoder representations from transformers (SBERT) numerical vectors fitted into the stacking ensemble model achieved comparable F1 scores of 69% in the dataset (D1) and 76% in the dataset (D2). Our findings suggest that utilizing sentiment indicators as an additional feature for depression detection yields an improved model performance, and thus, we recommend the development of a depressive term corpus for future work.

1. Introduction

There is a growing concern as regards the deteriorating mental health of people [,]. WHO [] reported a growing drop in the mental health of people as individuals diagnosed with depression has increased. In recent times, depression has garnered global attention due to its widespread prevalence and the significant risk it poses, particularly in terms of suicide [,,]. Depression is one of the leading causes of disability globally []. WHO [] reported approximately 280 million people in the world suffer from depression. Their findings showed more women are affected than men, and 5% of adults suffer from depression. The literature evidences the symptoms of depression as unhappiness, loneliness, forgetfulness, low self-esteem, thoughts of self-harm, disrupted sleep, loss of energy, changes in appetite, anxiety, reduced concentration, indecision, feelings of worthlessness, lack of interest, guilt, or hopelessness [,]. Previous studies showed that the primary causes of depression are financial issues, workplace issues, family issues, and academic performance []. There are traditional approaches to depression detection that require psychometric questions. Several depression questionnaires have been developed. For example, Beck’s Depression Inventory [], Center for Epidemiologic Studies Depression Scale [], Children’s Depression Inventory [], Mood and Feelings Questionnaire [], Patient Health Questionnaire-9 [], and Revised Children’s Anxiety and Depression Scale []. However, this approach is labor-intensive, time-consuming, and limited to pre-defined attributes. In addition, research evidenced that people with depression may not know or acknowledge they have the illness and thus influences the survey outcome for early-stage depression screening [,]. The ability to detect depression in its early stages is important in providing prevention, diagnosis, or treatment. Corrective treatment is usually too late as the sufferer’s quality of life might have been affected adversely []. This can be attributed to the sufferer’s attitude to seek medical help at the early stages of the disease, making their symptom worsen. Liu et al. [] stated that 70% of sufferers will not seek medical help for their depressive conditions. Social media, however, has been a proven source of obtaining public opinion of people, providing insight into the conditions of their mental wellness []. People obtain information or express their feelings freely on social media [,,]. Thus, user-generated content (UGC) is considered a reliable data source for developing automated depression detection models [,]. Interestingly, the rapid development of machine learning (ML) approaches has made significant contributions to the development of depression detection algorithms []. Thus, this study will focus on the use of ML techniques.
Several ML algorithms were implemented for depression detection, and so far, traditional algorithms and predominantly only one stand-alone algorithm have been used [,,,,,,,,,], which are unable to deal with data complexities, prone to overfitting, and limited in generalization, and thus, the effectiveness of the models is unsatisfactory for real-world deployment []. Most importantly, a major concern of existing algorithms is the generalization issue [,,,,]. Based on this background, our aim is to develop a well-performing ML algorithm for depression detection. To this end, we propose the use of the Sentence BERT-Ensemble model for depression detection. Thus, our main contributions can be summarised as follows.
  • We conduct an experimental comparison of several state-of-the-art (SOTA) ML algorithms for depression detection and discuss them from a scientific lens.
  • We demonstrate the use of the Sentence BERT-Ensemble model to achieve SOTA results.
  • We demonstrate that the sentiment analysis indicator is a useful external feature in depression detection.
Subsequent sections present the background knowledge of this study (Section 2), methodology (Section 3), and results (Section 4), and Section 5 will provide conclusions and recommendations.

3. Methodology

This section presents the methods applied in this study. This study conducted an extensive and thorough comparison of ML algorithms ranging from traditional ML, deep learning, and transformer-based algorithms. Based on the target variable of the datasets used (discussed in Section 3.2), this study formulates the depression detection task as a multiclass classification problem. In our attempt to improve generalization and ascertain the SOTA approach, we performed several experiments using two datasets, as outlined below.
  • In the first experiment, we compared traditional ML algorithms using term frequency and inverse document frequency (TF-IDF) vectorizer.
  • In the second attempt, we compared ML algorithms using contextual word embeddings such as BERT and SBERT.
  • Finally, we implemented sentiment analysis and used the polarity result as an explicit feature. Thus, we compared ML algorithms using contextual word embeddings.

3.1. Proposed Approach

This paper compared several ML approaches that have shown good performances in the depression detection literature. Thus, based on our experiment, this study proposes the use of sentence BERT embedding and stacked ensemble model.
The experimental setup of our approach is shown in Figure 1. The raw data are converted into word embeddings in numerical vector models such as BERT. In parallel, sentiment analysis is performed, and thus, the numerical vectors are fed to the classifier (stacked ensemble model) to provide the required prediction. Subsequent sections discuss the individual components of our proposed approach for depression detection, presenting the algorithms used in each stage. For illustration, Section 3.1.1 and Section 3.1.2 present the BERT and SBERT for the word embedding component, Section 3.1.3, Section 3.1.4, Section 3.1.5, Section 3.1.6 and Section 3.1.7 present the stacked ensemble model for the classifier component, and Section 3.1.8 presents the Afinn lexicon for the sentiment analysis component.
Figure 1. An illustration of our experimental setup.

3.1.1. BERT (Bidirectional Encoder Representations from Transformers)

Given a sequence of input embeddings X = x 1 ,   x 2 , , x n the BERT model [] consists of an encoder stack that utilizes self-attention mechanisms. The self-attention mechanism involves three sets of weight matrices Query Q , Key K and Value V . These are used to compute attention scores, and the output of the self–attention layer is obtained by applying these attention scores to the values.
The self–attention mechanism is mathematically represented as:
A t t e n t i o n   Q , K , V = s o f t max Q K T d k V
Here, Q = X W Q ,
K = X W K V = X W V
d k is the dimensionality of the key vectors.
BERT stacks multiple layers of these self-attention mechanisms to capture conceptual information in the input sequence. BERT uses a masked Language model (MLM) objective during pre-training, where some of the input words are masked, and the model is trained to predict the masked words based on the context.

3.1.2. Sentence-BERT

Sentence BERT (SBERT) [] is a variant of BERT designed to reduce the complexity of the computational overhead presented in the BERT algorithm for regression task such as sentence-pair similarity. It has also been used to classify sentence pairs based on the labeled class (binary or multi-class). SBERT framework with classification objective (used for this work) consists of a Siamese bi-encoder that comprises of two fine-tuned BERT models, each providing different output vector embeddings u and v . A third vector u v is also obtained, which is the element-wise absolute difference between the vector embeddings of the Siamese BERT encoder. The vectors are then concatenated and multiplied by a weight matrix W and the output is sent to a SoftMax layer to generate the probabilities for each class. This is illustrated in Figure 2.
Figure 2. Classification objective framework of the SBERT model [].

3.1.3. Stacking Ensemble Model

Ensemble techniques have experienced an increase in relevance in recent years. This can be attributed to the increased performance it provides when compared to using individual models for ML tasks such as regression and classification. It was first introduced in the early 1990s with the idea that individual algorithms with weak predictions could be improved to provide better predictions by combining them in some specific or methodological order []. The most common methods used in ensemble learning include Bagging, Boosting, and Stacking. Each of these methods varies slightly with their architecture, and their selection is dependent on the nature of the ML task. For this work, we have used the stacking ensemble framework.
Stacking, also known as a stacked generalization, is an ensemble learning strategy that generates predictive outcomes of selected initial learner models, known as the base models, to a final classifier model, known as the meta-classifier, to obtain the final classification for the specific ML task and in most cases, improves the overall performance of the classification when compared to the individual base models and has seen it used in different classification tasks ranging from healthcare to agri-business [,,,]. As illustrated in Figure 3, stacking provides model variety, allowing all base classifiers to participate in the learning process, which in turn helps reduce bias. It also provides interpretability of the learning capacity of the base models and how they influence the results of the selected meta-classifier. The diversity among the chosen base models helps the meta-classifier produce improved classification results. This is due to the different principles of generalization by the base models. The diversity provided by the models is obtained by using learner models of varying learning strategies, thereby introducing a level of disagreement in pattern recognition [].
Figure 3. Illustration of stacked ensemble model.
The ensemble technique can be represented using the equation:
y ^ = f m e t a b m i y i y N
where y ^ represents the predicted output based on the meta classifier function f m e t a on the output of the selected individual base models b m . With y i representing the predicted output of the individual base models. In the subsequent section, we discuss individual ML algorithms deployed.

3.1.4. Gradient Boosting

Gradient Boosting is a powerful ensemble technique in ML. Unlike traditional models that learn from the data independently, boosting combines the predictions of multiple weak learners to create a single, more accurate, strong learner. The key concept is to sequentially add weak learners, each one correcting the errors of its predecessors. Mathematically, the gradient boosting algorithm can be expressed as follows []
g t x = E y   Ψ   y , f x f x x f x = f ^ t 1 x  
where; Ψ y , f x   is the loss function measuring the difference between the actual value of y and the predicted value f x .
f ^ t 1 x is the prediction from the model at iteration t 1 .
g t x represents the pseudo-residuals, the negative gradients of the loss function with respect to the current model predictions.
In each iteration, a new weak learner h t x is trained to predict these pseudo-residuals. The model is then updated by adding the new learner, scaled by a learning rate η : f t x = f t 1 x + η h t x . This process continues for a predefined number of iterations, resulting in a strong predictive model that minimizes the loss function effectively.

3.1.5. AdaBoost

AdaBoost (Adaptive Boosting) is an ensemble learning algorithm that combines the predictions of multiple weak classifiers to create a strong classifier. It works by focusing on the training samples that are hard to classify and adjusting the weights of the weak classifiers accordingly [].
Given a training set T = { T = x i ,   y i } i = 1 N , where x i are the input feature and y i are the class labels y i 1 , 1 , the AdaBoost algorithm can be described as follows:
1.
Initialize the Weights:
w i 1 = 1 N   for   all   i = 1 , 2 , , N
2.
For t = 1 to T (number of iterations):
a.
Train a Weak Classifier h t using the weighted training set.
b.
Compute the Weighted Error t :
t = i = 1 N w i t × I ( h t ( x i ) y i ) i = 1 N w i t    
where I is the indicator function that returns 1 if the condition is true and 0 otherwise.
c.
Compute the Classifier Weight α t = 1 2 ln 1 ϵ t ϵ t
d.
Update the Weights:
w i t + 1 = w i t exp α t y i h t x i
Normalize the weights so that they sum to 1:
w i t + 1 = ω i t + 1 j = 1 N w j t + 1
3.
Final Strong Classifier:
The strong classifier is a weighted majority vote of the T weak classifiers:
H x = s i g n t = 1 T α t h t x

3.1.6. Logistic Regression

LR is a statistical method for modeling the probability of a binary outcome based on one or more predictor variables. It is widely used for classification problems where the dependent variable is categorical []
P X = e b 0 + b 1 * X 1 + e b 0 + b 1 * X
where P X is the predicted output b 0 is the intercept term and b 1 is the coefficient of the single input value x . The logistic function transforms the linear combination of the predictors into a probability value between 0 and 1.

3.1.7. Multi-Layer Perceptron

MLP is a class of feedforward artificial neural networks. It consists of multiple layers of nodes (neurons) connected in a directed graph, where each layer is fully connected to the next one. MLPs are capable of learning complex patterns through supervised learning and are commonly used for tasks such as classification, regression, and function approximation [].
Input Layer: Let X = x 1 + + x n be the input vector. Hidden Layer: For the l t h hidden layer, let h i l be the activation of i t h neuron. The activation is computed as:
h i l = l j ω i j l x j + b i l
where l is the activation function for the l t h layer.
ω i j l is the weight connecting the j t h neuron in the l 1 t h layer to the i t h neuron in the l t h layer.
b i l is the bias term for the i t h neuron in the l t h layer.
h i l 1 is the activation of the j t h neuron in the ( l 1 ) t h layer. For the input layer, h j o = x j .
Output Layer: Let y k be the output of the k t h neuron in the output layer. It is computed as:
y k = L j ω k j L h j L 1 + b k L
We distinguish 1 and 2 because different layers may have different activation functions.
  • where:
  • L is the index of the output layer.
  • ϕ L is the activation function for the output layer.

3.1.8. Sentiment Analysis

Sentiment analysis is the process of classifying text into sentiment categories such as positive and negative [,]. There are two popular approaches to performing sentiment analysis, namely, the lexicon-based approach and the ML-based approach. The former approach requires the use of a corpus or dictionary to map words according to their semantic orientation into sentiment categories. The latter approach involves the use of labeled samples to train the ML algorithm for prediction. In this context, due to the unavailability of labeled datasets for training purposes, we consider the use of lexicon-based sentiment analysis suitable. We adopted Afinn developed in [] based on usage and good performances shown across several sentiment analysis tasks. Afinn lexicon consists of 2477 English words, with each word associated with a sentiment score ranging from −1 to −5 for negative words and 1 to 5 for positive words. The sentiment score of a document is calculated by matching the Afinn lexicon wordlist to the document (at the word level). The associated sentiment score of the matched words is then aggregated to get the final sentiment score for the document. For illustration, if the final sentiment score is 1 or above, then the document sentiment score is positive, and it will be negative if the final sentiment score is −1 or below. Table 1 below presents the statistics of Afinn.
Table 1. Afinn statistics.

3.2. Datasets

This study adopted two datasets to account for bias of algorithms towards a particular dataset. For identification purpose, we renamed the datasets as D1 and D2. Subsequent sections will provide a summary of the datasets.

3.2.1. Dataset 1 (D1)

The dataset D1 was crawled from Reddit [] and was annotated by two domain experts with labels, namely, not depressed, moderate, and severe. The label “not depressed” indicates the text shows no sign of depression, while “moderate” and “severe” indicate this is a depressive text, and the strength of depression in the text with “severe” being high. The dataset has been used in [,,]. Specifically, it was used in the competition of the Second Workshop on Language Technology for Equality, Diversity, and Inclusion []. Authors in [] pre-processed the data by removing the duplicates to create the final dataset (10,251 posts). The dataset was divided into three partitions, namely, train (6006 posts), dev (1000 posts), and test (3245 posts). The evaluation of model performances was performed using the dev dataset (as utilized by previous studies) to enable fair comparison with prior research work. The dataset is publicly available and can be accessed via this link: https://github.com/rafalposwiata/depression-detection-lt-edi-2022/tree/main/data (accessed on 12 November 2023). Table 2 below shows the summary of the data.
Table 2. D1 statistics.

3.2.2. Datasets 2 (D2)

The dataset, D2, was prepared by the authors in []. In their study, they reconstructed the dataset of [] obtained from Reddit into four depression severity levels covering the clinical depression standards on social media. The dataset (3553 posts) was labeled manually by two human annotators, and an additional annotator was employed for cases with label disagreement. The dataset was divided into two partitions, namely, train (70%) and test (30%). The evaluation of model performances was performed using the test set. Table 3 below shows the data statistics.
Table 3. D2 statistics.

4. Results and Discussion

This section presents the results of our experiments. The evaluation of the algorithms in terms of performance is reported using metrics, namely, accuracy (A), weighted precision (P), weighted recall (R), and weighted F1 score (F). It is worth stating that due to the presence of class imbalance issues in datasets D1 and D2, this study emphasizes the P, R, and F metrics for evaluation more. Table 4 presents the comparative experimental results of the traditional ML algorithms fitted using the TF-IDF numerical vectors.
Table 4. The evaluation result of ML algorithms.
In the D1 experimentation, results showed that SVM achieved the best outcome in terms of precision and recall. However, LR showed a better F1 score. It is worth noting that in terms of computational time, SVM took a very long time, GBM took a moderate time, and LR and NB were fast. Overall, the traditional ML algorithms performed poorly, most especially with the “severe” class. The “not depressed” class showed the best F1 score of all the three classes. This is unsurprising, considering that the “severe” class is the minority class. In the D2 experimentation, results showed that LR outperformed other models. In general, the traditional ML algorithms achieved good results. Across both datasets (D1 and D2), the LR showed an able-to-compete performance. However, the performance was generally poor in D1. Thus, our results showed that the traditional ML models are unable to cater to the diversity and class imbalance in the dataset. Furthermore, we performed the second experiment, in which we used the word embedding vectors (of BERT and SBERT) and adopted both the traditional ML algorithms (LR, SVM, and GBM) and deep learning algorithms (BiLSTM and BiGRU) as classifiers. The results of the second experiment are shown in Table 5 below.
Table 5. Evaluation result of the LLMs.
In the D1 experimentation, results show that the use of word embedding vectors yields an improvement across all evaluation metrics in D1. However, the models show marginal improvement over each other. Furthermore, our approach performed best in terms of precision and F1 score. In the D2 experimentation, our approach showed improvements in terms of recall and F1 score. However, it is worth stating that improvements in the model performance were marginal when compared to the first experimental results presented in Table 4. This can be attributed to the proportionate distribution of the classes except the “minimal” class. Across the datasets (D1 and D2), our approach produced the best performance. This evidences the consistency, suitability, and efficiency of our approach.
Results from Table 6 show that incorporating SA as a feature enhanced the performance of our proposed model. Specifically, the results from the evaluation metrics show that the performances of our proposed model were improved with at least 0.02 (precision) and 0.01 (F1 score) in D1 and at least 0.08 (precision) and 0.06 (F1 score) in D2 in comparison to other models without SA feature. This improvement is consistent with the findings of Muñoz and Iglesias []. It is worth stating that models with the word embeddings (BERT and SBERT) showed better performance across D1 and D2 than models with TF-IDF. This is due to the ability of the transformer-based word embedding models to capture the relationship and contextual representation (language understanding) of the features. Overall, our approach performed best in D1 in terms of recall and F1 score. However, showed the best results across all metrics in D2. This further evidences the robustness, consistency, and efficiency of our approach. In comparison to other studies, our approach shows better performance than existing results, as shown in Table 7 and Table 8 below.
Table 6. Evaluation results using LLMs (with sentiment analysis).
Table 7. Performance comparison to previous studies (D1).
Table 8. Performance comparison to previous studies (D2).

5. Conclusions

This paper aimed to develop a depression detection model using a well-performing ML algorithm. To this end, we compared several ML algorithms ranging from the traditional ML to the deep learning algorithms. Furthermore, we utilized TF-IDF and word embeddings from the LLMs to generate the numeric vectors fitted into the ML models. Specifically, we compared the influence of sentiment indicators as an external feature in the models. Our experimental results evidenced that the use of SBERT and stacking ensemble model achieved SOTA results. Our result is beneficial to the medical/healthcare stakeholders and practitioners for early depression detection. This intervention helps in making a more accurate diagnosis and monitoring the symptoms over time.
Theoretically, we have evidence that sentiment analysis indicator is an important feature for enhancing the performance of depression detection. Secondly, we have evidence that the word embedding models from the transformer architectures yield an ability to compete for performance across several datasets. Thus, it can be relied on as an off-the-shelf depression detection model. To conclude, we have presented the use of stacked ensemble ML models with SBERT and sentiment indicators for the prediction of depression, with improved metrics when compared to previous works. However, SBERT, as with any BERT model, can struggle to generate embeddings when sentences become too long. Also, it may struggle with context-aware tasks, where there might be subtle semantics differences in sentences. To this end, in future works, we will experiment with newer, larger language models such as GPT, LLaMa, and PaLM variants with larger trainable parameters, as recent research [] has shown to have better sentiment scores when compared to BERT. Also, we will explore the explainability of the black box SBERT model. For example, we need to understand how the Siamese BERT models update their weight during training, having trained with different input sentences but using the same weight for each training epoch. Furthermore, it is worth stating that our experimental results relied on Reddit datasets; as such, we envisage evaluating our methodology with datasets from other social networks. Similarly, our future work will consider incorporating emotion recognition features into our methodology. Lastly, we recommend extending our methodology to the detection of harmful content online and the detection of academic stress.

Author Contributions

Conceptualization, B.O.; methodology, B.O., H.S. and O.S.; software, B.O.; validation, B.O. and O.S.; formal analysis, B.O.; investigation, B.O.; resources, B.O.; data curation, B.O.; writing—original draft preparation, B.O., H.S. and O.S.; writing—review and editing, B.O., H.S. and O.S.; visualization, B.O., H.S. and O.S.; supervision, B.O.; project administration, B.O. and O.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

All the data are presented in the study.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Figuerêd, J.S.L.; Maia, A.L.L.; Calumby, R.T. Early depression detection in social media based on deep learning and underlying emotions. Online Soc. Netw. Media 2022, 31, 100225. [Google Scholar] [CrossRef]
  2. Thapar, A.; Eyre, O.; Patel, V.; Brent, D. Depression in young people. Lancet 2022, 400, 617–631. [Google Scholar] [CrossRef]
  3. World Health Organization. Depressive Disorder (Depression). 2023. Available online: https://www.who.int/en/news-room/fact-sheets/detail/depression (accessed on 27 August 2023).
  4. Cai, Y.; Wang, H.; Ye, H.; Jin, Y.; Gao, W. Depression detection on online social network with multivariate time series feature of user depressive symptoms. Expert Syst. Appl. 2023, 217, 119538. [Google Scholar] [CrossRef]
  5. World Health Organization. Depression and Other Common Mental Disorders: Global Health Estimates. Technical Report. 2017. Available online: https://apps.who.int/iris/handle/10665/254610 (accessed on 19 September 2023).
  6. Zhang, T.; Yang, K.; Alhuzali, H.; Liu, B.; Ananiadou, S. PHQ-aware depressive symptoms identification with similarity contrastive learning on social media. Inf. Process. Manag. 2023, 60, 103417. [Google Scholar] [CrossRef]
  7. Liang, Y.; Liu, L.; Ji, Y.; Huangfu, L.; Zeng, D.D. Identifying emotional causes of mental disorders from social media for effective intervention. Inf. Process. Manag. 2023, 60, 103407. [Google Scholar] [CrossRef]
  8. Beck, A.T.; Ward, C.H.; Mendelson, M.; Mock, J.; Erbaugh, J. An inventory for measuring depression. Arch. Gen. Psychiatry 1961, 4, 561–571. [Google Scholar] [CrossRef]
  9. Radloff, L.S. The use of the Center for Epidemiologic Studies Depression Scale in adolescents and young adults. J. Youth Adolesc. 1991, 20, 149–166. [Google Scholar] [CrossRef]
  10. Kovacs, M. The Children’s Depression Inventory (CDI). Psychopharmacol. Bull. 1985, 21, 995–998. [Google Scholar]
  11. Angold, A.; Costello, E.J. Mood and Feelings Questionnaire (MFQ); Developmental Epidemiology Program, Duke University: Durham, UK, 1987; Available online: https://devepi.duhs.duke.edu/measures/the-mood-andfeelings-questionnaire-mfq/ (accessed on 28 October 2023).
  12. Kroenke, K.; Spitzer, R.L.; Williams, J.B. The PHQ-9: Validity of a brief depression severity measure. J. Gen. Intern. Med. 2001, 16, 606–613. [Google Scholar] [CrossRef]
  13. Chorpita, B.F.; Moffitt, C.E.; Gray, J. Psychometric properties of the Revised Child Anxiety and Depression Scale in a clinical sample. Behav. Res. Ther. 2005, 43, 309–322. [Google Scholar] [CrossRef]
  14. Epstein, R.M.; Duberstein, P.R.; Feldman, M.D.; Rochlen, A.B.; Bell, R.A.; Kravitz, R.L.; Cipri, C.; Becker, J.D.; Bamonti, P.M.; Paterniti, D.A. “I didn’t know what was wrong”: how people with undiagnosed depression recognize, name and explain their distress. J. Gen. Intern. Med. 2010, 25, 954–961. [Google Scholar] [CrossRef]
  15. Boerema, A.M.; Kleiboer, A.; Beekman, A.T.; van Zoonen, K.; Dijkshoorn, H.; Cuijpers, P. Determinants of help-seeking behavior in depression: A cross-sectional study. BMC Psychiatry 2016, 16, 78. [Google Scholar] [CrossRef]
  16. Liu, D.; Feng, X.L.; Ahmed, F.; Shahid, M.; Guo, J. Detecting and measuring depression on social media using a machine learning approach: Systematic review. JMIR Ment. Health 2022, 9, e27244. [Google Scholar] [CrossRef]
  17. Salas-Zárate, R.; Alor-Hernández, G.; Salas-Zárate, M.D.P.; Paredes-Valverde, M.A.; Bustos-López, M.; Sánchez-Cervantes, J.L. Detecting depression signs on social media: A systematic literature review. Healthcare 2022, 10, 291. [Google Scholar] [CrossRef]
  18. Ogunleye, B.O. Statistical Learning Approaches to Sentiment Analysis in the Nigerian Banking Context. Ph.D. Thesis, Sheffield Hallam University, Sheffield, UK, 2021. [Google Scholar]
  19. Ogunleye, B.; Brunsdon, T.; Maswera, T.; Hirsch, L.; Gaudoin, J. Using Opinionated-Objective Terms to Improve Lexicon-Based Sentiment Analysis. In Proceeding of International Conference on Soft Computing for Problem-Solving; Springer Nature: Singapore, 2023. [Google Scholar] [CrossRef]
  20. Chancellor, S.; De Choudhury, M. Methods in predictive techniques for mental health status on social media: A critical review. NPJ Digit. Med. 2020, 3, 43. [Google Scholar] [CrossRef]
  21. Pérez, A.; Parapar, J.; Barreiro, A.; López-Larrosa, S. BDI-Sen: A Sentence Dataset for Clinical Symptoms of Depression. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’23), Taipei, Taiwan, 23–27 July 2023; ACM: New York, NY, USA, 2023; pp. 2996–3006. [Google Scholar] [CrossRef]
  22. Wang, Y.; Wang, Z.; Li, C.; Zhang, Y.; Wang, H. Online social network individual depression detection using a multitask heterogenous modality fusion approach. Inf. Sci. 2022, 609, 727–749. [Google Scholar] [CrossRef]
  23. Islam, M.R.; Kamal, A.R.M.; Sultana, N.; Islam, R.; Moni, M.A.; Ulhaq, A. Detecting Depression Using K-Nearest Neighbors (KNN) Classification Technique. In Proceeding of the 2018 International Conference on Computer, Communication, Chemical, Material and Electronic Engineering (IC4ME2), Rajshahi, Bangladesh, 8–9 February 2018; pp. 1–4. [Google Scholar] [CrossRef]
  24. Cohan, A.; Desmet, B.; Yates, A.; Soldaini, L.; MacAvaney, S.; Goharian, N. SMHD: A large-scale resource for exploring online language usage for multiple mental health conditions. arXiv 2018, arXiv:1806.05258. [Google Scholar]
  25. Bierbaum, J.; Lynn, M.; Yu, L. Utilizing Pattern Mining and Classification Algorithms to Identify Risk for Anxiety and Depression in the LGBTQ+ Community During the COVID-19 Pandemic. In Proceedings of the Companion Proceedings of the Web Conference 2022 (WWW ‘22 Companion), Virtual Event, Lyon, France, 25–29 April 2022; ACM: New York, NY, USA, 2022; pp. 663–672. [Google Scholar] [CrossRef]
  26. Skaik, R.; Inkpen, D. Using Twitter Social Media for Depression Detection in the Canadian Population. In Proceedings of the 2020 3rd Artificial Intelligence and Cloud Computing Conference (AICCC 2020), Kyoto, Japan, 18–20 December 2020; ACM: New York, NY, USA, 2020. [Google Scholar] [CrossRef]
  27. Hosseini-Saravani, S.H.; Besharati, S.; Calvo, H.; Gelbukh, A. Depression Detection in Social Media Using a Psychoanalytical Technique for Feature Extraction and a Cognitive Based Classifier. In Advances in Computational Intelligence, Proceedings of the 19th Mexican International Conference on Artificial Intelligence, MICAI 2020, Mexico City, Mexico, 12–17 October 2020; Proceedings, Part II; Springer: Berlin/Heidelberg, Germany, 2020; pp. 282–292. [Google Scholar] [CrossRef]
  28. He, L.; Chan, J.C.W.; Wang, Z. Automatic depression recognition using CNN with attention mechanism from videos. Neurocomputing 2021, 422, 165–175. [Google Scholar] [CrossRef]
  29. Ive, J.; Gkotsis, G.; Dutta, R.; Stewart, R.; Velupillai, S. Hierarchical neural model with attention mechanisms for the classification of social media text related to mental health. In Proceedings of the Fifth Workshop on Computational Linguistics and Clinical Psychology: From Keyboard to Clinic, New Orleans, LA, USA, 5 June 2018; Association for Computational Linguistics: Stroudsburg, PA, USA, 2018; pp. 69–77. [Google Scholar]
  30. Amanat, A.; Rizwan, M.; Javed, A.R.; Abdelhaq, M.; Alsaqour, R.; Pandya, S.; Uddin, M. Deep learning for depression detection from textual data. Electronics 2022, 11, 676. [Google Scholar] [CrossRef]
  31. Almars, A.M. Attention-Based Bi-LSTM Model for Arabic Depression Classification. Comput. Mater. Contin. 2022, 71, 3092–3106. [Google Scholar] [CrossRef]
  32. Liu, T.; Jain, D.; Rapole, S.R.; Curtis, B.; Eichstaedt, J.C.; Ungar, L.H.; Guntuku, S.C. Detecting Symptoms of Depression on Reddit. Proceeding of the 15th ACM Web Science Conference 2023 (WebSci ’23), Austin, TX, USA, 30 April–1 May 2023; ACM: New York, NY, USA, 2023; pp. 174–183. [Google Scholar] [CrossRef]
  33. Harrigian, K.; Aguirre, C.; Dredze, M. Do Models of Mental Health Based on Social Media Data Generalize? In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, Online, 16–20 November 2020; Association for Computational Linguistics: Stroudsburg, PA, USA, 2020; pp. 3774–3788. [Google Scholar]
  34. Ogunleye, B.; Dharmaraj, B. The use of a large language model for cyberbullying detection. Analytics 2023, 2, 694–707. [Google Scholar] [CrossRef]
  35. Cheng, Q.; Li, T.M.; Kwok, C.L.; Zhu, T.; Yip, P.S. Assessing suicide risk and emotional distress in Chinese social media: A text mining and machine learning study. J. Med. Internet Res. 2017, 19, e243. [Google Scholar] [CrossRef]
  36. Shrestha, A.; Tlachac, M.L.; Flores, R.; Rundensteiner, E.A. BERT Variants for Depression Screening with Typed and Transcribed Responses. In Proceedings of the 2022 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp/ISWC ’22 Adjunct), Cambridge, UK, 11–15 September 2022; ACM: New York, NY, USA, 2022; pp. 211–215. [Google Scholar] [CrossRef]
  37. Naseem, U.; Dunn, A.G.; Kim, J.; Khushi, M. Early Identification of Depression Severity Levels on Reddit Using Ordinal Classification. In Proceedings of the ACM Web Conference 2022 (WWW ’22), Virtual Event, Lyon, France, 25–29 April 2022; ACM: New York, NY, USA, 2022; pp. 2563–2572. [Google Scholar] [CrossRef]
  38. Monreale, A.; Iavarone, B.; Rossetto, E.; Beretta, A. Detecting Addiction, Anxiety, and Depression by Users Psychometric Profiles. In Proceedings of the Companion Proceedings of the Web Conference 2022 (WWW ’22 Companion), Virtual Event, Lyon, France, 25–29 April 2022; ACM: New York, NY, USA, 2022; pp. 1189–1197. [Google Scholar] [CrossRef]
  39. Sen, I.; Quercia, D.; Constantinides, M.; Montecchi, M.; Capra, L.; Scepanovic, S.; Bianchi, R. Depression at Work: Exploring Depression in Major US Companies from Online Reviews. Proc. ACM Hum.-Comput. Interact. 2022, 6, 438. [Google Scholar] [CrossRef]
  40. Wu, J.; Wu, X.; Hua, Y.; Lin, S.; Zheng, Y.; Yang, J. Exploring Social Media for Early Detection of Depression in COVID19 Patients. In Proceedings of the ACM Web Conference 2023 (WWW ’23), Austin, TX, USA, 30 April–4 May 2023; ACM: New York, NY, USA, 2023; pp. 3968–3977. [Google Scholar] [CrossRef]
  41. Villatoro-Tello, E.; Ramírez-de-la-Rosa, G.; Gática-Pérez, D.; Magimai-Doss, M.; Jiménez-Salazar, H. Approximating the Mental Lexicon from Clinical Interviews as a Support Tool for Depression Detection. In Proceedings of the 2021 International Conference on Multimodal Interaction (ICMI ’21), Montréal, QC, Canada, 18–22 October 2021; ACM: New York, NY, USA, 2021; pp. 557–566. [Google Scholar] [CrossRef]
  42. Liu, Y.; Kang, K.D.; Doe, M.J. Hadd: High-accuracy detection of depressed mood. Technologies 2022, 10, 123. [Google Scholar] [CrossRef]
  43. Malik, A.; Shabaz, M.; Asenso, E. Machine learning based model for detecting depression during COVID-19 crisis. Sci. Afr. 2023, 20, e01716. [Google Scholar]
  44. Gallegos Salazar, L.M.; Loyola-Gonzalez, O.; Medina-Perez, M.A. An explainable approach based on emotion and sentiment features for detecting people with mental disorders on social networks. Appl. Sci. 2021, 11, 10932. [Google Scholar] [CrossRef]
  45. Burdisso, S.G.; Errecalde, M.; Montes-y-Gómez, M. A text classification framework for simple and effective early depression detection over social media streams. Expert Syst. Appl. 2019, 133, 182–197. [Google Scholar] [CrossRef]
  46. Trotzek, M.; Koitka, S.; Friedrich, C.M. Utilizing neural networks and linguistic metadata for early detection of depression indications in text sequences. IEEE Trans. Knowl. Data Eng. 2018, 32, 588–601. [Google Scholar] [CrossRef]
  47. Adarsh, V.; Kumar, P.A.; Lavanya, V.; Gangadharan, G.R. Fair and explainable depression detection in social media. Inf. Process. Manag. 2023, 60, 103168. [Google Scholar] [CrossRef]
  48. Guo, Z.; Ding, N.; Zhai, M.; Zhang, Z.; Li, Z. Leveraging domain knowledge to improve depression detection on Chinese social media. IEEE Trans. Comput. Soc. Syst. 2023, 10, 1528–1536. [Google Scholar] [CrossRef]
  49. Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
  50. Reimers, N.; Gurevych, I. Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv 2019, arXiv:1908.10084. [Google Scholar]
  51. Divina, F.; Gilson, A.; Goméz-Vela, F.; García Torres, M.; Torres, J.F. Stacking ensemble learning for short-term electricity consumption forecasting. Energies 2018, 11, 949. [Google Scholar] [CrossRef]
  52. Kwon, H.; Park, J.; Lee, Y. Stacking ensemble technique for classifying breast cancer. Healthc. Inform. Res. 2019, 25, 283–288. [Google Scholar] [CrossRef] [PubMed]
  53. Rajagopal, S.; Kundapur, P.P.; Hareesha, K.S. A stacking ensemble for network intrusion detection using heterogeneous datasets. Secur. Commun. Netw. 2020, 2020, 4586875. [Google Scholar] [CrossRef]
  54. Charoenkwan, P.; Chiangjong, W.; Nantasenamat, C.; Hasan, M.M.; Manavalan, B.; Shoombuatong, W. StackIL6: A stacking ensemble model for improving the prediction of IL-6 inducing peptides. Brief. Bioinform. 2021, 22, bbab172. [Google Scholar] [CrossRef]
  55. Akyol, K. Stacking ensemble based deep neural networks modeling for effective epileptic seizure detection. Expert Syst. Appl. 2020, 148, 113239. [Google Scholar] [CrossRef]
  56. Ribeiro, M.H.D.M.; dos Santos Coelho, L. Ensemble approach based on bagging, boosting and stacking for short-term prediction in agribusiness time series. Appl. Soft Comput. 2020, 86, 105837. [Google Scholar] [CrossRef]
  57. Natekin, A.; Knoll, A. Gradient boosting machines, a tutorial. Front. Neurorobotics 2013, 7, 21. [Google Scholar] [CrossRef]
  58. Tsai, J.K.; Hung, C.H. Improving AdaBoost classifier to predict enterprise performance after COVID-19. Mathematics 2021, 9, 2215. [Google Scholar] [CrossRef]
  59. Saini, D.; Chand, T.; Chouhan, D.K.; Prakash, M. A comparative analysis of automatic classification and grading methods for knee osteoarthritis focussing on X-ray images. Biocybern. Biomed. Eng. 2021, 41, 419–444. [Google Scholar] [CrossRef]
  60. Grosse, R. Lecture 5: Multilayer Perceptrons. 2018. Available online: https://www.cs.toronto.edu/~mren/teach/csc411_19s/lec/lec10_notes1.pdf (accessed on 13 November 2023).
  61. Nielsen, F.Å. A new ANEW: Evaluation of a word list for sentiment analysis in microblogs. arXiv 2011, arXiv:1103.2903. [Google Scholar]
  62. Sampath, K.; Durairaj, T. Data Set Creation and Empirical Analysis for Detecting Signs of Depression from Social Media Postings. In Computational Intelligence in Data Science, Proceedings of the ICCIDS 2022. IFIP Advances in Information and Communication Technology, Galway, Ireland, 22–24 November 2022; Kalinathan, L.R.P., Kanmani, M.S.M., Eds.; Springer: Cham, Switzerland, 2022; Volume 654. [Google Scholar] [CrossRef]
  63. Muñoz, S.; Iglesias, C.Á. Detection of the Severity Level of Depression Signs in Text Combining a Feature-Based Framework with Distributional Representations. Appl. Sci. 2023, 13, 11695. [Google Scholar] [CrossRef]
  64. Shi, Y.; Tian, Y.; Tong, C.; Zhu, C.; Li, Q.; Zhang, M.; Zhao, W.; Liao, Y.; Zhou, P. Detect Depression from Social Networks with Sentiment Knowledge Sharing. In Proceedings of the Chinese National Conference on Social Media Processing, Anhui, China, 23–26 November 2023; Springer: Singapore, 2023; pp. 133–146. [Google Scholar] [CrossRef]
  65. Tavchioski, I.; Robnik-Šikonja, M.; Pollak, S. Detection of depression on social networks using transformers and ensembles. arXiv 2023, arXiv:2305.05325. [Google Scholar]
  66. Poświata, R.; Perełkiewicz, M. Detecting Signs of Depression from Social Media Text using RoBERTa Pre-trained Language Models. In Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion, Dublin, Ireland, 27 May 2022; Association for Computational Linguistics: Stroudsburg, PA, USA, 2022; pp. 276–282. [Google Scholar]
  67. Turcan, E.; McKeown, K. Dreaddit: A reddit dataset for stress analysis in social media. arXiv 2019, arXiv:1911.00133. [Google Scholar]
  68. Ilias, L.; Mouzakitis, S.; Askounis, D. Calibration of transformer-based models for identifying stress and depression in social media. IEEE Trans. Comput. Soc. Syst. 2023, 11, 1979–1990. [Google Scholar] [CrossRef]
  69. Shobayo, O.; Sasikumar, S.; Makkar, S.; Okoyeigbo, O. Customer Sentiments in Product Reviews: A Comparative Study with GooglePaLM. Analytics 2024, 3, 241–254. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.