Next Article in Journal
Insights into CYP1B1-Related Ocular Diseases Through Genetics and Animal Studies
Next Article in Special Issue
Psychiatric Implications of Genetic Variations in Oligodendrocytes: Insights from hiPSC Models
Previous Article in Journal
Post-Transplant Cyclophosphamide-Based GVHD Prophylaxis After Peripheral Blood Stem Cell HLA Identical Transplantation in Patients with Lymphoma: A Prospective Observational Study
Previous Article in Special Issue
Third-Generation Antipsychotics: The Quest for the Key to Neurotrophism
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Application of the Random Forest Algorithm for Accurate Bipolar Disorder Classification

1
Virgen de la Luz Hospital, 16002 Cuenca, Spain
2
Medical Analysis Expert Group, Institute of Technology, University of Castilla-La Mancha, 13001 Cuenca, Spain
3
Instituto de Investigación Sanitaria de Castilla-La Mancha (IDISCAM), 45071 Toledo, Spain
4
Department of Pharmacy, General University Hospital, 46014 Valencia, Spain
*
Author to whom correspondence should be addressed.
Life 2025, 15(3), 394; https://doi.org/10.3390/life15030394
Submission received: 30 December 2024 / Revised: 16 February 2025 / Accepted: 28 February 2025 / Published: 3 March 2025
(This article belongs to the Special Issue What Is New in Psychiatry and Psychopharmacology—2nd Edition)

Abstract

Bipolar disorder (BD) is a complex psychiatric condition characterized by alternating episodes of mania and depression, posing significant challenges for accurate and timely diagnosis. This study explores the use of the Random Forest (RF) algorithm as a machine learning approach to classify patients with BD and healthy controls based on electroencephalogram (EEG) data. A total of 330 participants, including euthymic BD patients and healthy controls, were analyzed. EEG recordings were processed to extract key features, including power in frequency bands and complexity metrics such as the Hurst Exponent, which measures the persistence or randomness of a time series, and the Higuchi’s Fractal Dimension, which is used to quantify the irregularity of brain signals. The RF model demonstrated robust performance, achieving an average accuracy of 93.41%, with recall and specificity exceeding 93%. These results highlight the algorithm’s capacity to handle complex, noisy datasets while identifying key features relevant for classification. Importantly, the model provided interpretable insights into the physiological markers associated with BD, reinforcing the clinical value of EEG as a diagnostic tool. The findings suggest that RF is a reliable and accessible method for supporting the diagnosis of BD, complementing traditional clinical practices. Its ability to reduce diagnostic delays, improve classification accuracy, and optimize resource allocation make it a promising tool for integrating artificial intelligence into psychiatric care. This study represents a significant step toward precision psychiatry, leveraging technology to improve the understanding and management of complex mental health disorders.

1. Introduction

Bipolar disorder (BD) is a complex psychiatric illness that affects millions of people worldwide and is associated with drastic changes in mood, energy, and behavior. These changes can range from episodes of extreme euphoria, known as mania, to prolonged periods of deep depression, significantly impacting patients’ quality of life as well as their social and family environments. Recent studies estimate that between 1% and 3% of the global population suffers from bipolar disorder [1,2], making it a highly relevant public health issue. Despite advances in research and diagnostic methods, bipolar disorder remains a challenge for healthcare professionals due to the diversity in its clinical presentation and the overlap of its symptoms with other mood disorders, such as major depressive disorder [3,4].
The diagnosis of bipolar disorder is a complex process influenced by multiple clinical and neurobiological factors. In addition to the core symptoms of the disorder, temperamental traits have been shown to play a key role in identifying and differentiating BD from other psychiatric disorders. Recent studies, such as that of Favaretto et al. [5], have explored the relationship between affective temperaments and mood disorders, highlighting their potential to improve diagnostic accuracy [1]. In particular, certain temperamental profiles, such as cyclothymia and hyperthymia, have been linked to greater susceptibility to BD, suggesting that integrating these dimensions with machine learning-based tools could enhance current diagnostic approaches.
The diagnosis of bipolar disorder remains a significant challenge in clinical practice due to multiple factors. First, it relies heavily on the subjective evaluation of symptoms through clinical interviews, which can lead to variations in interpretations among different professionals. Second, there is symptomatic overlap with other psychiatric disorders, such as major depressive disorder and borderline personality disorder, complicating precise differentiation. Additionally, studies have reported that the average time to reach a definitive BD diagnosis can exceed 10 years, during which patients may receive inappropriate treatments that exacerbate the condition [1,2,3]. These limitations highlight the need for objective, data-driven tools that complement traditional clinical evaluation.
The early and accurate diagnosis of bipolar disorder is crucial for improving therapeutic outcomes and preventing associated complications such as suicide, medical comorbidities, and functional impairment [1,6]. However, the current diagnostic process still heavily relies on clinical observation and the subjective interpretation of symptoms by specialists, which carries a considerable risk of errors and delays in diagnosis [7]. These limitations underscore the need to develop more objective, data-driven tools that can complement clinical evaluation and provide a more precise and efficient approach to diagnosing bipolar disorder. Studies have analyzed the similarities and differences in the classification of bipolar disorders according to the DSM-5 and the beta version of the ICD-11, highlighting the incorporation of dimensional parameters for symptom assessment and the inclusion of new course specifiers, such as mixed features [8]. A review emphasized the latest advances in the diagnosis and treatment of bipolar disorder, stressing the importance of precision psychiatry and the need for thorough evaluation of patients with depressive symptoms to identify manic or hypomanic episodes [9]. Another study provided a comprehensive review of bipolar affective disorder, addressing its prevalence, clinical presentation, and treatment options, while emphasizing the importance of tailoring treatment to each patient [10]. Research has also explored the classification, epidemiology, and etiopathogenesis of bipolar disorder, offering an integrated perspective on the disease and its clinical implications [11].
In this context, machine learning (ML) approaches have emerged as promising alternatives to improve the accuracy and speed of BD detection [12,13,14]. ML enables the automated analysis of large volumes of biomedical data, identifying subtle patterns that may be difficult to detect using traditional methods. In particular, the application of ML in electroencephalographic (EEG) signal processing has demonstrated its ability to reveal neurophysiological alterations associated with BD, providing an objective biomarker-based approach for patient classification [7,8,9]. This study explores the use of the Random Forest (RF) algorithm for the automated classification of BD patients and healthy controls, aiming to contribute to the development of complementary tools that optimize the diagnostic process in precision psychiatry.
In recent decades, data-driven medicine has experienced significant growth, driven by the increasing availability of large clinical databases and the development of artificial intelligence tools such as machine learning. Machine learning algorithms have emerged as promising tools for improving diagnostic accuracy by analyzing large volumes of clinical, neurophysiological, and behavioral data [15,16,17,18,19,20]. One study proposed a deep learning method utilizing actigraphy and electrodermal activity data obtained from wrist-worn devices to differentiate between manic and euthymic states in bipolar patients [21]. Another study employed a machine learning model based on mathematical signatures to differentiate between bipolar disorder and borderline personality disorder [22,23]. Pettorruso et al. developed a machine learning-based model to predict the response to intranasal esketamine in patients with treatment-resistant depression, achieving an accurate classification of patients who responded favorably to the treatment [24]. Similarly, Distefano et al. applied AI models to the analysis of functional magnetic resonance imaging (fMRI) data to improve the detection of schizophrenia, highlighting the potential of these tools in precision psychiatry [25].
Additionally, a system using one-dimensional convolutional neural networks has been applied to analyze intrinsic connectivity patterns in resting-state functional magnetic resonance imaging (fMRI) data [26]. Various machine learning algorithms such as Random Forest (RF) [27,28], Support Vector Machines (SVM) [29,30,31,32], k-Nearest Neighbors (KNN) [33,34,35], decision trees (DT) [36,37,38], and Gaussian Naïve Bayes (GNB) [39,40] are being used to process medical data [41,42,43], achieving high accuracy rates. These techniques have demonstrated not only their ability to identify complex patterns associated with bipolar disorder but also to provide more objective and consistent interpretations compared to traditional methods [44]. Within this context, the RF model has gained popularity due to its robustness, flexibility, and ability to handle noisy and nonlinear data. This method, based on an ensemble of decision trees, has been widely applied in various medical domains, showing outstanding performance in disease classification, clinical outcome prediction, and the selection of relevant features in biomedical studies [28,44,45].
The objective of the present study is to develop a system capable of predicting patients with bipolar disorder using the RF algorithm based on a patient database through the analysis of their electroencephalogram (EEG) data. This approach aims to address some of the main challenges associated with traditional diagnosis, such as subjectivity and delays in identifying the disorder, and to evaluate the efficacy of a predictive model based on EEG recordings. Additionally, the study seeks to identify the most influential features in the classification of patients, providing a clearer perspective on the factors associated with the development of bipolar disorder. The implementation of this model could have a significant impact on clinical practice by offering a complementary tool for the early and accurate diagnosis of bipolar disorder.

2. Materials and Methods

2.1. Materials

In this study, real EEG recordings were used to develop a predictive system for identifying patients with bipolar disorder. We analyzed the individual results of 330 participants, 120 of whom were patients with bipolar disorder (BD) and 210 were healthy controls. In response to the observations, a diagnosis was made according to DSM-IV criteria, and patients received treatment in the Severe Mental Disorders Program (SMD-Cu) at the Department of Psychiatry of the Hospital. Each diagnosis was confirmed through the Structured Clinical Interview for DSM-IV (SCID-I). At the time of evaluation, all patients met the euthymia criteria applied in our previous studies (scores < 7 on the Hamilton Depression Rating Scale [HDRS] and <6 on the Young Mania Rating Scale [YMRS], at least three months prior to the study).
EEG data were recorded using a 32-channel BrainVision Recorder. Brain Products GmbH. Gilching (Germany) system with a sampling rate of 500 Hz. Electrodes were placed according to the International 10–20 System, and impedance was maintained below 10 kΩ to minimize signal distortion. To reduce the influence of spurious signals, several techniques were applied. A band-pass filter (0.5–45 Hz) based on a fourth-order Butterworth filter was used to eliminate low-frequency noise, such as movement artifacts and baseline fluctuations, as well as high-frequency components caused by electronic interference and muscular noise. Additionally, the Independent Component Analysis (ICA) method was implemented to correct muscular and ocular artifacts, allowing the identification and removal of components related to eye movements, such as blinking and saccadic movements, as well as muscle activity. Components associated with these artifacts were visually inspected and removed before further analysis. A manual inspection of channels was also performed to detect extreme values or persistent noises. If a defective channel was found, its signal would be replaced using weighted interpolation from neighboring electrodes.
To ensure data uniformity before analysis, segmentation and normalization processes were applied. Signals were divided into 2-s segments with a 50% overlap to maximize data availability while preserving information. Subsequently, each EEG segment was normalized using the Z-score technique, adjusting the mean and standard deviation for each channel. This step ensured that the extracted features had a homogeneous scale and reduced the influence of individual variations in the signal.
From the preprocessed EEG signals, different feature sets were calculated. In the temporal domain, descriptive statistics such as mean, variance, kurtosis, and skewness were extracted to capture the distribution of brain electrical activity. In the frequency domain, the Fast Fourier Transform (FFT) was applied to decompose the signal into frequency bands, including delta, theta, alpha, beta, and gamma, allowing for an evaluation of the relative power in each band. In the nonlinear domain, complexity metrics such as Higuchi’s Fractal Dimension, the Hurst Exponent, and the Lyapunov Exponent were computed, providing information on the irregularity and chaotic dynamics of brain signals.
The study was approved by the Clinical Research Ethics Committee of the Cuenca Health Area, and all participants provided informed consent after receiving a detailed explanation of the procedures involved. The study was conducted in accordance with the ethical principles outlined in the Declaration of Helsinki for medical research involving human subjects, ensuring the rights, welfare, and dignity of the participants were respected.

2.2. Model Development

In this study, the RF algorithm was employed to address the classification task of distinguishing between patients with bipolar disorder and healthy controls. RF is an ensemble model based on decision trees, designed to handle complex and nonlinear data with high robustness against noisy and imbalanced datasets [46]. This algorithm has proven particularly effective in biomedical problems due to its ability to identify relevant features and its flexibility in multivariate scenarios [33]. The RF model was trained using features extracted from EEG signals. A total of 100 decision trees were configured in the model, and the number of features considered at each split was set to the square root of the total number of features, following standard recommendations to optimize the balance between bias and variance [28,44,45].
Feature extraction represents a fundamental step in EEG signal analysis, as it enables the derivation of relevant attributes that encapsulate critical information about the underlying neurophysiological patterns in bipolar disorder. In this study, advanced techniques were implemented to extract features from EEG signals in the temporal, frequency, and nonlinear domains, with a focus on maximizing their clinical relevance and their ability to enhance the performance of predictive models.
In the temporal domain, statistical metrics describing the basic properties of EEG signals were calculated. These included the mean, variance, kurtosis, and skewness, which characterize the amplitude distribution of the signals. For instance, kurtosis is particularly useful for identifying extreme events or peaks in EEG signals, which may be associated with pathological brain states. These metrics provide an initial framework for understanding general variations in brain electrical activity between patients with bipolar disorder and healthy controls [28,43,44,45].
In the frequency domain, the Fast Fourier Transform (FFT) was used to decompose EEG signals and analyze spectral power in the delta (0.5–4 Hz), theta (4–8 Hz), alpha (8–13 Hz), beta (13–30 Hz), and gamma (>30 Hz) bands. Each of these bands is associated with specific brain functions, such as alertness (beta), relaxation (alpha), and emotional processes (theta). Previous studies have shown that alterations in these bands may serve as potential biomarkers in psychiatric disorders, including bipolar disorder [47,48]. For example, a decrease in alpha band power and an increase in beta activity have been reported during manic episodes.
In the nonlinear domain, advanced metrics were applied to capture the complex and chaotic dynamics of EEG signals, which are often not evident through linear analysis. Higuchi’s Fractal Dimension, the Lyapunov Exponent, and the Hurst Exponent were calculated. Higuchi’s Fractal Dimension measures the irregularity and geometric complexity of the signals, while the Lyapunov Exponent quantifies sensitivity to initial conditions, providing a measure of chaos in the system [47,48]. The Hurst Exponent, on the other hand, evaluates the signal’s tendency to maintain persistent or antipersistent behavior over time, making it useful for detecting dynamic patterns in pathological conditions [47].
To optimize the feature set and prevent overfitting in machine learning models, a feature selection procedure based on the relative importance of attributes was implemented. Mutual information-based selection was used to identify the most discriminative features, reducing the dataset’s dimensionality and ensuring that only the most relevant variables were included. This approach not only enhances the predictive power of the models but also facilitates better interpretation of the results by highlighting the most significant biomarkers associated with bipolar disorder [44]. The combination of these multidimensional approaches ensures that the extracted features not only capture essential information about brain activity but are also robust to noise and individual variations. These features represent an optimal input set for machine learning algorithms such as RF [28,44,45,46], SVM [29,30,31,32], KNN [33,34,35], DT [36,37,38], and GNB [39,40] which require clean and representative data to achieve accurate classification.
To evaluate the model’s performance, a 5-fold cross-validation scheme was used. This approach ensured that each dataset was used for both training and testing, minimizing the risk of overfitting and providing a more reliable assessment of the model [43]. The dataset was divided into two subsets, with 70% allocated for training and 30% for testing, ensuring the independence of patient groups between the sets. Key performance metrics, including Accuracy, Recall, Specificity, F1 Score, and the area under the ROC curve (AUC), were calculated to evaluate the model’s effectiveness in classifying patients.

3. Results

In this study, the RF algorithm was used to address the classification task of distinguishing between patients with bipolar disorder and healthy control subjects, based on processed EEG data and extracted features. This method, based on an ensemble of decision trees, stands out for its ability to handle complex and nonlinear data, as well as its robustness against data noise. To evaluate the model’s performance, a 5-fold cross-validation scheme was implemented, ensuring independence between training and testing data and minimizing the risk of overfitting.
The proposed RF system was compared with four classification algorithms: SVM, DT, GNB, KNN, and RF, using a set of evaluation metrics to measure their performance: Accuracy, Matthews Correlation Coefficient (MCC), F1 Score, Precision, DYI, Recall, Specificity, Kappa, and AUC. The results obtained are presented in Table 1 and Table 2.
The results show that the RF algorithm significantly outperformed the other evaluated methods, achieving an Accuracy of 93.41%, an AUC of 0.93, and the highest values for Recall (93.51%), Specificity (93.30%), F1 Score (93.13%), and MCC (82.99%). In comparison, the second-best classifier, KNN, achieved an Accuracy of 85.44% and an AUC of 0.85, while SVM and DT algorithms showed moderate performance, with an Accuracy of 83.65% and 82.35%, respectively. On the other hand, GNB had the lowest performance, with an Accuracy of 74.72% and an AUC of 0.75, highlighting its limitations in handling the complexity of the data. The combination of Recall and Specificity offered by RF surpasses that of other methods, underscoring its robustness in complex classification scenarios. Furthermore, the high Kappa (83.27) and Precision (93.75%) indicate its ability to minimize both false negatives and false positives. The proposed system also achieves an AUC value close to 0.93, exceeding KNN by 8%, with the other algorithms showing less precise values. These results position the proposed RF system as a reliable, robust, and superior model for classifying patients with bipolar disorder.
The classification analysis of patients with bipolar disorder based on EEG signals revealed that features extracted from the temporal, frequency, and nonlinear domains significantly contribute to the model. As shown in Figure 1, in the temporal domain, kurtosis emerged as the most important metric (12%), followed by mean, variance, and skewness, which characterized the basic statistical properties of the signals. In the frequency domain, the power in the alpha (11%) and beta (9%) bands showed high relevance associated with relaxation and cognitive processing, respectively, while the delta, theta, and gamma bands provided complementary information. Nonlinear metrics, such as Higuchi’s Fractal Dimension (10%) and the Hurst (12%) and Lyapunov (10%) exponents, were crucial for capturing the complexity and chaotic dynamics of EEG signals, offering unique insights into pathological alterations. These findings highlight the need for a multidimensional approach that combines features from different domains to capture the complexity of bipolar disorder.
As shown in Figure 2, the training subsets of the model and the test subset exhibit high scores across all metrics, although they are slightly lower in the test subset. This consistency is due to the algorithm achieving an optimal level of training without incurring overfitting or underfitting. As observed in Figure 1, the RF model covers a larger area compared to the other evaluated methods, demonstrating a well-balanced model with high generalization capacity and the ability to provide accurate results with new data.
Additionally, the ROC curve was generated to represent sensitivity and specificity measures for each threshold value, aiming to evaluate the classification capabilities of the different machine learning algorithms. The results, presented in Figure 3, once again show that the proposed RF-based system covers a larger area, indicating superior predictive accuracy.
Compared to other evaluated classification methods, RF demonstrated competitive performance. Its ease of interpretation and scalability make RF an attractive tool for clinical applications, where transparency in the decision-making process is crucial.
The results obtained support the use of RF as an effective approach for the automated classification of patients with bipolar disorder, showing robust performance in terms of accuracy, sensitivity, and specificity. This model not only enables faster and more objective diagnoses but also provides valuable insights into the most relevant features for identifying this condition, which could inform future developments in the field of computational psychiatry. These findings highlight the potential of machine learning-based methods to complement and enhance current diagnostic practices for complex disorders like bipolar disorder.

4. Discussion

The early and accurate detection of bipolar disorder is crucial due to the significant impact this condition has on patients, their families, and society at large. This disorder, characterized by extreme mood swings oscillating between manic and depressive episodes, affects between 2% and 3% of the global population and is associated with high rates of comorbidity, disability, and suicide risk [1]. Without an accurate diagnosis, many patients receive incorrect treatments, prolonging their suffering and increasing the economic and social costs associated with the disease [49]. For these reasons, the timely identification of bipolar disorder should be a priority for healthcare systems.
One of the main challenges in detecting bipolar disorder is the reliance on clinical interviews and the subjective interpretation of symptoms, which leads to variability in diagnoses. The overlap with other mood disorders and borderline personality disorder increases the difficulty in accurately identifying BD, often resulting in misdiagnoses and delays in the implementation of appropriate treatments. These factors have driven the search for complementary methods that integrate neurophysiological biomarkers and automated approaches to improve diagnostic accuracy. In this study, we demonstrated that the Random Forest (RF)-based model applied to EEG provides a robust tool for BD classification with an accuracy of 93.41%, outperforming other algorithms such as SVM and KNN in key metrics. However, although this EEG-based approach enhances diagnostic objectivity, its clinical implementation requires further validation, and its combination with clinical data could further enhance its utility in hospital settings. For EEG data classification, multiple machine learning algorithms were evaluated, including Random Forest (RF), Support Vector Machines (SVM), k-Nearest Neighbors (KNN), decision trees (DT), and Gaussian Naïve Bayes (GNB). After a comparative analysis, RF was selected as the final model due to its superior performance in terms of accuracy, robustness, and interpretability. EEG signals are highly variable and prone to noise, which poses challenges for classification. RF is particularly well-suited for handling noisy and high-dimensional data, as it leverages an ensemble of decision trees, reducing overfitting and ensuring stable generalization to new datasets. Additionally, RF can capture nonlinear relationships in EEG signals without requiring complex transformations, unlike SVM with a linear kernel or GNB, which rely on specific assumptions about data distribution.
Recent studies have explored deep learning (DL) techniques, such as Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), for EEG-based BD classification. While these models have demonstrated strong performance in various classification tasks, they require large amounts of labeled data, which can be difficult to obtain in clinical settings due to data collection challenges and ethical constraints. RF, on the other hand, is more data-efficient, performing well even with limited sample sizes, making it a more practical option for real-world medical applications. Furthermore, RF offers greater interpretability, allowing for the identification of key EEG biomarkers, such as power in the alpha and beta bands and complexity metrics like Higuchi’s Fractal Dimension and the Hurst Exponent, whereas deep learning models often act as black-box classifiers, limiting insight into their decision-making process.
Additionally, RF exhibits superior robustness to noise and computational efficiency compared to deep learning models. EEG signals are inherently noisy, and CNNs, in particular, are highly sensitive to artifacts, often requiring extensive preprocessing. Without proper augmentation and hyperparameter tuning, CNNs and RNNs may struggle to generalize well. In contrast, RF’s ensemble-based approach mitigates overfitting and ensures stable performance across varying data conditions. Moreover, RF does not require specialized hardware such as GPUs, unlike deep learning models which demand high computational resources, making RF a more accessible and scalable solution for EEG-based BD classification. While deep learning models have achieved promising results in EEG classification, our comparative analysis confirmed that RF outperformed alternative machine learning models and remained competitive with deep learning approaches, achieving a 93.41% accuracy while maintaining interpretability, computational efficiency, and robustness to noise. Given these advantages, RF was determined to be the most suitable model for this study.
Although this study focused on EEG data analysis, recent research has highlighted the relevance of temperamental factors in BD assessment [5]. The integration of these traits with machine learning approaches could yield more precise hybrid models, combining neurophysiological biomarkers with behavioral and temperamental characteristics. Future research directions should explore how the combination of these dimensions can improve the differentiation of BD from other psychiatric disorders.
The heterogeneous nature of bipolar disorder and its symptomatic overlap with other mood disorders, such as unipolar depression, complicates its clinical diagnosis. Indeed, previous studies have shown that the average time to reach an accurate diagnosis can exceed 10 years, during which patients remain at risk of developing severe complications, such as psychotic episodes, substance abuse, and functional impairment [50]. The introduction of artificial intelligence tools, such as the RF algorithm evaluated in this study, can play a transformative role by providing complementary methods to reduce diagnostic delays and improve detection accuracy.
The analysis of EEG signals using machine learning techniques has significantly advanced the classification of patients with bipolar disorder. In the temporal domain, metrics such as mean, variance, kurtosis, and skewness have proven useful for characterizing the statistical properties of EEG signals, providing essential information about amplitude distribution and the presence of extreme events. These metrics have been fundamental in recent studies exploring the differentiation between normal and pathological patterns in brain activity [51,52].
In the frequency domain, the FFT enables the decomposition of EEG signals to analyze spectral power in specific bands, such as delta, theta, alpha, beta, and gamma. Each band is associated with specific brain functions, and alterations in these frequencies have been identified as potential biomarkers in psychiatric disorders, including bipolar disorder. For example, studies have reported decreased alpha power and increased beta activity during manic episodes, highlighting the importance of these features for diagnosis [51,52,53,54].
The nonlinear domain provides a unique perspective by capturing the complexity and chaotic dynamics of EEG signals through metrics such as Higuchi’s Fractal Dimension, the Lyapunov Exponent, and the Hurst Exponent. These metrics allow the analysis of properties such as self-similarity and the persistent or antipersistent behavior of signals, which are particularly relevant for pathological conditions like bipolar disorder [51,52,53,54]. Combining these metrics with features from other domains has enhanced classification performance in various studies.
Machine learning algorithms have been instrumental in leveraging these features for classifying patients with bipolar disorder. Methods such as SVM, neural networks, and ensemble approaches have proven effective, achieving high accuracy in differentiating patients from healthy controls [55,56,57,58]. Integrating EEG data with other modalities, such as cognitive assessments and neuroimaging, has shown great potential for enhancing diagnostic accuracy and providing a more comprehensive understanding of the neurophysiological differences between bipolar disorder (BD) and other psychiatric conditions [57,58,59,60,61,62]. Functional neuroimaging techniques like fMRI, PET, and DTI could offer a more detailed view of neural circuits, complementing EEG data by capturing structural and functional brain alterations associated with BD. Additionally, incorporating genetic and molecular biomarkers into machine learning models could improve personalized diagnostics, allowing for a more tailored approach to identifying individuals at higher risk. Beyond biological markers, wearable sensors—such as smartwatches tracking electrodermal activity and heart rate variability—could provide real-time physiological data that help detect manic or depressive episodes. These multimodal approaches have the potential to refine predictive models, reduce misclassifications, and contribute to a more precise understanding of BD’s neurophysiological mechanisms, ultimately leading to more accurate and individualized diagnostic strategies.
Functional neuroimaging (fMRI, PET, DTI) could provide a more detailed view of neural circuits, complementing EEG data. Additionally, incorporating genetic and molecular biomarkers into machine learning models could improve personalized diagnostics. Wearable sensors, such as smartwatches tracking electrodermal activity and heart rate variability, may also offer real-time physiological data to detect manic or depressive episodes. These multimodal approaches could refine predictive models, reduce misclassifications, and contribute to a more precise understanding of BD’s neurophysiological mechanisms.
In this study, RF demonstrated strong performance in classifying patients with bipolar disorder using features derived from EEG data, achieving an accuracy of 93.41% along with high sensitivity and specificity metrics. These figures are encouraging, as they suggest that the model has the potential to reliably identify both patients and controls, reducing the margin of error associated with traditional methods based on clinical interviews and subjective observations. Furthermore, the model analysis identified relevant biomarkers, such as brain complexity metrics (the Hurst Exponent and Higuchi’s Fractal Dimension) and power in the alpha and beta frequency bands, reinforcing the utility of EEG for the objective diagnosis of bipolar disorder.
Early detection has significant clinical implications. A timely diagnosis can facilitate the initiation of more appropriate treatments, such as mood stabilizers, and reduce exposure to ineffective or harmful therapies, such as the overuse of antidepressants in bipolar patients misdiagnosed with unipolar depression [3]. Moreover, early and accurate detection enables the implementation of preventive strategies that mitigate the risk of future episodes, improve patients’ quality of life, and reduce the disease’s impact on social and occupational domains.
Detecting bipolar disorder is not only essential for improving individual patient outcomes but also for reducing the societal burden of this illness. Tools like RF represent a significant step toward more objective, accessible, and data-driven psychiatry, paving the way for transforming how complex mental disorders are understood and managed. This approach not only enhances diagnostic accuracy but also offers the potential to transform patient care by facilitating earlier diagnoses, more effective treatments, and improved quality of life.

5. Conclusions

This study demonstrates that the Random Forest (RF) algorithm is an effective and reliable tool for the automated classification of bipolar disorder (BD) using EEG data, achieving an accuracy of 93.41%. This suggests that artificial intelligence can complement traditional clinical practices by providing more objective, faster, and reproducible evaluations. The implementation of AI-based tools could significantly reduce delays in BD diagnosis, which currently can exceed 10 years. An early diagnosis would allow for timely initiation of appropriate treatments, reducing the risk of severe episodes, hospitalizations, and associated complications. Additionally, by improving the differentiation between BD and other mood disorders, such as major depression, it could optimize medication use and prevent the inappropriate prescription of antidepressants to bipolar patients.
This study represents a key advancement in precision psychiatry, demonstrating the potential of machine learning to enhance BD assessment and management. The integration of AI models into clinical practice could transform psychiatric diagnosis, leading to more personalized, efficient, and evidence-based patient care.

Author Contributions

Conceptualization, M.S. and J.M.; methodology, M.S., P.B.-S., A.M.T. and J.M.; software, A.M.T. and J.M.; validation, M.S., P.B.-S. and J.M.; formal analysis, M.S., P.B.-S., A.M.T. and J.M.; investigation, M.S., P.B.-S., A.M.T. and J.M.; writing—original draft, M.S., P.B.-S., A.M.T. and J.M.; writing—review and editing, M.S., A.M.T. and J.M.; visualization, M.S., A.M.T. and J.M.; supervision, J.M.; project administration, J.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Institute of Technology (University of Castilla-La Mancha, Spain), Diputación Provincial de Cuenca (Spain) and the Consorcio Hospital General Universitario De Valencia (Spain).

Institutional Review Board Statement

This study was conducted in accordance with the Declaration of Helsinki and approved by the Ethics Committee of the Virgen de la Luz Hospital (code: CE-MD-10568, date: 16 February 2010).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The datasets employed and analyzed in the current study are accessible upon reasonable request from the corresponding author.

Acknowledgments

Chair of Artificial Intelligence, sponsored by Bayer, Barcelona (Spain).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Grande, I.; Berk, M.; Birmaher, B.; Vieta, E. Bipolar disorder. Lancet 2016, 387, 1561–1572. [Google Scholar] [CrossRef] [PubMed]
  2. Kessler, R.C.; Berglund, P.; Demler, O.; Jin, R.; Merikangas, K.R.; Walters, E.E. Lifetime prevalence and age-of-onset distributions of DSM-IV disorders in the National Comorbidity Survey Replication. Arch. Gen. Psychiatry 2005, 62, 593–602. [Google Scholar] [CrossRef] [PubMed]
  3. Ghaemi, S.N.; Boiman, E.E.; Goodwin, F.K. Diagnosing bipolar disorder and the effect of antidepressants: A naturalistic study. J. Clin. Psychiatry 2000, 61, 804–808. [Google Scholar] [CrossRef] [PubMed]
  4. Chancel, R.; Lopez-Castroman, J.; Baca-Garcia, E.; Mateos Alvarez, R.; Courtet, P.; Conejero, I. Biomarkers of bipolar disorder in late life: An evidence-based systematic review. Curr. Psychiatry Rep. 2024, 26, 78–103. [Google Scholar] [CrossRef]
  5. Favaretto, E.; Bedani, F.; Brancati, G.E.; De Berardis, D.; Giovannini, S.; Scarcella, L.; Martiadis, V.; Martini, A.; Pampaloni, I.; Perugi, G.; et al. Synthesising 30 years of clinical experience and scientific insight on affective temperaments in psychiatric disorders: State of the art. J. Affect. Disord. 2024, 362, 406–415. [Google Scholar] [CrossRef]
  6. Nierenberg, A.A.; Agustini, B.; Köhler-Forsberg, O.; Cusin, C.; Katz, D.; Sylvia, L.G.; Peters, A.; Berk, M. Diagnosis and treatment of bipolar disorder: A review. JAMA 2023, 330, 1370–1380. [Google Scholar] [CrossRef]
  7. Bandyopadhyay Prasanta, S.; Forster Malcolm, R.; Oxford, E.; Barkow Jerome, H.; Leda, C.; John, T.; William, B.; Richardson Robert, C.; Beck Aaron, T.; John, R.A. American Psychiatric Association, Diagnostic and Statistical Manual of Mental Disorders: Dsm-5, Washington, DC, American Psychiatric Publishing, 2013. Ananth Mahesh, In defense of an evolutionary concept of health nature, norms, and human biology, Aldershot, England, Ashgate. Philosophy 2014, 39, 683–724. [Google Scholar]
  8. de Dios, C.; Goikolea, J.M.; Colom, F.; Moreno, C.; Vieta, E. Bipolar disorders in the new DSM-5 and ICD-11 classifications. Rev. Psiquiatr. Salud Ment. (Engl. Ed.) 2014, 7, 179–185. [Google Scholar] [CrossRef]
  9. McIntyre, R.S.; Berk, M.; Brietzke, E.; Goldstein, B.I.; López-Jaramillo, C.; Kessing, L.V.; Malhi, G.S.; Nierenberg, A.A.; Rosenblat, J.D.; Majeed, A. Bipolar disorders. Lancet 2020, 396, 1841–1856. [Google Scholar] [CrossRef]
  10. Oliva, V.; Fico, G.; De Prisco, M.; Gonda, X.; Rosa, A.R.; Vieta, E. Bipolar disorders: An update on critical aspects. Lancet Reg. Health–Eur. 2024, 48, 101135. [Google Scholar] [CrossRef]
  11. García Blanco, A.C.; Sierra, P.; Livianos, L. Nosology, epidemiology and pathogenesis of bipolar disorder: Recent approaches. Psiquiatr. Biológica 2014, 21, 89–94. [Google Scholar]
  12. Campos-Ugaz, W.A.; Garay, J.P.P.; Rivera-Lozada, O.; Diaz, M.A.A.; Fuster-Guillén, D.; Arana, A.A.T. An overview of bipolar disorder diagnosis using machine learning approaches: Clinical opportunities and challenges. Iran. J. Psychiatry 2023, 18, 237. [Google Scholar] [CrossRef] [PubMed]
  13. Kabir, M.S.; Khanom, J.; Bhuiyan, M.A.; Tumpa, Z.N.; Rabby, S.F.; Bilgaiyan, S. The Early Detection of Dementia Disease Using Machine Learning Approach. In Proceedings of the 2023 International Conference on Computer Communication and Informatics (ICCCI), Coimbatore, India, 23–25 January 2023; pp. 1–6. [Google Scholar]
  14. Montazeri, M.; Montazeri, M.; Bahaadinbeigy, K.; Montazeri, M.; Afraz, A. Application of machine learning methods in predicting schizophrenia and bipolar disorders: A systematic review. Health Sci. Rep. 2023, 6, e962. [Google Scholar] [CrossRef] [PubMed]
  15. Bzdok, D.; Meyer-Lindenberg, A. Machine learning for precision psychiatry: Opportunities and challenges. Biol. Psychiatry Cogn. Neurosci. Neuroimaging 2018, 3, 223–230. [Google Scholar] [CrossRef]
  16. Mora, D.; Nieto, J.A.; Mateo, J.; Bikdeli, B.; Barco, S.; Trujillo-Santos, J.; Soler, S.; Font, L.; Bosevski, M.; Monreal, M.; et al. Machine learning to predict outcomes in patients with acute pulmonary embolism who prematurely discontinued anticoagulant therapy. Thromb. Haemost. 2022, 122, 570–577. [Google Scholar] [CrossRef]
  17. Soria, C.; Arroyo, Y.; Torres, A.M.; Redondo, M.Á.; Basar, C.; Mateo, J. Method for classifying schizophrenia patients based on machine learning. J. Clin. Med. 2023, 12, 4375. [Google Scholar] [CrossRef]
  18. Suárez, M.; Martínez, R.; Torres, A.M.; Ramón, A.; Blasco, P.; Mateo, J. A Machine Learning-Based Method for Detecting Liver Fibrosis. Diagnostics 2023, 13, 2952. [Google Scholar] [CrossRef]
  19. Garrido, N.J.; González-Martínez, F.; Losada, S.; Plaza, A.; Del Olmo, E.; Mateo, J. Innovation through Artificial Intelligence in Triage Systems for Resource Optimization in Future Pandemics. Biomimetics 2024, 9, 440. [Google Scholar] [CrossRef]
  20. Suárez, M.; Gil-Rojas, S.; Martínez-Blanco, P.; Torres, A.M.; Ramón, A.; Blasco-Segura, P.; Torralba, M.; Mateo, J. Machine Learning-Based Assessment of Survival and Risk Factors in Non-Alcoholic Fatty Liver Disease-Related Hepatocellular Carcinoma for Optimized Patient Management. Cancers 2024, 16, 1114. [Google Scholar] [CrossRef]
  21. Côté-Allard, U.; Jakobsen, P.; Stautland, A.; Nordgreen, T.; Fasmer, O.B.; Oedegaard, K.J.; Tørresen, J. Long–short ensemble network for bipolar manic-euthymic state recognition based on wrist-worn sensors. IEEE Pervasive Comput. 2022, 21, 20–31. [Google Scholar] [CrossRef]
  22. Perez Arribas, I.; Goodwin, G.M.; Geddes, J.R.; Lyons, T.; Saunders, K.E. A signature-based machine learning model for distinguishing bipolar disorder and borderline personality disorder. Transl. Psychiatry 2018, 8, 274. [Google Scholar] [CrossRef] [PubMed]
  23. Metin, B.; Uyulan, Ç.; Ergüzel, T.T.; Farhad, S.; Çifçi, E.; Türk, Ö.; Tarhan, N. The deep learning method differentiates patients with bipolar disorder from controls with high accuracy using EEG data. Clin. EEG Neurosci. 2024, 55, 167–175. [Google Scholar] [CrossRef] [PubMed]
  24. Pettorruso, M.; Guidotti, R.; d’Andrea, G.; De Risio, L.; D’Andrea, A.; Chiappini, S.; Carullo, R.; Barlati, S.; Zanardi, R.; Rosso, G.; et al. Predicting outcome with Intranasal Esketamine treatment: A machine-learning, three-month study in Treatment-Resistant Depression (ESK-LEARNING). Psychiatry Res. 2023, 327, 115378. [Google Scholar] [CrossRef]
  25. Di Stefano, V.; D’Angelo, M.; Monaco, F.; Vignapiano, A.; Martiadis, V.; Barone, E.; Fornaro, M.; Steardo, L.; Solmi, M.; Manchia, M.; et al. Decoding Schizophrenia: How AI-Enhanced fMRI Unlocks New Pathways for Precision Psychiatry. Brain Sci. 2024, 14, 1196. [Google Scholar] [CrossRef] [PubMed]
  26. Janeva, D.; Krsteski, S.; Tashkovska, M.; Jovanovski, N.; Kartalov, T.; Taskovski, D.; Ivanovski, Z.; Gerazov, B. A System for Differentiation of Schizophrenia and Bipolar Disorder based on rsfMRI. In Proceedings of the 2023 30th International Conference on Systems, Signals and Image Processing (IWSSIP), Ohrid, North Macedonia, 27–29 June 2023; pp. 1–5. [Google Scholar]
  27. Bader, M.; Abdelwanis, M.; Maalouf, M.; Jelinek, H.F. Detecting depression severity using weighted random forest and oxidative stress biomarkers. Sci. Rep. 2024, 14, 16328. [Google Scholar] [CrossRef]
  28. Zhou, Y.; Zhang, X.; Gong, J.; Wang, T.; Gong, L.; Li, K.; Wang, Y. Identifying the risk of depression in a large sample of adolescents: An artificial neural network based on random forest. J. Adolesc. 2024, 96, 1485–1497. [Google Scholar] [CrossRef]
  29. Huang, S.; Cai, N.; Pacheco, P.P.; Narrandes, S.; Wang, Y.; Xu, W. Applications of support vector machine (SVM) learning in cancer genomics. Cancer Genom. Proteom. 2018, 15, 41–51. [Google Scholar]
  30. Pisner, D.A.; Schnyer, D.M. Support vector machine. In Machine Learning; Elsevier: Amsterdam, The Netherlands, 2020; pp. 101–121. [Google Scholar]
  31. Shia, W.-C.; Chen, D.-R. Classification of malignant tumors in breast ultrasound using a pretrained deep residual network model and support vector machine. Comput. Med. Imaging Graph. 2021, 87, 101829. [Google Scholar] [CrossRef]
  32. Javeed, A.; Dallora, A.L.; Berglund, J.S.; Idrisoglu, A.; Ali, L.; Rauf, H.T.; Anderberg, P. Early prediction of dementia using feature extraction battery (feb) and optimized support vector machine (svm) for classification. Biomedicines 2023, 11, 439. [Google Scholar] [CrossRef]
  33. Zhang, S.; Li, X.; Zong, M.; Zhu, X.; Wang, R. Efficient kNN classification with different numbers of nearest neighbors. IEEE Trans. Neural Netw. Learn. Syst. 2017, 29, 1774–1785. [Google Scholar] [CrossRef]
  34. Arian, R.; Hariri, A.; Mehridehnavi, A.; Fassihi, A.; Ghasemi, F. Protein kinase inhibitors’ classification using K-Nearest neighbor algorithm. Comput. Biol. Chem. 2020, 86, 107269. [Google Scholar] [CrossRef] [PubMed]
  35. Ehsani, R.; Drabløs, F. Robust distance measures for k NN classification of cancer data. Cancer Inform. 2020, 19, 1176935120965542. [Google Scholar] [CrossRef] [PubMed]
  36. Ghiasi, M.M.; Zendehboudi, S. Application of decision tree-based ensemble learning in the classification of breast cancer. Comput. Biol. Med. 2021, 128, 104089. [Google Scholar] [CrossRef] [PubMed]
  37. Lazebnik, T.; Bunimovich-Mendrazitsky, S. Decision tree post-pruning without loss of accuracy using the SAT-PP algorithm with an empirical evaluation on clinical data. Data Knowl. Eng. 2023, 145, 102173. [Google Scholar] [CrossRef]
  38. Manzella, F.; Pagliarini, G.; Sciavicco, G.; Stan, I.E. The voice of COVID-19: Breath and cough recording classification with temporal decision trees and random forests. Artif. Intell. Med. 2023, 137, 102486. [Google Scholar] [CrossRef]
  39. Jayachitra, S.; Prasanth, A. Multi-feature analysis for automated brain stroke classification using weighted Gaussian naïve Bayes classifier. J. Circuits Syst. Comput. 2021, 30, 2150178. [Google Scholar] [CrossRef]
  40. Gohari, K.; Kazemnejad, A.; Mohammadi, M.; Eskandari, F.; Saberi, S.; Esmaieli, M.; Sheidaei, A. A Bayesian latent class extension of naive Bayesian classifier and its application to the classification of gastric cancer patients. BMC Med. Res. Methodol. 2023, 23, 190. [Google Scholar] [CrossRef]
  41. Queipo, M.; Barbado, J.; Torres, A.M.; Mateo, J. Approaching personalized medicine: The use of machine learning to determine predictors of mortality in a population with SARS-CoV-2 infection. Biomedicines 2024, 12, 409. [Google Scholar] [CrossRef]
  42. Usategui, I.; Arroyo, Y.; Torres, A.M.; Barbado, J.; Mateo, J. Systemic Lupus Erythematosus: How Machine Learning Can Help Distinguish between Infections and Flares. Bioengineering 2024, 11, 90. [Google Scholar] [CrossRef]
  43. Hosseinifard, B.; Moradi, M.H.; Rostami, R. Classifying depression patients and normal subjects using machine learning techniques and nonlinear features from EEG signal. Comput. Methods Programs Biomed. 2013, 109, 339–345. [Google Scholar] [CrossRef]
  44. Lundberg, S.M.; Erion, G.G.; Lee, S.-I. Consistent individualized feature attribution for tree ensembles. arXiv 2018, arXiv:1802.03888. [Google Scholar]
  45. Sun, Z.; Wang, G.; Li, P.; Wang, H.; Zhang, M.; Liang, X. An improved random forest based on the classification accuracy and correlation measurement of decision trees. Expert. Syst. Appl. 2024, 237, 121549. [Google Scholar] [CrossRef]
  46. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  47. Accardo, A.; Affinito, M.; Carrozzi, M.; Bouquet, F. Use of the fractal dimension for the analysis of electroencephalographic time series. Biol. Cybern. 1997, 77, 339–350. [Google Scholar]
  48. Nguyen-Ky, T.; Wen, P.; Li, Y. Monitoring the depth of anaesthesia using Hurst exponent and Bayesian methods. IET Signal Process. 2014, 8, 907–917. [Google Scholar] [CrossRef]
  49. Birnbaum, H.G.; Shi, L.; Dial, E.; Oster, E.F.; Greenberg, P.E.; Mallett, D.A. Economic consequences of not recognizing bipolar disorder patients: A cross-sectional descriptive analysis. J. Clin. Psychiatry 2003, 64, 1201–1209. [Google Scholar] [CrossRef]
  50. McCombs, J.S.; Ahn, J.; Tencer, T.; Shi, L. The impact of unrecognized bipolar disorders among patients treated for depression with antidepressants in the fee-for-services California Medicaid (Medi-Cal) program: A 6-year retrospective analysis. J. Affect. Disord. 2007, 97, 171–179. [Google Scholar] [CrossRef]
  51. Angst, J. Bipolar disorders in DSM-5: Strengths, problems and perspectives. Int. J. Bipolar Disord. 2013, 1, 12. [Google Scholar] [CrossRef]
  52. van der Voort, T.Y.; van Meijel, B.; Goossens, P.J.; Hoogendoorn, A.W.; Draisma, S.; Beekman, A.; Kupka, R.W. Collaborative care for patients with bipolar disorder: Randomised controlled trial. Br. J. Psychiatry 2015, 206, 393–400. [Google Scholar] [CrossRef]
  53. Severus, E.; Bauer, M. Diagnosing bipolar disorders in DSM-5. Int. J. Bipolar Disord. 2013, 1, 14. [Google Scholar] [CrossRef]
  54. van der Voort, T.Y.; van Meijel, B.; Hoogendoorn, A.W.; Goossens, P.J.; Beekman, A.T.; Kupka, R.W. Collaborative care for patients with bipolar disorder: Effects on functioning and quality of life. J. Affect. Disord. 2015, 179, 14–22. [Google Scholar] [CrossRef] [PubMed]
  55. Jie, N.-F.; Osuch, E.A.; Zhu, M.-H.; Wammes, M.; Ma, X.-Y.; Jiang, T.-Z.; Sui, J.; Calhoun, V.D. Discriminating bipolar disorder from major depression using whole-brain functional connectivity: A feature selection analysis with SVM-FoBa algorithm. J. Signal Process. Syst. 2018, 90, 259–271. [Google Scholar] [CrossRef]
  56. AbaeiKoupaei, N.; Al Osman, H. A multi-modal stacked ensemble model for bipolar disorder classification. IEEE Trans. Affect. Comput. 2020, 14, 236–244. [Google Scholar] [CrossRef]
  57. Dev, A.; Roy, N.; Islam, M.K.; Biswas, C.; Ahmed, H.U.; Amin, M.A.; Sarker, F.; Vaidyanathan, R.; Mamun, K.A. Exploration of EEG-based depression biomarkers identification techniques and their applications: A systematic review. IEEE Access 2022, 10, 16756–16781. [Google Scholar] [CrossRef]
  58. Yamunarani, T.; Ponniran, A.B.; Zaki, W.S.B.W.; Ali, R.A.B.M.; Sivaranjani, S. EEG–Based Bipolar Disorder Deduction Using Machine Learning. In Proceedings of the 2023 First International Conference on Advances in Electrical, Electronics and Computational Intelligence (ICAEECI), Tiruchengode, India, 19–20 October 2023; pp. 1–8. [Google Scholar]
  59. Acharya, U.R.; Oh, S.L.; Hagiwara, Y.; Tan, J.H.; Adeli, H.; Subha, D.P. Automated EEG-based screening of depression using deep convolutional neural network. Comput. Methods Programs Biomed. 2018, 161, 103–113. [Google Scholar] [CrossRef]
  60. Seal, A.; Bajpai, R.; Agnihotri, J.; Yazidi, A.; Herrera-Viedma, E.; Krejcar, O. DeprNet: A deep convolution neural network framework for detecting depression using EEG. IEEE Trans. Instrum. Meas. 2021, 70, 1–13. [Google Scholar] [CrossRef]
  61. Yasin, S.; Hussain, S.A.; Aslan, S.; Raza, I.; Muzammel, M.; Othmani, A. EEG based Major Depressive disorder and Bipolar disorder detection using Neural Networks: A review. Comput. Methods Programs Biomed. 2021, 202, 106007. [Google Scholar] [CrossRef]
  62. Nazari, M.-J.; Shalbafan, M.; Eissazade, N.; Khalilian, E.; Vahabi, Z.; Masjedi, N.; Ghidary, S.S.; Saadat, M.; Sadegh-Zadeh, S.-A. A machine learning approach for differentiating bipolar disorder type II and borderline personality disorder using electroencephalography and cognitive abnormalities. PLoS ONE 2024, 19, e0303699. [Google Scholar] [CrossRef]
Figure 1. The figure represents the importance of the extracted features in the classification of patients.
Figure 1. The figure represents the importance of the extracted features in the classification of patients.
Life 15 00394 g001
Figure 2. The figure represents the radar plots of the different algorithms studied. The upper figure shows the training results, while the lower figure displays the test results.
Figure 2. The figure represents the radar plots of the different algorithms studied. The upper figure shows the training results, while the lower figure displays the test results.
Life 15 00394 g002aLife 15 00394 g002b
Figure 3. The figure represents the ROC curve for different machine learning algorithms.
Figure 3. The figure represents the ROC curve for different machine learning algorithms.
Life 15 00394 g003
Table 1. The table presents the results for Accuracy, MCC, F1 Score, Precision, and DYI.
Table 1. The table presents the results for Accuracy, MCC, F1 Score, Precision, and DYI.
AccuracyMCCF1 ScorePrecisionDYI
SVM83.6573.9883.9083.5683.85
DT82.3572.9582.1181.7782.31
GNB74.7266.3074.5074.1974.69
KNN85.4475.9385.1985.8485.34
RF93.4182.9993.1393.7593.37
Table 2. Metrics for Recall, Specificity, Kappa, and AUC are summarized in the table.
Table 2. Metrics for Recall, Specificity, Kappa, and AUC are summarized in the table.
RecallSpecificityKappaAUC
SVM83.8583.9574.030.84
DT82.4582.2673.320.82
GNB74.8174.6366.520.75
KNN85.5485.3475.180.85
RF93.5193.3083.270.93
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Suárez, M.; Torres, A.M.; Blasco-Segura, P.; Mateo, J. Application of the Random Forest Algorithm for Accurate Bipolar Disorder Classification. Life 2025, 15, 394. https://doi.org/10.3390/life15030394

AMA Style

Suárez M, Torres AM, Blasco-Segura P, Mateo J. Application of the Random Forest Algorithm for Accurate Bipolar Disorder Classification. Life. 2025; 15(3):394. https://doi.org/10.3390/life15030394

Chicago/Turabian Style

Suárez, Miguel, Ana M. Torres, Pilar Blasco-Segura, and Jorge Mateo. 2025. "Application of the Random Forest Algorithm for Accurate Bipolar Disorder Classification" Life 15, no. 3: 394. https://doi.org/10.3390/life15030394

APA Style

Suárez, M., Torres, A. M., Blasco-Segura, P., & Mateo, J. (2025). Application of the Random Forest Algorithm for Accurate Bipolar Disorder Classification. Life, 15(3), 394. https://doi.org/10.3390/life15030394

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop