Machine Learning Approach for Analyzing 3-Year Outcomes of Patients with Brain Arteriovenous Malformation (AVM) after Stereotactic Radiosurgery (SRS)

A cerebral arteriovenous malformation (AVM) is a tangle of abnormal blood vessels that irregularly connects arteries and veins. Stereotactic radiosurgery (SRS) has been shown to be an effective treatment for AVM patients, but the factors associated with AVM obliteration remains a matter of debate. In this study, we aimed to develop a model that can predict whether patients with AVM will be cured 36 months after intervention by means of SRS and identify the most important predictors that explain the probability of being cured. A machine learning (ML) approach was applied using decision tree (DT) and logistic regression (LR) techniques on historical data (sociodemographic, clinical, treatment, angioarchitecture, and radiosurgery procedure) of 202 patients with AVM who underwent SRS at the Instituto de Radiocirugía del Perú (IRP) between 2005 and 2018. The LR model obtained the best results for predicting AVM cure with an accuracy of 0.92, sensitivity of 0.93, specificity of 0.89, and an area under the curve (AUC) of 0.98, which shows that ML models are suitable for predicting the prognosis of medical conditions such as AVM and can be a support tool for medical decision-making. In addition, several factors were identified that could explain whether patients with AVM would be cured at 36 months with the highest likelihood: the location of the AVM, the occupation of the patient, and the presence of hemorrhage.


Introduction
Cerebral arteriovenous malformation (AVM) is a congenital neurological disease that causes cerebral hemorrhage, seizures, or headache.It consists of an abnormal conglomerate of dilated cerebral vessels derived from the maldevelopment of the capillary network that allows direct connections between cerebral arteries and veins [1].One of the treatments, known of since the 1970s, in addition to microsurgery and endovascular therapy, is stereotactic radiosurgery (SRS), in which the AVM is obliterated by radionecrosis through the administration of multi-beam directed radiation [2].From the medical point of view, SRS is a neurosurgical technique that does not require an incision and is used as an alternative or complement to noninvasive treatment.
The healing process of patients with AVM undergoing SRS is not immediate and requires time with clinical and imaging monitoring to know the evolution of the disease.The successful exclusion of brain AVM with radiosurgery is considerably higher for smaller lesions.For example, one study showed that the obliteration rate of patients with brain AVM after SRS was between 54-92% for lesion diameters ≤ 2.5 cm [3].Several scoring systems, such as the Spetzler-Martin Grading Scale (SMGS) and the Virginia Radiosurgery AVM Scale (VRAS), are currently used by physicians to understand the nature of AVM and predict the results of radiosurgery treatment [4][5][6].However, developing new methods to predict the results of radiosurgery treatment and determining the factors that influence the probability of success are needed.
Machine learning (ML) is a subset of artificial intelligence (AI) that uses algorithms that automatically "learn" to identify patterns in data, which are used to make forecasts based on these patterns [7].The use of such algorithms as support tools for medical decisionmaking and their application in the prognosis, diagnosis, and treatment of diseases has been recently developed [8]; however, certain conditions still exist that make it difficult for them to be widely adopted [9][10][11][12][13].Among the studies referring to the prediction and diagnosis of neurological and brain diseases in which ML techniques were applied is the study of Uspenskaya-Cadoz et al. [14], which proposed a method for diagnosing Alzheimer's disease (AD) by applying logistic regression (LR), decision tree (DT), random forest (RF), and gradient-boosted trees (GBT) techniques, and the study of Ghafouri-Fard et al. [15], which proposed using artificial neural networks (ANNs) to predict multiple sclerosis (MS) risk based on genotypes.
At present, the application of ML techniques to the diagnosis, prognosis, or treatment of AVM has increased.Interesting studies can be found, such as one by Tao et al. [16], which examined the factors that influence the risk of bleeding from AVM, and another by Hong et al. [17], which reported an experiment for the detection of hemorrhages in AVMs using digital subtraction angiography (DSA) images.There are also studies on the use of deep learning models, a type of ML specialized in image processing; for example, Wang et al. [18] automated the process of segmenting and identifying AVMs in computed tomography (CT) and DSA images.Other studies have focused on the prognosis of patients with AVM after surgery, with the aim of predicting whether they would be cured.For example, Asadi et al. [19] presented a study on identifying the factors that influence the outcome of treatment with endovascular embolization and showcased that ML techniques can satisfactorily predict outcomes with high accuracy and can help to individualize the treatment based on key predictors.Finally, Oermann et al. [20] used an ML approach to predict the outcomes of AVM patients undergoing radiosurgery, and achieved an accuracy of 0.74, which is considered to be the best prognostic result as of the date of publication of this paper.However, the prediction error rate found in these previous studies is high (greater than 25%), and in addition, they did not study the explainability phenomena through assessing the importance of the variables, which is key for medical decision-making.
From these previous studies [16][17][18][19][20], which show that ML algorithms are powerful tools that can be used in the medical field, in the present study, we aimed not only to provide an ML approach for predicting whether patients with AVM who undergo SRS will be cured but also one that could identify the main factors influencing whether these patients will be cured 36 months after radiosurgery.

Materials and Methods
The construction of an ML system for the prognosis of patients with AVM treated with SRS is proposed using two techniques: DT and LR (Figure 1).These two techniques were used in this study because they can produce results (predictions) that are easy to understand by the experts in the domain as they are considered "white box" methods [21].Additionally, these methods were also used in previous studies regarding the AVM outcome prediction [16,19,20].
of LR is proposed to determine the main factors that influence whether an AVM patient will be cured.

Dataset
For this study, a dataset comprising 45 variables of 202 patients diagnosed with AVM who underwent SRS treatment to cure this disease was used.The data were collected from different medical sources at the Instituto de Radiocirugía del Perú (IRP) between 2005 and 2018 following the process shown in Figure 2. The variables that were collected from patient data were considered as input data (predictors) and were grouped into 5 categories: sociodemographic (S), clinical (C), treatment (T), angioarchitecture (A), and radiosurgery (R).The variable for patients being cured at 36 months after radiosurgery was considered as output data (response).Table 1 shows the structure of the dataset used in this study.Due to a common long-term follow-up protocol that suggests complete AVM obliteration within the first 3 years for 70-80% of AVM patients [22], the objective of this study is to predict whether a patient will be cured or not at 36 months after undergoing SRS; for this, a supervised ML learning approach was chosen via binary classification.Additionally, the use of LR is proposed to determine the main factors that influence whether an AVM patient will be cured.

Dataset
For this study, a dataset comprising 45 variables of 202 patients diagnosed with AVM who underwent SRS treatment to cure this disease was used.The data were collected from different medical sources at the Instituto de Radiocirugía del Perú (IRP) between 2005 and 2018 following the process shown in Figure 2. of LR is proposed to determine the main factors that influence whether an AVM patient will be cured.

Dataset
For this study, a dataset comprising 45 variables of 202 patients diagnosed with AVM who underwent SRS treatment to cure this disease was used.The data were collected from different medical sources at the Instituto de Radiocirugía del Perú (IRP) between 2005 and 2018 following the process shown in Figure 2. The variables that were collected from patient data were considered as input data (predictors) and were grouped into 5 categories: sociodemographic (S), clinical (C), treatment (T), angioarchitecture (A), and radiosurgery (R).The variable for patients being cured at 36 months after radiosurgery was considered as output data (response).Table 1 shows the structure of the dataset used in this study.The variables that were collected from patient data were considered as input data (predictors) and were grouped into 5 categories: sociodemographic (S), clinical (C), treatment (T), angioarchitecture (A), and radiosurgery (R).The variable for patients being cured at 36 months after radiosurgery was considered as output data (response).Table 1 shows the structure of the dataset used in this study.The dataset is tabular and is made up of 202 records (rows) and 45 variables (columns), in which the rows correspond to the patient data and the columns represent the variables considered in the study.The first 44 variables were considered as input variables to the system (independent variables) and the last column as the output variable (dependent variable), representing patients being cured (cured = 1) or not (cured = 0).

Data Preprocessing
Before carrying out any data processing and because this was a medical application, it was advisable to analyze the data regarding possible confounding variables that could have an undesired impact on our prediction results [23]; for this, we analyzed the possible confounding variables of gender and age.
For the categorical variable gender, the chi-square test of homogeneity was performed to verify whether the difference in the number of men and women in each data group was statistically significant, and no difference was found (p-value = 0.566; Figure 3a).For the age variable, Student's t-test was applied to verify whether there was a statistically significant difference in age between groups (class 0, mean = 31.97;class 1, mean = 26.72),and again no difference was found (p-value = 0.058; Figure 3b).The dataset is tabular and is made up of 202 records (rows) and 45 variables (columns), in which the rows correspond to the patient data and the columns represent the variables considered in the study.The first 44 variables were considered as input variables to the system (independent variables) and the last column as the output variable (dependent variable), representing patients being cured (cured = 1) or not (cured = 0).

Data Preprocessing
Before carrying out any data processing and because this was a medical application, it was advisable to analyze the data regarding possible confounding variables that could have an undesired impact on our prediction results [23]; for this, we analyzed the possible confounding variables of gender and age.
For the categorical variable gender, the chi-square test of homogeneity was performed to verify whether the difference in the number of men and women in each data group was statistically significant, and no difference was found (p-value = 0.566; Figure 3a).For the age variable, Student s t-test was applied to verify whether there was a statistically significant difference in age between groups (class 0, mean = 31.97;class 1, mean = 26.72),and again no difference was found (p-value = 0.058; Figure 3b).
From this analysis, we concluded that the variables age and gender should not be considered as confounding variables, so we moved forward with the data preprocessing.Finally, in order to avoid prediction biases and build the ML system effectively, variable selection and data balancing were carried out.

Variable Selection
After an analysis by expert judgment, 6 independent variables were identified that were considered not to influence the prognosis of being cured, so they were excluded from the study (residence, education_level, health_insurance, mri_examination, ct_examination, and das_examination).
Additionally, correlation analysis of the 38 remaining independent variables was carried out; Cramer s test [24] was applied to identify the linear correlation between categorical variables and Pearson s test (Pearson s correlation coefficient) for the numerical variables; in both cases, a threshold value greater than or equal to 0.7 was used to determine the high positive (negative) correlation [25], and 6 correlated variables that exceeded the From this analysis, we concluded that the variables age and gender should not be considered as confounding variables, so we moved forward with the data preprocessing.
Finally, in order to avoid prediction biases and build the ML system effectively, variable selection and data balancing were carried out.

Variable Selection
After an analysis by expert judgment, 6 independent variables were identified that were considered not to influence the prognosis of being cured, so they were excluded from the study (residence, education_level, health_insurance, mri_examination, ct_examination, and das_examination).
Additionally, correlation analysis of the 38 remaining independent variables was carried out; Cramer's test [24] was applied to identify the linear correlation between categorical variables and Pearson's test (Pearson's correlation coefficient) for the numerical variables; in both cases, a threshold value greater than or equal to 0.7 was used to determine the high positive (negative) correlation [25], and 6 correlated variables that exceeded the threshold were identified and discarded from the study (Table 2).The dython library [26], which is available for the Python programming language, was used to perform the calculations.Finally, 12 independent variables were discarded, leaving a dataset made up of 32 independent variables and 1 dependent variable, which were used in the ML system proposed in this study (Table 3).The final dataset was made up of 202 records, with 32 independent variables and 1 dependent variable, which was divided into two datasets, 75% (n = 151) for ML model training and validation and 25% (n = 51) for testing.In addition, the 32 independent variables of the training and validation set were normalized using the min-max technique [27].

Data Balancing
The original training dataset had a data imbalance with respect to the dependent variable, cured, in that it consisted of 125 records of class 1 and 26 of class 0. The imbalance was corrected by applying the synthetic minority oversampling technique (SMOTE), which creates new synthetic instances of the minority class instead of repeating them [28,29].We obtained 250 records in total; 125 records for each class, as shown in Figure 4.

Machine Learning Models
For the construction, validation, and evaluation of the ML system, we used the p cess shown in Figure 5, which consisted of using the two training datasets (balanced imbalanced) to build and validate the two ML models (DT and LR) in four experime scenarios; based on the results, the model with the best performance metrics was cho Scenario 1 refers to the imbalanced training data with the DT model, scenario 2 refer the imbalanced training data with the RL model, scenario 3 refers to the balanced train data with the DT model, and scenario 4 refers to the balanced training data with the model.The final model s performance was evaluated by using both the accuracy and AUC metrics to compare our study s results with the ones obtained by Oermann et al. and Meng et al. [30].The accuracy was used to evaluate how well the model predicts correct label (cured patients) for a given data point, so the ML model can be effectiv used in the medical field.
Additionally, the LR method was used to identify the most important factors determine the probability of patients being cured (clinical interpretability).

Machine Learning Models
For the construction, validation, and evaluation of the ML system, we used the process shown in Figure 5, which consisted of using the two training datasets (balanced and imbalanced) to build and validate the two ML models (DT and LR) in four experimental scenarios; based on the results, the model with the best performance metrics was chosen.Scenario 1 refers to the imbalanced training data with the DT model, scenario 2 refers to the imbalanced training data with the RL model, scenario 3 refers to the balanced training data with the DT model, and scenario 4 refers to the balanced training data with the RL model.The final model's performance was evaluated by using both the accuracy and the AUC metrics to compare our study's results with the ones obtained by Oermann et al. [20] and Meng et al. [30].The accuracy was used to evaluate how well the model predicts the correct label (cured patients) for a given data point, so the ML model can be effectively used in the medical field.

Machine Learning Models
For the construction, validation, and evaluation of the ML system, we used the cess shown in Figure 5, which consisted of using the two training datasets (balanced imbalanced) to build and validate the two ML models (DT and LR) in four experim scenarios; based on the results, the model with the best performance metrics was ch Scenario 1 refers to the imbalanced training data with the DT model, scenario 2 refe the imbalanced training data with the RL model, scenario 3 refers to the balanced tra data with the DT model, and scenario 4 refers to the balanced training data with th model.The final model s performance was evaluated by using both the accuracy an AUC metrics to compare our study s results with the ones obtained by Oermann et al and Meng et al. [30].The accuracy was used to evaluate how well the model predict correct label (cured patients) for a given data point, so the ML model can be effect used in the medical field.
Additionally, the LR method was used to identify the most important factors determine the probability of patients being cured (clinical interpretability).Additionally, the LR method was used to identify the most important factors that determine the probability of patients being cured (clinical interpretability).
In the training phase, the grid search technique [31] was used to find the optimal hyperparameters of the ML models in each of the four scenarios.The set of search values defined for the hyperparameters is given in Table 4.During the training process, the resampling technique was used (Figure 6), in which the training dataset was divided into 8 subsets, with 1 set taken for validation and 7 for training, following 8-fold cross-validation, which is a commonly used method for selecting ML models [32,33].
In the training phase, the grid search technique [31] was used to find the op hyperparameters of the ML models in each of the four scenarios.The set of search v defined for the hyperparameters is given in Table 4.During the training process, the resampling technique was used (Figure 6), in w the training dataset was divided into 8 subsets, with 1 set taken for validation and training, following 8-fold cross-validation, which is a commonly used method for s ing ML models [32,33].To build the ML models, the scikit-learn 1.0.2library [34] of Python version 3 was used in the Google Colab environment.The algorithms and resources built fo research can be found at https://github.com/mirkorodriguez/ml-prediction-mavacc on 14 December 2023.

Results
The composition of the study population, the performance of the prediction mo and the explainability of the prediction are presented below.To build the ML models, the scikit-learn 1.0.2library [34] of Python version 3.8.16 was used in the Google Colab environment.The algorithms and resources built for this research can be found at https://github.com/mirkorodriguez/ml-prediction-mavaccessed on 14 December 2023.

Results
The composition of the study population, the performance of the prediction models, and the explainability of the prediction are presented below.

Study Population
This study included 202 patients with AVM who underwent stereotactic radiosurgery between 2005 and 2018 at the IRP.As shown in Supplementary Figure S1, 167 patients (82.20%) were cured 36 months after the surgical intervention.
Supplementary Table S1 shows the sociodemographic characteristics of the population included this study: 52.97% were men and 47.03% were women; 70.49% of patients were in the age range of 18 to 59 years; 80.69% were from Lima or Callao; 18.82% had a preschool or grade school education and 52.97% had only a high school education; and 42.08% had insurance through the Ministry of Health of Peru (SIS).
Supplementary Table S2 shows the clinical characteristics of the patients.The average time from radiosurgery to AVM cure (obliteration) was 22.07 months, the average radiation dose was 17.86 Gray, the average AVM diameter was 2.14 cm, and the average number of isocenters applied was 1.35.On average, radiosurgery was performed in a single session.
Supplementary Table S3 shows the statistics of the patients' previous treatments before SRS.Of the 202 patients, 31 had undergone surgical treatment and 49 had prior embolization.As part of the treatment, 22 only underwent surgery, 40 only embolization, and 9 both surgery and embolization.The embolizing agents were Onyx (52%) and Histoacryl (48%).In total, 155 patients had previous cerebral hemorrhage, 76 developed encephalomalacia, 178 had headache, and 112 had seizures; furthermore, 55% presented some type of deficit (motor, sensory, or cognitive).Regarding the angioarchitecture (characteristics) of the AVM, most (100) were located on the left side of the brain and most (96) were categorized as deep; most treated AVMs (95) had moderately intense flow.

Performance of Prediction Models
The results obtained by the models using the data in the testing set are described below.Table 5 shows the optimal hyperparameters identified for each scenario that were used in the models for prediction.Figure 7 shows the confusion matrices obtained as a result of evaluating the best ML model from each of the four predefined scenarios with the testing dataset.Figure 8 shows the AUC curve for each scenario.
Table 6 shows the results of the experiments with the four scenarios in terms of their performance metrics for both the training and testing datasets.The best model according to the performance metrics in the testing dataset is the LR model built with the balanced dataset.Table 6 shows the results of the experiments with the four scenarios in terms of their performance metrics for both the training and testing datasets.The best model according to the performance metrics in the testing dataset is the LR model built with the balanced dataset.

Explainability of Models
In order to gain a general idea about the explainability of the results obtained by the models used in this research, the LR model built with balanced data (scenario 4) was used based on its good prediction results and its interpretability through the calculation of the odds ratio (importance) [35].Table 7 shows the variables (features) and their level of importance in explaining the probability of patients with AVM being cured 36 months after SRS, among which 18 have a negative influence and 14 have a positive influence.The five most important variables that positively influence being cured are (1) the location of the AVM (side_avm), (2) the occupation of the patient (occupation), (3) the presence of bleeding in the AVM (hemorrhage), (4) previous cranial surgery (prev_cran_surgery), and (5) the type of venous drainage (type_venous_drainage).It is important to highlight that the patient's occupation is an antecedent of the disease, but it is not clinically relevant; however, it is an interesting finding that should be evaluated in greater detail in another study.

Discussion
Inspired by the use of ML techniques in medicine [36][37][38][39][40][41][42], and specifically for the prognosis of patients with AVM [19,20,30], this study proposed a method that makes it possible to predict whether or not a patient with AVM who undergoes SRS will be cured at 36 months after the intervention.We found that using ML techniques for the prognosis of patients with AVMs is possible.Our approach involved evaluating four scenarios using two ML models and two datasets (imbalanced and balanced data).After following a standard process to build the ML models, in which oversampling, grid search, and cross-validation techniques were also applied, it was found that the best model to predict whether patients with AVM would be cured is the LR model trained with balanced data (accuracy 0.92, AUC 0.98).The LR model was superior to the DT model even when trained with imbalanced data, as shown in Table 6.The data preprocessing (selection of variables and balancing) performed in this study led to significantly higher results for the two models (DT and LR) than when the data were not preprocessed, so we can argue that data preprocessing should be included in any approach that uses an ML model.In addition, the results obtained in this study (accuracy 0.92 and AUC 0.98) were found to be superior to the results obtained in other studies using similar procedures, such as those by Oermann et al. [20] and Meng et al. [30], who obtained an accuracy of 0.74 and 0.83, and an AUC of 0.71 and 0.77, respectively.
From the clinical perspective, it is observed that the data used in this study have acceptable homogeneity for the radiosurgery protocol: AVM diameter of 2.14 cm (SD = 0.89), applied radiation dose of 17.86 Gy (SD = 4.44), and number of isocenters of 1.35 (SD = 0.56); all of this, together with other technical and morphological factors, allowed for the effective application of ML techniques to individualize the AVMs that will respond positively to radiosurgery treatment.The LR model is the one that best predicts the SRS outcomes and the variables that positively influence determining whether a patient will be cured are (1) the location in the basal ganglia, which coincides with previous studies [43]; additionally, the location of the AVM on the left side of the brain as an important factor is due to the fact that the sample is not completely random; (2) deep venous drainage, which occurs at the level of the basal ganglia or midbrain is considered not treatable with other techniques due to the high risk involved; (3) the occupational group, which denotes a population of children and adolescents who tend to have a good response to radiosurgery, was expected and also coincides with results from other studies [44].In addition, it is important to highlight that both the history of bleeding in the AVM and the presence of previous surgical treatment are key prognostic factors, as it is shown in our study, where 71 (35.14%) of the patients had previous treatment either through conventional neurosurgery, embolization, or both, which contributed to improving the favorable prognosis of AVMs by reducing their size or altering the hemodynamics of the residual AVM, which ultimately favors its healing.
The importance of the results of this study goes beyond the possibility of using this method for the medical prognosis of patients with AVM; it also allows us to confirm that it is possible to use an ML model, understood as a generalizable framework, in medicine, by using historical data to predict the future.We believe that the ML algorithms that process clinical and imaging data in a personalized way can effectively help in decision-making to predict which patients with cerebral AVM could benefit from being cured by treatment with stereotactic radiosurgery.In this case, we used historical information over a 14-year time horizon, from which sociodemographic and medical data were collected to build an ML system that achieved very good prediction results and could be used as a tool by medical professionals for decision-making when dealing with new AVM cases.
Finally, the proposed approach for the prognosis and explainability of whether patients with AVM will be cured has no limitations; however, the results of these models are limited to the dataset used in this study, so its application in medical practice requires more experiments with larger amounts of data and the possibility of including additional medical variables should also be evaluated.Also, it is important to remark that the two ML models used in this study are considered transparent models, or "white box" models [21], the results of which are easy to interpret; however, it would be important to contrast the interpretability with more sophisticated explainability techniques such as local interpretable model-agnostic explanations (LIME), Shapley additive explanations (SHAP), and others, which are focused on identifying the most important predictors for any type of ML model, including those considered "black box" models.

Figure 3 .
Figure 3. Analysis of variables in the dataset: (a) gender variable; (b) age variable.

Figure 3 .
Figure 3. Analysis of variables in the dataset: (a) gender variable; (b) age variable.

Figure 4 .
Figure 4. Number of records for each class: (a) before and (b) after data balancing.Finally, two training datasets were obtained: an imbalanced training dataset m up of 151 records, and a balanced training dataset made up of 250 records.Both data were represented by a data matrix of dimension n × 32, in which the observation i can expressed as oi = [o0, o1, …, o32] ∈ R n×32 , where n is the number of observations or reco in the dataset.

Figure 4 .
Figure 4. Number of records for each class: (a) before and (b) after data balancing.Finally, two training datasets were obtained: an imbalanced training dataset made up of 151 records, and a balanced training dataset made up of 250 records.Both datasets were represented by a data matrix of dimension n × 32, in which the observation i can be expressed as o i = [o 0 , o 1 , . .., o 32 ] ∈ R n×32 , where n is the number of observations or records in the dataset.

Figure 4 .
Figure 4. Number of records for each class: (a) before and (b) after data balancing.Finally, two training datasets were obtained: an imbalanced training dataset m up of 151 records, and a balanced training dataset made up of 250 records.Both dat were represented by a data matrix of dimension n × 32, in which the observation i ca expressed as oi = [o0, o1, …, o32] ∈ R n×32 , where n is the number of observations or rec in the dataset.

Figure 5 .
Figure 5. ML model construction, validation, and evaluation process.Figure 5. ML model construction, validation, and evaluation process.

Figure 5 .
Figure 5. ML model construction, validation, and evaluation process.Figure 5. ML model construction, validation, and evaluation process.

Figure 6 .
Figure 6.Process used to train the ML models for each of the four scenarios.

Figure 6 .
Figure 6.Process used to train the ML models for each of the four scenarios.

Figure 7 .
Figure 7. Confusion matrix of ML models evaluated with testing dataset: (a) DT model built with imbalanced data; (b) DT model built with balanced data; (c) LR model built with imbalanced data; (d) LR model built with balanced data.The shade of the color represents the quantity of the observations (patients).The bigger the number, the darker the background.

Figure 7 .
Figure 7. Confusion matrix of ML models evaluated with testing dataset: (a) DT model built with imbalanced data; (b) DT model built with balanced data; (c) LR model built with imbalanced data; (d) LR model built with balanced data.The shade of the color represents the quantity of the observations (patients).The bigger the number, the darker the background.

Figure 8 .
Figure 8. ROC curves of ML models evaluated with testing dataset: (a) AUC of DT models built with balanced and imbalanced data; (b) AUC of LR models built with balanced and imbalanced data.The dashed line represents a non-discriminatory test.

Figure 8 .
Figure 8. ROC curves of ML models evaluated with testing dataset: (a) AUC of DT models built with balanced and imbalanced data; (b) AUC of LR models built with balanced and imbalanced data.The dashed line represents a non-discriminatory test.

Table 2 .
Variables discarded from the study.

Table 3 .
Variables selected for the study.

Table 4 .
Search space for tuning hyperparameter values.

Table 4 .
Search space for tuning hyperparameter values.

Table 5 .
Calibrated hyperparameters for each model found during the training process.

Table 6 .
Summary of models' performance.
* Models built without any data preprocessing used as a baseline for comparison.

Table 7 .
Importance of variables in LR model calculated via odds ratio.