Prediction of Recovery from Traumatic Brain Injury with EEG Power Spectrum in Combination of Independent Component Analysis and RUSBoost Model

EEG Power Spectrum in Combination of Independent Component Analysis and RUSBoost Model. Abstract: The computational electroencephalogram (EEG) is recently garnering signiﬁcant attention in examining whether the quantitative EEG (qEEG) features can be used as new predictors for the prediction of recovery in moderate traumatic brain injury (TBI). However, the brain’s recorded electrical activity has always been contaminated with artifacts, which in turn further impede the subsequent processing steps. As a result, it is crucial to devise a strategy for meticulously ﬂagging and extracting clean EEG data to retrieve high-quality discriminative features for successful model development. This work proposed the use of multiple artifact rejection algorithms (MARA), which is an independent component analysis (ICA)-based algorithm, to eliminate artifacts automatically, and explored their effects on the predictive performance of the random undersampling boosting (RUSBoost) model. Continuous EEG were acquired using 64 electrodes from 27 moderate TBI patients at four weeks to one-year post-accident. The MARA incorporates an artifact removal stage based on ICA prior to RUSBoost, SVM, DT, and k-NN classiﬁcation. The area under the curve (AUC) of RUSBoost was higher in absolute power spectral density (PSD) in AUC δ = 0.75, AUC α = 0.73 and AUC θ = 0.71 bands than SVM, DT, and k-NN. The MARA has provided a good generalization performance of the RUSBoost prediction model.


Introduction
Traumatic brain injury (TBI) has a tremendous impact on neurological dysfunction and death in young people (i.e., younger than 45 years old) and children (1-15 years old) worldwide [1][2][3][4]. Most TBI is graded based on initial Glasgow Coma Scale (GCS) as mild (GCS score [13][14][15], meanwhile approximately 8-10% is graded as moderate (GCS score 9-12) or severe (GCS score 8 or less) [5,6] when recorded during the emergency room admission [7]. The effects of TBI on brain electrical activity, due to injury on a number of ionic channels, electrical generators and network dynamics involved in the distribution and coordination of electrical energy, can be easily measured using electroencephalography (EEG). EEG records the neuronal activities with non-invasive electrodes fitted on the scalp, allowing the analysis of neuronal activity in five canonical EEG frequency bands: delta δ (<4 Hz), theta θ (4 to 8 Hz), alpha α (8 to 12 Hz), beta β (12 to 30 Hz) and gamma γ (<30 Hz). The electroencephalogram can provide invaluable information regarding the instantaneous changes of brain electrical activity, specifically in aiding neuroprognostication of TBI [8][9][10][11].
Numerous research groups have been working on developing a TBI-diagnostic model based on quantitative EEG (qEEG) features (i.e., differentiation between mild TBI (mTBI) and no mTBI) [12][13][14][15][16][17][18] but have yet to develop a TBI-prognostic (i.e., prediction TBI recovery) tool effectively that has gained widespread attention. Quantitative EEG analyses use computationally derived features that highlight specific components of EEG with numerical values [19]. Predictive models are statistic models that incorporate patient data to anticipate outcomes and are more robust than simple clinical judgments [2,20]. There have been numerous reports of prognostic models, but none are widely used. It may be because the validity and usefulness of predictive models in TBI have not been demonstrated with sufficient clarity and certainty to convince clinicians of their potential added value. A systematic review offers possible explanations [20].
Modern advances have shown that EEG is a prospective neuroimaging modality for accurate prognostication of patients after moderate to severe brain injuries [8,[21][22][23]. Advancements in computational EEG signal processing have significantly improved the reliability and validity of electrophysiological brain measurements. The preprocessing of EEG data is tedious and labor-intensive as the recorded EEG data is usually long, and analyzing raw data through visual inspection is time-consuming. Hence, there is a need for an automated system to perform the analysis (i.e., feature extraction, feature selection, and classification).
Several surveys and studies have been conducted, and the automatic learning methods (i.e., machine learning (ML)) have proven their effectiveness in recognizing EEG wave patterns [23][24][25][26][27][28][29][30]. A key advantage of ML is manipulating multimodal objectively; and modeling hidden relationships in complex datasets with heterogeneous distribution using advanced mathematical techniques [31,32]. The learning strategy is particularly based on supervised learning (i.e., the algorithms learn from labeled training data to create a model that can generate predictions based on unknown data) and unsupervised learning (i.e., the algorithms analyze and cluster unlabeled data, and discover hidden patterns of data clusters without the need of human intervention). More in-depth descriptions of ML and its limitations can be found in references [19,25,[32][33][34].
The procedure for building a TBI predictive outcomes model using the continuous EEG data typically involves preprocessing the raw recordings to mitigate the low signal-to-noise (SNR) ratio in order to obtain a more accurate representation of the primary brain activity. The data confounded with noise or artifacts such as eye blinks, muscular movements, and other instrumentation noises may not correctly represent the underlying brain signals [19]. In the literature, independent component analysis (ICA) has been investigated as a chosen technique for artifact rejection to improve the quality of EEG signals. The ICA has been widely used in EEG signal analysis and brain-computer interface (BCI) [31][32][33]. Khoshnevis and Sankar [34] confirmed that the blind source separation in ICA allows estimation of independent components (ICs) from multiple mixed observations without prior knowledge about brain activity to remove correlation between the channels [35]. Lee et al. [36] argued that the gold standard for EEG review (i.e., traditional approach) is a manual inspection by experts, but ICA algorithms (i.e., automatic approach) could produce EEG with higher signal quality.
With ICA, the signal sources are assumed to be instantaneous linear mixtures of cerebral and artifactual sources that can be decomposed into ICs. Once the ICs have been extracted from the original signals, the clean signal is reconstructed by discarding the ICs that contain artifacts. Vigário [37] tested the ICA method on simulated and experimental data and found that it performed well in the separation of signals from their linear mixtures and the extraction of eye information from electrooculography (EOG) signals [38]. Romero et al. [39] used ICA to reduce EEG artifacts at various sleep stages and discovered that the bidirectional property of EEG and EOG had little effect on ICA. Therefore, noise reduction is a compulsory technique as this method will influence the computation of qEEG features (e.g., power spectral density (PSD), coherence or connectivity). If the extracted features do not precisely designate the essential signals, a classification algorithm employing such features might have problems in identifying the classes of the features. Methodological differences of artifacts removal make it challenging to extract accurate qEEG features and they also pose a problem in assessing the reproducibility of the ML models across restricted datasets.
Recent studies revealed that the ML based on qEEG characteristics yields superior performance in classifying the outcomes of TBI patients. The results highlighted that the ML algorithm (i.e., random under-sampling boosting decision trees (RUSBoosted Trees)) that uses qEEG features (i.e., PSD in specific frequency band (e.g., δ, θ, α and γ)) demonstrated promising results to predict outcomes of highly imbalanced moderate TBI dataset [40]. The present study extends our prior work [40] by including a modification of adding an automatic artifacts rejection method (i.e., multiple artifact rejection algorithm (MARA)), an independent component analysis (ICA)-based algorithm in EEG preprocessing steps and exploring their effects on the predictive performance of the RUSBoost prediction model. This paper consists four main sections: Section 1 introduces the present study, including some backgrounds and literature review. Section 2 presents the dataset, proposed methodology, and its performance evaluation. Section 3 includes the results and discussion. Finally, Section 4 concludes this paper.

Outcome Assessment
Patient recovery assessment was conducted through telephone calls by physicians from four weeks to one-year after the accident. The Glasgow Outcome Scale (GOS) was used as the primary outcome measure, which was dichotomized as a bad outcome (i.e., GOS score of 1-4) and a good outcome (i.e., GOS score at 5), in approximately 12-months after injury. In this study, an expert (i.e., neurosurgeon) in our team evaluated the neurological outcomes of moderate TBI patients based on GOS score (given in Table 1) that corresponded to the specific level of improvement of each patient [41,42].

TBI Patients
Continuous EEG eyes-closed data from 27 moderate TBI patients (B1-B27) were obtained from 64 EEG electrodes to record the brain's signals from 64-sites on the scalp at a sampling rate 1 kHz. Electroencephalograms of 27 moderate TBI patients (n = 27) were collected at the Hospital Universiti Sains Malaysia, Kubang Kerian, Kelantan, Malaysia. Ethical clearance for this study was attained from the Human Research Ethics Committee, Universiti Sains Malaysia (USM), with a permission number USM/JEPeM/1511045. Nonsurgical moderate TBI patients aged 18 to 65 met the inclusion criteria. The first hit involved the left or right hemisphere, which was confirmed by a computerized tomography (CT) scan at the time of diagnosis. Criteria of ages under 18 years old, serious scalp and skull deformities, bone fractures, and drug use were all ruled out. All patients had given informed consent before participating in this study. All of the patients in this study were men who had sustained a TBI due to vehicle accidents. Table 2 provides a detailed explanation of the characteristics of moderate TBI patients for this study.

EEG Recordings
The EEG signals were obtained using 64 electrodes to record brain signals from 64 different locations on the scalp. All electrodes were placed following an international standard of 10-10 electrode configuration [16,43]. The CP z (i.e., equivalent to Ch-32) was set as an EOG channel for tracking the eye movement and blinking artifacts. As a result, only 63 EEG channels were used as input data in our classification model.
The patient ground electrode serving as reference electrode was placed at 10% frontward to Fz connected to earlobes. A programmable direct current (DC) broadband SynAmps amplifier was used to measure the brain signals by boosting up to 2500 gain and precision of 0.033/bit in the recording range of 55 millivolts (mV) at the DC 70 Hz frequency range. The 16-bit alternating current (AC)-DC converters were used for digitizing EEG signals to 1 kHz. During the recording, each patient was sat in a comfy seat in front of a computer screen in a dimly lit room and advised to remain motionless and close their eyes to remain task-free (i.e., no tasks or activities performed) for 350 s.
The continuous EEG eyes-closed data were obtained from moderate TBI patients with follow-up visits. The first measurement (i.e., four-10 weeks post-accident) contributed to thirteen moderate TBI data. Eleven moderate TBI EEG data were contributed from the second measurement (i.e., six-months post-accident). The third measurement (i.e., one-year post-accident) contributed to three EEG data. The patients would be rejected if they failed to participate in the follow-up EEG measurements within the given time frame.The raw, unprocessed EEG data were exported for further analysis. The patient would be disqualified if they failed to participate in EEG measurement within the time frame. The raw, unprocessed EEG data were exported for further analysis. Figure 1 depicts a block diagram of the suggested methodology.

EEG Data Preprocessing
The continuous eyes-closed EEG data were preprocessed in MATLAB R2020a (Math-Works, Natick, MA, USA) using open-source EEGLAB toolbox version 2019.0 [44,45] with the custom MATLAB script. The details about the preprocessing script are given as follows. We assigned the 63-channels continuous eyes-closed EEG signal to be presented as a vector x = [x 1 , x 2 , x 3 , x 4 , . . . , x 61 , x 62 , x 63 ], where x m is the 60 segments (i.e., 1-s) from the m-channel of EEG signal (i.e., m = 63 channels).
The power line noise was eliminated in the first step by applying a notch filter at 50 Hz (in Malaysia) on the signal x with EEGLAB function (i.e., pop_ eegfiltnew()). Next, the ICA was performed, and artifact-related components were rejected according to the MARA [46,47] to remove the EOG artifacts (e.g., eye blinks, eye movements), electromyogram (muscle) artifact, and electrocardiograph (EKG) artifactual activity components. MARA is a supervised ML algorithm that automates artifact removal by hand-labeling ICA components. For this purpose, MARA employs a classifier (i.e., a linear programming machine) that discriminates an ICA component that is not derived from brain activity. The MARA classification system is based using two linear classifiers for finding a separating hyperplane (P), which is mathematically solved by Equation (1).
ICA components are classed as neural or artifact w is a weight vector taken from labeled training data samples, v is a feature vector, and b is a bias factor [47,48].
The MARA provides visualization of each component scalp map (see Figure 2a,b), its spectrum, respectively (see Figure 2c,d), and the current label of the component (i.e., artifact or neural) by presenting each component's probability based on the six features (i.e., current density norm, range in pattern, mean local skewness, λ, fit error and 8-13 Hz) as feature selection procedure described in [47]. If the artifact probability is greater than 0.5 (p-artifact = 0.99, see Figure 2e), it is considered as an artifact, and if the artifact probability is less than 0.5 (p-artifact = 0.00), it is considered as a neuronal signal (see Figure 2f). Features that contribute to a component marked as an artifact are plotted as red bars, and features that indicate the component containing neuronal activity are marked in blue. These features are invaluable in understanding the MARA's decision. In this study, IC17 is a typical eye blink artifact shown by intense frontal activity steep power spectrum; IC32 is a neuronal component showing the alpha peak around 10 Hz. The scalp map indicates the occipital brain source. We can choose to remove ICs automatically without inspection, but in this study, ICs flagged as artifacts will be rejected based on the artifacts probability and scalp map computed by MARA to produce a clean signal y. After that, a band-pass filter with cutoff values at 0.1 Hz and 100 Hz was applied to the clean signal y to remove any undesired peaks with extreme signal values. The output signal was represented as the signal z. In order to cover multiple informative frequency bands (i.e., δ, θ, α, β,and γ), the frequency analysis is limited to a range of 0.1 to 100 Hz as recommended by McNerney et al. and van den Brink et al. for TBI classification [12,18].

Data Preparation
The first 60 s of recording were deleted due to artifact contamination. The next 60 s of EEG data were then split into 60 fragments of one second. The fragmentation began at a 60,001 miliseconds because the first 60 s had been rejected. The input signal z was matrixarranged (i.e., the amplitude of the EEG channel × time) following to the default arrangement of the 64-channel WaveGuard EEG helmet cap. The M × F s is the signal z, where M is the number of EEG channels (i.e., M = 63 channels) and F s (i.e., = 1 kHz) is the sampling rate. Therefore, there were 1620 segments of data (i.e., 27 recordings × 60 segments/recording).

EEG Feature Extraction
Feature extraction is critical stage in any EEG analysis that identifies common feature representations among EEG samples. The absolute frequency bands were determined by integration of the PSD inside each frequency band: δ (0.5-4 Hz), θ (4-7 Hz), α (7-13 Hz), β (13-30 Hz) and γ (30-100 Hz). Based on Equation (2), the Fourier transform (FT) ofẑ(ω) over an interval of [0, T] is calculated to assess the frequency content of the input signal z(t) based on Equation (2).ẑ The average PSD is subsequently computed for each EEG frequency band of each channel. Equation (3) can be used to represent the generic PSD, P zz (ω) of a given signal z(t), where E is the estimated value and T is the time range for the PSD, P zz (ω).
PSD is a frequently extracted characteristic that quantifies the power contained within a frequency domain signal. It is similar to the FT of a signal's auto-correlation function [19,49]. Therefore, each EEG sub-band had an average of 63-PSD in one segment. For each frequency band, the 63-PSD average was convolved to generate a feature vector (e.g., P β = [P β1 , P β2 , . . . , P β63 ]).

RUSBoost Prediction Model
RUSBoost is a hybrid sampling-boosting model that balances the characteristic of classes by minimizing instances from dominant classes [50,51]. Since the dataset distribution were heavily skewed, a robust classification algorithm was required which would work well with such a skewed dataset. The implementation of the RUSBoost as a classifier for predicting moderate TBI outcomes is feasible since the epochs reflecting poor and good outcomes are not uniformly distributed in both the training and testing datasets. A skewed dataset slows the classifier's learning rate for the poor outcome, as most data corresponds to the good outcome. As a result of this imbalance, the model's predictions are biased towards the majority class, resulting in a decrease in the model's overall performance [52].
A majority class is known as a negative class and constitutes the maximum of the dataset. With increasing samples in the negative class, learning becomes more complicated since ML classifiers for used learning purposes in such imbalance datasets may disregard positive class (i.e., minority class) samples as noise or outliers. In the majority class (i.e., good), the predictive model leans towards better accuracy, meanwhile poorly performing on the side of the minority class (i.e., poor). The examples of the minority class are misclassified at a higher rate than examples of the other classes. The steps for RUSBoost implementation is described in Algorithm 1 [50].
The present study implemented the RUSBoost algorithm with decision tree (DT) as the weak learner. By applying the RUSBoost, the resampling method (i.e., undersampling) handled the imbalanced dataset problem by altering the minority and majority class size to provide a balanced distribution in a training dataset. Boosting leverages the random samples of the data to create each tree where each sample is balanced because the algorithm undersamples the majority class to match the size of the minority class. Due to the minimal number of epochs representing poor outcomes (18.52%), the number of weak learner, that is, 30 trees, were utilized in the final models with a number of 20 splits, and a learning rate of 0.1 was merged into a high-quality ensemble predictor using the base function fitensemble to build a RUSBoost prediction algorithm in prediction recovery of moderate TBI. Create temporary training dataset R t with distribution W t using random undersampling 4: Train a WeakLearn, providing it with samples R t and their weight W t

5:
As a result, get a hypothesis h t : X × Y → [0, 1]. 6: Compute the pseudo-loss for R and W t : Compute the weight update parameter: Update W t (i) for each sample: Normalize the weights W t+1 : In order to evaluate the performance of the proposed recovery prediction system, we used a k-fold cross-validation process. To do this, the dataset was divided randomly into k equal-sized subdivisions. At each fold, the k-1 subdivisions were used for training, and the remaining partitioning was used for testing. This process is repeated for k-times (i.e., k = 5). The results of five partitions were averaged and reported as the system performance.

Evaluation in Imbalanced Dataset
Instances in a binary classification task can be labeled as either positive or negative. The minority class is usually regarded as a positive class in binary poorly balanced datasets, while the majority class is usually considered negative. This research classified the poor and good outcomes as positive and negative cases, respectively, resulting in the classification matrix shown in Table 3. Table 3. Confusion matrix of binary classification problem. True positive (TP) and true negative (TN) are valid predictions, but false negative (FN) and false positive(FP) are wrong predictions. Analyzing the four entries in the confusion matrix does not suffice to assess the classifier's performance. The confusion matrix provided four types of performance measurements that were used in this study:

Ground Truth Prediction
• The sensitivity, often referred to as the True Positive Rate, TP rate , is expressed in terms of : TP rate = TP/(TP + FN) TP rate indicates the ability of a classifier to identify a positive class correctly. It ranges from 0 to 1, with 1 being the perfect score. • The specificity, alternatively referred to as the True Negative Rate, TN rate , is determined as; TN rate = TN/(TN + FP) TN rate denotes the ability of a classifier to identify a negative class correctly . The perfect score is 1, and 0 is the worst measure. • G-mean (geometric mean), is denoted as; G-Mean = TP rate * TN rate (6) G-Mean introduced by [53] quantifies the ability of a classifier to balance classification accuracy between positive and negative classes. By combining the G-Means of TP rate and TN rate , a low G-Mean score indicates a highly discriminative classifier toward one class and vice versa. • F1 score describes the trade-off between precision (TP/(TP + FP) and recall (TP/(TP + FN) in the positive class. This well-known metric is perfectly suitable for skewed dataset problem that is determined as ; It is a numeric value between 0 and 1, with 1 representing the perfect value. • Area Under Curve (AUC) is a popular overall model performance evaluation, especially for rating binary classifiers in the presence of class imbalance. Receiver operating characteristic (ROC) equals to AUC. To generate the ROC curve, we plotted the TP rate against the false positive rate FP rate , which is calculated as follows: It should be noted that higher AUC values imply a better ROC curve and, thus resulting in better performance.

Results
The ROC curve and its AUC, TP rate , and TN rate of the evaluated RUSBoost prediction model on moderate TBI data were shown in Table 4. The ROC curve for each frequency band computed from the absolute PSD is shown in Figures 3 and 4. In most cases, the AUC value ranges from 0.5 to 1, where 0.5 means the algorithm performs the same as the chances of flipping a coin. The AUC values were low in absolute PSD of β (i.e., AUC β = 0.51) (i.e., see Figure 4b) and γ (i.e., AUC γ = 0.54) (i.e., see Figure 4c) bands; indicating that the TP rate and TN rate of these frequencies bands were low (i.e., TP rateβ = 40.0%, TN rateβ = 54.5% (i.e., see Figure 4e); TP rate γ = 80.0%, TN rate γ = 54.5% (i.e., see Figure 4f). The absolute PSD in δ solely achieved the most significant prediction performance with AUC δ values was 0.75 (i.e., see Figure 3a), demonstrating that our proposed RUSBoost prediction model is perfect for dealing with our imbalanced dataset distributions. The AUC values of absolute PSD in θ had AUC θ = 0.71 (i.e., see Figure 3b), and α had AUC α = 0.73 (i.e., see Figure 4a), respectively. The TP rate and TN rate of absolute PSD in δ (i.e., TP rateδ = 80.0%, TN rateδ = 63.6%) and α (i.e., TP rate α = 80.0%, TN rate α = 59.1%) bands were high ((i.e., see confusion matrix in Figures 3c and 4d); indicating their good prediction performance at discriminating the TBI outcomes. The results suggested that the prediction of recovery based on the RUSBoost algorithms had efficiently distinguished between patients with poor and good outcomes with higher AUC values in absolute PSD of δ, α, and θ bands. The absolute PSD in δ, α, and θ bands provided the most significant predictive value and were the best predictors for predicting the recovery outcomes of moderate TBI.  The G-Mean is the most acceptable metric to replace the accuracy rate due to the uneven distribution [54,55]. The accuracy rate is a traditional performance indicator for a predictive model with a perfectly balanced class distribution [56]. Based on the results, we found the G-Mean most suited for balancing poor and good outcome classes in terms of total classification accuracy (see Table 4). More specifically, the RUSBoost prediction model contributed the maximum G-Mean (%) and F1 scores in δ, θ (G-Mean (%) = 71.33), meanwhile the G-Mean (%) and F1 scores in α (G-Mean (%) = 68.76) were at the aboveaverage level. The best G-Mean reflects a balanced prediction performance on both positive (i.e. bad outcomes) and negative classes (i.e., favorable outcomes).
The F1 scores for the δ and θ bands were good, indicating that the F1 scores are insensitive to FN, and therefore, it accurately measures the quality of an algorithm for predicting the TP. The final PSD in the β (G-Mean (%) = 46.69) band indicates poor performance in predicting the positive and negative classes. The absolute PSD resulted in higher AUC values, and G-Mean above 68.7% suggested the suitability of RUSBoost to predict the moderate TBI outcomes.
In addition, for imbalanced dataset problems, the present study provides a performance comparison of three different ML classifiers (i.e., support vector machine (SVM), DT, and k-nearest neighbor (k-NN)) to predict the outcomes of moderate TBI. The results confirmed a superiority in AUC value and a balanced classification performance (i.e., G-Mean (%)) for the RUSBoost over other algorithms. The standard rule-based classifiers (i.e., SVM, DT, and k-NN) had demonstrated algorithm discrimination between a single class of positive and negative (see Table 5 (DT), Table 6 (SVM) and Table 7 (k-NN) for prediction performance comparison). In summary, the results showed that the RUSBoost prediction model was the most suitable for predicting recovery of moderate TBI patients. The AUC values were low in absolute PSD of the following models: k-NN, DT, and SVM; representing that the TP rate , TN rate and G-Mean (%) and F1 scores of these models are low for all frequency bands. The AUC values of absolute PSD β of the DT algorithm were higher than the AUC of the RUS-Boost, SVM, and k-NN. However, it could not indicate its superior prediction performance because the classifier was biased towards a negative class (i.e., good outcomes).

Discussion
In the previous section, we classified the outcomes of moderate TBI using DT, SVM, k-NN, and the RUSBoost. The recovery models were computed based on the absolute PSD in five sub-bands to evaluate the most successful qEEG features in predicting TBI outcomes. We discovered that the RUSBoost prediction model generally outperforms the the DT, SVM, and k-NN, considering only the optimal total accuracy rather than the distribution across different classes. However, this condition may be explained because a classification algorithm that tries to maximize accuracy to meet its objective rule will produce an accuracy of 99% just by correctly classifying all samples from the larger class but misclassifying one sample of the smaller class. As illustrated in Figure 5 and the AUC values in Tables 5-7, the ensemble decision trees (i.e., RUSBoost) outperformed the individual classifier. The RUSBoost adopted a hybrid approach from AdaBoost [57] (i.e., adaptive boosting) algorithms that use the combination of sampling and boosting, aiming to achieve higher performance for the dataset with the class imbalance problem [51,58]. As seen from the dataset we used in this study, it is unbalanced, which is why the RUSBoost classifier achieved the highest prediction performance. The learning strategy of the RUSBoost algorithms offered advantages in improving the prediction of the poor outcomes with a slight decrease in the good outcomes class. The undersampling strategy, which balances the class distribution in the dataset, is highly beneficial in learning from skewed training data [40,55]. In sleep spindles detection [59], the RUSBoost algorithms enable an automatic sleep spindles detection with an F-measure of 0.70 and sensitivity of 76.9% without requiring threshold calibration. RUSBoost used majority voting of weak classifiers for discrimination spindles from the extracted EEG features (i.e., synchrosqueezeed wavelet transform (SST)).
In comparison to non-sampling techniques [12], recent findings imply that the hybrid strategy with resampling (i.e., undersampling) and boosting can significantly improve model performance [34,[60][61][62]. The ensemble DT technique is more flexible and less prone to overfitting (i.e., has a high bias but low variance), demonstrating the generalization power of RUSBoost in predicting outcomes. The present results support the previously reported development of a predictive model using the ensemble DT and resampling is better suited in predicting TBI outcomes than using an individual algorithm (i.e., DT, SVM and k-NN) [40].
In this work, the automated artifacts rejection method, which is the MARA, was performed on our continuous EEG of moderate TBI data to separate the contributing sources to the scalp EEG [47]. Artifacts in EEG signals might make interpretation difficult and lead to incorrect analytical judgments. Numerous algorithms and preprocessing pipelines have been developed to address the problem of artifact rejection in electrophysiological data [32,49,[63][64][65][66][67]. Each of these algorithms has its own set of strengths and focuses on a different area of artifact rejection than the others. In recent studies, Pedroni et al. [68] suggested that applying a preprocessing pipeline of algorithms to detect defective channels in combination with MARA, which is an ICA-based artifact rejection method, effectively removes a large extent of artifacts.
The MARA was initially designed to distinguish the ICA components that originated from the brain and non-brain sources and reject artifactual ICA components in EEG data [46] . The development of the prediction outcomes algorithm is based on an automated EEG artifact rejection method (i.e., MARA) shown to be efficient in identifying artifacts with intelligence ICs selection and provided a good generalization performance of the RUSBoost prediction model. The model obtained the highest prediction performance in δ band (i.e., AUC = 0.75, G-Mean (%) = 71.33) compared to other classifiers. Our results support the claims made in the literature [33,48,68,69] that the automated ICA-artifact preprocessing pipeline offered substantial benefits by increasing consistency and efficiency for classifying artifacts or non-artifacts ICs. On the contrary, Alam et al. [49] have found that the selection of artifact removal had distinct effects on the PSD calculation. However, the recent findings in Noor et al. [40] showed that the artifacts rejection by the automatic continuous rejection and experts confirmation provided promising results in predicting TBI outcomes of specific frequency bands. Although an automated artifacts rejection can help isolate between the neural or non-neural source components from ICA decomposition, subjective method (i.e., visual classification by experts) is still typically advisable [33,70].
Designing a clinically useful predictive model is difficult due to the high complexity of EEG measurements. Preprocessing and feature extraction must be done carefully to ensure that high-quality discriminative features can be retrieved to attain a high model performance. Therefore, feature selection and suitable ML algorithms are required to deduce the significant qEEG predictors. Poor data extraction thereby directly affects the accuracy of classification. The great majority of research included in the review [23,28] found that identifying the most informative qEEG that characterizes the recovery outcome level is crucial to ensure early targeted prediction after TBI post-injury. More recently, Noor et al. [40] have demonstrated that qEEG features of PSD in δ, θ, α, and γ showed promising results in the prediction of recovery outcome of moderate TBI patients. A key strength is that the frequency distribution of neuronal activity provides information on the patient level of arousal, restful alertness, and general capacity for focus mental activity [8]. In support, the specific qEEG features are confirmed as invaluable predictors of recovery in TBI, which can complement demographic and clinical information [4,9,71,72].
Several studies have suggested that spectrum EEG features may predict the level of consciousness in patients suffering from the disorder of consciousness (DOC) following severe TBI. In comparative studies, a significant reduction in the amplitude oscillations of the α and β bands among patients with DOC but there was a concurrent improvement in θ and δ amplitude for fully conscious participants [73][74][75]. The systematic review by Pauli et al. [22] examined a number of studies exploring continuous EEG as a prognostic measure in DOC following TBI. The resting-state EEG (i.e., continuous EEG) is particularly promising within the 12-month post-injury. Several studies have shown that the α power and variability are significant for modeling the functional outcomes during periods [72,76,77].
The α and δ power were extracted from the EEG to be utilized in the random forest (RF) [9], generalized linear model (GLM) [11], and linear regression (LR) [10,72] training features in the previous studies to predict outcomes in severe TBI patients. They have a lower classification performance than our proposed model method (i.e., RUSBoost) [40]. However, we believe this cannot be a fair indicator for a method comparison because their proposed algorithms were mainly suitable for balanced TBI data distribution. The similarities in our findings suggest that modeling prediction models based on computational EEG approaches (i.e., ML and qEEG features) allowed the researchers to identify the most explanatory predictors for a reliable TBI outcome prediction. In support of this, ref. [74] found that spectral density in a specific frequency band provides a strong connection between severe TBI outcomes. The results highlighted that spectral density at different frequency bands has the utmost predictive value, especially two to three months after injury [9][10][11]72]. Overall, this work demonstrates that the association between PSD features in a specific frequency and the clinical outcomes (i.e., GOS scores) is robust enough to develop a reliable TBI prediction outcomes model with multiple combinations of qEEG features and different ML approaches.

Conclusions
In conclusion, this study presented a RUSBoost prediction outcomes model that integrates MARA ICA-based into the resting-state eyes-closed EEG preprocessing of moderate TBI data to eliminate artifacts automatically. The prediction performance obtained and reported in this paper is higher than previous studies [8][9][10]66] in predicting TBI outcomes; however, the model's performance is lower than our prior work [40]. A robust model performance requires rigorous EEG preprocessing and feature extraction techniques to ensure the retrieval of high-quality discriminative features. In addition to that, distortions in EEG signals can significantly diminish the trustworthiness of clinical decisions based on the signals. We believe that the development of a moderate TBI outcomes prediction model based on MARA for automatic tracking and eliminating artifactual ICA components has been demonstrated to be effective in identifying artifacts with intelligence ICs selection, thus providing a good generalization performance of the RUSBoost prediction model. The robustness of RUSBoost algorithms (i.e., ensembles DT) compensated for the inadequacies of single classifiers (i.e., DT, SVM, and k-NN) in classifying the outcomes even with small samples and a minimal set of qEEG features (i.e., PSD). Future research could involve predictive modeling with various parameters (e.g., coherence, connectivity, relative power, spectrum asymmetry) to classify the unique qEEG properties to moderate TBI outcomes.

Informed Consent Statement:
Informed consent was obtained from all subjects involved in the study. Written informed consent has been obtained from the patient(s) to publish this paper.