Ensemble Wavelet Decomposition-Based Detection of Mental States Using Electroencephalography Signals

Technological advancements in healthcare, production, automobile, and aviation industries have shifted working styles from manual to automatic. This automation requires smart, intellectual, and safe machinery to develop an accurate and efficient brain–computer interface (BCI) system. However, developing such BCI systems requires effective processing and analysis of human physiology. Electroencephalography (EEG) is one such technique that provides a low-cost, portable, non-invasive, and safe solution for BCI systems. However, the non-stationary and nonlinear nature of EEG signals makes it difficult for experts to perform accurate subjective analyses. Hence, there is an urgent need for the development of automatic mental state detection. This paper presents the classification of three mental states using an ensemble of the tunable Q wavelet transform, the multilevel discrete wavelet transform, and the flexible analytic wavelet transform. Various features are extracted from the subbands of EEG signals during focused, unfocused, and drowsy states. Separate and fused features from ensemble decomposition are classified using an optimized ensemble classifier. Our analysis shows that the fusion of features results in a dimensionality reduction. The proposed model obtained the highest accuracies of 92.45% and 97.8% with ten-fold cross-validation and the iterative majority voting technique. The proposed method is suitable for real-time mental state detection to improve BCI systems.


Introduction
Recent technological developments have changed the roles of humans in safety-critical and complex areas, such as autonomous driving vehicles, aviation, healthcare systems, industries, etc., from manual to autonomous control systems [1].However, due to the involvement of humans in several tasks, at the same time, the growing sophistication of these processes makes human intervention and control difficult.Therefore, there is an urgent need for the development of more accurate and automated systems.An analysis of an individuals' cognitive, emotional, and psychological states can provide a solution using brain-computer interface (BCI) technologies [2].Such information measures the mental states of the users to make these environments safer for human-machine interfaces.The brain's physiological activities have been studied by electroencephalograms (EEGs) [3], functional magnetic resonance imaging (fMRI) [4], functional near-infrared spectroscopy (fNIRS) [5], magnetoencephalograms (MEGs) [6], and other forms of biosignals, such as electrooculograms (EOGs) [7], electrocardiograms (ECGs) [7,8], and galvanic skin responses (GSRs) [9], to detect various conditions [10,11].Taking the day-to-day perspective of a mental activity measurement, issues related to size, weight, expense, power consumption, and radioactivity restrict the usage of MEGs and fMRI [12].EOG, ECG, and GSR signals provide some degree of correlation with mental states (mental fatigue, drowsiness, and stress) [10].However, such techniques have demonstrated success only in combination with neuro-imaging methods linked to the central nervous system [10].As a result, fNIRS and EEG signals proved the most appropriate choices for BCI systems [10].EEG signals are favored over fNIRS signals, as they offer higher sensitivity to variations in brain activities and higher temporal resolution [10].Moreover, researchers have widely used EEG signals to study emotions, cognitive load, fear of missing out, drowsiness, and schizophrenia, due to their low-cost, portable, and non-invasive properties [13][14][15][16][17][18].
Recently, many studies have been presented for detecting mental states using EEG signals.The mental states of "workload", "fatigue", and "situational awareness" have been studied by examining the correlation between mental workload and EEG signals in different conditions, such as in airplane pilots and car drivers [11].Myrden et al. presented an EEG-BCI model to predict the mental states of frustration, fatigue, and attention.Different features extracted using the fast Fourier transform (FFT) have been classified with linear discriminant analysis (LDA), support vector machines (SVMs), and naive Bayes classifiers [19].Li et al. recognized reading silently, a comprehension task, a mental arithmetic task, and a question-answering task based on the self-assessment Manikin (SAM) model [20].Nuamah et al. classified five tasks (baseline, visual counting, geometric figure rotation, letter composition, and multiplication) using the short-time Fourier transform (STFT) to extract different features, which were classified using an SVM classifier [21].Liu et al. presented a frequency domain analysis of features using the FFT in combination with SVM to detect attentive and inattentive mental states of students [22].Ket et al. classified attention, no attention, and rest states using sample entropy and linear features with an SVM classifier [23].
Wang et al. used the focus of attention ability during mathematical problem solving and lane-keeping driving tasks.The central, parietal, frontal, occipital, right-motor, and left-motor power spectra computed using filtering and independent component analysis (ICA) were classified with an SVM classifier [24].Djamal et al. evaluated features from raw EEG signals and wavelet decomposition to recognize attention and inattention activities [25].Arico et al. used stepwise linear discriminant analysis and the statistical test of analysis of variance (ANOVA) to detect easy, medium, and hard mental assessments [12].Hamadicharef et al. developed an attention and non-attention classification state model using a combination of filter banks, common spatial patterns, and a Fisher linear discriminant classifier [26].Mardi et al. used Log energy, Higuchi, and Petrosian's fractal dimension to extract chaotic features for detecting alertness and drowsiness states [27].Richer et al. evaluated the band power of frequency bands.They computed histograms of naive and entropy-based scores using the P2 algorithm and classified them with binary classifiers [28].Aci et al. used STFT-based features to detect focused (F), unfocused (UF), and drowsiness (D) mental states [29].Zhang et al. used six convolutional networks and one output layered deep neural network to predict F, UF, and D states [30].Islam et al. explored multivariate empirical mode decomposition (MEMD) and the discrete wavelet transform (DWT) to detect working and relaxed states.The nonlinear features extracted from intrinsic mode functions and subbands (SBs) have been classified with an ensemble classifier [31].Tiwari et al. used rhythm level analysis using filtering and the FFT.The SVM, k-nearest neighbor (KNN), and random forest classifiers have been used to detect highand low-level attention [32].Samima and Sarma used an analysis of rhythms using filtering and artificial neural network (ANN) classifiers for mental workload level assessments [33].Mohdiwale et al. used a DWT-based rhythm analysis using teaching-learning-based optimization for detecting cognitive work assessments [34].Easttom and Alsmadi presented a comparative analysis of EMD and variational mode decomposition to extract nonlinear entropy and Higuchi features for mental state detection [35].Khare et al. used waveletbased analysis using only the rational dilation wavelet transform (RDWT) to extract five statistical and nonlinear features and classified them using an ensemble classifier to detect various mental states [36].Kumar et al. used analysis of EEG rhythms using the discrete Fourier transform and power spectral density (PSD) to detect mental states using the KNN classifier [37].Rastogi and Bhateja explored artifacts of or noise elimination in mental state EEG signals using a stationary wavelet transform (SWT)-enhanced fixed-point fast ICA technique [38].
The methods in the literature used traditional feature extraction from raw EEG signals, statistical analysis, filtering techniques, frequency-based transforms such as the FFT or STFT, rhythm-based analysis, and wavelet-based decomposition.However, direct feature extraction exhibits a decreased performance [15], frequency-based transforms result in a time-frequency trade-off [15], filtering and rhythmic analyses require choosing filter coefficients [15], and wavelet-based methods require the selection of a mother wavelet [15].The experimental and empirical selection of parameters can cause information loss and performance degradation due to misclassification [15].Thus, to overcome these shortcomings, we propose an ensemble-based analysis using advanced decomposition techniques, including the tunable Q wavelet transform (TQWT), the multilevel DWT (MDWT), and the flexible analytic wavelet transform (FAWT).Individual and feature fusion for the automated detection of three mental states (F, UF, and D) is accomplished with an optimizable ensemble technique.The major contributions of the proposed work are listed below:

•
Analysis of ensemble decomposition techniques using multi-wavelet decomposition.• Statistical analysis to reduce the feature dimensions of multi-wavelet feature analysis for mental state detection.• Analysis of feature fusion to detect the best combination of features.

•
Exploring an optimized ensemble classifier to determine the optimum hyper-parameter selection.
The remainder of paper is organized as follows: Section 2 explains the methodology.The results are presented in Section 3. The discussion and conclusions are presented in Sections 4 and 5.

Methodology
The proposed methodology comprises several steps, such as EEG dataset pre-processing, signal analysis using ensemble decomposition, feature extraction, and classification.The flowchart of the method is shown in Figure 1.

Dataset and Preprocessing
The EEG signals of mental states from Kaggle were used, which is a public dataset repository [29,39].The EEG recordings from five subjects originally consisted of a total of 25 h of recording.The participants performed train control on the "Amtrak-Philadelphia" route using the Acela-express simulator.The subjects were instructed to maintain the locomotive speed at 40 mph in every experiment.Each subject controlled the train for 35 to 55 min.The subjects performed seven experiments each, performing at most one experiment per day.The focused state was captured by paying attention to simulator control during the first 10 min of the experiment.The participants became unfocused and stopped paying attention during the second 10 min, exhibiting an unfocused state.Finally, the participants closed their eyes, relaxed freely, and dozed off during the next 10 min to capture the drowsy state.The recording of EEG data was in accordance with international 10-20 standards using an EPOC EEG system.A voltage resolution of 0.51 µV, a sampling frequency of 128 Hz, and a bandwidth between 0.2 and 43 Hz were chosen for data acquisition and pre-processing.A 10 min segment of each class was stratified into 30 s non-overlapping EEG segments with 3840 samples.Each class consists of a total of 680 EEG segments.The dataset details are available in [29,30,39].

Ensemble Decomposition Techniques
In this paper, we have explored an ensemble of three wavelet-based analyses.A brief description of MDWT, TQWT, and FAWT is given in the following subsections.

Multilevel Discrete Wavelet Transform (MDWT)
The MDWT decomposes the signal into two bands called low-pass (LP) and high-pass (HP) filter banks, respectively.The LP filter bank captures the low-frequency content of the signal, while the HP filter bank captures the high-frequency content of the signal.Decomposition of EEGs into four levels results in four HP SBs and one LP SB.The mathematical formulation of the MDWT for the jth level of decomposition is defined as [40] where M is the length of the signal (M = 2 n ), φ and θ are LP and HP filter, V φ is the LP-filtered signal, and V θ is the HP-filtered signal.

Tunable Q Wavelet Transform (TQWT)
The traditional forms of wavelet transforms decompose any signal into subsequent LP SBs and HP SBs with a choice of the mother wavelet.Accurately choosing a wavelet to extract meaningful information is another topic of discussion.The TQWT does not require the selection of a mother wavelet.The decomposition into LPSBs and HPSBs using the TQWT requires tuning parameters, namely the quality factor (q), the oversampling rate (R), and decomposition levels (B), respectively [41].The quality factor (q) is chosen as 1 for non-oscillatory signals, and it is >1 for oscillatory signals [41].R controls the localization of the time-domain response, and it is selected as ≥3 to better capture the time-domain response [41].The EEG signal can split into a number B of high-pass subbands (HPSBs) and one low-pass subband (LPSB) using B decomposition levels.The HPSBs and LPSBs are generated by filter-bank analysis with an LP and a HP frequency response of U B 0 (ω) and U B 1 (ω) denoted as [41]: ( The low-frequency and high-frequency components from any signal can be obtained by LP scaling (a) and HP scaling (β) denoted as [41] The quality factor is represented as [41] The oversampling rate is denoted as The FAWT offers several benefits over the conventional dyadic wavelet transform, which includes a provision for arbitrary sample rates for LP and HP channels that allow flexible time-frequency covering.The HP channel used by the FAWT uses a complex pair of atoms, giving it more freedom in choosing the transform parameters.These advantages allow the FAWT to analyze complex oscillating signals, such as vibrations and EEG signals [42,43].The iterative filter bank structure of the FAWT decomposes the signals into two HP channels and one LP channel, respectively [42].The frequency responses of the LP and HP filter, denoted as V φ (ω) and V θ (ω), are defined as [42] V ) where α 1 and α 2 are the up-sampling and down-sampling factors of the LP channel, α 3 and α 4 are the up-sampling and down-sampling factors of the HP channel, ω p is the passband frequency, and ω s is the stop-band frequency.β and are factors related to perfect reconstruction.
A typical example of the SBs obtained after seven levels of decomposition is represented in Figure 2.

Features Extraction
Features are crucial for drawing a decision boundary to improve system performance.Nonlinear, fractal dimension, and statistical features provide representative information for different physiological and neurological conditions [44,45].Such features provide an effective representation of brain dynamics, which helps to improve the system performance [44,45].The current work explores the application of 27 statistical and nonlinear features to detect three mental states.These features are the standard deviation, Hurst exponent, average energy, wavelength, V order, skewness, kurtosis, Hjorth mobility, Higuchi fractal dimension, Lyapunov exponent, differential absolute standard deviation value, absolute value of the summation of an exponential root, absolute value of the sum of square root, normalized first difference, normalized second difference, mean value of the square root, difference variance value, log energy, absolute energy, simple square integral, slope sign change, peak amplitude, minima, peak amplitude, zero crossing rate, interquartile range, and trimean [46-50].

Ensemble Classifiers
Bootstrap aggregating is an ensemble method usually used to improve classification performance.This work combines five ensemble models to obtain the best optimum combination of hyper-parameters for classification.The classification techniques are ensemble bagged tree, ensemble boosted tree, random under-sampling boosted tree, ensemble subspace knn, and ensemble discriminant trees classifiers with hyper-parameters.In ensemble operation, bootstrap resampling is applied to divide the training data into subsets.Each subset is then used to construct a decision tree, and the output is a function of the voting scheme from the different sets of decision trees.The best-performing classifier is selected as a meta-classifier.Figure 3 shows the operations of ensemble classification techniques.In addition, the performance of classifiers is highly hyper-parameter dependent [51].Careful selection of the hyper-parameters prevents the model from over-fitting and performance degradation.An accurate choice of hyper-parameters is time-consuming and prone to human error.To overcome this, we have explored an optimizable ensemble classification design using the Matlab classifier application.The classification setting for a datum with pair U i , V i , where (i = 1, 2, . . ., M), U i is the predictor with a dimension j, and V i is the response with K numbers of classes, is described.The estimator function for classification is represented by [52] where h M (.) is the estimator as a function of the input data.The ensemble algorithm is as follows [52]: Evaluate the bootstrap estimator f * (.
Repeat steps 1 and 2 L times, where L = 50 or 100.4.

Performance Measure
The evaluation of model performance is a crucial stage to measure the effectiveness of the developed model [53].We have performed a comprehensive analysis of the developed ensemble model to test the effectiveness of the developed system.The evaluation strategy uses three stages.In the first stage, a model is evaluated for its consistency using different validation techniques, namely holdout cross-validation (HOCV), five-fold cross-validation (FFCV), and ten-fold cross-validation (TFCV) techniques.In HOCV, we used the 80:20 strategy, where training and testing were performed on 80% and 20% of the total data, respectively.In five-and ten-fold validation techniques, data were divided into five and ten equal parts, respectively.The model was trained and tested five and ten times, with one part used for testing and the remaining for training, respectively.In the second stage, we performed feature fusion and selected the most prominent features.Finally, we evaluated different performance measures to obtain insights into the developed model.Five evaluation matrices, accuracy, recall, specificity (SPE), precision (PPV), and F-1 score, were used to test the system performance.It is noteworthy to mention that we have used subject-independent training and testing to evaluate the model performance.The mathematical formulations of the performance parameters are expressed as follows.
where T p , T n , F p , and F n are the values of true positive, true negative, false positive, and false negative, respectively.

Results
We aimed at classifying mental states using ensemble decomposition and classification algorithms.At first, stratification of the EEG signals was performed to obtain 3840 nonoverlapping samples for each class.The stratified signals were decomposed into SBs using three wavelet-based decomposition techniques (MDWT, TQWT, and FAWT).We used four-level decomposition using Daubechies wavelet (db2), yielding five SBs corresponding to five EEG rhythms.The tuning parameters of the TQWT were chosen as q = 2, R = 5, and B = 7.For the FAWT, the tuning parameters were selected as B = 6, p = 3, q = 5, r = 2, and s = 3, respectively.We extracted 27 features from the SBs of the MDWT, FAWT, and TQWT with an empirical setting of the tuning parameters.The current analysis includes a feature matrix of all the channels with 27 features.Therefore, a total of 378 features with a total of 2040 segments were introduced into the ensemble classification techniques.The model uses three validation strategies, i.e., HOCV, FFCV, and TFCV.It is noteworthy to mention that we have maintained the same experimental setup.Table 1 shows the accuracy obtained for each SB using MDWT features.The accuracy of two-class and multiclass classification is highest for SB-1.The model yielded the highest accuracies of 95.07%, 94.93%, and 94.36% for D vs. F using HOCV, FFCV, and TFCV, respectively.For UF vs. F, the highest accuracies were 91.18%, 89.34%, and 88.60%, while for D vs. UF, the accuracies were 88.84%, 89.78%, and 88.53% using the optimizable ensemble classifier with HOCV, FFCV, and TFCV techniques.Similarly, three-class classification yielded the highest accuracies of 87.45%, 87.45%, and 86.27% using HOCV, FFCV, and TFCV.
Table 3 shows the accuracy obtained in each SB using FAWT-based features and the optimizable ensemble classifier.The analysis reveals that the last SB yielded the highest accuracy for different classification scenarios.Table 3 shows that the ensemblebased classifier yielded the highest accuracies of 97.79%, 96.91%, and 96.84% for D vs. F classification using HOCV, FFCV, and TFCV techniques.The model provided the highest accuracies of 93.75%, 92.28%, and 91.01%for UF vs. F, D vs. UF, and D vs. F vs. UF using the HOCV technique.The highest accuracies of 93.09%, 91.10%, and 90.90% for UF vs. F, D vs. UF, and D vs. F vs. UF were obtained with FFCV.The accuracies obtained with TFCV for UF vs. F, D vs. UF, and D vs. F vs. UF were 92.94%, 90.96%, and 90.10%, respectively.Thus, it is clear from Tables 1-3 that the accuracy of our developed model is almost stable for three validation techniques in various SBs for different classification scenarios.SB-1 generated the highest accuracy for MDWT and TQWT feature classification.The accuracy yielded by FAWT-based features was highest in SB-7.Analysis also reveals that FAWT-based features provide discernable characteristics, and due to this it obtained the highest accuracy over TQWT-and MDWT-based features.Further, our developed model is consistent for different classification scenarios (binary and multiclass analysis) with three validation techniques.The features provided by drowsy and focused classes are highly discernable; therefore, they yielded the highest classification rate over other scenarios.On the other hand, the features of focused and unfocused classes significantly overlap, resulting in a decreased model performance.An exemplary training curve obtained for the optimized ensemble classifier is shown in Figure 4.As stated earlier, our training and testing feature set comprised all features from all channels.Analysis of the model with all features may increase the time without improving the classification performance [54].Therefore, we used feature ranking analysis to test our model performance with optimal features using the minimum redundancy feature selection technique.Figure 5 shows the feature rank obtained for FAWT, TQWT, and MDWT-based features.As seen from Figure 5, out of twenty-seven features, only a few features are statistically significant for classification.The feature importance values for FAWT, TQWT, and MDWT decrease significantly or remains the same after six features.This reveals that a similar performance can be obtained using less features with higher feature ranks.To obtain an insight into our developed model, we explored a fusion of the most important features of the three decomposition techniques.During fusion, we concatenated the features from all channels according to their ranks.As evident from Tables 1-3, SB-1 for TQWT and MDWT and SB-7 for FAWT features yielded the highest accuracy.Therefore, we have fused the features from these SBs.Table 4 represents the accuracy obtained by feature fusion of decomposition techniques with different feature combinations.As seen from Table 4, the accuracy yielded by the ensemble model increases with an increase in the feature count.The model provides the highest performance with four features.After that, the accuracy of the model decreases slightly or remains constant.Furthermore, our model exhibits that feature fusion helps to improve system performance.The fusion of three decomposition techniques yielded the highest accuracy, followed by the features based on a fusion of TQWT and FAWT decomposition.The combination of TQWT and MDWT feature fusion resulted in the lowest performance.Further, to obtain the highest score, we evaluated the highest performance measures of TFCV using iterative majority voting (IMV).For IMV, we conducted multiple rounds of TFCV, and selected the one with the best overall and fold-wise accuracy.The model exhibited the highest accuracy of 97.8%, obtained twice during fold-wise analysis.Further, we tested the model performance using four performance metrics, as shown in Table 5.The performance measures show that the drowsy class generated the most discriminant features with the highest recall, SPE, PPV, and F1 score.The focused class is the second best, while the worst performance is exhibited by the unfocused class.The analysis shows that feature fusion of the drowsy class yields the highest recall, SPE, PPV, and F1 score of 93.13%, 95.91%, 91.76%, and 92.44%.The recall, SPE, PPV, and F1 score yielded for the drowsy class was 97.12%, 99.63%, 99.26%, and 98.18% using the IMV technique.To obtain more insight into the proposed system, the receiver operating characteristics (ROC) and area under the curve (AUC) were evaluated, as shown in Figure 6.The ROC and AUC of D vs. F and UF, F vs. D and UF, and UF vs. D and F states for fused features are shown in Figure 6a-c.It is evident that the AUC for drowsy is 94%, while for focused and unfocused states it is 95% and 92%, with an average accuracy of 93.67%, respectively.

Discussion
We have tested the efficacy of our proposed model by comparing it with existing stateof-the-art techniques.Borghini et al. [11] computed the power of alpha, theta, and delta frequency bands.An analysis of these frequency bands was performed, and they reported an accuracy of around 90%. Myrden et al. [19] used the FFT to evaluate frequency domain features and classified them with SVM, LDA, and naive Bayes classifiers.Their model yielded the highest accuracies of 71.6%, 74.8%, and 84.8% for frustration, fatigue, and attention levels using the LDA classifier.In another method by Liu et al. [22], an FFT-and SVM-based model yielded an accuracy of 76.82%.Li et al. [20] used an SAM model and obtained and average accuracy rate of 57.03% with the KNN classifier.Nuamah et al. [21] presented a combination of STFT and SVM for feature extraction and classification.Their method obtained an accuracy of 93.33% using the radial basis function kernel.Ket et al. [23] automatically identified three tasks, namely attention, no attention, and rest, using two experiments (ball playing or walking cartoon).The sample entropy and linear features classified using SVM and their method yielded an accuracy of 76.19% and 85.24% using two experiments with sample entropy features.Wang et al. [24] fed features extracted by filtering and ICA into an SVM classifier and achieved 86.2% and 84.6% accuracies in the classification of driving tasks and math-related activities.Djamal et al. [25] computed non-wavelet-and wavelet-based features and classified them with an SVM classifier, and their method provided accuracies in the range of 44-58% and 69-83%, respectively.Hamadicharef et al. [26] developed a filters, common spatial patterns, and Fisher linear discriminant-based attention and non-attention classification model with an accuracy of 89.4%.Chaotic features based on log energy, Higuchi, and Petrosian's fractal dimension artificial neural network classifiers claim an accuracy of 83.3%.Richer et al. [28] used the power of frequency bands, naive, entropy scores, and a binary classification model to obtain a sensitivity of 82% and 80.4% and a specificity of 82.8% and 80.8% for the focus and relax scores, respectively.The methods discussed above have been tested on different datasets for mental state classification.The proposed method was compared with the work of Aci et al. [29] and Zhang et al. [30] on the same dataset, as shown in Table 6.Aci et al. used STFT-based feature extraction to compute different feature sets.ANFIS, SVM, and KNN classifiers were employed to classify 154 features with accuracies of 81.55%, 77.76%, and 91.72%, respectively.A method by Zhang et al. [30] used a deep-learning-based convolutional neural network (CNN) and provided an accuracy of 96.4%.Kumar et al. explored the analysis of PSD using FFT-based feature extraction and a KNN classifier.The channel-wise and grouped channel analysis yielded accuracies of 80% and 97.5% [37].Khare et al. used RDWT wavelet analysis with statistical feature extraction.The classification of features resulted in an accuracy of 91.77% using the bagged tree classifier [36].Rastogi and Bhateja et al. [38] performed elimination of artifacts and noise using SWT and ICA.However, their group did not report the classification accuracy.In our method, we have used ensemble-based decomposition and extraction of nonlinear features.The individual analysis of MDWT, TQWT, and FAWT features yielded accuracies of 88.27%, 89.02%, and 90.1%.Fused feature analysis yielded accuracies of 90.98%, 88.62%, and 89.61% for TQWT/FAWT, TQWT/MDWT, and MDWT/FAWT feature fusion using the TFCV technique.A combined fused feature analysis using TFCV and IMV resulted in accuracies of 92.45% and 97.8%.The analysis shows that our developed model has surpassed the performance of existing state-of-the-art techniques, showing the efficacy of our developed model.

Conclusions
The proposed method classifies focused, unfocused, and drowsy mental states.We have developed an ensemble decomposition and optimized classification technique to create an effective model for mental state detection.Our analysis shows that feature extraction using the FAWT yields the most discernible features for detecting mental states.We demonstrated that feature fusion is helpful for the extraction of representative SBs from EEG signals.Hence, it is useful for extracting meaningful information for mental state analysis.Our model also shows that feature fusion with statistical analysis helps to reduce the feature dimensions with an increased accuracy.The model yielded an accuracy of 97.8% with IMV, which is higher than existing state-of-the-art techniques.Our developed model can detect drowsy, focused, and unfocused states with accuracies of 99.26%, 98.52%, and 94.11%.The proposed work is ready for real-time mental state classification applications to take brain-computer and human-machine interfaces to the next level.The advantages of our developed model are listed as follows: • The model can explore multi-level ensemble wavelet analysis.

•
The model is effective and robust due to comprehensive analysis.

•
The optimized ensemble classifier allows tuning of the hyper-parameters to achieve the best classification performance.

•
The model yielded the highest accuracy of 97.8%.

•
The model supports binary and multi-class analyses.
The model has following limitations: • The model has been tested on a single EEG dataset.

•
The dataset contains fewer subjects.

•
The model has not been tested with leave-one-subject-out classification.
In the future, we will aim to: • Perform adaptive parameter tuning and channel selection.

•
Develop model leave-one-subject-out classification on a relatively larger dataset.

Figure 1 .
Figure 1.Proposed ensemble model for mental state detection.

Figure 2 .
Figure 2. A typical example of SBs generated by the TQWT.

Figure 3 .
Figure 3.Typical working of ensemble classifier techniques.

Table 2 .
The overall accuracy (%) obtained in SBs of the TQWT for HOCV, FFCV, and TFCV techniques using different classification scenarios.

Table 3 .
The overall accuracy (%) obtained in SBs of the FAWT for HOCV, FFCV, and TFCV techniques using different classification scenarios.

Table 4 .
The overall accuracy (%) of fused features of best performing SBs of ensemble decomposition techniques.

Table 5 .
Performance parameters obtained for different scenarios of the proposed model.

Table 6 .
Comparison with existing state-of-the-art techniques using the same dataset.