Effective Detection of Epileptic Seizures through EEG Signals Using Deep Learning Approaches

: Epileptic seizures are a prevalent neurological condition that impacts a considerable portion of the global population. Timely and precise identiﬁcation can result in as many as 70% of individuals achieving freedom from seizures. To achieve this, there is a pressing need for smart, automated systems to assist medical professionals in identifying neurological disorders correctly. Previous efforts have utilized raw electroencephalography (EEG) data and machine learning techniques to classify behaviors in patients with epilepsy. However, these studies required expertise in clinical domains like radiology and clinical procedures for feature extraction. Traditional machine learning for classiﬁcation relied on manual feature engineering, limiting performance. Deep learning excels at automated feature learning directly from raw data sans human effort. For example, deep neural networks now show promise in analyzing raw EEG data to detect seizures, eliminating intensive clinical or engineering needs. Though still emerging, initial studies demonstrate practical applications across medical domains. In this work, we introduce a novel deep residual model called ResNet-BiGRU-ECA, analyzing brain activity through EEG data to accurately identify epileptic seizures. To evaluate our proposed deep learning model’s efﬁcacy, we used a publicly available benchmark dataset on epilepsy. The results of our experiments demonstrated that our suggested model surpassed both the basic model and cutting-edge deep learning models, achieving an outstanding accuracy rate of 0.998 and the top F1-score of 0.998.


Introduction
Epilepsy is an enduring neurological disorder that primarily impacts the central nervous system, with its primary expression manifesting in the brain.It is characterized by the occurrence of sudden and repetitive seizures [1].This ailment is recognized as the third most prevalent neurological condition, ranking closely behind stroke and Alzheimer's disease [2].An epileptic seizure is a temporary event characterized by unusual and exaggerated activity of brain neurons, leading to the presence of signs or symptoms [3].Seizures can be identified by observing and analyzing various physiological indicators, such as brain and muscle activities, heart rate, oxygen levels, synthetic speech, or visual patterns.Techniques like electroencephalography (EEG), electrocardiogram (ECG), electromyography (EMG), movement tracking, or video capturing on a person's head and body are used for this purpose [4].
To detect anomalous behaviors and enable early detection of epileptic seizures before they escalate into more severe states, human activity recognition (HAR) can be effectively applied [5].Recent scholarly investigations have explored a range of techniques for identifying anomalous behaviors [6], including wearable technologies, sensor-based approaches, and ambient instrument methodologies.The activated notification system ensures the verification of action identification.However, the accuracy of detecting these behaviors depends on the thorough analysis and precise acquisition of feature patterns [7].
EEG signals are preferred for their cost-effectiveness, portability, and ability to exhibit distinct frequency-dependent patterns [8].The EEG is a measurement technique that captures the brain's bioelectric activities by recording voltage fluctuations resulting from the ionic flow of neurons [9].To identify epileptic seizures accurately, capturing signals over an extended duration is necessary, which can introduce complexity due to multiple channels used for storage.Studies have also raised concerns over wearable devices' energy use and data storage limits posing challenges for creating seizure forecasting tools [10].These portable gadgets tend to swiftly drain batteries and lack capacity to save the huge data flows needed to reliably predict seizures over extended periods.Overcoming these power and memory roadblocks to enable precise ambulatory monitoring and algorithms will require further innovations going forward.Nonetheless, EEG signals can be influenced by disturbances arising from different origins, including the main power supply, movement of electrodes, and muscular vibrations [11].The presence of noisy EEG signals poses a significant challenge for healthcare professionals in diagnosing epileptic seizures effectively.To address these difficulties, extensive research is currently being conducted to detect and forecast epileptic seizures through the utilization of EEG methods, alongside tools like magnetic resonance imaging (MRI) and artificial intelligence (AI) methodologies [12].The realm of diagnosing epileptic seizures has been incorporating traditional machine learning (ML) and deep learning (DL) techniques within the framework of AI methods [13].
Numerous ML algorithms have been established for epileptic seizure identification, incorporating statistical, temporal, spectral, time-frequency domain, and nonlinear features [14].Traditional ML approaches involve a trial-and-error method for selecting features and classifiers [15].A thorough grasp of signal processing and data mining techniques is essential for constructing accurate models.A recent study [16] utilized three ML modelssupport vector machines, linear discriminant analysis, and multilayer perceptrons-to differentiate resting state EEG data between healthy subjects and those with psychogenic non-epileptic seizures.Specifically, these algorithms aimed to uncover connections between measures of functional brain connectivity and the eventual categorical diagnosis.By modeling these complex relationships, the systems classified individual data points as belonging to either the healthy control group or the psychogenic seizure group.Initial findings demonstrated promise in using patterns of functional connectivity derived from EEGs to accurately predict which subjects were suffering from non-epileptic events via completely automated ML.However, further validation is still needed, particularly around generalizability to diverse patient subgroups.While these models perform well with small data sets, the field has also implemented sophisticated DL techniques for epileptic seizure identification [17].DL models, unlike traditional ML methods, require a substantial amount of data during training due to their complex feature mapping spaces.This phenomenon leads to overfitting challenges when confronted with insufficient data.
The primary objective of this research is to utilize DL networks to detect epilepsy by automatically processing EEG signals and recognizing patterns.The aim is to identify the spatial distribution and temporal characteristics of spikes and seizures.Convolutional neural networks (CNN), long short-term memory (LSTM), and gated recurrent unit (GRU) are among the methods employed in this research.We introduce a new deep residual model, ResNet-BiGRU-ECA, to accurately identify epileptic seizures by analyzing EEG data.
To evaluate the performance of our model, we utilized a publicly available epilepsy dataset.This benchmark compilation contained EEG readings segmented into five distinct health categories-one representing active epileptic seizures and the remaining four encompassing normal, non-seizure brain activity.Leveraging this diverse test set enabled robust assessment of the model's ability to accurately differentiate between the pathological seizure state and healthy function.Through this analysis, we aim to demonstrate several key contributions of the current research, summarized as follows:

•
This research presents a framework for epileptic seizure detection (ESD) using EEG data to assess the performance of various DL structures within this particular domain.

•
The suggested method introduces a deep residual model that incorporates residual blocks, an efficient channel attention (ECA) module, and a bidirectional GRU (Bi-GRU).This model adeptly captures extended data sequences, extracts spatio-temporal features, and carries out EEG classification.
The paper's organization is as follows: Section 2 offers a summary of pertinent literature and prior studies.Section 3 outlines the proposed DL-based framework for epileptic seizure recognition.The results are showcased and examined in Section 4, and Section 5 wraps up and deliberates on potential avenues for future research.

Related Work
In this section, we review several studies that have used DL to classify epileptic and non-epileptic activities.One study introduced a DL model called the pyramidal onedimensional CNN.This CNN model uses improved parameters that are less amenable to training than those used in conventional CNNs.The model achieved a remarkable accuracy of 99.1% when used to distinguish between different behaviors, including typical behaviors such as eyes closed, eyes open, and pre-ictal, as well as unusual behaviors such as inter-ictal and ictal [18].
Conventional DL models struggle to process lengthy, variable input data like text, sensor readings over time, or video [19].Yet these sequential data types represent abundant real-world information.Recurrent neural networks (RNNs) have emerged to enable DL on such ordered series with temporal dynamics.Through internal state units retaining context, RNNs can analyze signals that change over timescales like EEG-gaining widespread use in physiology.Standard deep networks fail on such variable, redundant sequential inputs.But by propagating context, RNNs overcome challenges of long inputs with fluctuations and redundancy across temporal or spatial dimensions [7].This unique feedback architecture provides short-term memory lacking in feedforward networks to better extract patterns from sequential data like measurements over time.
The primary limitation of a basic RNN lies in its inability to retain information effectively over short periods of time.This is mainly because RNNs struggle to propagate information from earlier time steps to later ones, especially when dealing with long sequential data [20].Another challenge faced by RNNs is the vanishing gradient problem [21], which occurs when the gradients diminish significantly during backpropagation.To overcome the short-term memory issue, researchers devised a solution in the form of LSTM neural networks [22].LSTM networks address the problem by allowing the model to selectively store and access relevant information, making them more adept at handling long sequences and retaining essential details.
In their study, Golmohammadi et al. [23] evaluated two LSTM designs: one with three layers and the other with four layers, both combined with the Softmax classifier.The researchers reported that their findings were considered appropriate.In another study [24], a three-layer LSTM architecture was used for feature extraction and classification.The final fully connected layer commonly employed the sigmoid activation function for classification purposes.Furthermore, in a study conducted by another group [25], two structures, LSTM and GRU, were investigated.The structure of the LSTM/GRU model included a reshaped layer, succeeded by four layers of LSTM/GRU with an activation function, and, ultimately, a fully connected layer featuring a sigmoid activation function.In summary, these studies explored various LSTM designs and activation functions to improve feature extraction and classification accuracy.
In the realm of sophisticated DL models, it has been observed that employing more intricate and deeper architectures leads to enhanced accuracy compared to the feature learning approach discussed earlier.These prototypes employ CNNs to identify features autonomously [26].In particular, the CNN feature extractor is often denoted as the backbone when it comes to object recognition, setting it apart from the complete model architecture.In this investigation, we adopt a CNN-based feature extractor as the foundational framework.DL techniques, such as CNNs and RNNs, have demonstrated their ability to obtain state-of-the-art performance by autonomously learning the underlying characteristics from raw sensor data.The concept of deepening neural networks has evolved with the emergence of hybrid networks.These hybrid models combine diverse architectural designs, leading to improved feature representation and enhancing both computing and network achievements.Moreover, this development opens up opportunities for DL-based techniques in portable electronics.

Methodology
In this section, we delineate the structure applied in our study to assess the importance of EEG signals concerning epileptic seizures.The suggested ESD framework encompasses four separate phases: (i) collecting data, (ii) preparing data, (iii) processing data, and (iv) performing classification.To visually represent the approach used to detect seizures, refer to Figure 1.

Data Acquisition
Data acquisition refers to the systematic process of collecting and preserving digital or numerical data to incorporate it into our mathematical framework.This can be done either in its original, unprocessed state or after undergoing preliminary processing, depending on data accessibility.Therefore, it is crucial to have a comprehensive understanding of the dataset before obtaining the data.
To develop and test our model, we employed the epileptic seizures recognition dataset (ESRD), a publicly accessible benchmark widely utilized for detecting seizures [27].This compilation contains EEG readings from multiple patients, acquired through a standardized 128-channel system and averaged reference methodology.The analogue signals were digitized via 12-bit analog-to-digital conversion and continuously saved onto a computer at 173.61 Hz sampling frequency.Understanding such data collection and processing details is crucial for appropriate methodology.The ESRD dataset comprises data from 500 individuals, including 11,500 instances of time-series EEG signals specifically designed for investigating EEG signal alterations during seizure events.Before being made available online, the initial dataset undergoes preliminary processing by the university of California Irvine (UCI).Each dataset sample in our study consists of 4097 data points, further divided into 23 parts, with each part comprising 178 data points representing one second of data.These 23 components are shuffled in a random manner, yielding a collective count of 11,500 time-series EEG signal samples sourced from the 500 individuals in the study group.The UCI dataset for recognizing epileptic seizures encompasses five distinct medical conditions, with one being explicitly linked to epileptic seizures, while the remaining four conditions pertain to individuals displaying no signs of epilepsy.More information about these five categories and their respective samples can be found in Table 1.

Data Pre-Processing
During this phase, the EEG data undergo filtering and normalization procedures.The goal is to generate a dataset that is uniform and appropriate for training a detection model.In this approach, any incomplete or anomalous data values are removed from the dataset according to the following procedure:

•
We used the linear interpolation procedure for imputation to address the issue of incomplete values in sensor data.This helped eliminate any existing noises.To reduce noise that could obscure relevant EEG patterns, we implemented sequential low and median pass filtering.Initially, a third-order Butterworth filter eliminated high frequency artifacts over 20 Hz.Next, a third-order median filter replaced each point with the regional median value, calculated from surrounding data.By emphasizing the median, sporadic outliers and anomalies get rejected in favor of the less skewed central tendency.This substitutes misleading erratic spikes with smoothed waveforms reflecting underlying dynamics.The process removes irrelevant deviations so key shape features needed for model training remain, without distortion from sporadic spikes that misrepresent normal or pathological brain activity.
• In addition, we standardized each individual segment of EEG data using a normalization method that involved calculating the mean and standard deviation.This step was crucial in ensuring consistency and comparability across the data.
The normalization process entails using the min-max method to adjust the unprocessed EEG data linearly.After the data are cleaned and normalized, they become the input for the subsequent data preparation and classification procedures.To facilitate the training of the classifier, the data are partitioned based on the chosen methodology.The second group is subsequently employed as a test set to assess the performance of the classifier that has been trained.

Convolutional Block
CNNs are a type of neural network commonly used in supervised learning.CNNs have a specific architecture in which each neuron in one layer is connected to every neuron in the next layer.The input to each neuron is transformed into an output via an activation function [28].Two key properties of the activation function that impact CNN performance are sparsity (having many zeros) and ability to propagate gradients to lower layers during backpropagation.Overall, CNNs leverage their specialized architecture with sparse activation functions to effectively learn representations of data like images for classification and other supervised learning tasks [29].In CNNs, pooling operations are frequently employed to reduce dimensionality.Two frequently employed pooling functions include max-pooling and average-pooling, both of which aid in extracting the most significant features from the data.
In this research, ConvB is employed to extract foundational features from unprocessed sensor data.The ConvB structure comprises four components: 1D-convolutional (Conv1D), batch normalization (BN), exponential linear unit (ELU), and max-pooling (MP) layers, as visualized in Figure 3.The Conv1D layer utilizes multiple trainable convolutional kernels to detect distinct features, generating a feature map for each kernel.The BN layer is applied to improve stability and accelerate the training process.Additionally, the ELU layer is integrated to enhance the model's expressive capacity.Furthermore, the MP layer is used to downsize the feature map's dimensions while retaining the most salient features.This amalgamation of layers within the ConvB architecture assists in extracting valuable foundational features from the raw sensor data.

Bidirectional Gated Recurrent Unit
GRU was introduced to address the issue of gradients either growing too large or vanishing in RNNs.However, the incorporation of memory cells in the GRU design elevates the memory resource requirements [30].Unlike the LSTM model, the GRU is a more streamlined version, lacking a separate memory cell in its architecture [31].Instead, the GRU network integrates update and reset gates, which play a role in determining the adjustments made to each hidden state.These gates control the information that should be propagated to the following state and the data deemed unnecessary, as illustrated in Figure 4a.The calculation of the hidden state h t at time t in a GRU model involves utilizing the update gate z t , reset gate r t , current input x t , and the prior hidden state h t−1 .These components work together to manage the flow of information in the GRU network efficiently.
where s denotes a sigmoid function, ⊕ denotes an elementary addition operation, and ⊗ denotes an elementary multiplication operation.
In 1997, Schuster and Paliwal [32] introduced the bidirectional RNN (BiRNN) to address limitations of traditional unidirectional RNNs.The key innovation was to have the outputs at each time step incorporate contextual information not just from preceding time steps but also future time steps.This is achieved by training two separate hidden states, one processing the inputs in a forward direction and the other processing in reverse.The outputs at each time step are then computed based on the hidden states from both directions.By leveraging future context as well as past, BiRNNs can better model sequential data like text or speech compared to unidirectional RNNs.This bidirectional architecture has become widely adopted for sequence modeling tasks.
Within this BiRNN framework, the neurons in a standard RNN are segregated into two separate components: one dedicated to processing information in the forward direction and the other focused on information from the opposite direction.Notably, the output of positive neurons is independent of negative neurons, and vice versa.This arrangement gives rise to a general structure illustrated in Figure 4b.The underlying computational processes are outlined by the equations provided below.

ECA Mechanism
The ECA strategy holds great promise for boosting the effectiveness of deep CNNs.However, most current approaches focus on developing intricate attention components to enhance effectiveness, inadvertently leading to more complex models and increased processing demands.A solution known as ECA has been proposed to address the challenges of overfitting and high computational needs [33].This ECA module determines weights for individual channels and captures interrelationships among different channels.
In time series data, the norm involves assigning higher weights to crucial features while assigning lower weights to less relevant ones.Herein, the ECA method takes on the responsibility of prioritizing pertinent information, thus bolstering the network's capacity to discern and respond to pivotal characteristics.The configuration of the ECA module is visually represented in Figure 5.The channel weights in ECA are generated by applying a 1D convolution with a creative choice of kernel size (k) to the aggregated data obtained through GAP.The choice of the value for k is decided by mapping the channel dimensions (C).

GAP
Input : Output : Vector multiplication

Training and Hyperparameters
Model performance hinges critically on sufficient volumes of representative, varied data for training along with properly tuned architectural design parameters called hyperparameters, like iteration counts, learning rates, batch size, activation logic, etc.We applied a standardized approach separating data into training (for hyperparameter optimization) and holdout validation sets (for independent comparative testing).Guided by trial and error, the following settings maximized eventual accuracy: 128 batch size, 1 × 10 −3 learning rate, 200 epoch count, plus adaptive logic to cut the learning rate by 25% after 10 stagnant epochs lacking improvement.Beyond tuning, we enabled data shuffling before each epoch for robustness.For model specification, we utilized an Adam optimizer for weight adjust-ments and cross-entropy error quantification.Refer to Table 2 for full architectural details, including all set hyperparameters for our customized ResNet-BiGRU-ECA framework.

Cross Validation
The k-fold cross-validation (k-CV) procedure is a technique employed to assess the performance of a classification model [34].This method entails partitioning a dataset, collected from one or more sources, into approximately equal-sized, distinct, and nonoverlapping subsets of k.Each of these subsets is then utilized to evaluate the model, which is trained on the remaining k − 1 subsets.The overall performance of the model is determined by computing the mean of various performance metrics, such as accuracy, precision, recall, and F-measure, all derived from the k-CV process [35].
Nonetheless, it is important to acknowledge that this method can pose significant computational demands, particularly when working with extensive datasets or when k is set to a high value.In our study, we have employed the k-CV procedure with a chosen value of k = 5, as depicted in Figure 6, to assess the effectiveness of our models.

Performance evaluation
To evaluate the performance of deep learning models used in this work, four evaluation metrics are calculated with validation protocol of 5-fold cross validation.The mathematical expressions of these four metrics are shown as the following equations: These performance metrics are most common performance of deep learning research.The recognition is defined as a true positive (TP) recognition for the considered class and a true negative (TN) recognition for all other classes.An activity sensor data belonging to one class may be misclassified as belonging to another, creating a false positive (FP) recognition of that class, while an activity sensor data belonging to another class may be misclassified as belonging to that class, creating a false negative (FP) recognition of that class.

Experimental Results and Discussion
In this section, we will detail the experimental setup and showcase the outcomes acquired through our assessment of DL models for the purpose of identifying epileptic seizures.

Experiments
The experiments in this study were conducted using Google Colab Pro, which provides access to Tesla V100 GPUs.The implementation was done in Python 3.6.9,utilizing key libraries, including TensorFlow 2.2.0 for building the neural network models, Keras 2.3.1 for the high-level API, Scikit-Learn for machine learning utilities, Numpy 1.18.5 for numerical processing, and Pandas 1.0.5 for data manipulation.By leveraging Google Colab and these state-of-the-art libraries, the experiments could be efficiently run on powerful hardware, enabling the exploration of deep neural network architectures for the research questions under investigation.
This empirical study compared four main DL architectures for the application of ESD on the ESRD dataset.The models tested were CNNs, LSTMs, bidirectional LSTMs (BiLSTMs), GRUs, and BiGRUs.These represented the state-of-the-art approaches.Our proposed model, called ResNet-BiGRU-ECA, was benchmarked against these models to assess its performance at identifying seizures relative to conventional architectures.By evaluating both unidirectional and bidirectional variants of LSTM and GRU models, we aimed to thoroughly compare our novel approach to existing methods using these epileptic seizure data.
Additionally, we conducted an in-depth investigation into various CNN backbone models.Specifically, we examined VGG16 [36], ResNet18 [37], Pyramid-Net18 [38], Inception [39], and Xception [40] to perform a comprehensive experimental comparison analysis.These models were considered as potential solutions for addressing the challenge of time-series classification.Consequently, we restructured each model to suit the context of ESR.
In this study, we conducted research on two distinct scenarios involving EEG data for ESRD.In each of these scenarios, we utilized separate datasets for both training and testing DL models, as detailed in Table 3.

Experimental Results
In each of our experiments, we used the ESRD dataset for training DL models.To assess these models, we employed a 5-CV approach.Our study centered on assessing the effectiveness of five core DL models (CNN, LSTM, BiLSTM, GRU, and BiGRU), in addition to cutting-edge DL models, within the framework of the two situations outlined in Tables 4 and 5.The DL models underwent training and testing using EEG data from scenario I, as detailed in Table 4.Our experiments revealed that the ResNet-BiGRU-ECA model, as proposed, demonstrated exceptional efficiency, boasting an average accuracy of 0.998 and an average F1-score of 0.998.The DL models underwent training and testing with EEG data sourced from scenario II, as specified in Table 5. Upon analyzing the experiment results, it became evident that the ResNet-BiGRU-ECA model, as initially proposed, displayed the highest level of effectiveness.Its impressive performance supports this conclusion, boasting an average accuracy rate of 0.996 and an average F1-score of 0.994.

Comparative Results with ML Models
Guided by prior analyses [5], we selected leading ML classifiers for comparative benchmarking, including k-nearest neighbors (KNN), naive Bayes (NB), logistic regression (LR), random forest (RF), decision trees (DT), stochastic gradient boosting (SGDC), and gradient boosting (GB).Recent studies confirm the utility of these algorithms paired with neurologists in accurately detecting seizures and characterizing epileptiform EEG dynamics [41,42].Our experiments leveraged the standard ESRD dataset under equivalent scenarios to examine model performance variability when using raw EEG readings versus extracted feature sets as inputs.Table 6 presents accuracy outcomes with our proposed deep ResNet-BiGRU-ECA architecture versus these widely adopted shallow ML approaches.By evaluating on equal inputs, we aimed to isolate the performance gains stemming solely from algorithmic and architectural optimizations rather than data pre-processing.Results showed NB reached 95% accuracy, the sole ML technique with comparable proficiency.However, our deep ResNet-BiGRU-ECA architecture significantly exceeded all benchmark methods under both data scenarios.Surpassing 99% accuracy and 99% F1, the optimized network architecture demonstrated superior feature extraction and pattern recognition even from raw EEG readings.This substantial performance gap despite shared inputs suggests deep networks intrinsically outperform shallow ML at derives nuanced relationships within complex physiological signals.Rather than relying on predefined assumptions in simplified models, DL constructs intricate representations via hierarchical data transformations.Though some conventional methods approach sufficiency for classification tasks, deep neural networks attain state-of-the-art performance by learning intricate embeddings uniquely tuned to the intricacies of the data through backpropagation and gradient descent optimization.

Comparative Results with DL Models
The ResNet-BiGRU-ECA model under examination is subjected to a comparative analysis alongside previously developed models using the same dataset, specifically the ESDR dataset.Prior research studies [36][37][38][39][40] have consistently shown that leveraging CNNbased DL architectures yields remarkable results in the field of time-series classification.The existing literature introduced the 5-CV methodology, which we also employed in our study.The summary of comparative results can be found in Tables 7 and 8.These outcomes reveal that the ResNet-BiGRU-ECA model, as presented here, exhibits superior accuracy compared to the earlier models across the majority of actions.

Effect of BiGRU and ECA Modules
Additional investigations were conducted to provide a more comprehensive assessment of the BiGRU and ECA modules within the proposed ResNet-BiGRU-ECA architecture.As depicted in Table 9, the findings obviously indicate that both the BiGRU module and the ECA module contribute significantly to enhancing the model's efficiency in identifying datasets from two distinct situations.A checkmark ( ) indicates that the module is included in the given variation of the ResNet model and a dash (-) indicates that it is not included in that ResNet model variation.

Conclusions and Future Work
Epilepsy, a neurological disorder, can be significantly mitigated if detected early.This study introduces a novel hybrid DL model named ResNet-BiGRU-ECA, designed to accurately identify epileptic seizures using EEG signals.This model combines residual blocks, an ECA module, and a BiGRU to recognize epileptic seizures in pre-processed multichannel EEG recordings.To gauge the efficiency of our suggested model, we carried out a thorough assessment by contrasting it with five fundamental DL models and leading time-series classification models, all applied to the same publicly available dataset.The DL models underwent training and evaluation using a 5-CV approach.We analyzed model performance using standard evaluation metrics, and the ResNet-BiGRU-ECA model consistently outperformed other models, achieving an average accuracy of 0.998 and an F1-score of 0.998.Furthermore, our model surpassed most existing systems in terms of effectiveness when compared using the same dataset.The objective of this research is to make a meaningful contribution to the field of neurology by investigating the potential benefits of employing EEG data in the context of epilepsy detection and related matters.Our primary goal is to investigate the possibility of reducing examination duration while simultaneously improving diagnostic efficiency and effectiveness.
In future research, we intend to employ the ResNet-BiGRU-ECA model to address other detection issues relying on EEG signals.Additionally, we plan to delve into the explainability of our model to gain insights into the mechanisms and reasoning behind its accurate decision-making process.

Figure 1 .
Figure 1.The epileptic seizure detection (ESD) framework proposed in this study.
) An individual in good health with their eyes open Numerous analyses have recommended simplifying the multi-class categorization challenge embodied in the UCI seizure dataset, which segregates EEG data into five unique groups as shown in Figure 2. Rather than attempting to draw boundaries between all categories, they propose concentrating specifically on discriminating class one-representing active epileptic seizures-from aggregate signals of the other classes signifying ordinary non-seizure brain states.This binary framework focuses model resources on the most vital differentiation between pathological and healthy function.Though it sacrifices granularity within normal activity subtypes, enhanced accuracy and reliability in identifying the central seizure condition demonstrate the sensibility of this targeted approach.

Figure 4 .
Figure 4.The structure of the ResNet-BiGRU-ECA model introduced in this study: (a) GRU cell and (b) unroll BiGRU.

Figure 5 .
Figure 5.The structure of the efficient channel attention (ECA) block.

Table 1 .
Summarized details of the epileptic seizures recognition dataset (ESRD).

Table 2 .
The summary of hyperparameters of the ResNet-BiGRU-ECA used in this work.

Table 3 .
A compilation of experiments applied in this study.

Table 4 .
Experimental results of the deep learning (DL) networks trained and tested using electroencephalography (EEG) from scenario I.

Table 5 .
Experimental results of the deep learning (DL) trained and tested using electroencephalography (EEG) from scenario II.

Table 6 .
Comparative results of the proposed model and machine learning (ML) models using electroencephalography (EEG) data from the epileptic seizures recognition dataset (ESRD).

Table 7 .
A side-by-side comparison of the outcomes achieved by the proposed model and state-ofthe-art deep learning (DL) networks, all trained and tested using electroencephalography (EEG) data from scenario I.

Table 8 .
A side-by-side comparison of the outcomes achieved by the proposed model and state-ofthe-art deep learning (DL) networks, all trained and tested using electroencephalography (EEG) data from scenario II.

Table 9 .
A comparative analysis of ResNet-based models, with and without the bidirectional gated recurrent unit (BiGRU) and efficient channel attention (ECA) modules.