Epileptic Seizures Detection Using Deep Learning Techniques: A Review

A variety of screening approaches have been proposed to diagnose epileptic seizures, using electroencephalography (EEG) and magnetic resonance imaging (MRI) modalities. Artificial intelligence encompasses a variety of areas, and one of its branches is deep learning (DL). Before the rise of DL, conventional machine learning algorithms involving feature extraction were performed. This limited their performance to the ability of those handcrafting the features. However, in DL, the extraction of features and classification are entirely automated. The advent of these techniques in many areas of medicine, such as in the diagnosis of epileptic seizures, has made significant advances. In this study, a comprehensive overview of works focused on automated epileptic seizure detection using DL techniques and neuroimaging modalities is presented. Various methods proposed to diagnose epileptic seizures automatically using EEG and MRI modalities are described. In addition, rehabilitation systems developed for epileptic seizures using DL have been analyzed, and a summary is provided. The rehabilitation tools include cloud computing techniques and hardware required for implementation of DL algorithms. The important challenges in accurate detection of automated epileptic seizures using DL with EEG and MRI modalities are discussed. The advantages and limitations in employing DL-based techniques for epileptic seizures diagnosis are presented. Finally, the most promising DL models proposed and possible future works on automated epileptic seizure detection are delineated.


Introduction
Epilepsy is a noncommunicable disease and one of the most common neurological disorders of humans, usually associated with sudden attacks [1]. Sudden attacks of seizures are a swift and early abnormality in the electrical activity of the brain that disrupts the part or whole body [2]. Various kinds of epileptic seizures are affecting around 60 million people worldwide [3]. These attacks occasionally provoke cognitive disorders that can cause severe physical injury to the patient. Moreover, people with epileptic seizures sometimes suffer emotional distress due to embarrassment and lack of appropriate social status. Hence, early detection of epileptic seizures can help the patients and improve their quality of life.
The EEG signals are widely preferred as they are economical, portable, and show clear rhythms in the frequency domain [8,9]. The EEG provides the voltage variations produced by the ionic current of neurons in the brain, which indicate the brain's bioelectric activity [15]. They need to be recorded for a long period of time to detect epileptic seizures. In addition, these signals are recorded in multiple channels, making the analysis complex. The EEG signals are also prone to artifacts generated by main power supply, electrode movement, and muscle tremor [16]. This will pose challenges to the physicians to diagnose epileptic seizures using noisy EEG signals. To resolve these difficulties, much research is being carried out to diagnose and predict epileptic seizures based on EEG modalities and other techniques such as MRI coupled with AI techniques [17,18]. AI techniques in the field of epileptic seizures diagnosis have employed conventional machine learning and DL methods [19][20][21][22].
Many machine learning algorithms have been developed using statistical, time, frequency, time-frequency domain and nonlinear parameters to detect epileptic seizures [23,24]. In conventional machine learning techniques, the selection of features and classifiers is done by trial-and-error method [25,26]. One needs to have sound knowledge of signal processing and data mining techniques to develop an accurate model. Such models perform well for limited data. Nowadays, with the increase in the availability of data, machine learning techniques may not perform very well. Hence, the DL techniques, which are the state-of-art methods, have been employed [27,28]. DL models, unlike conventional machine learning techniques, require huge data in the training phase [29]. This is because these models have a large number of feature spaces, and in case of lack of data, they face the problem of overfitting [29].
In conventional machine learning algorithms, most simulations were executed in the Matlab software environment, but the DL models are usually developed using Python programming language with numerous open-source toolboxes. The python language with more freely available DL toolboxes has helped the researchers to develop novel automated systems, and there is greater accessibility of computation resource to everyone thanks to cloud computing. Figure 1 shows that the TensorFlow and one of its high-level APIs, Keras, are widely used for epileptic seizure detection using DL in reviewed works due to their versatility and applicability. Since 2016, substantial research has been done to detect epilepsy using DL models such as convolutional neural networks (CNNs), recurrent neural networks (RNNs), deep belief networks (DBNs), Autoencoders (AEs), CNN-RNNs, and CNN-AEs [30][31][32][33]. The number of studies in this area using DL is growing as new efficient models are proposed. Figure 2 provides the overview of number of studies conducted using various DL models from 2014 to 2021 in detecting epileptic seizures. It can be noted from Figure 2 that various DL models have been exploited in the diagnosis of epileptic seizures. Compared to other DL techniques, 2D-CNN and 1D-CNN models are the most widely used in epileptic seizures detection. Researchers have mostly employed 2D-CNN models to diagnose epilepsy. In the diagnosis of epileptic seizures using 2D-CNN models, EEG signals are first converted into two-dimensional (2D) images using preprocessing methods such as short-time Fourier transform (STFT). Next, these images are applied to 2D-CNN networks. The second category comprises 1D-CNN models, which have achieved a special place among researchers for epileptic seizures detection. In this work, EEG signals are first preprocessed (noise removal and normalization) and then applied to 1D-CNN networks. Simple implementation and high efficiency are among the most important advantages of this type of network.
The keywords "EEG", "MRI", "Epilepsy", "Epileptic Seizures", and "Deep Learning" were used to search articles. These keywords were searched in various citation databases such as IEEE, Elsevier, Springer, Wiley, and ArXiv. Google Scholar was also used to search further. Figure 3 shows the number of accepted papers in each citation database. It is observed that the IEEE citation database contains the most accepted articles. The main aims of this study are as follows:

•
Providing information on available EEG datasets; • Reviewing works done using various DL models for automated detection of epileptic seizures with various modality signals; • Introducing future challenges on the detection of epileptic seizures; • Analyzing the best performing model for various modalities of data.
Epileptic seizures detection using DL is discussed in Section 2. Section 3 describes the non-EEG-based epileptic seizure detection. Hardware used for epileptic seizures detection is provided in Section 4. Discussion on the paper is outlined in Section 5. The challenges faced by employing DL methods for epileptic seizure detection are summarized in Section 6. Finally, the conclusion and future work are delineated in Section 7. Figure 4 illustrates the working of a computer-aided diagnosis system (CADS) for epileptic seizures using DL architectures. The input to the DL model can be EEG, MEG, ECoG, fNIRS, PET, SPECT, and MRI. Then, the signal is subjected to the preprocessing to remove the noise. These eliminated signals are used to develop the DL models. The performance of the model is evaluated using accuracy, sensitivity, and specificity. Additionally, a table combining all the works conducted on epileptic seizure detection using DL is presented in the table form in Appendix A of the paper.

Fribourg
The EEG dataset contains invasive EEG signals from 21 patients suffering from refractory focal epilepsy which were recorded during pre-surgical epilepsy monitoring at the epilepsy center of the University Hospital Fribourg. To provide direct recording from focal area, reduction of artifacts and achieving higher Signal to Noise Ratio (SNR), the in-tra-cortical grid, strip, and depth electrodes were used. The EEG signals were recorded using 128-channel Neurofile NT system with 6 contacts electrodes (three focal and three extra focal) and digitized by a 16bits A/D with sample rate of 256 Hz. For each patient, there are ictal and interictal data, the former contains seizures with at least 50 minutes' of the pre-ictal region and the latter contains about 24 h of EEG data without seizure [34].

CHB-MIT
The database comprises 844 h of continuous recording of scalp EEG signals with 163 seizures from 23 children, recorded according to intentional 10-20 standard electrode positions and sampled at 256 samples per second. The inter-ictal region is defined as the period between at least 4 h before the onset seizure and 4 h after the seizure ended. There are two types of seizures, called combined and main seizures, available in this database. The former are multiple seizures close to each other, while the later are great seizures considered for prediction. Generally, the prediction is meaningful for patients having less than 10 seizures per day. In this database, there are sufficient data available (at least three main seizures and 3 h inter-ictal recording) from 13 patients [35].

Kaggle
The database is the epileptic seizures prediction challenge of the American Epilepsy Society and contains intracranial EEG signals from five dogs and two patients with 48 seizures and 627 h total duration. The EEG signals of dogs were acquired by 16 implantable electrodes, which were sampled at 400 KHz, while the EEG signals from patient 1 and patient 2 were recorded using 15 deep and 24 subdural electrodes, respectively, with sample rate of 5 KHz. In this database, 10 min segments of pre-ictal and inter-ictal data are available, and for each seizure, six pre-ictal segments (with 10 s distance) up to five minutes before seizure onset are accessible. The inter-ictal segments are selected randomly at least one week before each seizure [36].

Bonn
Bonn database consists of five datasets, A, B, C, D, and E, each containing 100 singlechannel EEG signals of 23.6 s duration. The EEG signals were digitized at a sample rate of 173.61 Hz by 12-bit A/D converter. Datasets A and B have the normal signals of five volunteers with eyes opened and closed states, respectively. The EEG signals of datasets C and D are related to pre-ictal region and were recorded from epileptogenic and left area of hippocampus, respectively. The EEG signals of E dataset are related to ictal region. Signals of datasets A and B were recorded using 10-20 scalp EEG standard, while the signals of C and D were intracranial EEG recorded using depth electrodes, and the signals of E were provided using both depth and strip electrodes. Depth electrodes are located symmetrically on hippocampus, while strip electrodes are located on lateral and base sections of neo cortex [37].

Flint-Hills
The database presents electrocardiography signals with total duration of 1419 h and sample rate of 249 Hz. In addition, meta information about 59 seizures and information related to the position of electrodes are presented. The signals of this database were obtained using 48 to 64 electrodes for each patient [26].

Bern Barcelona
Barcelona database was collected from the brain department of Bern Hospital of Barcelona and contains intracranial EEG of patients with focal epilepsy. Subjects were monitored for several days, and no antiepileptic drugs were used to determine seizures and possible surgery. The signals were acquired using AD-Tech intracortical electrodes, and one extra reference electrode based on 10-20 standard between PZ and FZ positions was used. The database contained two types of EEG signals: focal and extra focal EEG signals. Every dataset contained 3750 pairs of simultaneous recorded signals with duration of 20 s and sample rate of 512 Hz. The database consists of total 83 h EEG data from five patients with different ages [38].

Hauz Khas
The database was collected at a brain center in Delhi, India and comprises of scalp EEG signals of 10 patients, recorded with AS40 system and sampled at a rate of 200 Hz in Hauz Khas neurons. The signals were filtered using band-pass filter with pass frequency of 0.5-70 Hz and classified as pre-ictal, inter-ictal, and ictal classes by neurologist experts [26].

Zenodo
This dataset contains multichannel EEG recordings of 79 human neonates collected in Helsinki University Hospital, with the median recording duration of 74 min. The EEG data were annotated by three experts, and every expert has annotated about 460 seizures, 39 neonates had seizure and 22 neonates were seizure-free in consensus [26].
The supplementary information on each dataset is listed in Table 1. Figure 5 shows the number of times each dataset employed epileptic seizures detection using DL techniques.
It can be observed that the Bonn dataset is most widely used for automated detection of seizure using DL methods.

Preprocessing
In developing CADS using DL models with EEG signals, the preprocessing involves three steps: noise removal, normalization, and signal preparation for DL network applications [29,40]. In the noise removal step, finite impulse response (FIR) or infinite impulse response (IIR) filters are usually used to eliminate extra signal noise. Normalization is then performed using various schemes such as the z-score technique. Finally, different time domain, frequency, and time-frequency methods are employed to prepare the signals for the deployment of deep networks.

Review of Deep Learning Techniques
In contrast to conventional neural networks, or so-called shallow networks, deep neural networks are structures with more than two hidden layers. This increase in the size of the networks results in a massive rise in the number of parameters of the network, requiring appropriate methods for learning, and also measures to avoid overfitting of the learned network. Convolutional networks use filters convolved with input patterns instead of multiplying a weight vector (matrix), which reduces the number of trainable parameters dramatically.
Furthermore, other methods are suggested to help the network to learn, as well [41]. Pooling layers reduce the size of the input pattern to the next convolutional layer. Batch normalization, dropout, early stopping, unsupervised or semi unsupervised learning, and regularization techniques prevent the learned network from overfitting and increase the learning ability and speed. The AE and DBN are employed as unsupervised learning and then fine-tuned to avoid overfitting for limited labeled data. Long short-term memory (LSTM) and gated recurrent units (GRU) are RNNs capable of revealing the long-term time dependencies of data samples.

Convolutional Neural Networks (CNNs)
CNNs are one class of the most popular DL networks to which most of the researches in machine learning have been devoted [30]. They were initially presented for imageprocessing applications, but have recently been adopted to one-and two-dimensional architectures for diagnosis and prediction of diseases using biological signals [42]. This class of DL networks is widely used for the detection of epileptic seizures using EEG signals. In two-dimensional convolutional neural networks (2D-CNN), the one-dimensional (1D) EEG signals are first transformed into two-dimensional plots by employing visualization methods such as spectrogram [43], higher-order bispectrum [44,45], and wavelet transforms, and are then applied to the input of the convolutional network. In 1D architectures, the EEG signals are applied in the one-dimensional form to the input of convolutional networks. In these networks, changes are made to the core architecture of 2D-CNN that makes it capable of processing the 1D-EEG signals. Therefore, since both 2D and onedimensional convolutional neural networks (1D-CNNs) are used in the field of epileptic seizures detection, they are investigated separately.

A. 2D Convolutional Neural Networks (2D-CNNs)
Nowadays, deep 2D networks are used for various medical applications such as diagnosis of COVID-19 in CT and X-ray [46,47], and autism spectrum disorders from MRI modalities [48]. First, in 2012, Krizovsky et al. [49] suggested this network to solve image classification problems, and then quickly used similar networks for different tasks such as medical image classification, in an effort to obviate the difficulties of previous networks and solve more intricate problems with better performance. Figure 6 shows a general form of a 2D-CNN used for epileptic seizure detection. The application of 2D-CNN architectures is arguably the most important architecture in the deep neural nets. More information about visualization and preprocessing method can be found in Appendix A. In one study [50], the SeizNet 16-layer convolutional network is introduced, with additional dropout layers and batch normalization (BN) behind each convolutional layer having a structure similar to that of VGG-Net. The authors in [51] presented a new 2D-CNN model that can extract the spectral and temporal characteristics of EEG signals and used them to learn the general structure of seizures. Zuo et al. [52] developed the diagnosis of higher-frequency oscillations (HFO) epilepsy from 16-layer 2D-CNN and EEG signals. A DL framework called SeizureNet that uses convolution layers with dense connections is proposed in [53]. A novel DL model called the temporal graph convolutional network (TGCN) has been introduced by Covert et al. [54], comprising of five architectures with 14, 18, 22, 23, and 26 layers. Bouaziz et al. [55] split the EEG signals of CHB-MIT with 23 channels into 2 s time windows and then converted them into density images (spatial representation), which were fed as inputs to the CNN network.

B. AlexNet
FeiFei Li, Professor of Stanford University, created a dataset of labeled images of realworld objects and termed her project as ImageNet [56]. ImageNet organizes a computer vision competition called ILSVRC annually to solve the image classification problems. Alex Krizhevsky revolutionized the image classification world with his algorithm, AlexNet, which won the 2012 ImageNet challenge and started the whole DL era [49]. AlexNet won the competition by achieving the top-5 test accuracy of 84.6%. Taqi et al. [57] used the AlexNet network to diagnose focal epileptic seizures. This proposed network used the feature extraction approach and eventually applied the Softmax layer for classification purposes and achieved 100% accuracy. In another study, the AlexNet network was employed [58]. They transformed the 1D signal to 2D image by passing through the Sig-nal2Image (S2I) module. The several methods used in this are signal as image, spectrogram, one-layer 1D-CNN, and two-layer 1D-CNN.

C. VGG
A research team at Oxford proposed the visual geometry group (VGG) model in 2014 [59]. They configured various models, and one such model was VGG-16, which was submitted to the ILSVRC 2014 competition. The VCG-16 comprises 16 layers and delivered an excellent performance for image classification problems. Ahmedt-Aristizabal et al. [60] performed VGG-16 architecture to diagnose epilepsy from facial images. Their proposed approach attempted to extract and classify semiological patterns of facial states automatically. After recording the images, the proposed VGG architecture is trained primarily by well-known datasets, followed by various networks such as 1D-CNN and LSTM in the last few layers. In [58], the VGG network used one-dimensional and two-dimensional signals. To train the models, Adam's optimizer and a cross-entropy error function were used. They used the batch size and number of epochs as 20 and 100, respectively. The idea of detecting epileptic seizures on the sEEG signal plots was examined by Emami et al. [61]. In the preprocessing step, the signals were segmented into different time windows and VGG-16 was used for classification, using small (3 × 3) convolution filters to efficiently detect small EEG signal changes. This architecture was pre-trained by applying an ImageNet dataset to differentiate 1000 classes, and the last two layers had 4096 and 1000 dimensional vectors. They modified these last two layers to have 32 and 2 dimensions, respectively, to detect seizure and non-seizure classes.

D. GoogleNet
GoogLeNet won the 2014 ImageNet competition with 93.3% top-5 test accuracy [62]. This 22-layer network was called GoogLeNet to honor Yann Lecun, who designed LeNet. Before the introduction of GoogLeNet, it was stated that by going deep, one could achieve better accuracy and results. Nevertheless, the google team proposed an architecture called inception, which achieved better performance by not going deep but by better design. It represented a robust design by using filters of different sizes on the same image. In the field of EEG signal processing to diagnose epileptic seizures, this architecture has recently received the attention of researchers. Taqi et al. [57] used this network in their preliminary studies to diagnose epileptic seizures. Their model was used to extract features from the Bern-Barcelona dataset and achieved excellent results.

E. ResNet
Microsoft's ResNet won the ImageNet challenge with 96.4% accuracy by applying a 152-layer network that utilized a ResNet module [63]. In this network, residual blocks capable of training deep architecture were introduced by using skip connections that copied inputs of each layer to the next layer. The idea was to learn something different and new in the next layer. So far, little research has been accomplished on the implementation of ResNet networks to diagnose epilepsy, but this may grow significantly in the coming days. Bizopoulos et al. [58] introduced two ResNet and DenseNet architectures to diagnose epileptic seizures and attained good results. They showed that S2I-DenseNet based model with an average of 70 epochs was sufficient to gain the best accuracy of 85.3%. A summary of related works done using 2D-CNNs is shown in Table 2. A sketch of accuracy accuracy (%) obtained by various authors is shown in Figure 7.  1D-CNNs are intrinsically suitable for processing of biological signals such as EEG for epileptic seizures detection [2]. These architectures present a more straightforward structure, and a single pass of them is faster as compared with CNN with 2D architecture, due to fewer parameters. The most important superiority of 1D to 2D architectures is the possibility of employing pooling and convolutional layers with a larger size. In addition to that, signals are 1D in nature, and using preprocessing methods to transform them to 2D may lead to information loss. Figure 8 shows a general form of a 1D-CNN used for epileptic seizure detection. The authors in [58] conducted experiments using 1D-LeNet, AlexNet, VGGnet, ResNet, and DenseNet architectures, and applied well-known 2D architectures in 1D space in the first study in this section. In [80], 1D-CNN was used for feature extraction procedure. The researchers in [81] used 1D-CNN for other work. They used a CHB-MIT dataset, and the signals from each channel were segmented into 4 s intervals; overlapping segments were also accepted to increase the data and accuracy. Combining CNNs with conventional feature extraction methods was explored in [82]; they used the empirical mode decomposition (EMD) method for feature extraction, and CNN was used to acquire high accuracy in the multiclass classification tasks. In [83], a framework for the diagnosis of epileptic seizures is presented that combined the capability of interpreting probabilistic graphical models (PGMs) with advances in DL. The authors in [84] submitted a 1D-CNN architecture-defined CNN-BP (standing for CNN bipolar). In this work, they used the data from patients monitored with combined foramen ovale (FO) electrodes and EEG surface electrodes. A new scheme to classify EEG signals based on temporal convolution neural networks (TCNN) was introduced by Zhang et al. [85]. Table 3 shows the summary of related works done using 1D-CNNs. Figure 9 shows the sketch of accuracy (%) obtained by various authors using 1D-CNN models for epileptic seizures detection.  Sequential data such as text, signals, and videos show characteristics such as variable and great length, which makes them not suitable for simple DL methods [41]. However, these data form a significant part of the information in the world, compelling the need for DL-based schemes to process these types of data. RNNs are the solution suggested to overcome the mentioned challenges, and are widely used for physiological signals. Figure 10 shows a general form of RNN used for epileptic seizure detection. In the following section, an overview of popular RNN models are presented in addition to the reviewed papers.
A. Long Short-Term Memory (LSTM) The main problem of a simple RNN is short-term memory. RNN may leave out key information as it has a hard time transporting information from earlier time steps to the next steps in long-sequence data. Another drawback of RNN is the vanishing gradient problem [30][31][32][33]. The problem arises because of the shrinking of gradients as it back-propagates. To solve the short-term memory problem, LSTM gates were created [30]. The flow of information can be regulated through gates. The gates can preserve the long sequence of necessary data, and throw away the undesired ones. The building block of LSTM is the cell state and its gates. In this section, Golmohammadi et al. [68] evaluated two LSTM architectures with three and four layers together with the Softmax classifier in their investigation and obtained satisfactory results. In [92], three-layer LSTMs are used for feature extraction and classification. The sigmoid active function is used in the last fully connected (FC) layer for classification. According to directed experiments in [98], they employed two architectures: LSTM and GRU. The LSTM GRU model architecture is composed of a layer of Reshape, four layers of LSTM/GRU with the activator, and one layer of FC with sigmoid activator. In another work, Yao et al. [102] practiced ten different and independently ameliorated RNN (IndRNN) architectures and achieved the best accuracy using Dense IndRNN with attention (DIndRNN) with 31 layers.

B. Gated Recurrent Unit (GRU)
One variation of LSTM is GRU, which combines the input and forgets gates into one update gate [30][31][32][33]. It merges the input and forgets gates and also makes some other modifications. The gating signals are decreased to two. One is the reset gate, and another is the updating gate. These two gates decide which information is necessary to pass to the output. In one experiment, Chen et al. [92] used a three-layer GRU network with sigmoid classifier and yielded 96.67% accuracy. Talathi et al. have used a new CADS based on GRU for epileptic seizure detection [103]. In the proposed method, during the preprocessing, the input signals are split into time windows and spectrogram are obtained from them. Then, these plots are fed to a four-layer GRU network with a Softmax FC layer in the classification stage; 98% accuracy was achieved. In another study, Roy et al. [104] employed a five-layer GRU network with Softmax classifier and achieved remarkable results. Table 4 provides the summary of related works done using RNNs. Figure 11 shows the sketch of accuracy (%) obtained by various authors using RNN models for seizure detection.  11. Sketch of accuracy (%) obtained by authors using RNN models for seizure detection.

Autoencoders (AEs)
AE is an unsupervised machine learning model for which the input is the same as output [30][31][32][33]. Input is compressed to a latent-space representation, and then the output is obtained from the representation. Therefore, in AE, the compression and decompression functions are coupled with the neural network. AE consists of three parts, i.e., encoder, code, and decoder. AE networks are most commonly used for feature extraction or dimensionality reduction in the brain signal processing. Figure 12 shows a general form of an AE used for epileptic seizures detection. As the first research in this section, Rajaguru et al. [113] separately surveyed the multilayer AE (MAE) and expectation-maximization with principal component analysis (EM-PCA) methods to diminish the representation dimensions and then employed the GA for classification. They obtained an average classification accuracy of 93.78% when MAEs were applied for dimensionality reduction and combined with GA as classifier. In another work, it was proposed to design an automated system based on AEs for the diagnosis of epilepsy using the EEG signals [114]. First, Harmonic wavelet packet transform (HWPT) was used to decompose the signal into frequency sub-bands, and then fractal features, including box-counting (BC), multiresolution BC (MRBC), and Katz fractal dimension (KFD), were extracted from each of the sub-bands.

A. Other Types of AEs
To create a more robust representation, a number of schemes such as denoising AE (DAE) (which tries to recreate input from a corrupted form of it) [41], stacked AE (SAE) (stacking a few AEs on top of each other to go deeper) [41], and sparse AEs (SpAE) (which attempts to harness from sparse representations) [41] have been applied. These methods might pursue other objectives as well, for example, the DAE can be used to recover the corrupted input.
Works in this section begin with Golmohammadi et al. [68], who presented various deep networks, one of which is stacked denoising AE (SDAE). Their architecture in this section consists of three layers, and the final results demonstrated good performance of their approach. Qiu et al. [115] exerted the windowed signal, z-score normalization step of preprocessing EEG signals and imported preprocessed data into the denoising sparse AE (DSpAE) network. In their experiment, they achieved an outstanding performance of 100% accuracy. In [116], a high-performance automated EEG analysis system based on principles of machine learning and big data is presented, which consists of several parts. At first, the signal features are extracted by linear predictive cepstral coefficients (LPCC) coefficients, then three paths are applied for precise detection. The first pass is sequential decoding using hidden Markov models (HMMs), the second pass is composed of both temporal and spatial context analysis based on DL, and in the third pass, a probabilistic grammar is employed.
In another study, Yan et al. [117] proposed a feature extraction and classification method based on SpAE and support vector machine (SVM). In this approach, first, the feature extraction of the input EEG signals is performed using SAE, and, finally, the classification is performed by SVM. Another SAE architecture was proposed by Yuan et al. [118], which is namedWave2Vec. In the preprocessing stage, the signals were first framed, and in the deep network segment, the SAE with Softmax was applied and achieved 93.92% accuracy. Following the experiments of Yuan et al., in [119], different stacked sparse de-noising AE (SSpDAE) architectures have been tested and compared. In this work, feature extraction is accomplished by the SSpDAE network and finally classification by Softmax. They obtained an accuracy of 93.64%. Table 5 provides the summary of related works done using AEs. In addition, Figure 13 shows the comparison of the accuracies obtained by different researchers.  13. Sketch of accuracy (%) versus authors obtained using AE models for seizure detection.

Deep Belief Networks (DBNs)
Restricted Boltzmann machines (RBM) is a variant of deep Boltzmann machines (DBM) and an undirected graphical model [30]. The unrestricted Boltzmann machines may also have connections between the hidden units. Stacking the RBMs forms a DBN; RBM is the building block of DBN. DBNs are unsupervised probabilistic hybrid generative DL models comprising latent and stochastic variables in multiple layers [30][31][32][33]. Furthermore, a variation of DBN is called convolutional DBN (CDBN), which could successfully scale the high-dimensional model and uses the spatial information of the nearby pixels [30][31][32][33]. DBNs are probabilistic, generative, unsupervised DL models which contain visible and multiple layers of hidden units [30][31][32][33]. Xuyen et al. [129] used DBN to identify epileptic spikes in EEG data. The proposed architecture in their study consisted of three hidden layers and achieved an accuracy of 96.87%. In another study, Turner et al. [130] applied the DBN network to diagnose epilepsy and found promising results.

Convolutional Recurrent Neural Networks (CNN-RNNs)
The highly efficient combination of DL networks used to predict and detect epileptic seizures from EEG signals is the CNN-RNN architecture. Adding convolutional layers to RNN helps to find spatially nearby patterns effectively as RNN characteristic is more suitable for time-series data. In [68], they applied numerous preprocessing schemes; then, a modified CNN-LSTM architecture was proposed comprising 13 layers and the sigmoid was used for the last layer. Finally, the proposed approach demonstrated better performance.
Roy et al. [69] used different CNN-RNN hybrid architectures to improve the experimental results. Their first network comprised a one-dimensional seven-layer CNN-GRU convolution architecture, and the second one is a three-dimensional (3D) CNN-GRU network. In another work, Roy et al. [104] concentrated on natural and abnormal brain activities and suggested four different DL architectures. The proposed ChronoNet model was developed using previous models. It achieved 90.60% and 86.57% training and test accuracies, respectively.
Fang et al. [131] used the Inception-V3 network. At the outset, a preliminary training was used on this network. Then, to fine-tune this architecture, an RNN-based network called spatial temporal GRU (ST-GRU) was applied, and achieved 77.30% accuracy. Choi et al. [132] proposed a multiscale 3D-CNN with RNN model for the detection of epileptic seizures. The CNN module output is applied as the input of the RNN module. The RNN module consists of a unilateral GRU layer that extracts the temporal feature of epileptic seizures, which are finally classified using an FC layer. At the end of this section, generalized information from the CNN-RNN research is presented in Table 6 and Figure 14, respectively.

Convolutional Autoencoders (CNN-AEs)
In addition to finding nearby patterns, convolutional layers can reduce the number of parameters in structures such as AEs. These two reasons make their combination suitable for many tasks such as unsupervised feature extraction for epileptic seizure detection. A novel approach based on CNN-AE was presented by Yuan et al. [136]. At the feature extraction stage, two deep approaches, AE and 2D-CNN, were used to extract the supervised and unsupervised features, respectively. The unsupervised features were obtained directly from the input signals, and the supervised features were acquired from the spectrogram of the signals. Finally, the Softmax classifier was utilized for classification and achieved 94.37% accuracy. In another investigation, Yuan et al. [137] proposed an approach called deep fusional attention network (DFAN), which can extract channel-aware representations from multichannel EEG signals. They developed a fusional attention layer that utilized a fusional gate to fully integrate multiview information to quantify the contribution of each biomedical channel dynamically. A multiview convolution encoding layer, in combination with CNN, has also been used to train the integrated DL model. Table 7 provides the summary of related works done using CNN-AEs, and Figure 15 shows the accuracies (%) obtained by different researchers.

Medical Imaging
Various DL models were developed to detect epileptic seizure using sMRI, fMRI, and PET scans with or without EEG signals [141][142][143][144][145][146][147][148]. These models outperformed the conventional models in terms of automatic detection and monitoring of the disease. However, due to the nature and difficulties in using imaging methods, these models are mostly practiced for localization and detection of seizure.
The authors of [141] proposed automatic localization and detection of focal cortical dysplasia (FCD) from the MRI modality using a CNN model. The diagnosis of FCD rate is only 50% despite the progress in the analytics of MRI modalities. Gill et al. [142] proposed a CNN-based algorithm with feature learning capability to detect FCD automatically. The authors [143] designed DeepIED based on DL and EEG-fMRI scans for epilepsy patients, combining the general linear model with EEG-fMRI techniques to estimate the epileptogenic zone. Hosseini et al. [144] proposed an edge computing autonomic framework for evaluation, regulation, and monitoring of epileptic brain. The epileptogenic network estimated the epilepsy using rs-fMRI and EEG. Shiri et al. [148] presented a technique for direct attenuation correction of PET images by applying emission data via CNN-AE. Nineteen radiomic features from 83 brain regions were evaluated for image quantification via Hammersmith atlas. Finally, the summary of related works done using medical imaging methods and DL is shown in Table 8.

Other Neuroimaging Modalities
Ravi Prakash et al. [135] introduced an algorithm based on DL for ECoG-based functional mapping (ECoG-FM) for eloquent language cortex identification. However, the success rate of ECoG-FM is low as compared with electro-cortical stimulation mapping (ESM). In another work, Rosas-Romero et al. [149] have used fNIRS to detect epileptic seizure and obtained better performance than achieved using conventional EEG signals.

Rehabilitation Systems for Epileptic Seizures Detection
The high performance and robustness to noise have made the DL techniques suitable for commercial products. Nowadays various commercial products have been developed in the field of DL, one of which is DL applications and hardware for diagnosing epileptic seizures. In the first study investigated, the brain-computer interface (BCI) system was developed using an AE for epileptic seizure detection by Hosseini et al. [127]. In another study, Singh et al. [128] indicated a utilitarian product for the diagnosis of epileptic seizures, which comprised the user segment and the cloud segment. The block diagram of the proposed system presented by Singh et al. is shown in Figure 16. Kiral-Kornek et al. [150] demonstrated that DL in combination with neuromorphic hardware could help in developing a wearable, real-time, always-on, patient-specific seizure warning system with low power consumption and reliable long-term performance.

Discussion
Nowadays, many people worldwide have epileptic seizures and suffer from these neurological disorders. Early detection of epileptic seizures is of substantial importance because it directly affects the patients' quality of life and can enhance their self-confidence at all stages of life. So far, much research has been accomplished to diagnose epileptic seizures using AI techniques. The objective of these studies is to assist physicians in accurate epileptic seizures diagnosis. AI research involves conventional machine learning [151] and DL [152][153][154][155][156] scopes. Until recently, many machine learning methods that were adopted to automatically detect seizures could not be seriously used for a variety of real-time diagnostic aid tools for epileptic seizures due to their disadvantages. DL is one of the stateof-the-art fields of epileptic seizure detection that has been employed for epileptic seizure detection since 2016. In recent years, the research growth in epileptic seizure diagnosis using DL is proceeding rapidly due to the simultaneous development of DL toolboxes as well as graphics processing units (GPUs). Applying DL techniques to diagnose epileptic seizures gives doctors hope that in the not-too-distant future a variety of rehabilitation tools will be developed for patients with epileptic seizures. Table A1 in the Appendix shows the overview of works done in this area. It also shows the type of dataset used, implementation tool, preprocessing, DL network, and evaluation methods utilized.
As shown in this study, various DL structures are applied for epileptic seizure detection, yet none of them has superiority over others. The best structure should be chosen carefully based on the dataset and problem characteristics, such as the need for real-time detection or minimum acceptable accuracy or even the use of pre-trained models. There are many databases available with different models. Hence, it is difficult to compare them as they have been developed using different datasets and models. Overall, one of the most important advantages of DL algorithms is their high performance. Hence, such models have been widely used for many applications. Another advantage of DL methods is that they are robust to noise. Therefore, noise removal can be omitted in many applications. However, they need more data to train, and training takes time. Developing a robust model is time consuming and requires huge data.

Challenges
There are several challenges in diagnosing epileptic seizures using neuroimaging modalities and DL procedures. Inaccessibility of datasets with high registration time is the first challenge in this area. The datasets available for diagnosing epileptic seizures have a finite registration time (or recording), making it difficult to conduct serious (or important) research in the field of epileptic seizures. The complete datasets are not shared in the public domain, only a portion of the data may be available. Hence, real-time diagnosis of epileptic seizures is still challenging. However, research in the field of real-time epileptic seizures diagnosis has been performed, using clinical data [157][158][159].
Due to the lack of accessible datasets, researchers have not yet been able to present a DL-based CADS for diagnosing epileptic seizures with optimum performance. Additionally, it is not possible to combine the available EEG datasets to enhance the efficiency of DL networks. This is because each of the datasets presented possesses different sampling frequencies, and in order to achieve higher detection accuracy, it is not pragmatic to integrate them to feed to DL networks. Table 1 shows all available EEG datasets used for epileptic seizure detection. However, other neuroimaging modalities such as MRI are used for epileptic seizures detection. In [141][142][143][144][145][146][147][148], MRI modalities coupled with DL methods have been used to diagnose epileptic seizures. Datasets with non-MRI modalities are not available, and this has led to limited research in this area. Therefore, providing datasets from other neuroimaging modalities is important to conduct research.
Nowadays, DL models have made considerable advancements [160][161][162][163][164]. This has resulted in the development of computer hardware [165,166] that is expensive and not easily accessible to the researchers. Researchers working in the field of epileptic seizures detection/prediction do not always have access to high-power hardware to implement novel DL models. Although powerful computing servers are available by Google, constraints such as the amount of data that can be uploaded to these servers and execution time are still the challenges.

Conclusion and Future Works
In recent years, a lot of research has been done in the epileptic seizures detection field using artificial intelligence methods [167][168][169][170][171][172][173][174][175]. In this paper, a comprehensive review of works done in the field of epileptic seizure detection using various DL techniques such as CNNs, RNNs, and AEs is presented. Various screening methods have been developed using EEG and MRI modalities. We have investigated the epileptic seizures detection using DL-based practical and applied hardware methods. It is very encouraging that much of the future research will concentrate on hardware-practical applications aid in the accurate detection of such diseases. The functional hardware has also been utilized to boost the performance of detection strategies. Furthermore, the models can be placed in the cloud by hospitals. Therefore, handheld applications, mobile or wearable devices, may be equipped with such models, and cloud servers will perform the computations; by taking benefit from predictive models, these devices can be used to avert patients in a timely manner. Alert messages may be generated to the family, relatives, the concerned hospital, and doctor in the detection of epileptic seizures through the handheld devices or wearables, and thus the patient can be provided with proper treatment in time. Moreover, a cap with EEG electrodes in it can obtain the EEG signals, which can be sent to the model kept in the cloud to achieve real-time detection. Additionally, if we can detect early stage of seizure using interictal periods of EEG signals, the patient can take medication immediately and prevent seizure. This field of research requires more research that combines different screening methods for more precise and fast detection of epileptic seizures and also applies semi supervised and unsupervised methods to further overcome the dataset size limits. Finally, having publicly available comprehensive datasets can help to develop an accurate and robust model that can detect the seizure in the early stage.

Conflicts of Interest:
The authors declare no conflict of interest. Table A1 shows the detailed summary of DL methods employed for automated detection of epileptic seizures.