1. Introduction
Epilepsy is a common neurological disorder marked by frequent and unpredictable seizures that can profoundly impact a person’s quality of life. Early and accurate seizure prediction is essential for improving patient outcomes by enabling timely intervention and enhancing overall safety. Electroencephalogram (EEG) signals, which capture the brain’s electrical activity, have become central to the development of automated seizure detection and prediction systems. However, the highly nonstationary and complex nature of EEG data presents significant challenges in designing robust and generalizable predictive models.
Recent advancements in machine learning and signal processing have substantially improved EEG-based epileptic seizure prediction models. Li et al. [
1] introduced a patient-specific seizure prediction framework using a multichannel feedback capsule network (FB-CapsNet), which processes EEG signals in an end-to-end manner, eliminating the need for manual feature extraction. By leveraging feedback-enabled capsule structures, the model effectively captures complex spatial-temporal dependencies, offering high predictive accuracy and adaptability to individual EEG profiles. However, the model exhibits high computational complexity and limited generalizability, as it requires separate training for each patient. In contrast, Tamanna et al. [
2] proposed a generalized approach utilizing discrete wavelet transform (DWT) for feature extraction and support vector machines (SVM) for classification. Their method achieved an average prediction accuracy of 96.38% and anticipated seizures approximately 26.1 min before onset, without requiring patient-specific training. Nevertheless, reliance on a single dataset and sensitivity to preprocessing steps may impact its generalizability and robustness.
Divya et al. [
3] introduced a hybrid deep learning model integrating autoencoders for unsupervised feature extraction with convolutional neural networks (CNNs) for classification. The model achieved 92.5% accuracy and demonstrated good generalization across datasets, though it remains slightly behind the best-performing models and still requires clinical validation. Zhang et al. [
4] developed a vision transformer (ViT)-based model that processes EEG data by converting signals into image-like formats. This method benefits from self-attention mechanisms that extract patient-specific spatial features. While it offers strong predictive performance, the approach demands large volumes of patient-specific data and significant computational resources, limiting scalability. Wang et al. [
5] proposed a seizure prediction model combining dynamic multi-graph convolution with a channel-weighted transformer. This architecture captures intricate spatial and temporal relationships among EEG channels, resulting in enhanced prediction accuracy. However, it is computationally intensive and dependent on high-quality EEG data, which may not always be feasible in clinical settings.
Kapoor et al. [
6] introduced a hybrid Cuckoo–Finch optimization algorithm for automatic EEG electrode selection and CNN hyperparameter tuning. Evaluated on the CHB-MIT and Siena datasets, the model achieved an accuracy of 97.76%. Despite its strong performance, the method’s computational complexity and lack of testing in real-world IoT scenarios pose challenges for deployment. Zhao et al. [
7] proposed a patient-specific approach combining Adder networks and supervised contrastive learning to improve energy efficiency and feature discrimination. The model demonstrated strong sensitivity and specificity while being computationally suitable for wearable devices. However, it heavily relies on patient-specific data and requires further validation in real-time, noisy environments. Bhattacharya et al. [
8] developed a Transformer-based model for seizure prediction that captures long-range temporal dependencies with minimal manual preprocessing. Achieving a sensitivity of 98.5% and a low false-positive rate of 0.12/h, the model is well-suited for clinical use. Nonetheless, its generalizability and high resource requirements remain concerns.
In the context of IoMT applications, Torkey et al. [
9] proposed a hybrid model combining CNN, LSTM, and GRU layers, incorporating SMOTE for class imbalance and SHAP for model explainability. Designed for real-time deployment, the model achieved 99.13% accuracy. Despite its strengths in scalability and interpretability, limitations include high computational costs and privacy concerns in IoMT ecosystems. Kalitzin et al. [
10] presented TOREADA, an adaptive seizure detection algorithm based on topological reinforcement learning. It dynamically adjusts detection parameters using real-time EEG embeddings. While simulation results highlight its robustness, the lack of clinical validation restricts its practical application. Li et al. [
11] analyzed variations in fractal dimensions during seizures using CHB-MIT data, applying both Higuchi’s and roughness scaling extraction (RSE) methods. Their study clarified discrepancies in prior findings and emphasized the influence of preprocessing. However, its clinical relevance remains limited due to the absence of predictive modeling. The study by Deng et al. [
12] introduced HViT, a hybrid vision Transformer architecture for EEG-based seizure prediction that integrates convolutional neural networks (CNNs) to enhance local feature extraction and mitigate top-level gradient vanishing typically seen in pure transformers. On top of this, they incorporate data uncertainty learning (DUL), modeling each EEG embedding as a Gaussian or Laplacian distribution—where the mean is learned by the HViT and a parallel branch predicts variance or scale—thereby increasing robustness to noisy EEG signal representations. Additionally, a learnable constraint coefficient in the loss function is tailored per patient, and a simple uncertainty quantification method is applied to alarms using a k-of-n continuous prediction strategy. Evaluated on two public epilepsy datasets, their approach demonstrates a superior performance, highlighting the combined benefits of CNN-augmented transformers and uncertainty modeling in improving seizure prediction accuracy.
Zhu et al. [
13] presented a novel EEG-based seizure prediction model that fuses a multidimensional Transformer encoder with LSTM and GRU recurrent neural networks to simultaneously capture both global and local temporal-frequency features from EEG spectrograms. The approach first applies short-time Fourier transform to extract time–frequency representations, then separately processes spectral and temporal dimensions via dual Transformer encoders. Outputs are further refined through LSTM and GRU branches, whose features are gated and fused for classification. Tested on two public datasets—CHB-MIT and Bonn—the model achieved an outstanding performance: on CHB-MIT, it averaged 98.24 sensitivity and 97.27 specificity; on Bonn, it reached 99 accuracy in binary and 98 in three-class classification. Han Wang et al. [
14] proposed a lightweight, compressive sensing-based approach to channel estimation in MIMO-FBMC systems, addressing challenges such as intrinsic imaginary interference and low-resource deployment in industrial IoT settings. By modeling the channel as sparse in the delay/Doppler domain, the authors design a low-complexity estimator that avoids dense pilot structures and heavy computations, achieving a comparable or better MSE than classical methods, especially in noisy environments. This methodology offers strong cross-domain relevance to biomedical applications like seizure prediction, where similar constraints—such as real-time processing, noise robustness, and low-power operation—exist. Applying these ideas to EEG analysis suggests potential for wavelet-domain sparsity modeling, adaptive recovery algorithms, and improved interpretability in wearable seizure-monitoring systems. To enhance novelty, related biomedical research can integrate these signal-model-driven, sparsity-aware techniques as alternatives to standard deep fusion models. The main aim of [
15] is to address angle estimation in arbitrary-manifold array bistatic MIMO radar and propose a joint two-dimensional direction-of-departure (2D-DOD) and two-dimensional direction-of-arrival (2D-DOA) estimation algorithm assisted by IRS. Finally, Mansouri et al. [
16] developed a real-time, non-patient-specific algorithm for seizure detection and localization. Utilizing spectral and coherence-based features, the method achieved a median detection latency of 8 s. While computationally efficient and generalizable, its accuracy is reduced for brief seizures and requires robust artifact removal for optimal performance.
Because adder networks use the t-norm distance as the similarity measure between input features and filters, the network’s gradient behavior changes. To ensure the proper convergence of AddNet-SCL, we introduce an adaptive learning rate strategy. Building upon these innovations, this paper presents a novel epileptic seizure prediction method that integrates DWT-based time–frequency analysis with advanced feature extraction techniques—including entropy, power, frequency, and amplitude—paired with deep learning using Fourier neural networks (FNNs). Evaluated on the well-established CHB-MIT EEG dataset, the proposed approach achieves high prediction accuracy with a zero false positive rate, marking a significant advancement over existing methods. The rest of this paper is structured as follows:
Section 2 describes the preprocessing steps, including discrete wavelet transform and feature selection methods.
Section 3 details the deep learning approach using Fourier neural networks.
Section 4 details the experimental results and provides a comparative analysis of the proposed method with current approaches.
The originality and primary contributions of this paper can be summarized as follows:
Sparse signal representation in the wavelet domain: Unlike traditional approaches that rely on dense representations followed by deep fusion techniques (e.g., DWT combined with LSTM), our method explicitly models EEG signals as sparse in the wavelet domain. This formulation draws a parallel to the sparsity observed in the delay/Doppler domain of FBMC channels, allowing for more efficient signal analysis.
Lightweight adaptive recovery framework: We introduce an adaptive sparse recovery module inspired by pursuit-based algorithms, specifically designed to handle the non-stationary and noise-prone nature of EEG signals. While conceptually similar to sparse channel estimation techniques in communication systems, this module is customized for biomedical signal processing applications.
Optimized for real-time applications: The proposed model significantly reduces computational overhead, making it suitable for deployment on low-power, real-time platforms such as wearable seizure detection devices. This aligns with similar efficiency goals pursued in MIMO-FBMC systems for industrial IoT environments.
Improved interpretability and clinical relevance: The sparse wavelet coefficients produced by our model enhance the transparency and physiological interpretability of EEG signals. This feature not only supports clinical decision making but also mirrors the interpretability advantages seen in sparse communication channel estimation.
3. Deep Learning
3.1. Fourier Neural Networks
(FNNs) can be particularly effective in analyzing electroencephalography (EEG) signals, which are inherently complex and often contain rhythmic and periodic patterns indicating brain activity. Below is an in-depth description of how FNNs can be applied specifically to EEG signal processing. Given an input vector
and a frequency matrix
, the Fourier feature mapping is defined as
Here
B is a matrix with entries often sampled from .
embeds x into a higher-dimensional space where periodic features are easier to learn.
3.2. FNN Architecture
An FNN uses the Fourier-mapped input as input to a standard neural network:
where
A basic architecture looks like
where
,
are biases
is an activation function (e.g., ReLU)
The neural network architecture consists of an input layer followed by two hidden layers with 128 and 64 neurons, respectively. Both hidden layers use ReLU activation functions. The output layer employs a sigmoid activation to perform binary classification, distinguishing between seizure and non-seizure instances.
The model was trained using the Adam optimizer with a binary cross-entropy loss function, a batch size of 32, and over 10 epochs. This configuration was chosen to maintain a balance between computational efficiency and predictive performance.
Although no explicit regularization methods (such as dropout or L2 regularization) were applied, validation metrics were closely monitored throughout training. This helped ensure the model generalized well and did not exhibit signs of overfitting, particularly given the relatively short training duration.
Description for EEG signal processing EEG signal characteristics: EEG signals are composed of electrical activities from the brain, reflecting various states such as sleep, alertness, and cognitive engagement. These signals typically exhibit rhythmic patterns across different frequency bands (e.g., delta, theta, alpha, beta, and gamma). Fourier Transform applications: frequency domain analysis: FNNs can leverage Fourier transforms to convert EEG time-series data into the frequency domain. This transformation allows the network to directly analyze different frequency bands, which are essential for interpreting various mental states and conditions. Feature extraction: Through Fourier transformation, the FNN can automatically extract meaningful features related to brain activity, such as power spectral densities for specific frequency bands, without relying heavily on manual feature engineering. Network Architecture: FNNs designed for EEG may include layers that directly apply Fourier transforms, sinusoidal activations, and other elements suited to model periodic functions relevant to brain activities. Some architectures might blend traditional convolutional or recurrent layers with Fourier layers to capture both temporal and frequency-related information. Learning Temporal Dynamics: By integrating temporal dynamics within the framework of Fourier analysis, FNNs can adaptively learn how specific frequency patterns evolve over time, which is crucial for understanding cognitive processes, detecting seizures, or classifying mental states.
4. Result
We utilized the CHB-MIT Scalp EEG database, which contains recordings from 24 pediatric patients. From this dataset, a total of 827 usable EEG signal segments were extracted following initial preprocessing, which included
Removing segments with missing or corrupted values.
Normalizing signal amplitudes using z-score normalization to ensure a standardized range.
Truncating or zero-padding signals to a fixed length to maintain uniform input dimensions.
To simulate a realistic prediction scenario, the dataset was split chronologically based on recording times. This strategy ensures that the test data reflects unseen future samples, reducing data leakage and offering a more accurate measure of generalization. The data was divided as follows:
This results in a test set comprising approximately 35% of the entire dataset. To enhance the robustness of the evaluation, the model was trained and assessed over 5 independent runs, each with different random seeds and data shuffling.
Table 1 presents a comparison of several deep learning models applied to EEG signal classification, each utilizing distinct preprocessing pipelines.
AUC-ROC (area under the receiver operating characteristic curve) quantifies the model’s ability to distinguish between preictal (pre-seizure) and interictal (non-seizure) states across all possible classification thresholds. An AUC of 1.0 indicates perfect discrimination, whereas an AUC of 0.5 suggests a performance equivalent to random guessing. This metric is particularly important in medical diagnosis tasks such as epilepsy prediction, where both false positives and false negatives can have significant consequences. A high AUC signifies that the model is effective at ranking true preictal events above interictal ones, which is vital for providing timely and accurate seizure warnings.
FPR (false positive rate) refers to the proportion of interictal (non-seizure) segments that are incorrectly classified as preictal. In the context of epilepsy prediction, maintaining a low FPR is crucial, as frequent false alarms can cause undue stress, trigger unnecessary interventions, and negatively impact the patient’s quality of life. This is especially important in real-time or wearable monitoring systems, where reliability and user trust are paramount.
As shown in
Figure 4, the plot illustrates the classification accuracy of various seizure prediction and detection models. The x axis represents different models along with their corresponding preprocessing techniques, while the y axis indicates accuracy values ranging from 93% to 100%. This visualization enables a direct comparison of model performance in terms of classification accuracy.
Figure 5 presents a line graph comparing the AUC-ROC scores of several seizure prediction or detection models, providing insight into their discriminative power. Higher AUC values (closer to 1.0) indicate better performance. The y axis ranges from 0.88 to 1.02, focusing on high-performing models, while the x axis lists the models along with their EEG preprocessing methods: CapsNet (Raw Data), SVM (DWT), Hybrid AE + CNN (DWT), Transformer based (STFT), and FNN (DWT). The curve begins at approximately 0.92 for the Hybrid AE + CNN model, rises to about 0.98 for the Transformer-based model, and peaks near 1.0 for the FNN model, indicating the highest classification performance. CapsNet and SVM models are not plotted, possibly due to low or unavailable AUC values. Overall, the figure highlights the superior discriminative capability of the FNN model and the importance of effective EEG preprocessing combined with deep learning architectures.
Figure 6 displays a line graph showing the false positive rate (FPR) across various seizure detection models, where lower FPR values indicate better resistance to false alarms. The y axis spans from 0 to 0.20 to capture clinically significant variations, while the x axis shows the models alongside their respective EEG preprocessing techniques: CapsNet (raw data), SVM (DWT), hybrid AE + CNN (DWT), Transformer-based (STFT), and FNN (DWT). Among the visible data points, the SVM model starts at an FPR of approximately 0.13, and the hybrid AE + CNN model rises to around 0.19, indicating weaker performance. In contrast, the FNN model demonstrates the best outcome, with an FPR close to zero. CapsNet and Transformer-based models lack visible data points, possibly due to missing values or values outside the displayed range.
5. Conclusions
However, several challenges were identified:
Data scarcity and imbalance: Despite access to a relatively large EEG database, only 827 segments were usable after filtering, limiting model generalization. Furthermore, certain models showed skewed performance (e.g., perfect FPR but low AUC), likely due to class imbalance or overfitting.
Interpretability of deep models: While deep models like Transformers and CNNs yielded competitive results, their black-box nature raises concerns in clinical applications where decision transparency is critical.
Variability in signal quality: EEG signals are inherently noisy and patient-specific. Although normalization was applied, inter-patient variability remains a key obstacle, potentially impacting model robustness in broader clinical settings.
Computational Trade-offs: Some models, such as the Transformer-based approach, offer competitive accuracy with reduced prediction time, making them suitable for real-time use. Others, while more accurate, require more extensive computation or more complex preprocessing, which may limit deployment in resource-constrained environments.
The integration of discrete wavelet transform (DWT) with deep learning models led to significant improvements in performance. Notably, the combination of DWT with a Fourier neural network (FNN) achieved the highest performance across all metrics. This configuration recorded an accuracy of 98/96, an AUC-ROC score of 1.0, and an FPR of 0, with the shortest prediction time of 5 units. These results indicate a highly effective model capable of rapid and precise EEG signal classification.
In contrast, the use of raw EEG data with a CapsNet model, while achieving a reasonable accuracy of 95/7, resulted in the highest false positive rate (0/127) and the longest prediction time (30 units), underscoring the limitations of bypassing signal preprocessing. Other DWT-based approaches, such as DWT + SVM and DWT + hybrid AE+CNN, also demonstrated solid accuracy and low false positive rates, though they did not surpass the FNN model. Meanwhile, the STFT + Transformer-based method showed moderate accuracy (94/6) and AUC-ROC performance but fell short in comparison to the top-performing DWT-based configuration. The experimental results confirm that applying discrete wavelet transform (DWT) as a preprocessing step significantly enhances the effectiveness of deep learning models for EEG signal classification. Among the tested methods, the DWT + FNN configuration stands out for its superior accuracy, perfect AUC-ROC score, zero false positives, and minimal prediction time. These findings highlight the potential of combining multi-resolution signal analysis with lightweight neural architectures to develop robust and efficient EEG-based diagnostic and brain–computer interface systems.