LightGBM-Based Seizure Detection Method in Pilocarpine Mouse Model of Epilepsy

Edoho, Mercy; Partouche, Nicolas; Hoornenborg, Christiaan Warner; Hoogland, Tycho M.; Baudouin, Stéphane; Mooney, Catherine; Wei, Lan

doi:10.3390/a19030167

Open AccessArticle

LightGBM-Based Seizure Detection Method in Pilocarpine Mouse Model of Epilepsy

by

Mercy Edoho

¹

,

Nicolas Partouche

²,

Christiaan Warner Hoornenborg

²

,

Tycho M. Hoogland

²,

Stéphane Baudouin

²,

Catherine Mooney

^1,*

and

Lan Wei

³

¹

UCD School of Computer Science, University College Dublin, 4 Dublin, Ireland

²

uniQure biopharma B.V., 1105 BP Amsterdam, The Netherlands

³

UCD School of Electrical and Electronic Engineering, University College Dublin, 4 Dublin, Ireland

^*

Author to whom correspondence should be addressed.

Algorithms 2026, 19(3), 167; https://doi.org/10.3390/a19030167

Submission received: 9 January 2026 / Revised: 16 February 2026 / Accepted: 20 February 2026 / Published: 24 February 2026

(This article belongs to the Special Issue Machine Learning in Medical Signal and Image Processing (4th Edition))

Download

Browse Figures

Versions Notes

Abstract

Electroencephalogram (EEG) has been the gold standard for measuring epileptic activity in rodent models of epilepsy. Manual scoring of seizures in EEG recordings lasting from days to months is laborious and prone to human error. The existing literature on automatic seizure detection in rodent models of epilepsy is limited, and the electrographic characteristics of induced epilepsy significantly differ from those of other epilepsy types. This study employed a Light Gradient Boosting Machine (LightGBM), with the dataset carefully partitioned into separate training and testing sets to ensure no data overlap. The model was trained using five-fold cross-validation to enhance robustness and generalisability. The training, validation, and independent test sets comprised 29,722 h of EEG recordings from 102 mice with pilocarpine-induced temporal lobe epilepsy. Following feature selection, model training, and post-processing, the lightGBM-based model exhibited a sensitivity of 80%, a specificity of 99%, and an F1-score of 0.71 on the independent test set. Multiple pairwise and non-parametric statistical tests indicated that envelope, skewness, and kurtosis, identified as the three most significant features in the feature importance ranking, exhibit statistically significant differences in their distributions (p-value < 0.05). The statistical analysis revealed significant differences across the three features and between seizure and non-seizure events for each feature, highlighting their relevance for discriminating epileptic activity. This study highlights the potential to support the automation of seizure event detection in preclinical rodent models of epilepsy.

Keywords:

EEG; pilocarpine mouse model of epilepsy; seizure detection; machine learning; LightGBM

1. Introduction

Epilepsy is a chronic neurological disease defined by at least two unprovoked, or reflex, seizures occurring more than 24 h apart [1]. The World Health Organisation reports that 50 million people globally have epilepsy, rendering it a considerable global health concern [2]. One-third of patients with epilepsy endure the substantial burden of drug-resistant seizures [3,4]; hence, there remains an unmet clinical need for improved management of seizures [5], which continues to propel epilepsy research aimed at discovering and testing new interventions.

The complexities of epilepsy and drug-resistant seizures are frequently studied in preclinical settings with chemically induced (pilorcapine or kainic acid) mouse models of epilepsy [6,7,8]. Similar to the assessment of human epilepsy, the EEG, a readout of brain activity, is also used to monitor seizure incidence and duration in mice, thereby facilitating understanding of disease aetiology and evaluating the impact of experimental therapies. EEG recording extending from several days to months is common for such preclinical investigations.

Epileptiform discharges in prolonged EEG recordings are usually identified through visual inspection by trained experts [9]. In addition to the extensive time in reviewing the prolonged recordings, the subjective identification of seizures by experts, due to varying seizure morphologies and the similarity of seizure patterns to noise and artefacts, introduces human errors.

In contrast to human seizure detection, there is a paucity of data and published research on automatic seizure detection in rodent models of epilepsy [10]. Jang and Cho [11] developed a dual deep neural network-based classifier using periodograms from 5 s EEG segments to detect experimental seizures obtained from a pilocarpine-induced mouse model of epilepsy. In their extended work [12], the authors methodically evaluated the efficacy of various combinations of input modalities and deep network architectures using a consistent window size of the dataset to determine the most effective combination. Kamintsky et al. [13] trained an artificial neural network (ANN) using four mouse EEG recordings, which included one synapsin triple knockout mouse (STKO), one pilocarpine mouse, and two albumin-treated mice. Fumeaux et al. [14] developed three generalised linear models with EEG recordings from a 16 kainic acid rat model of epilepsy. The three models underwent training and testing with different data compositions.

In our previous work [15], we developed a publicly accessible web server (EPI-AI) that integrates XGBoost-based seizure detection for multiple mouse models of epilepsy. The model was trained using EEG recordings from two mouse models: intra-amygdala microinjection of kainic acid (IAKA) and Dravet syndrome. EPI-AI was evaluated on an independent test set comprising sixteen IAKA and Dravet syndrome mice, along with four pilocarpine-induced temporal lobe epilepsy (TLE) mice, which were not included in the training process. EPI-AI demonstrated robust generalisation across all mouse models, including pilocarpine, with a sensitivity of 76.30%. However, the assessment of the EPI-AI on our independent test set shows a sensitivity of 35.0% and an AUROC of 64.0%. In this study, we developed a machine learning seizure detection system utilising EEG recordings derived from 102 mice with pilocarpine-induced TLE. The proposed method demonstrated a sensitivity of 80.0%, a specificity of 99.0%, and an F1-score of 0.71 on the independent test set. Non-parametric and multiple pairwise tests were conducted on the three most important features (envelope, skewness, and kurtosis), demonstrating their capacity to discriminate between seizure and non-seizure events. This method may be assistive in a preclinical context for detecting seizures in the pilocarpine mouse model of TLE.

2. Materials and Methods

2.1. Data

2.1.1. Pilocarpine Mouse Model of Epilepsy

The EEG data used for training the software was acquired in accordance with all relevant ethical guidelines and requirements in accordance with Directive 2010/63/UE on the protection of animals used for scientific purposes. In adult Swiss male (Charles River and Janvier) mice, a refined pilocarpine mouse model of TLE was induced [16]. DSI implants (ETA-F10, DSI, St. Paul, MN) for EEG telemetry were implanted under isoflurane anaesthesia (induction at 5%, maintenance at 2.5%). Transmitters were placed in a subcutaneous pocket near the peritoneum, with leads routed subcutaneously to the head. Electrodes were placed in the dentate gyrus, and an additional electrode was placed in the cerebellum, serving as the ground electrode. The electrodes were fixed with screws and secured on the skull with dental cement. EEG signals were continuously monitored over an extended period of time (Ponemah, DSI) to capture a wide range of neurological events, including recordings of status epilepticus (SE) and spontaneous recurrent seizures (SRSs) under standardised conditions (1000-fold amplified (1000×), bandpass filtered at 0.16–97 Hz, with a sampling rate of 500 Hz).

2.1.2. Dataset

This study utilises a dataset consisting of 102 mouse EEG recordings acquired from a single EEG channel, with a total recording duration of 29,722 h. The 102 mouse EEG recordings were divided into training, validation, and independent test sets in the proportions of 70.60%, 9.80%, and 19.60%, respectively, ensuring no overlap; each EEG recording from a single mouse was assigned to only one set. Table 1 details the duration of seizure and non-seizure segments across the entire dataset, in addition to the distribution of the mouse EEG recordings into training, validation, and independent test sets.

2.2. Data Pre-Processing

A 50 Hz notch filter was utilised to eliminate the 50 Hz interference from the EEG recordings caused by power lines, and the DC offset was also eliminated. The EEG recordings were further segmented into 5 s epochs with a 2.5 s overlap.

2.3. Feature Estimation

A total of 35 features were extracted from segmented EEG epochs for model development. These include the original 19 features from our previous work [15] and 16 additional features introduced in this study. The additional features are (i) permutation entropy [17], (ii) approximate entropy [18], (iii) spectral entropy [19], (iv) single-value decomposition (SVD) [20], (v) sample entropy [18], (vi) zero-crossing, (vii) mean and variance of instantaneous frequency (IF), (viii) mean and variance of absolute first derivative of instantaneous frequency, (ix) mean and variance of power spectral density (PSD), (x) maximum and minimum PSD, and (xi) mean and variance of absolute first derivative of instantaneous amplitude (IA). The additional 16 features were selected to capture nonlinear, complexity-based, and entropy-based characteristics of EEG recordings that are not fully represented by the original set of features. These features have shown promise in the prior literature for improving the detection of neurological disorders [21,22,23]. Our goal was to enrich the feature space to improve generalisability. The features include time-domain, frequency-domain and nonlinear features. All features were extracted from each 5 s epoch with a 2.5 s overlap, segmented from the 102 EEG recordings. A complete list of the newly added 16 features, along with their definitions and categories, is provided in the Appendix A.

2.4. Detection Algorithm

As shown in Table 1, EEG recordings from 72 mice were used for training: ten for validation and 20 EEG recordings as an independent test set. The independent test set was not used in the training; hence, the model trained is naive to the independent test set. The training signals were further split into five folds for five-fold cross-validation training, such that each EEG recording is assigned to exactly one fold. Light Gradient Boosting Machine (LightGBM) was employed in this study; LightGBM is a gradient boosting framework. Unlike conventional gradient boosting methods that utilise depth-first tree growth, LightGBM uses a leaf-wise tree growth technique and a histogram-based algorithm to determine the optimal split. These two features make it ideally suited for managing large datasets where speed and precision are critical [24]. Consequently, model training was conducted on a conventional CPU-based workstation and necessitated roughly 72 min, indicating that the method is not dependent on specialised technology.

Hyperparameter tuning for the LightGBM classifier was performed using a grid search strategy implemented via GridSearchCV to mitigate overfitting. Four key hyperparameters were optimised, including the learning rate, minimum-data-in-leaf, number-of-leaves, and L1 regularisation strength. Model optimisation was conducted on the training data using predefined split, while the validation set was used exclusively to evaluate hyperparameter configurations. The F1-score was employed as the primary validation metric due to its suitability for imbalanced seizure detection tasks. The validation set achieved its best performance (F1-score = 0.70) with a learning rate of 0.05, a minimum data-in-leaf value of 100, 128 leaves, and an L1 regularisation of one. See the Appendix for the hyperparameters and search ranges used during the grid search optimisation.

The preliminary feature space consisted of 35 features. To reduce dimension and enhance model performance, feature importance ranking was obtained using the LightGBM embedded feature importance based on gain. The median importance was set as the threshold to select the important features. The use of a median threshold for feature selection serves as a central reference point, differentiating the upper half of the feature list as important features and the lower half as less important features. This method of feature selection thresholding does not depend on arbitrary cut-offs. Seventeen features whose gains exceeded the threshold were selected.

Subsequently, the 17 selected features were used to retrain the LightGBM algorithm (Figure 1). The optimal hyperparameters depend on the number of features and the degree of their correlation [25]. Similar to the preliminary trained LightGBM algorithm, hyperparameters were optimised to align with the selected features; all other hyperparameters yielded the same results as those in the preliminary model, except for min-data-in-leaf, which was optimised to 20. Figure 2 presents the overall experimental framewwork employed in this study.

Non-parametric tests (Mann–Whitney U and Kruskal) were computed to determine whether the distributions of the three most important features differ significantly. To further examine whether the features excluded during the feature selection process contained complementary information, a targeted ablation study was performed. Models were trained using (i) the 18 features discarded by feature selection and (ii) the selected feature set with the three highest-ranked features (envelope, skewness, and kurtosis) excluded. These models were trained with the earlier optimised hyperparameters.

2.5. Post-Processing

EEG generally has a low signal-to-noise ratio, with the signal frequently entangled with various artefacts. As a result, tree-based models may exhibit sensitivity to noise [26]. Seizures are not instantaneous; they rhythmically progress in time and space, typically lasting over 10 s [27]. In this study, the minimum seizure duration used for this dataset was observed to be 60 s. However, the LightGBM model detected seizures in 10 s windows, occasionally identifying brief, isolated events shorter than 10 s, which are unlikely to represent actual seizure activity. To address this, any seizure detection lasting less than 10 s was relabelled as a non-seizure event. Additionally, since the exact start and end times of seizures are difficult to determine, each detected seizure lasting more than 10 s was extended by 25 s before and after the initial detection to minimize the risk of missing any seizure activity and provide a more accurate representation of the event as it appears in the EEG recordings. In other words, although the minimum seizure duration observed in the EEG recordings is 60 s, this adjustment ensures that the whole temporal progression of the seizure is captured, providing a more accurate representation of the seizure event as it appears in the EEG recordings.

2.6. Performance Evaluation

Accuracy, sensitivity, specificity, area under the curve (AUROC), and F1-score were used to evaluate the LightGBM-based model.

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

P r e c i s i o n = \frac{T P}{T P + F P}

R e c a l l / S e n s i t i v i t y = \frac{T P}{T P + F N}

S p e c i f i c i t y = \frac{T N}{T N + F P}

F 1 - s c o r e = \frac{2 * T P}{2 * T P + F P + F N}

where

True positives (TP): The number of epochs predicted as seizures that were labelled as seizures.
False positives (FP): The number of epochs predicted as seizures that were labelled as non-seizures.
True negatives (TN): The number of epochs predicted as non-seizures that were labelled as non-seizures.
False negatives (FN): The number of epochs predicted as non-seizures that were labelled as seizures.

3. Results

Table 2 presents the performance of EPI-AI, our publicly accessible web server that integrates an XGBoost-based seizure detection model trained on EEG recordings from 16 IAKA and six Dravet syndrome mouse models. The model’s performance was evaluated using 21 EEG recordings from pilocarpine-induced mice in an independent test set. The assessment was calibrated through accuracy, sensitivity, and specificity to determine the model’s reliability and generalizability. The accuracy and specificity measures were 93.0%, while the sensitivity was 35.0%, and the AUROC was 64.0%.

Table 3 presents the results of the LightGBM model after feature selection. The sensitivity and AUROC during the five-fold cross-validation were both 59%. The evaluation of the model on the independent test set yields similar results to those obtained using cross-validation. After applying the post-processing method, our approach improved sensitivity and AUROC to 80%, with specificity increasing to 99%.

The result of the Kruskal test revealed significant differences in the distributions of envelope (p < 0.05), skew (p < 0.05), and kurtosis (p < 0.05) between seizure and non-seizure events. Also, the Mann–Whitney multiple pairwise test was performed on envelope vs. skewness, envelope vs. kurtosis, and skewness vs. kurtosis. The p-values were less than 0.05.

4. Discussion

The pilocarpine model of epilepsy is commonly used to study TLE [28,29]. The electrographic characteristics of TLE markedly differ from those of other epilepsy types [30,31,32], with seizures often displaying focal low-voltage fast activity or rhythmic theta onsets originating from mesial temporal structures on intracranial EEG, unlike the hypersynchronous spike or spike-and-wave onsets and widespread cortical involvement commonly observed in generalised and extratemporal epilepsies [33]. Rodent models play a crucial role in epilepsy research, offering valuable insights into the underlying mechanisms and potential treatments for seizure disorders. However, analysing EEG data from rodent models is a time-consuming and labour-intensive task that requires well-trained experts to manually identify seizure events. Additionally, semi-automated approaches do not incorporate the features of seizures that more complex ML models offer. Commercially available tools are frequently used to assess the occurrence of seizures after entering fixed criteria, including manual thresholding. Therefore, the development of an automatic seizure detection model is essential to improve efficiency, reduce human error, and accelerate research progress. Despite this need, research on automated seizure detection in rodent EEG remains limited, and automated seizure detection methods for the pilocarpine model of TLE are even more limited, highlighting the importance of further advancements in this area.

Previous studies on seizure detection using rodent EEG recordings have varied considerably in data size. The number of rodents used in these studies ranges from as few as four to as many as 45. Jang and Cho et al. [12] used 32 pilocarpine mouse EEG recordings to train deep neural networks. Kamintsky et al. [13] trained an artificial neural network (ANN) with four mouse EEG recordings (one STKO mouse, one pilocarpine, and two albumin-treated mice). Fumeaux et al. [14] developed generalised linear models with 16 EEG recordings from the kainic acid rat model of epilepsy. In the context of limited data, Buteneers et al. [34] utilised the largest sample size of 45 EEG recordings from genetic and kainate systemic injected mice. The 102 EEG recordings used in this study is the largest dataset to date for the pilocarpine-induced epilepsy.

Studies on the automatic detection of seizures using rodent models of epilepsy have achieved sensitivity ranging from 68% to 100%. Cho and Jang [11] reported high performance, achieving 100% sensitivity and a 98% positive predictive value (PPV). These results were derived from a limited dataset (8977 h of EEG recordings) and did not incorporate the F1-score, complicating the assessment of performance in the context of class imbalance, a critical factor for seizure detection. In their subsequent study [12], the authors reported sensitivity values ranging from 96.20% to 96.70%, while F1-scores varied between 0.026 and 0.492. The low F1-scores may indicate an imbalance between precision and recall; however, without detailed error analysis, the relative contributions of false positives and false negatives cannot be determined. Sensitivity, or recall, is crucial in seizure detection; however, it is essential to establish a balanced trade-off between recall and precision to minimise false positives and false negatives, thereby improving the F1-score. As reported in Table 3, the LightGBM-based model attained an F1-score of 0.71 on the independent test set, reflecting a moderate trade-off between precision and recall and indicating reasonable identification of seizure events on unseen data.

Fumeaux et al. [14] reported high AUROC values of 0.995, 0.983, and 0.962 for pooled, continuous, and extrapolated models, respectively; however, the pooled and continuous models were trained and assessed with EEG recordings from the same mice, thus inflating the efficacy of the performance. The extrapolated model, which employed a leave-one-mouse-out test methodology, exhibited a decline in performance to 0.962 AUROC. This illustrates the difficulty of generalising across different mouse EEG recordings. Although the AUROC in this study is slightly lower, we evaluated our model on a cohort of 20 mice that were withheld, ensuring no data leakage occurred between training and testing, hence aligning the testing conditions more closely with real-world deployment scenarios.

Kamintsky et al. [13] reported a sensitivity of 100%; however, their work was based on a limited dataset of only 12 mouse EEG recordings, including only six EEG recordings from pilocarpine-treated mice, which raises questions regarding scalability and reproducibility. In contrast, our model was trained and evaluated on a significantly larger number of mouse EEG recordings, thus enhancing its statistical validity and reducing the risk of overfitting.

EPI-AI, deployed as a public web server, enables researchers to annotate EEG recordings derived from rodent models of epilepsy [15]. EPI-AI demonstrates generalisability across mouse models of genetic and acquired epilepsies, achieving sensitivities ranging from 91.40% to 98.8%. The pilocarpine EEG recordings (unseen data to EPI-AI) exhibits a sensitivity of 76.3%. Table 2 illustrates that the evaluation of the EPI-AI on our independent test set yields a sensitivity of 35.0% and an AUROC of 64.0%. These results reflect high precision but limited recall/sensitivity, probably since EPI-AI was not trained with EEG recordings from the pilocarpine-induced mice. Instead, the model was trained exclusively on data from IAKA and Dravet syndrome models, likely learning specific seizure patterns for these induction methods. Research indicates that seizure-onset and propagation patterns vary according to the induction method. For example, the pilocarpine model typically results in hypersynchronous (HYP) onset seizures characterised by fast-ripple high-frequency oscillations (HFOs), while other models, such as 4-aminopyridine, produce low-voltage fast (LVF) onsets with ripple-dominated activity. This suggests fundamentally distinct EEG characteristics across different models [32]. This limitation highlights the challenge of applying ML models trained on specific mouse models of epilepsy to EEG recordings from alternative seizure induction techniques, which may exhibit different electrophysiological features. As shown in Table 3, the proposed model demonstrates improved performance relative to EPI-AI when evaluated on the same independent test set, achieving a sensitivity of 80.0% and an AUROC of 80.0% following post-processing. While this represents a performance gain under comparable evaluation conditions, the overall discrimination capability remains moderate. In this context, the AUROC values obtained during cross-validation (59%) and on the independent test set (60%) are closely aligned, suggesting consistent generalisation behaviour and no clear evidence of overfitting. The limited discriminative performance is likely attributable to the inherent complexity of the task, including substantial inter-subject variability and restricted separability of the manually engineered feature space.

Feature selection consistently enhances predictive performance, simplifies models, and reduces computational demands [35,36,37]. However, lightGBM is an efficient algorithm that is computationally inexpensive. Feature selection utilising LightGBM’s importance scores effectively reduces feature dimensionality, but does not consistently yield improved performance [38]. LightGBM effectively manages features during the tree-building process, allowing low-importance features to contribute through interactions. While deep learning models may demonstrate good performance, the absence of model interpretability frequently raises concerns for clinical and preclinical users [39]. Hence, the feature importance illustrated in Figure 1 reveals that envelope, skewness and kurtosis are the three most significant features. The median was used as a feature selection criterion due to its adaptability to the data, eliminating the necessity for human intervention while inherently balancing feature reduction and information preservation. Conversely, the fixed-size top-k selection necessitates predetermining k, thereby introducing an additional hyperparameter that must be optimised. Our median-based threshold, however, accounts for the data’s distribution, allowing the number of retained features to be determined by the learnt importance framework rather than a fixed quantity. Furthermore, the gain distribution distinctly separated high- and low-contributing features, rendering a median-based threshold a rational and substantiated selection. While median-based thresholding effectively reduces feature dimensionality in a straightforward and obvious way, it should be considered a heuristic rather than a completely objective selection criterion. More robust alternatives, such as recursive feature elimination, stability selection or permutation-based feature importance assessed on held-out data, could offer enhanced assurances of feature relevance and reproducibility. Nonetheless, these methodologies are more computationally demanding and were not explored in this study.

Two non-parametric tests, the Mann–Whitney U and Kruskal tests, were computed to determine whether the distributions of the three most important features differ significantly between seizure and non-seizure events. The test highlights their strong discriminative capacity for classifying these events. The Mann–Whitney multiple pairwise test on pairs of the top-selected features yielded p-values < 0.05, indicating significant differences in distribution between the pairs. The three features represent distinct components of the EEG recordings’ statistical structure and should not be regarded as duplicates. The envelope is the continuous curve outlining the maximum upper and lower amplitudes over time, providing insight into heightened neuronal perturbation. Kurtosis indicates the presence of sudden transients or bursts, often associated with epileptiform activity. Skewness examines the asymmetry of a signal, which may be associated with alterations in the waveform’s shape during a seizure. To relate the statistical analysis of envelope, skewness, and kurtosis variables to model behaviour, we assessed classification performance both with and without these three top features. As shown in Table 4, upon exclusion, the model attained a precision of 61.0%, a recall of 58.0%, an F1-score of 0.59, and an AUROC of 58.0%; conversely, the inclusion of these features yielded a precision of 58.0%, a recall of 60.0%, an F1-score of 0.59, and an AUROC of 60.0% on the independent test set. Although the total F1-score remains constant, this may indicate a trade-off between precision and recall rather than an absence of feature significance. The AUROC improvement could indicate improved discriminative capability, suggesting that these statistical features enhance the model’s ranking performance and decision boundaries, thereby establishing a significant connection between feature-level statistical variations and classification behaviour. Also, as seen in Table 4, when trained exclusively on the discarded features, the model achieved precision = 54.0%, recall = 51.0%, F1 = 0.51, and AUROC = 51.0%. In contrast, the model trained on the 17 selected features achieved improved performance (precision = 60.0%, recall = 60.0%, F1 = 0.59, AUROC = 59.0%), demonstrating that the retained subset carries substantially more discriminative information.

Although this study concentrated on pilocarpine-induced epilepsy, the methodological framework is not specific to this mouse model of epilepsy. The feature extraction and classification pipeline may be adapted to other preclinical epilepsy models, including kainic acid-induced epilepsy. Nonetheless, retraining and evaluation are necessary due to the inconsistencies in seizure onset patterns and electrographic morphology between the seizure models. The lightGBM-based machine learning model demonstrated potential in seizure detection. The AUROC values achieved during cross-validation (59.0%) and the independent test set (60.0%) are well aligned, suggesting continuous generalisation performance and no definitive indication of overfitting. The poor discrimination capacity is likely due to the inherent complexity of the job, which includes inter-subject variability and restricted separability of the manually generated feature space. Although post-processing increased sensitivity to 80.0%, the corresponding precision (68.0%) and high specificity (99.0%) suggest that this improvement was not achieved at the expense of a large increase in false positives, as reflected by an overall F1-score of 0.71. These results, though clinically relevant, remain insufficient for detecting all seizure occurrences. This performance level, without further enhancement, may diminish the model’s utility in real-time or high-stakes preclinical environments. Accordingly, the proposed model should be viewed as a decision-support tool for expert inspection, rather than as a fully autonomous system. Future work will focus on achieving enhanced performance and interpretability. In particular, deep learning approaches will be explored by integrating automatically learned representations with manually engineered features that exhibit significant distributional differences between seizure and non-seizure events. In parallel, model explanation techniques will be employed to better characterize the features and learned representations that influence model predictions, which is an important consideration for preclinical adoption.

5. Conclusions

A LightGBM-based machine learning model for automated seizure detection was developed using EEG recordings from 102 mice. The model achieved a sensitivity of 80% and an F1-score of 0.71 after post-processing. Consistently detecting the majority of seizures in preclinical research is crucial for evaluating drug efficacy and monitoring disease progression. A sensitivity of 80% may assist in estimating seizure burden while minimising manual annotation. This is particularly advantageous given the prolonged EEG recordings in mouse models of epilepsy. Moreover, we analysed feature importance and identified three features (envelope, skewness, and kurtosis) with the most superior discriminative value. These features have the potential to assist preclinical researchers in analysing seizure events in the pilocarpine mouse model of epilepsy. Furthermore, statistical analysis demonstrated significant differences in these features between seizure and non-seizure events, thereby underscoring their importance in discriminating epileptic activity.

Author Contributions

Conceptualisation, M.E. and L.W.; data curation, M.E., N.P., C.W.H., T.M.H., S.B. and L.W.; formal analysis, M.E.; funding acquisition, C.M.; investigation, M.E. and L.W.; methodology, M.E.; project administration, C.M.; resources, L.W.; supervision, C.M.; validation, M.E. and L.W.; visualisation, M.E.; writing—original draft, M.E.; writing—review and editing, M.E., N.P., C.W.H., T.M.H., S.B., L.W. and C.M. All authors have read and agreed to the published version of the manuscript.

Funding

This publication has emanated from research conducted with the financial support of Taighde Éireann—Research Ireland under Grant number 21/RC/10294_P2 at FutureNeuro Research Ireland Centre for Translational Brain Science.

Institutional Review Board Statement

The EEG data used for training the software was acquired in accordance with all relevant ethical guidelines and requirements in accordance with Directive 2010/63/UE on the protection of animals used for scientific purposes.

Informed Consent Statement

Not applicable.

Data Availability Statement

Access to the data may be granted following a reasonable request and subject to approval by the relevant institutional authorities.

Acknowledgments

OpenAI’s ChatGPT 5.2 was employed to enhance the grammar, clarity, and language expression. The authors retain full responsibility for the content and conclusions presented in this work.

Conflicts of Interest

Authors Nicolas Partouche, Christiaan Warner Hoornenborg, Tycho M. Hoogland and Stéphane Baudouin are employed by the company uniQure biopharma B.V. The authors declare that this study received funding from uniQure biopharma B.V. The funder had the following involvement with the study: data sharing and evaluation.

Appendix A

Table A1. The 16 newly added features with their descriptions and domain of application.

Feature	Description	Domain
Permutation entropy [17]	Nonlinear complexity metric that measures the temporal irregularity of EEG recordings.	Time
Approximate entropy [18]	Measures the regularity and unpredictability of signal fluctuations.	Time
Spectral entropy [19]	Quantifies the EEG signal irregularity.	Frequency
SVD entropy [20]	Measures the effective dimensionality of the data.	Time
Sample entropy [18]	Measures the complexity of physiological time-series data. A modification of approximate entropy.	Time
Zero-crossing [40]	It counts the number of times a signal crosses the zero line.	Time
Mean of instantaneous frequency (IF)	Average frequency content over a segment.	Time-frequency
Variance of IF	Determines the spread of frequency variations.	Time-frequency
Mean absolute first derivative of IF	The mean rate of variation of the predominant frequency. Identifies trend in oscillatory activity.	Time-frequency
Variance of the first derivative of IF	Measures the spread in the rate of change. Shows the instability in brain dynamics.	Time-frequency
Mean PSD	It measures the average signal power across frequency spectrum.	Frequency
Variance PSD	Quantifies the spread of signal power across different frequencies within a specified time interval.	Frequency
Maximum PSD	Maximum signal power across all frequencies.	Frequency
Minimum PSD	Minimum signal power across all frequencies.	Frequency
Mean absolute first derivative of instantaneous amplitude	Quantifies the average rate of change in the amplitude envelope of the signal over time.	Time
Variance of the first derivative of instantaneous amplitude	Quantifies the variability or irregularity in the temporal fluctuations of the signal’s amplitude envelope.	Time

Appendix A.1. Mean Instantaneous Frequency

Let

f_{i} [n]

denote the instantaneous frequency of epoch i at sample n, computed from the unwrapped phase of the analytic signal.

f_{i} [n] = \frac{f_{s}}{2 π} (ϕ_{i} [n + 1] - ϕ_{i} [n]), n = 0, \dots, N - 2

The mean instantaneous frequency is defined as

μ_{| f |, i} = \frac{1}{N - 1} \sum_{n = 0}^{N - 2} |f_{i} [n]|,

(A1)

where N is the number of samples in the epoch.

Appendix A.2. Variance of instantaneous frequency

σ_{f, i}^{2} = \frac{1}{N - 1} \sum_{n = 0}^{N - 2} {(f_{i} [n] - {\bar{f}}_{i})}^{2},

(A2)

{\bar{f}}_{i} = \frac{1}{N - 1} \sum_{n = 0}^{N - 2} f_{i} [n] .

(A3)

Appendix A.3. Mean absolute first derivative of instantaneous frequency

μ_{| Δ f |, i} = \frac{1}{N - 2} \sum_{n = 0}^{N - 3} |Δ f_{i} [n]| .

(A4)

Appendix A.4. Variance of the first derivative of instantaneous frequency

σ_{Δ f, i}^{2} = \frac{1}{N - 2} \sum_{n = 0}^{N - 3} {(Δ f_{i} [n] - {\bar{Δ f}}_{i})}^{2},

(A5)

{\bar{Δ f}}_{i} = \frac{1}{N - 2} \sum_{n = 0}^{N - 3} Δ f_{i} [n] .

(A6)

Appendix A.5. Mean absolute derivative of instantaneous amplitude

μ_{| d |, i} = \frac{1}{N - 1} \sum_{n = 0}^{N - 2} |d_{i} [n]| .

(A7)

d_{i} [n] = a_{i} [n + 1] - a_{i} [n], n = 0, \dots, N - 2,

(A8)

Appendix A.6. Variance of the derivative of instantaneous amplitude

σ_{d, i}^{2} = \frac{1}{N - 1} \sum_{n = 0}^{N - 2} {(d_{i} [n] - {\bar{d}}_{i})}^{2},

(A9)

{\bar{d}}_{i} = \frac{1}{N - 1} \sum_{n = 0}^{N - 2} d_{i} [n] .

(A10)

Table A2. LightGBM hyperparameters and search ranges used during GridSearchCV optimisation.

Hyperparameter	Search Range
Learning rate (learning_rate)	{0.01, 0.03, 0.05, 0.1}
Number of leaves (num_leaves)	{32, 64, 128, 256}
Minimum data in leaf (min_data_in_leaf)	{20, 50, 100, 200}
L1 regularisation (lambda_l1)	{0, 0.1, 0.5, 1}

References

Beniczky, S.; Trinka, E.; Wirrell, E.; Abdulla, F.; Al Baradie, R.; Alonso Vanegas, M.; Auvin, S.; Singh, M.B.; Blumenfeld, H.; Bogacz Fressola, A.; et al. Updated classification of epileptic seizures: Position paper of the International League Against Epilepsy. Epilepsia 2025, 66, 1804–1823. [Google Scholar] [CrossRef]
World Health Organization. Epilepsy: A Public Health Imperative. 2019. Available online: https://www.who.int/publications/i/item/epilepsy-a-public-health-imperative (accessed on 23 June 2025).
Kalilani, L.; Sun, X.; Pelgrims, B.; Noack-Rink, M.; Villanueva, V. The epidemiology of drug-resistant epilepsy: A systematic review and meta-analysis. Epilepsia 2018, 59, 2179–2193. [Google Scholar] [CrossRef] [PubMed]
Sultana, B.; Panzini, M.A.; Veilleux Carpentier, A.; Comtois, J.; Rioux, B.; Gore, G.; Bauer, P.R.; Kwon, C.S.; Jetté, N.; Josephson, C.B.; et al. Incidence and prevalence of drug-resistant epilepsy: A systematic review and meta-analysis. Neurology 2021, 96, 805–817. [Google Scholar] [CrossRef]
Bialer, M.; White, H.S. Key factors in the discovery and development of new antiepileptic drugs. Nat. Rev. Drug Discov. 2010, 9, 68–82. [Google Scholar] [CrossRef] [PubMed]
White, H.S.; Smith-Yockman, M.; Srivastava, A.; Wilcox, K.S. Therapeutic assays for the identification and characterization of antiepileptic and antiepileptogenic drugs. In Models of Seizures and Epilepsy; Elsevier: Amsterdam, The Netherlands, 2006; pp. 539–549. [Google Scholar]
Bazhanova, E.D.; Kozlov, A.A.; Litovchenko, A.V. Mechanisms of drug resistance in the pathogenesis of epilepsy: Role of neuroinflammation. A literature review. Brain Sci. 2021, 11, 663. [Google Scholar] [CrossRef] [PubMed]
Löscher, W. Critical review of current animal models of seizures and epilepsy used in the discovery and development of new antiepileptic drugs. Seizure 2011, 20, 359–368. [Google Scholar] [CrossRef]
Niedermeyer, E.; da Silva, F.L. Electroencephalography: Basic Principles, Clinical Applications, and Related Fields; Lippincott Williams & Wilkins: Philadelphia, PA, USA, 2005. [Google Scholar]
Edoho, M.; Mooney, C.; Wei, L. AI-Based Electroencephalogram Analysis in Rodent Models of Epilepsy: A Systematic Review. Appl. Sci. 2024, 14, 7398. [Google Scholar] [CrossRef]
Jang, H.J.; Cho, K.O. Dual deep neural network-based classifiers to detect experimental seizures. Korean J. Physiol. Pharmacol. 2019, 23, 131–139. [Google Scholar] [CrossRef]
Cho, K.O.; Jang, H.J. Comparison of different input modalities and network structures for deep learning-based seizure detection. Sci. Rep. 2020, 10, 122. [Google Scholar] [CrossRef]
Kamintsky, L.; van Hameren, G.; Weissberg, I.; Moradi, P.; Prager, O.; Ahmad, A.A.; Schori, L.; Becker, A.; Zigel, Y.; Friedman, A. An algorithm for seizure detection in rodents. Epilepsia Open 2025. online ahead of print. [Google Scholar] [CrossRef]
Fumeaux, N.F.; Ebrahim, S.; Coughlin, B.F.; Kadambi, A.; Azmi, A.; Xu, J.X.; Abou Jaoude, M.; Nagaraj, S.B.; Thomson, K.E.; Newell, T.G.; et al. Accurate detection of spontaneous seizures using a generalised linear model with external validation. Epilepsia 2020, 61, 1906–1918. [Google Scholar] [CrossRef] [PubMed]
Wei, L.; Boutouil, H.; Gerbatin, R.R.; Mamad, O.; Heiland, M.; Reschke, C.R.; Del Gallo, F.; Fabene, P.F.; Henshall, D.C.; Lowery, M.; et al. Detection of spontaneous seizures in EEGs in multiple experimental mouse models of epilepsy. J. Neural Eng. 2021, 18, 056060. [Google Scholar] [CrossRef] [PubMed]
Vigier, A.; Partouche, N.; Michel, F.J.; Crepel, V.; Marissal, T. Substantial outcome improvement using a refined pilocarpine mouse model of temporal lobe epilepsy. Neurobiol. Dis. 2021, 161, 105547. [Google Scholar] [CrossRef]
Bandt, C.; Pompe, B. Permutation entropy: A natural complexity measure for time series. Phys. Rev. Lett. 2002, 88, 174102. [Google Scholar] [CrossRef] [PubMed]
Richman, J.S.; Moorman, J.R. Physiological time-series analysis using approximate entropy and sample entropy. Am. J.-Physiol.-Heart Circ. Physiol. 2000, 278, H2039–H2049. [Google Scholar] [CrossRef]
Inouye, T.; Shinosaki, K.; Sakamoto, H.; Toi, S.; Ukai, S.; Iyama, A.; Katsuda, Y.; Hirano, M. Quantification of EEG irregularity by use of the entropy of the power spectrum. Electroencephalogr. Clin. Neurophysiol. 1991, 79, 204–210. [Google Scholar] [CrossRef]
Eckart, C.; Young, G. The approximation of one matrix by another of lower rank. Psychometrika 1936, 1, 211–218. [Google Scholar] [CrossRef]
Jui, S.J.J.; Deo, R.C.; Barua, P.D.; Devi, A.; Soar, J.; Acharya, U.R. Application of entropy for automated detection of neurological disorders with electroencephalogram signals: A review of the last decade (2012–2022). IEEE Access 2023, 11, 71905–71924. [Google Scholar] [CrossRef]
Arunkumar, N.; Kumar, K.R.; Venkataraman, V. Automatic detection of epileptic seizures using new entropy measures. J. Med. Imaging Health Inform. 2016, 6, 724–730. [Google Scholar] [CrossRef]
Acharya, U.R.; Fujita, H.; Sudarshan, V.K.; Bhat, S.; Koh, J.E. Application of entropies for automated diagnosis of epilepsy using EEG signals: A review. Knowl.-Based Syst. 2015, 88, 85–96. [Google Scholar] [CrossRef]
Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.Y. Lightgbm: A highly efficient gradient boosting decision tree. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
Probst, P.; Boulesteix, A.L.; Bischl, B. Tunability: Importance of hyperparameters of machine learning algorithms. J. Mach. Learn. Res. 2019, 20, 1–32. [Google Scholar]
Roy, Y.; Banville, H.; Albuquerque, I.; Gramfort, A.; Falk, T.H.; Faubert, J. Deep learning-based electroencephalography analysis: A systematic review. J. Neural Eng. 2019, 16, 051001. [Google Scholar] [CrossRef] [PubMed]
Jiang, M.; Wang, Y. Refinement of the pilocarpine-induced status epilepticus model in mice to improve mortality outcomes. Front. Neurosci. 2025, 19, 1592014. [Google Scholar] [CrossRef]
Cavalheiro, E.; Santos, N.; Priel, M. The pilocarpine model of epilepsy in mice. Epilepsia 1996, 37, 1015–1019. [Google Scholar] [CrossRef] [PubMed]
Curia, G.; Longo, D.; Biagini, G.; Jones, R.S.; Avoli, M. The pilocarpine model of temporal lobe epilepsy. J. Neurosci. Methods 2008, 172, 143–157. [Google Scholar] [CrossRef]
Jan, M.M.; Sadler, M.; Rahey, S.R. Electroencephalographic features of temporal lobe epilepsy. Can. J. Neurol. Sci. 2010, 37, 439–448. [Google Scholar] [CrossRef]
Britton, J.W.; Frey, L.C.; Hopp, J.L.; Korb, P.; Koubeissi, M.Z.; Lievens, W.E.; Pestana-Knight, E.M.; St Louis, E.K. Electroencephalography (EEG): An Introductory Text and Atlas of Normal and Abnormal Findings in Adults, Children, and Infants; American Epilepsy Society: Chicago, IL, USA, 2016. [Google Scholar]
Salami, P.; Lévesque, M.; Gotman, J.; Avoli, M. Distinct EEG seizure patterns reflect different seizure generation mechanisms. J. Neurophysiol. 2015, 113, 2840–2844. [Google Scholar] [CrossRef]
Perucca, P.; Dubeau, F.; Gotman, J. Intracranial electroencephalographic seizure-onset patterns: Effect of underlying pathology. Brain 2014, 137, 183–196. [Google Scholar] [CrossRef]
Buteneers, P.; Verstraeten, D.; Van Nieuwenhuyse, B.; Stroobandt, D.; Raedt, R.; Vonck, K.; Boon, P.; Schrauwen, B. Real-time detection of epileptic seizures in animal models using reservoir computing. Epilepsy Res. 2013, 103, 124–134. [Google Scholar] [CrossRef]
Noroozi, Z.; Orooji, A.; Erfannia, L. Analyzing the impact of feature selection methods on machine learning algorithms for heart disease prediction. Sci. Rep. 2023, 13, 22588. [Google Scholar] [CrossRef] [PubMed]
Li, J.; Cheng, K.; Wang, S.; Morstatter, F.; Trevino, R.P.; Tang, J.; Liu, H. Feature selection: A data perspective. ACM Comput. Surv. (CSUR) 2017, 50, 1–45. [Google Scholar] [CrossRef]
Abdumalikov, S.; Kim, J.; Yoon, Y. Performance analysis and improvement of machine learning with various feature selection methods for EEG-based emotion classification. Appl. Sci. 2024, 14, 10511. [Google Scholar] [CrossRef]
Hassan, M.; Kaabouch, N. Impact of Feature Selection Techniques on the Performance of Machine Learning Models for Depression Detection Using EEG Data. Appl. Sci. 2024, 14, 10532. [Google Scholar] [CrossRef]
Antoniadi, A.M.; Du, Y.; Guendouz, Y.; Wei, L.; Mazo, C.; Becker, B.A.; Mooney, C. Current challenges and future opportunities for XAI in machine learning-based clinical decision support systems: A systematic review. Appl. Sci. 2021, 11, 5088. [Google Scholar] [CrossRef]
Principe, J.; Smith, J.R. Microcomputer-based system for the detection and quantification of petit mal epilepsy. Comput. Biol. Med. 1982, 12, 87–95. [Google Scholar] [CrossRef]

Figure 1. Feature importance ranking for the 17 features utilised in the training of the final LightGBM model. The features are ranked in descending order of their gain. Gain measures the overall improvement in the model’s performance resulting from all splits associated with a specific feature. FD = fractal dimension; TKEO = Teager–Kaiser Energy Operator; max PSD = maximum power spectral density; var PSD = variance of power spectral density.

Figure 2. Framework for LightGBM-based seizure detection. Following EEG acquisition, the signals underwent preprocessing, feature extraction, hyperparameter optimisation, model training, and feature selection to develop a preliminary model. After feature selection, LightGBM hyperparameters were re-optimised using the selected features, and a final model was trained accordingly. The resulting predictions were subsequently subjected to post-processing and performance evaluation.

Table 1. Mouse EEG recording distribution.

	Training	Validation	Ind. Test	Total
Number of mouse EEG recordings	72 (70.60%)	10 (9.80%)	20 (19.60%)	102 (100%)
Seizure Duration (hours)	71	12	16	99
Non-seizure Duration (hours)	19,894	2972	6757	29,623
Seizure/non-seizure ratio	1:280	1:242	1:422	1:299

Table 2. The evaluation of EPI-AI with the independent test set used in this study.

	Acc (%)	Sens (%)	Spec (%)	AUROC (%)
Ind. test set	93.0	35.0	93.0	64.0

Table 3. The evaluation of cross-validation training, the independent test set, and post-processing of predictions from the independent test set.

	Acc (%)	Pre (%)	Recall/Sens (%)	Spec (%)	AUROC (%)	F1
Cross-val	99.0	73.0	59.0	99.0	59.0	0.62
Ind. test set	99.0	58.0	60.0	99.0	60.0	0.59
post-processing	99.0	68.0	80.0	99.0	80.0	0.71

Cross-val metrics are reported only for the training data (averaged across folds), while ‘Ind. test set’ corresponds exclusively to the independent held-out test set and is not involved in model optimisation and training.

Table 4. Ablation assessment of models trained with the 18 omitted features from the feature selection and those trained without the three top significant features of the selected features, evaluated using the independent test set.

	Acc (%)	Pre (%)	Recall/Sens (%)	Spec (%)	AUROC (%)	F1
18-feat. Excluded	99.0	54.0	51.0	99.0	51.0	0.51
Selected w/o top-3	99.0	61.0	58.0	99.0	58.0	0.59

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Edoho, M.; Partouche, N.; Hoornenborg, C.W.; Hoogland, T.M.; Baudouin, S.; Mooney, C.; Wei, L. LightGBM-Based Seizure Detection Method in Pilocarpine Mouse Model of Epilepsy. Algorithms 2026, 19, 167. https://doi.org/10.3390/a19030167

AMA Style

Edoho M, Partouche N, Hoornenborg CW, Hoogland TM, Baudouin S, Mooney C, Wei L. LightGBM-Based Seizure Detection Method in Pilocarpine Mouse Model of Epilepsy. Algorithms. 2026; 19(3):167. https://doi.org/10.3390/a19030167

Chicago/Turabian Style

Edoho, Mercy, Nicolas Partouche, Christiaan Warner Hoornenborg, Tycho M. Hoogland, Stéphane Baudouin, Catherine Mooney, and Lan Wei. 2026. "LightGBM-Based Seizure Detection Method in Pilocarpine Mouse Model of Epilepsy" Algorithms 19, no. 3: 167. https://doi.org/10.3390/a19030167

APA Style

Edoho, M., Partouche, N., Hoornenborg, C. W., Hoogland, T. M., Baudouin, S., Mooney, C., & Wei, L. (2026). LightGBM-Based Seizure Detection Method in Pilocarpine Mouse Model of Epilepsy. Algorithms, 19(3), 167. https://doi.org/10.3390/a19030167

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

LightGBM-Based Seizure Detection Method in Pilocarpine Mouse Model of Epilepsy

Abstract

1. Introduction

2. Materials and Methods

2.1. Data

2.1.1. Pilocarpine Mouse Model of Epilepsy

2.1.2. Dataset

2.2. Data Pre-Processing

2.3. Feature Estimation

2.4. Detection Algorithm

2.5. Post-Processing

2.6. Performance Evaluation

3. Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

Appendix A.1. Mean Instantaneous Frequency

Appendix A.2. Variance of instantaneous frequency

Appendix A.3. Mean absolute first derivative of instantaneous frequency

Appendix A.4. Variance of the first derivative of instantaneous frequency

Appendix A.5. Mean absolute derivative of instantaneous amplitude

Appendix A.6. Variance of the derivative of instantaneous amplitude

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI