Wavelet Transform-Statistical Time Features-Based Methodology for Epileptic Seizure Prediction Using Electrocardiogram Signals

: Epilepsy is a brain disorder that a ﬀ ects about 50 million persons around the world and is characterized by generating recurrent seizures, which can put patients in permanent because of falls, drowning, burns, and prolonged seizures that they can su ﬀ er. Hence, it is of vital importance to propose a methodology with the capability of predicting a seizure with several minutes before the onset, allowing that the patients take their precautions against injuries. In this regard, a methodology based on the wavelet packet transform (WPT), statistical time features (STFs), and a decision tree classiﬁer (DTC) for predicting an epileptic seizure using electrocardiogram (ECG) signals is presented. Seventeen STFs were analyzed to measure changes in the properties of ECG signals and ﬁnd characteristics capable of di ﬀ erentiating between healthy and 15 min prior to seizure signals. The e ﬀ ectiveness of the proposed methodology for predicting an epileptic event is demonstrated using a database of seven patients with 10 epileptic seizures, which was provided by the Massachusetts Institute of Technology–Beth Israel Hospital (MIT–BIH). The results show that the proposed methodology is capable of predicting an epileptic seizure 15 min before with an accuracy of 100%. Our results suggest that the use of STFs at frequency bands related to heart activity to ﬁnd parameters for the prediction of epileptic seizures is suitable.


Introduction
Epilepsy is a brain disorder that affects about 50 million persons around the world and is characterized by generating recurrent seizures [1]. In particular, the brain contains billions of nerve cells or neurons that help people to control their way of (1) thinking, (2) moving, and (3) feeling by means of electrical signals that send messages from one nerve cell to others [2]. In this regard, the brain is characterized by allowing the neurons to send electrical signals at a rate that is under 100 times per second, but during an epileptic seizure or ictal state, the neurons become hyperexcitable, which causes a period of abnormal and asynchronous excitation in a neuronal population, generating muscular contractions that are uncontrollable [3,4]. This reaction can lead to the patients suffering physical problems (i.e., fractures and bruising related to seizures) and psychological conditions (i.e., anxiety and depression), negatively affecting their life quality [5,6]. In addition, they present a permanent

Data Set Used
With the aim of validating the potential of the proposal for predicting an epileptic seizure, the open access database called "Post-Ictal Heart Rate Oscillations in Partial Epilepsy" (PIHROPE) (https://physionet.org/content/szdb/1.0.0/) provided by the Beth Israel Hospital was used. It comprises the ECG signals of 7 patients (2 men and 5 women) within an age range of 31-48 years, where a total of 10 epileptic seizures were registered during their monitoring. It should be noted that a team of experts in cardiology confirmed that the patients enrolled in the study did not present the clinical evidence of heart disease and they present partial seizures with or without a secondary generalization from frontal or temporal foci. Table 1 presents the seizure onset time registered for each patient, where it is possible to observe that two patients suffered more than one epileptic crisis (i.e., patients No. 2 and No. 6) during their monitoring [26].

Preparation of the ECG Signals
Electrocardiogram signals examined in this work were monitored during 3 h on average by using a sampling frequency of 200 Hz [26]. However, they are down-sampled or resampled to 128 Hz since this sampling frequency value is useful to diagnose heart diseases as well as allow for reducing the quantity of information or samples to be processed [27][28][29][30]. This process is performed by convolving the measured ECG signals with a low-pass finite impulse response filter. From the 3 h monitored for each patient, only the first 15 min of the ECG signal prior to the seizure onset was extracted and evaluated in this investigation. It should be noted that this time window allows exploring and offering an earlier epileptic seizure prediction than the one provided by other works presented in the literature. The time window (15 min) investigated in this work is divided into 1 min intervals; that is, the first 1 min interval before the epileptic crisis onset, the second 1 min interval before the epileptic crisis onset, etc. On the other hand, a normal or healthy group (HG) was obtained by extracting 1 min intervals from ECG signals 1 h prior to or after the seizure since in this time the patients presented a normal cardiac rhythm [22]. It is important to mention that the time windows or intervals of 1 min were selected because they allowed identifying reliable features in the ECG signals for predicting a sudden cardiac death as well as a seizure [22,28,29,31]. For this reason, the 1 min interval is also explored in this work for predicting a seizure until 15 min before the onset. Figure 1 illustrates the first 1 min interval of the ECG signal prior to the epileptic seizure and two 1 min intervals of the ECG signal recorded 1 h before the seizure occurred, respectively. From this figure, it is not possible to visually identify significant differences or changes between both ECG signal groups prior to the seizure onset. Therefore, it is of imperative importance to propose a method or

Wavelet Packet Transform
Wavelet packet transform has demonstrated to be a reliable tool to analyze physiological signals such as ECG [29], EEG [32], electromyography (EMG) [33], among others. It is characterized by performing an analysis more detailed than the discrete wavelet transform (DWT), since in WPT both the low and high frequency components (approximation and detail coefficients, respectively) are decomposed to form a new decomposition level (see Figure 2). In this sense, the WPT is a useful method to provide a time-frequency analysis of the signals [34,35]. In order to estimate the decomposition level, the following equation is employed [36]: where j and k are the scaling (frequency localization) and the translation (time localization) parameters, respectively, and n is the oscillation parameter. The approximation and detail coefficients are obtained as follows [35]: where h(k) and g(k) are the low-pass and high-pass filter coefficients associated, respectively, t is the variable time, and represents the Mother Wavelet used (i.e., Daubechies, biorthogonal, Morlet, among others). In this work, the Mother Wavelet Daubechies 44 (Db-44) is employed as it delivers reliable results for analyzing or processing physiological signals [36].

Wavelet Packet Transform
Wavelet packet transform has demonstrated to be a reliable tool to analyze physiological signals such as ECG [29], EEG [32], electromyography (EMG) [33], among others. It is characterized by performing an analysis more detailed than the discrete wavelet transform (DWT), since in WPT both the low and high frequency components (approximation and detail coefficients, respectively) are decomposed to form a new decomposition level (see Figure 2). In this sense, the WPT is a useful method to provide a time-frequency analysis of the signals [34,35]. In order to estimate the decomposition level, the following equation is employed [36]: where j and k are the scaling (frequency localization) and the translation (time localization) parameters, respectively, and n is the oscillation parameter. The approximation and detail coefficients are obtained as follows [35]: where h(k) and g(k) are the low-pass and high-pass filter coefficients associated, respectively, t is the variable time, and W n represents the Mother Wavelet used (i.e., Daubechies, biorthogonal, Morlet, among others). In this work, the Mother Wavelet Daubechies 44 (Db-44) is employed as it delivers reliable results for analyzing or processing physiological signals [36].
From the previous work [29], it has been found that a 5-level decomposition scheme (32 nodes) using WPT is reliable to decompose ECG signals because it allows isolating transient events due to the reduced bandwidth (i.e., 2 Hz with a sampling frequency of 128 Hz) associated to each node. In addition, other benefits that can provide this decomposition level are: (1) a reduction in the noise of the signal obtained in the frequency bands or nodes, and (2) the mix down of other frequencies. In this way, the behavior of the ECG signals obtained at each node can be analyzed by STFs to obtain features that can allow the prediction of an epileptic episode. Hence, since an epileptic seizure problem can generate transients [17], this decomposition level value will also be evaluated in this work. From the previous work [29], it has been found that a 5-level decomposition scheme (32 nodes) using WPT is reliable to decompose ECG signals because it allows isolating transient events due to the reduced bandwidth (i.e., 2 Hz with a sampling frequency of 128 Hz) associated to each node. In addition, other benefits that can provide this decomposition level are: (1) a reduction in the noise of the signal obtained in the frequency bands or nodes, and (2) the mix down of other frequencies. In this way, the behavior of the ECG signals obtained at each node can be analyzed by STFs to obtain features that can allow the prediction of an epileptic episode. Hence, since an epileptic seizure problem can generate transients [17], this decomposition level value will also be evaluated in this work.

Statistical Time Features
Statistical time features have proven to be efficient tools for detecting significant changes in the signals associated to neurodegenerative [37] and cardiac [38] diseases, as well as sleep disorders [39], among others. In general, STFs are capable of measuring changes in the properties of non-stationary time signals such as their range of values, dispersion, asymmetry, and convergence, among others [40,41]. It should be noted that a signal transformation is not performed to estimate the features; hence, it is evident that the techniques' computational cost is low [42]. The equations of the 17 STFs used in this work are described as follows [43][44][45]:

Statistical Time Features
Statistical time features have proven to be efficient tools for detecting significant changes in the signals associated to neurodegenerative [37] and cardiac [38] diseases, as well as sleep disorders [39], among others. In general, STFs are capable of measuring changes in the properties of non-stationary time signals such as their range of values, dispersion, asymmetry, and convergence, among others [40,41]. It should be noted that a signal transformation is not performed to estimate the features; hence, it is evident that the techniques' computational cost is low [42]. The equations of the 17 STFs used in this work are described as follows [43][44][45]: Latitude Factor = LF = x j max SRM (10) x j (12) where x j is the j-th time-series sample that goes from j = 1,2,3 . . . n, n is the number of samples, L i is the lower limit of the modal class, c is the width of the modal class, d − and d + are the absolute differences of the modal interval and the classes of the neighboring intervals, respectively. It should be noted that the property that each of the STFs measures is different and might not capture relevant information that can help to perform the early detection of an epileptic seizure. For this reason, it is necessary to employ a statistical method that can measure the relevance of the obtained features for each node estimated by using WPT.

Kruskal-Wallis Method
Kruskal-Wallis method is a non-parametric method used to evaluate the statistical independence of distribution-free signals [46]. It has been employed as a part of the development of physiological processing strategies; in particular for the respiration estimation using ECG [47], and drowsiness levels using EEG [48], among others. In general, the Kruskal-Wallis method (KWM) evaluates the medians of the feature groups through a null hypothesis that consists in assuming that all medians of the data sets are equal [49]. If the probability value (p-value) is smaller than a range between 0.05 and 0.01 [49], the null hypothesis is rejected, indicating that the features to differentiate between a selected data set and the remaining data sets can be safely used. On the contrary, if the p-value is greater than the aforementioned range [49], the null hypothesis is accepted; this means that the feature groups have similar information and cannot be used to determine the differences between the groups (seizure and healthy). Therefore, the p-value is used to determine the most discriminating STFs to predict a seizure.

Decision Tree Classifier
The decision tree classifier is a simple but effective classifier that offers an easy and flexible implementation because it can be implemented by using a set of if-else rules with a low computational  [50,51]. Moreover, it can reach a good accuracy [52], assuming that the feature sets do not heavily overlap [53]. In recent years, DTC has been used for the classification of physiological signals for the detection and diagnosis of medical diseases [54][55][56]. Figure 3 shows a graphical illustration of a DTC. The procedure used to develop a DTC is [57]: Rule splitting node selection; 2.
Set of the terminal nodes; 3.
Assignment of the corresponding class label to the terminal nodes.
1. Rule splitting node selection; 2. Set of the terminal nodes; 3. Assignment of the corresponding class label to the terminal nodes.
The decision in each stage of the tree depends on the previous branching operations. The tree structure (see Figure 3) starts with the root node, then performs the test that follows the edge, and then repeats the test until it reaches the end (leaf) node [52]. Once it reaches the leaf node, the tree predicts the outcome associate, that is, class label [55]. This procedure can be resumed as [58]: 1. Design a logical test for each feature that will be used as an input to the DTC; 2. For each logical test, use a subset of the training data to verify that the outcome to the corresponding terminal node assigns the expected label; 3. Repeat this procedure for all the terminal nodes.
Considering the aforementioned benefits and the easy implementation, a DTC can be an appropriate option to perform the seizure prediction using the selected STFs, where it should be pointed out that a quick-adjust can be performed [59] if required to ensure the best possible result.   The decision in each stage of the tree depends on the previous branching operations. The tree structure (see Figure 3) starts with the root node, then performs the test that follows the edge, and then repeats the test until it reaches the end (leaf) node [52]. Once it reaches the leaf node, the tree predicts the outcome associate, that is, class label [55]. This procedure can be resumed as [58]:

1.
Design a logical test for each feature that will be used as an input to the DTC; 2.
For each logical test, use a subset of the training data to verify that the outcome to the corresponding terminal node assigns the expected label; 3.
Repeat this procedure for all the terminal nodes.
Considering the aforementioned benefits and the easy implementation, a DTC can be an appropriate option to perform the seizure prediction using the selected STFs, where it should be pointed out that a quick-adjust can be performed [59] if required to ensure the best possible result. Figure 4 graphically presents the steps required to execute the proposed methodology for predicting an epileptic seizure. In Step 1, the first 15 min of an ECG signal prior to seizure are extracted and divided into 1 min intervals. In addition, 1 min intervals of ECG signals 1 h prior to or after the seizure are extracted and used as HG. Then, in Step 2, each 1 min interval of the ECG signals is decomposed by means of WPT in diverse frequency bands (FBs) or nodes according to the level selected (e.g., if the ECG signals are decomposed until the eighth level, 256 nodes or frequency bands are obtained because each level, L, generates 2 L frequency bands). WPT allows obtaining a more detailed analysis of ECG signals unlike the discrete wavelet transform (DWT) as DWT only decomposes low frequencies. In Step 3, each node or frequency band obtained by WPT is analyzed using the seventeen STFs to find the features of the decomposed ECG signals with the capability of predicting an epileptic seizure. In Step 4, the STF values estimated for each node are evaluated through KWM, a nonparametric analysis of variance, for determining the most suitable STFs for predicting an epileptic seizure. Finally, in Step 5, the selected STF values in the previous step are used as inputs for designing detailed analysis of ECG signals unlike the discrete wavelet transform (DWT) as DWT only decomposes low frequencies. In Step 3, each node or frequency band obtained by WPT is analyzed using the seventeen STFs to find the features of the decomposed ECG signals with the capability of predicting an epileptic seizure. In Step 4, the STF values estimated for each node are evaluated through KWM, a nonparametric analysis of variance, for determining the most suitable STFs for predicting an epileptic seizure. Finally, in Step 5, the selected STF values in the previous step are used as inputs for designing a DTC for predicting an epileptic seizure automatically. In the following sections the mathematical concepts used in this work are detailed.

Results
Using the steps of the proposed methodology, the HG and the seizure group signals, divided into 1 min intervals, are decomposed by employing the WPT. It should be noted that the decomposition levels were varied from 1 to 8, revealing as a results that the fifth level is suitable for analyzing the ECG signals because a higher decomposition level did not improve the classifier accuracy in a relevant manner, whereas a lower decomposition level resulted in a wider bandwidth, increasing the possibilities of having a mixture of frequencies that might result in the lower accuracy of the classifier. Figures 5 and 6 show the estimated 32 FBs at the fifth level for the HG signal and the first 1 min interval prior to the seizure occurrence, respectively. Since there are no visual differences between the HG and a patient with epilepsy one minute prior to a seizure's occurrence, each FB is analyzed with the 17 STFs to identify the features in the decomposed signals that allow the early prediction of a seizure.

Results
Using the steps of the proposed methodology, the HG and the seizure group signals, divided into 1 min intervals, are decomposed by employing the WPT. It should be noted that the decomposition levels were varied from 1 to 8, revealing as a results that the fifth level is suitable for analyzing the ECG signals because a higher decomposition level did not improve the classifier accuracy in a relevant manner, whereas a lower decomposition level resulted in a wider bandwidth, increasing the possibilities of having a mixture of frequencies that might result in the lower accuracy of the classifier. Figures 5 and 6 show the estimated 32 FBs at the fifth level for the HG signal and the first 1 min interval prior to the seizure occurrence, respectively. Since there are no visual differences between the HG and a patient with epilepsy one minute prior to a seizure's occurrence, each FB is analyzed with the 17 STFs to identify the features in the decomposed signals that allow the early prediction of a seizure.   Once the STFs were calculated for all FBs and the fifteen 1 min intervals, they were evaluated by means of KWM in order to determine which FB and STF were the most capable of predicting an epileptic seizure. In this regard, the FB-9 and FB-13 for the CF property, and the FB-12 for the IF index present the lowest p-values, indicating that they can be reliable for predicting a seizure. Figure 7 depicts the boxplot graph for the aforementioned properties, where it is seen that no evident overlaps are observed between the HG group and the ECG signal prior to the seizure from the first 1 min interval to the fifteenth 1 min interval.  Once the STFs were calculated for all FBs and the fifteen 1 min intervals, they were evaluated by means of KWM in order to determine which FB and STF were the most capable of predicting an epileptic seizure. In this regard, the FB-9 and FB-13 for the CF property, and the FB-12 for the IF index present the lowest p-values, indicating that they can be reliable for predicting a seizure. Figure 7 depicts the boxplot graph for the aforementioned properties, where it is seen that no evident overlaps are observed between the HG group and the ECG signal prior to the seizure from the first 1 min interval to the fifteenth 1 min interval.  Table 2 presents the p-values obtained for the most discriminative STF indices from the first 1 min interval to fifteenth 1 min interval prior to the epileptic seizure. On the other hand, Figure 8 shows the obtained results analyzing directly, without the WPT, the fifteen 1 min intervals prior to the seizure and the 1 min intervals of HG with the STFs, where it is possible to observe that the STFs are not capable of detecting any difference between both signals, as the values obtained heavily overlap between them, limiting their use for predicting an epileptic seizure. Hence, these results allow affirming that the integration of WPT with STFs are useful for early predicting the disease.   min interval to fifteenth 1 min interval prior to the epileptic seizure. On the other hand, Figure 8 shows the obtained results analyzing directly, without the WPT, the fifteen 1 min intervals prior to the seizure and the 1 min intervals of HG with the STFs, where it is possible to observe that the STFs are not capable of detecting any difference between both signals, as the values obtained heavily overlap between them, limiting their use for predicting an epileptic seizure. Hence, these results allow affirming that the integration of WPT with STFs are useful for early predicting the disease.  Finally, the most useful STFs selected by means of KWM, i.e., CF(FB-9), CF(FB-13), and IF(FB-12), were employed to design a DTC based on if-else rules in order to predict a seizure automatically. It is important to mention that the proposal could distinguish both groups with high accuracy by integrating anyone of the selected STFs with the DTC since their values were not overlapped; however, in order to give more robustness to the proposed methodology in the case of evaluating new ECG signals or a database, the three selected STFs, the most discriminant ones, were used at the same time with the DTC. In this regard, three threshold values were determined (one per each STF) by performing a visual inspection for designing a DTC based on if-then-else rules for distinguishing between both groups, whose values are denoted in Figure 7 Table 3 presents the proposed methodology accuracy, minute by minute, during the fifteen minutes before the seizure. From this table, it was seen that an accuracy of 100% was reached for discriminating between the normal subjects and the ones that can experience an epileptic seizure, fifteen minutes prior to the event. In addition, a specificity and sensitivity of 100% was reached. Hence, it can be affirmed that this time detection window allows providing an appropriate time to take the remedial actions that can mitigate the consequences of the seizure, either locating a safe place where the person is less likely to suffer physical damage or taking medication that can diminish the seizure effects.

Discussion
It should be noticed that 100% accuracy is obtained, which is a reasonable improvement over the presented results of similar works [21][22][23][24][25]; moreover, the proposal achieves a 15 min interval for the seizure prediction, being a good time window for allowing to take remedial actions to avoid severe damage to patient integrity. As noted by Vargas-Lopez et al. [28], it is desirable to achieve a methodology capable of obtaining a 100% accuracy as this will indicate that theoretically, any patient prone to suffer an episode can have a timely alert; in this sense, the proposal achieves this desired scenario. It should be noted that the proposed methodology uses a low-complex classifier, the DTC, which compared with a SVM employed in similar works [21][22][23][24][25], which requires a lower amount of computational resources, the desired feature, without compromising its accuracy. This allows us to affirm that the used indices can capture the subtle changes that the ECG signal suffers prior to an epileptic episode without using high-load computational burden classifiers nor several features for detecting the changes in the signal used for detecting the epileptic episode. Table 4 presents a qualitative comparison between the proposal and other recent methodologies that have employed ECG signals, as well as a brief description of the methodologies used, the achieved prediction time, accuracy, and the reported specificity and sensitivity, respectively. From this table, it can be observed that most of the previous works reported in the literature requires a high quantity of features to measure the changes between the HG and seizure signals, reaching an accuracy from 74.6 to 95.6%. This fact can be considered as indicative of the employed indices that do not capture the subtle changes that the ECG signal suffers; in consequence, the used methodologies increase their computational burden as well as the classifier algorithm complexity, as it has to handle a considerable amount of information [21,22,24,25,60]. On the other hand, when only one feature is used as input to the classifier [23], both the accuracy and the detection time are greatly diminished. Another interesting fact the recent methodologies share is the processing of the ECG signal to either obtain the heart rate variability (HRV) or the R-R interval time (RRI). This fact further increases the computational cost of their methodologies, limiting the proposals real-time operation. On the contrary, the proposal directly processes the raw ECG signal, comprising the computational cost employed. In addition, from this table, it was also seen that STFs are a suitable and effective tool to detect the ECG changes that the autonomic nervous system can produce in an epileptic event. In addition, this work identified several frequency bands (16)(17)(18)(22)(23)(24) Hz, and 24-26 Hz, respectively) that can be used to detect the aforementioned changes. The bandwidths are in concordance with the values indicted in the state-of-the-art, which depicts that most of the ECG's spectral power is located below 30 Hz [61]. Hence, this indicates that there is a relation between epileptic seizures and the activation of the sympathetic system, which confirms the findings reported by [17]. The proposal can be considered as a suitable alternative to detect patients that can suffer a seizure in a reliable and accurate way. However, it is necessary to continue exploring the proposed method with (1) a larger database in order to corroborate that the activation of the sympathetic system when the patient is prone to suffer a seizure can allow to detect the seizure as noted by [18]; in this sense, the viability of the proposal can be ensured, allowing to modify or calibrate according to new results, and (2) select a larger prediction time window (e.g., 40 min prior to the seizure onset), as this time window can allow the patient to locate a safe place where the seizure-associated movements do not compromise a risk to the patient's integrity and their surroundings; moreover, it will allow the patient to arrive at a hospital so proper remedial actions can be carried out, ensuring a quicker recovery.

Conclusions
Epilepsy is a disease that produces an imbalance in the electrical activity of the brain neurons, which results in abnormal electrical activity (seizures) in the people that suffers this condition. Since the seizure can generate life-threatening scenarios, an early diagnosis of epileptic events will allow locating a safe place for the patients, preventing falls, drowning, and burns, as well as enabling them to receive adequate medical treatment. The results show an accuracy of 100%, sensitivity of 100% and a specificity of 100% in a window of 15 min prior to the seizure using the 150 seizure signal segments, 15 1 min intervals for the 10 epileptic conditions suffered by seven patients and 10 HG signal segments, outperforming the previous works reported in the literature. It should be noted that a large database is required to confirm whether the selected FBs and STFs can be used, so the proposal accuracy, specificity, and sensitivity are not degraded. Particularly, it is necessary to obtain data from young, teenagers, and senior patients to verify if the selected frequency bands still contain the information that, using the STFs, can determine if a patient can present a seizure or not. By confirming this information, the necessary calibration steps can be executed, if necessary, to ensure the best possible results.