Tonic Cold Pain Detection Using Choi–Williams Time-Frequency Distribution Analysis of EEG Signals: A Feasibility Study

: Detecting pain based on analyzing electroencephalography (EEG) signals can enhance the ability of caregivers to characterize and manage clinical pain. However, the subjective nature of pain and the nonstationarity of EEG signals increase the difﬁculty of pain detection using EEG signals analysis. In this work, we present an EEG-based pain detection approach that analyzes the EEG signals using a quadratic time-frequency distribution, namely the Choi–Williams distribution (CWD). The use of the CWD enables construction of a time-frequency representation (TFR) of the EEG signals to characterize the time-varying spectral components of the EEG signals. The TFR of the EEG signals is analyzed to extract 12 time-frequency features for pain detection. These features are used to train a support vector machine classiﬁer to distinguish between EEG signals that are associated with the no-pain and pain classes. To evaluate the performance of our proposed approach, we have recorded EEG signals for 24 healthy subjects under tonic cold pain stimulus. Moreover, we have developed two performance evaluation procedures—channel-and feature-based evaluation procedures—to study the effect of the utilized EEG channels and time-frequency features on the accuracy of pain detection. The experimental results show that our proposed approach achieved an average classiﬁcation accuracy of 89.24% in distinguishing between the no-pain and pain classes. In addition, the classiﬁcation performance achieved using our proposed approach outperforms the classiﬁcation results reported in several existing EEG-based pain detection approaches.


Introduction
Pain is an unpleasant sensation that is caused by illness or physical injury. Nowadays, caregivers in the medical domain are using several methods to assess clinical pain, such as the visual analog scale (VAS), numerical rating scale (NRS), and verbal rating scale (VRS) [1]. These clinical pain assessment methods require the caregiver to perform an interview with the patient to determine the level of pain. Therefore, such pain assessment methods are considered subjective as the pain tolerance level can vary significantly among different patients. Moreover, these methods are not suitable for patients who are unable to verbally express pain, such as infants [2]. This justifies the importance of developing objective pain assessment methods that can detect pain by analyzing the brain signals of the patients. In fact, objective pain detection methods can provide caregivers with a second opinion that can enhance the assessment and management of clinical pain.
In order to study the pain perception mechanism, researchers have recently worked on analyzing brain activities that are recorded for healthy subjects under stimulated pain conditions. Among the different types of pain stimuli, tonic pain stimuli are widely used due to their high induced effects compared to the other types of pain stimuli. In particular, tonic pain can cause impairments in the performance of visual tasks and memory [3,4]. Moreover, various studies have indicated that tonic pain can simulate clinical pain better than other types of pain stimuli [8].
To record brain activities that are generated in response to an exogenous pain stimulus, researchers have employed several neuroimaging modalities, such as the electrocorticographic (ECoG), functional magnetic resonance imaging (fMRI), functional near-infrared spectroscopy (fNIRS), and electroencephalography (EEG). The EEG modality, which records the electrical potentials produced at different locations in the brain in response to an exogenous pain stimulus, is considered the most commonly used modality for designing objective pain detection systems. This can be attributed to several properties that are associated with the EEG modality, including the high temporal resolution, low-cost, and noninvasive nature [5,6]. Nonetheless, pain detection based on EEG signals analysis is considered challenging due to the nonstationarity nature of the EEG signals, low spatial resolution, and low signal-to-noise (SNR) ratio [7].
Over the past decade, several approaches have been proposed to detect stimulated pain based on EEG signals analysis. For example, Nir et al. [8] studied the relation between the properties of the peak alpha frequency (PAF) of the EEG signals and the subjective perception of tonic heat pain. The employed EEG signals were recorded for 18 healthy subjects during the application of tonic heat pain stimulus. The results indicate a correlation between the PAF and the subjective perception of tonic pain, which suggests the possibility of utilizing the properties of the PAF to characterize pain. Panavaranan and Wongsawat [9] proposed an approach to detect tonic heat pain using power spectral density (PSD)-based features that were extracted from the EEG signals. The extracted features were used to construct a support vector machine (SVM) classifier to distinguish between EEG signals that correspond to the no-pain and pain classes, respectively. The performance of the proposed system was evaluated using EEG signals that were acquired from nine healthy subjects during the application of tonic heat pain stimulus. Shao et al. [10] performed a frequency-domain EEG source analysis to study brain responses to tonic cold pain stimulus. In particular, the utilized EEG signals were recorded for 26 healthy subjects under two conditions-absence or presence of a tonic cold pain stimulus. The results show that the variations in the power associated with different frequency bands of the EEG signals can be used to analyze the perception mechanism of tonic cold pain. In another study, Vatankhah and Toliyat [11] proposed an approach that employs the wavelet coherency to estimate the tonic cold pain level. In particular, first-and second-order statistical features were extracted from the wavelet coherency of the EEG signals and used to construct two classifiers-an SVM classifier with radial basis function (RBF) kernel and a hidden Markov model (HMM) classifier. The results indicate that the classification performance obtained using the SVM classifier outperform the classification results obtained using the HMM classifier.
The previously described studies were focused on analyzing the EEG signals in the frequencydomain without considering the nonstationarity nature of these signals. The nonstationarity nature of the EEG signals imposes the requirement of analyzing these signals in the joint time-frequency domain to capture the spectral variations of the EEG signals over time. In this regard, few research groups have recently investigated the use of time-frequency analysis methods, such as wavelet transform [12,13], to characterize the EEG signals and detect tonic pain. For example, Hadjileontiadis [14] utilized the continuous Wavelet transform (CWT) to analyze the EEG signals and detect tonic cold pain. In particular, the CWT was employed to construct a time-frequency representation of the EEG signals. Then, a set of higher-order spectral features were extracted and used to train and evaluate the performance of four different classifiers, namely the quadratic discriminant analysis (QDA), Mahalanobis (MAH), k-nearest neighbor (k-NN), and SVM with RBF kernel classifiers. The performance of these classifiers was evaluated using EEG signals acquired from 17 healthy subjects while holding a plastic bottle with iced water until the pain became unbearable. The experimental results indicate that the best classification accuracy was achieved using the MAH classifier. In another study, Alazrai et al. [15] proposed a tonic cold pain detection approach that utilizes the discrete Wavelet transform (DWT) to analyze the EEG signals and detect pain. This approach was evaluated using EEG signals that were acquired from 24 healthy subjects while submerging their hands in iced water to stimulate tonic cold pain. The DWT was applied to the recorded EEG signals, and a set of features were extracted from the computed DWT coefficients. The extracted features were used to train a SVM classifier with RBF kernel to distinguish between the EEG signals that correspond to the no-pain and pain states. Despite the promising results achieved using the Wavelet transform analysis of the EEG signals, including CWT and DWT, the time-frequency representations obtained using the Wavelet transform are characterized by a nonuniform resolution across the time-frequency plane (TFP) [16], which can increase the difficulty to interpret the obtained time-frequency representation of the EEG signals [16,17].
To address the limitations of the previously described EEG-based pain detection approaches, we propose an EEG-based pain detection approach that employs a quadratic time-frequency distribution, namely the Choi-Williams distribution (CWD), to analyze the EEG signals. Compared to other time-frequency analyses, such as the Wavelet transform, the QTFDs are considered invariant to time and frequency shifts [16,18,19]. Therefore, the use of the CWD enables construction of a time-frequency representation (TFR) of the EEG signals that can capture the time-varying spectral characteristics of the EEG signals. To reduce the dimensionality of the TFRs constructed for the EEG signals, we have extracted 12 time-frequency features from the constructed TFR of the EEG signals. These extracted time-frequency features are used to train a support vector machine (SVM) classifier to distinguish between EEG signals that are associated with pain and no-pain states. To evaluate the performance of the proposed approach, we have recorded the EEG signals for 24 healthy subjects who volunteered to participate in a tonic cold pain stimulation experiment. Moreover, we have developed two performance evaluation procedures-channel-and feature-based analyses-to evaluate the effect of the selected EEG channels and time-frequency features on the capability of detecting pain. The results reported in the current study demonstrate the capability of our proposed approach to detect tonic cold pain using EEG signals.
The main objective of the current study is to contribute to ongoing research efforts developing objective pain detection systems based on time-frequency analysis of the EEG-based pain detection. In particular, we have introduced an EEG-based system that employs a set of time-frequency features that are extracted from a QTFD-based representation to discriminate between the no-pain and pain classes. In fact, we hypothesize that analyzing the EEG signals using the CWD, which is considered as a QTFD, can improve the accuracy of pain detection compared with other previously used frequency-domain and time-frequency domain analyses, such as the PSD-based analysis and the wavelet transform (WT) analysis. The contribution of this study is three-fold: (1) To the best of our knowledge, this is the first study that employs the CWD to analyze EEG signals for detecting pain. Therefore, our work can be viewed as a feasibility study that aims to explore the potential of developing objective clinical pain detection systems based on time-frequency analysis of the EEG signals. (2) A novel set of time-frequency features are extracted from the constructed CWD-based TFR of the EEG signals to model the no-pain and pain classes encapsulated within the EEG signals, (3) Extensive feature and channel selection analyses are performed to quantify the effect of excluding any of the extracted time-frequency features and any of the utilized EEG channels on the accuracy of pain detection.
The remainder of the paper is structured as follows. In Section 2, we describe the experimental protocol, the recorded EEG signals, the CWD-based time-frequency analysis, the extraction and classification of the features, and the performance evaluation procedures. In Section 3, we present and discuss the results. In Section 4, we conclude our findings and provide future directions.

Subjects
Twenty four healthy subjects (12 females and 12 males, with a mean ± standard deviation age of 22.5 ± 3.2 years) volunteered to participate in this study. The subjects were not feeling any pain before participating in the experiment and not taking any medication related to neurological or psychiatric disorders. Moreover, each subject received a detailed explanation of the experimental protocol, which was approved by the Research Ethics Committee at the German Jordanian University, and signed a consent form before participating in the experiment.

Experimental Protocol
In this study, each subject was asked to sit on a comfortable chair at a distance of approximately 50 cm from a screen displaying a black image. The subjects were instructed to keep their eyes open during the experiments. At the beginning of each EEG recording trial, the subject was asked to relax his/her right hand on a table for 15 s. Then, the subject was instructed to submerge his/her right hand in a bucket of iced water with a temperature of approximately 1.5 ± 2.5 • C until he/she starts to have a moderately uncomfortable feeling. The subject marks this moment by clicking the mouse button of the recording computer using his/her left hand. This mouse click indicates the end of the no-pain state and the beginning of the pain state. During the pain state, the subject keeps his/her hand submerged in the bucket of the iced water until the pain becomes intolerable. At this moment, the subject withdraws his/her right hand from the iced water and relaxes it on the table. The pain state ends when the subject feels that the pain has reduced to a moderate level by clicking the mouse button again to indicate the end of the trial and EEG recording. After each trial, the subject was asked to submerge his/her right hand in a bucket of warm water that had a temperature of 40 ± 1 • C and countdown for 3 s starting from a random 4-digit number to ensure that each trial begins with the no-pain state [20]. The total number of trails recorded for each subject is five trials. Moreover, the average ± standard deviation duration of each trial computed over all subjects is 65.8 ± 19.1 s. Figure 1 shows the timing diagram of the experimental procedure during each trial.

Data Acquisition and Preprocessing
The EEG signals were recorded using the Emotiv EPOC+ EEG neuroheadset system (https://www.emotiv.com). The Emotiv system consists of 14 EEG electrodes that are arranged on the scalp according to the 10-20 international electrode placement system at the following locations; F 7 , T 7 , AF 3 , O 1 , F 4 , P 7 , AF 4 , F 8 , F 3 , P 8 , O 2 , FC 5 , T 8 , and FC 6 . These electrodes are referenced to the common mode sense (CMS)/ driven right leg (DRL) at the P 3 and P 4 locations. Figure 2 shows the positions of the EEG electrodes on the scalp. The acquired EEG signals are internally sampled at a frequency of 2048 Hz. Moreover, the Emotiv system is configured to filter the recorded EEG signals using a hardware band-pass filter with a bandwidth of 0.2 to 43 Hz and downsamples the EEG signals to 128 samples per second. Finally, the muscle and electrooculography artifacts within the EEG signals are reduced using the automatic artifact removal (AAR) toolbox [21,22].

Time-Frequency Analysis of the EEG Signals
The EEG signals are nonstationary signals due to their time-varying spectral components [6,17,23]. This implies that analyzing the EEG signals in the time-domain or the frequency-domain may not capture the variations in the spectral characteristics of the EEG signals over time [24]. In this regard, researchers have utilized several time-frequency analysis techniques to construct TFRs of the nonstationary signals, such as the wavelet transform (WT) [13]. Despite the promising results obtained based on analyzing the EEG signals using the WT [14,25], the TFRs constructed using the WT method have a nonuniform resolution across the time-frequency plane and are considered variant to frequency-shifts [16]. Having that said, in this study, we propose the use of a QTFD, namely the Choi-Willimas Distribution (CWD) [26], to analyze the EEG signals. The CWD maps the time-domain EEG signals into a high-resolution time-frequency domain that quantifies the distribution of the energy encapsulated in the EEG signals over the time and frequency domains [16,17,26].
To compute the CWD, we utilized a sliding window approach to divide the EEG signal of each electrode into a set of overlapping segments. Specifically, the number of samples within each segment is equal to 128 samples and the overlap between consecutive segments is equal to 64 samples. Then, for each EEG segment, x(t), the CWD of x(t), denoted by Γ x (t, f ), can be computed as follows [16,17].
where t and f represent the indices of the time and frequency samples, respectively, convolution operation computed over time and frequency domains, and Φ(t, f ) represents a time-frequency smoothing kernel. Specifically, the WVD of x(t), W x (t, f ), can be computed as follows [17].
where a(t) = x(t) + jH{x(t)} is the analytic signal associated with x(t), H is the Hilbert transform [27], and a * (·) is the complex conjugate of a(·). Moreover, the time-frequency smoothing kernel, Φ(t, f ), can be expressed as follows [26].
where α > 0 is a smoothing parameter that determines the amount of suppression applied to the cross-terms introduced by the WVD and its value was experimentally tuned to be 0.7. In this work, we have utilized the HOSA toolbox [28] to compute the CWD of the EEG segments. The dimensionality of the CWD-based time-frequency representation (TFR) computed for each EEG segment, which is obtained using Equation (1), is equal to T × F, where T = 128 represents the number of time samples within an EEG segment and F = 256 represents the number of frequency samples. Figure 3 provides sample images of the CWD-based TFR constructed for two different EEG segments. In particular, Figure 3a shows an EEG segment that belongs to the no-pain class. The CWD-based TFR constructed for the EEG segment presented in Figure 3a is shown in Figure 3b. Similarly, Figure 3c shows an EEG segment that belongs to the pain class. Figure 3d shows the CWD-based TFR constructed for the EEG segment presented in Figure 3c.

Feature Extraction
To reduce the dimensionality of the constructed CWD-based TFR, we propose to extract a set of twelve time-frequency features, denoted as TF 1 -TF 12 , which quantify the CWD-based TFR computed for each EEG segment. Table 1 describes the twelve time-frequency features extracted from the CWD-based TFR of each EEG segment. The first eight features, TF 1 -TF 8 , are computed by extending eight time-domain features to the joint time-frequency domain. These eight features are the mean (TF 1 ), variance (TF 2 ), skewness (TF 3 ), kurtosis (TF 4 ), sum of the logarithmic amplitudes (TF 5 ), median absolute deviation (TF 6 ), root mean square value (TF 7 ), and interquartile range (TF 8 ) values computed for the constructed CWD-based TFR of each EEG segment. The aforementioned eight time-frequency features comprise higher-order statistical features, such as TF 3 and TF 4 , which characterize the distribution of the energy encapsulated in an EEG segment. The last four features, TF 9 -TF 12 , are obtained by extending four frequency-domain features to the joint time-frequency domain. These four features are the flatness (TF 9 ), flux (TF 10 ), normalized Renyi entropy (TF 11 ), and energy concentration (TF 12 ) values computed for the constructed CWD-based TFR of each EEG segment. The last four features in Table 1 quantify the time-variant spectral information within each EEG segment. Detailed description of the construction procedure and the physical interpretation of the extracted time-frequency features can be found in our previous work [6,7,23]. Table 1. The time-frequency features extracted from the CWD-based TFR of each EEG segment [6,7,23].

Time-Frequency Feature Mathematical Expression of the Time-Frequency Feature
The mean of the CWD (TF 1 ) The variance of the CWD (TF 2 ) The skewness of the CWD (TF 3 ) The kurtosis of the CWD (TF 4 ) Sum of the logarithmic amplitudes of the CWD (TF 5 ) Median absolute deviation of the CWD (TF 6 ) Root mean square value of the CWD (TF 7 ) interquartile range of the CWD (TF 8 ) The flatness of the CWD (TF 9 ) The flux of the CWD (TF 10 ) The normalized Renyi entropy of the CWD (TF 11 ) The energy concentration of the CWD (TF 12 ) The time-frequency features extracted from the CWD-based TFRs of the EEG segments associated with a given temporal position of the sliding window are grouped to form a feature vector. In particular, the number of features comprised within each feature vector is equal to 14 × 12, where 14 represents the number of EEG segments at each window position, which is equal to the number of EEG electrodes, and 12 represents the number of time-frequency features extracted from the CWD-based TFR computed for each EEG segment at a particular window position.

Classification
For each trial, the feature vectors extracted from the EEG segments associated with the window positions that are located before the first mouse click are labeled as no-pain class, while the feature vectors extracted from the EEG segments associated with the window positions that are located after the first mouse click are labeled as pain class. These labeled feature vectors are used to construct a binary support vector machine (SVM) classifier with a radial basis function (RBF) kernel [29] to classify the feature vectors into one of two pain-related classes: no-pain and pain. Specifically, for each subject, we utilize a ten-fold cross-validation procedure to train and test the constructed binary SVM classifier. The ten-fold cross-validation procedure is repeated for ten times and the average classification performance for each subject is computed over the ten repetitions. Moreover, a grid search is performed to tune the regularization parameter, C, and the RBF kernel parameter, γ, of the SVM classifier. In particular, we select the values of C and γ that maximize the classification accuracy [30].

Performance Evaluation
In order to evaluate the effect of the utilized EEG channels and CWD-based time-frequency features on the classification performance of our proposed pain detection approach, we have developed two performance evaluation procedures: the channel-and feature-based evaluation procedures. The channel-based evaluation procedure investigates the effect of selecting a subset of the EEG channels on the accuracy of detecting pain, while the feature-based evaluation procedure investigates the effect of selecting a subset of the extracted CWD-based time-frequency features on the accuracy of detecting pain. In particular, for the channel-based evaluation procedure, we start by computing the classification performance of our proposed approach using all the EEG channels and all the CWD-based time-frequency features. After that, we apply an iterative channel reduction procedure to specify the minimum number of EEG channels required to accurately detect pain. The number of iterations of the channel reduction procedure is equal to the number of EEG channels considered in this study, which is equal to 14 channels. For each iteration, the average classification performance is computed over all subjects after removing one of the EEG channels. Then, the EEG channel that produced the highest average classification performance after its elimination is removed from subsequent iterations. Similarly, for the feature-based evaluation procedure, we applied an iterative feature reduction procedure to specify the minimum number of time-frequency features required to accurately detect pain. The number of iterations of the feature reduction procedure is equal to the number of time-frequency features considered in this study, which is equal to 12. For each iteration, the average classification performance is computed over all subjects after removing one of the CWD-based time-frequency features. Then, the time-frequency feature that produced the highest average classification performance after its elimination is removed from subsequent iterations. It is worth noting that the iterative reduction procedure, which is employed in the current channeland feature-based performance evaluation analyses, has been successfully employed in several previous studies [31,32] as an alternative approach to the computationally-intensive exhaustive search procedure [33,34].
In addition, we compare the classification performance obtained using the CWD-based time-frequency features presented in the current study with the classification performances obtained using the features presented in the works of the authors of [8,9]. In particular, Nir et al. [8] utilized the peak alpha frequency, which represents the frequency value associated with the highest power spectral density (PSD) within the alpha frequency band, as a feature to characterize the tonic pain in the EEG signals. To compare the classification performance obtained using our proposed CWD-based time-frequency features with the classification performance obtained using the feature presented in the study by Nir et al. [8], we applied the sliding window approach described in Section 2.4 to the EEG signals employed in the current study. After that, we computed the peak alpha frequency feature for each EEG segment. The features extracted from the EEG segments associated with a particular window position are grouped to form a feature vector. Then, for each subject, we train a SVM classifier to classify the feature vectors into one of two pain-related classes-the no-pain class and pain class-using the cross-validation procedure described in Section 2.6. Similarly, Panavaranan and Wongsawat [9] utilized two features-the normalized PSD of the alpha and beta frequency bands-to characterize the tonic pain in the EEG signals. To compare the classification performance obtained using our proposed CWD-based time-frequency features with the classification performance obtained using the features presented in the study by Panavaranan and Wongsawat [9], we applied the sliding window approach described in Section 2.4 to the EEG signals employed in the current study. After that, we computed the normalized PSD of the alpha and beta frequency bands for each EEG segment. The features extracted from the EEG segments associated with a particular window position are grouped to form a feature vector. Then, for each subject, we train a SVM classifier to classify the feature vectors into one of two pain-related classes, namely the no-pain class and pain class, using the cross-validation procedure described in Section 2.6.
In this study, we compute the classification performance for each subject in terms of two standard evaluation metrics, namely the classification accuracy and the F 1 −measure. These metrics are computed as follows [7,35].
where tp, tn, f p, and f n are the number of true positive, true negative, false positive, and false negative cases, respectively. Moreover, PRE = tp/(tp + f p) and REC = tp/(tp + f n) represent the precision and recall, respectively.

Results Obtained Using All the EEG Channels
In this section, we present the results obtained for our proposed approach using all the EEG channels and all the CWD-based time-frequency features. The average ± standard deviations classification accuracies obtained for each one of the 24 subjects (S 1 -S 24 ) are shown in Figure 4. In particular, the average classification accuracy values computed for the 24 subjects is between 73.8%, which was obtained for S 22 , and 93.3%, which was obtained for S 4 . Moreover, the average classification accuracy value obtained for each subject is significantly higher than the random classification rate, which is defined as the reciprocal of the number of classes, and equal to 50% (the red dashed line in Figure 4). The average ± standard deviation classification accuracy value achieved in distinguishing between the no-pain and pain classes, which is computed over all 24 subjects, is equal to 84.6% ± 4.3%. The average ± standard deviations of F 1 − measure values computed for the no-pain and pain classes per each subject are provided in Figure 5. In particular, the average ± standard deviation F 1 − measure values computed for the no-pain and pain classes over all subjects are 81.7% ± 5.9% and 85.1% ± 4.2%, respectively.

Results of the Channel Reduction Procedure
In this section, we present the results obtained for our proposed approach using the channel reduction procedure described in Section 2.7. Table 2 shows the average classification accuracies computed over all subjects for each of the 14 iterations of the channel reduction procedure. Specifically, at the first iteration, eliminating the channel FC 6 provides an average classification accuracy of 84.33%, which is higher than the average classification accuracies obtained when any of the other 13 EEG channels is eliminated. In the second iteration, the additional elimination of the channel T 8 provides an average classification accuracy of 83.76%, which is higher than the average classification accuracies obtained when any of the remaining 12 EEG channels is eliminated. Moreover, in the subsequent iterations of the channel reduction procedure, the average classification accuracy is drastically decreasing after each iteration. Finally, at the last iteration, an average classification accuracy of 65.9% is achieved using the channel F 7 . Table 2. The mean values of the average classification accuracies computed for each iteration of the channel reduction procedure over all subjects.

Iterations of the Channel Reduction Procedure
Eliminated EEG Channel Average classification accuracy(%)  Figure 4 indicates that the average classification accuracy obtained using all the EEG channels, which is equal to 84.6%, is higher than the average classification accuracies obtained after eliminating any of the EEG channels, where these accuracies are reported in Table 2. This can be attributed to several factors, including (1) the volume conductor effect [36], which describes the fact that the electrical potentials generated at a particular small region of the brain are spatially propagated to different regions of the brain. Consequently, these electrical potentials are recorded by the sparsely distributed electrodes on the scalp [6,23,36,37]. (2) The pain perception mechanism involves the activation of several regions of the brain, such as the frontal, prefrontal, temporal, parietal, and occipital regions [14,[38][39][40]. Therefore, eliminating any EEG channel implies that the brain activities produced within the region covered by the eliminated channel might not be captured during the analysis. This can justify the observed decrease in the classification accuracies obtained during the channel reduction evaluation procedure. In light of the results presented in this section, we have utilized all the EEG channels to perform the feature-based evaluation procedure, which is presented in Section 3.2, and the performance comparison with the other previous approaches, which is described in Section 3.3.

Results of the Feature-Based Evaluation Procedure
In this section, we present the results obtained for our proposed approach using the feature reduction procedure described in Section 2.7. Table 3 shows the average classification accuracies computed over all subjects for each of the 12 iterations of the feature reduction procedure.
Specifically, at the first iteration, eliminating feature TF 8 provides an average classification accuracy of 86.63%, which is higher than the average classification accuracies obtained when any of the other 11 time-frequency features is eliminated. In the second iteration, the additional elimination of the feature TF 3 provides an average classification accuracy of 87.68, which is higher than the average classification accuracies obtained when eliminating any of the remaining 10 time-frequency features. Moreover, in the subsequent iterations of the feature reduction procedure, the average classification accuracy keeps increasing until the 9th iteration, at which the average classification accuracy achieved using the time-frequency features TF 7 , TF 9 , and TF 12 is 89.24%. In the consecutive iterations, the average classification accuracy starts to decrease after each iteration. Finally, at the last iteration, an average classification accuracy of 87.06% is achieved using the time-frequency feature TF 7 .  Table 3 shows that the average classification accuracy obtained using only the TF 7 , TF 9 , and TF 12 features, which is equal to 89.24%, is higher that the average classification accuracy obtained using all the CWD-based time-frequency features, which is equal to 84.6%, as shown in Figure 4. In addition, the results presented in Table 3 suggest that the best classification accuracy can be obtained by eliminating all the features except TF 7 , TF 9 , and TF 12 . This can be attributed to two factors: (1) The physical interpretation of the quantities measured by each of these three features. In particular, the feature TF 7 , which is the root mean square of the CWD, measures the strength of the signal energy in the time-frequency domain [6,7,17,23]. The feature TF 9 , which is the time-frequency flatness of the CWD, measures the rate of change of the signal energy in the time-frequency domain [6,7,17,23]. Finally, the feature TF 12 , which is the time-frequency energy of the CWD, measures the concentration of the signal energy in the time-frequency domain [6,7,17,23]. In fact, further analysis is needed to explore the correlation between the energy characteristics of the EEG signals and the capability of detecting pain. (2) Recently, researchers have shown that during pain there are several neural oscillations that are associated with different frequency bands and are generated across various regions of the brain [10,41]. Therefore, using the computed values of the aforementioned three features enables capturing the time-varying spectral oscillations encapsulated within EEG segments, where such time-varying spectral oscillations can contribute to the task of differentiating between the no-pain and pain classes.
It is worth noting that the use of the time-frequency features TF 9 and TF 12 in different EEG-based classification problems, such as decoding human emotions [23] and motor imagery tasks [6], has yielded a higher classification accuracies compared with the other time-frequency features. This implies the capability of these time-frequency features to capture the nonstationary and nonlinear characteristics of the EEG signals. This in turn suggests the possibility of employing these features in the design process of several EEG-based classification problems [23]. Figure 6 shows the classification accuracy values and the corresponding standard deviations obtained for each subject using the features TF 7 , TF 9 , and TF 12 . In particular, the average classification accuracy values computed for the 24 subjects are between 81.9%, which was obtained for S 22 , and 96.5%, which was obtained for S 13 . Moreover, the average classification accuracy value obtained for each subject is significantly higher than the random classification rate, which is defined as the reciprocal of the number of classes, and equal to 50% (the red dashed line in Figure 6). The average ± standard deviation classification accuracy value achieved in distinguishing between the no-pain and pain classes using the features TF 7 , TF 9 , and TF 12 , which is computed over all 24 subjects, is equal to 89.2% ± 3.2%. The average ± standard deviations of F 1 − measure values computed for the no-pain and pain classes per each subject using the features TF 7 , TF 9 , and TF 12 are provided in Figure 7. In particular, the average ± standard deviation of F 1 − measure values computed for the no-pain and pain classes over all subjects are 87.4% ± 4.1% and 89.5% ± 3.3%, respectively. The results presented in Figures 6 and 7, which are obtained using the features TF 7 , TF 9 , and TF 12 , outperform the results obtained using all the EEG channels and all the features, which are shown in Figures 4 and 5. Figure 6. The average classification accuracy values computed for each subject over the ten train-test repetitions of the employed cross-validation procedure using the features TF 7 , TF 9 , and TF 12 . The black vertical lines represent the standard deviations of the classification accuracy and the red dashed line represents the random classification rate. The last column represents the mean value of the average classification accuracies computed over the 24 subjects.

Comparison with Other Approaches
In this section, we compare the classification results obtained using our proposed CWD-based time-frequency features with the classification results obtained using the features presented in the studies [8] and [9], respectively, as described in Section 2.7. Table 4 provides the average ± standard deviations classification accuracies computed over all 24 subjects for the methods presented in the studies [8] and [9]. Specifically, using the feature presented by Nir et al. [8], the average ± standard deviation classification accuracy value achieved in distinguishing between the no-pain and pain classes, which is computed over all 24 subjects, is equal to 81.1% ± 5.2%. Similarly, using the features presented by Panavaranan and Wongsawat [9], the average ± standard deviation classification accuracy value achieved in distinguishing between the no-pain and pain classes, which is computed over all 24 subjects, is equal to 75.6% ± 8.7%. The results presented in Table 4 indicate that the classification accuracy obtained using our proposed CWD-based time-frequency features outperforms the classification accuracies obtained using the features described by the authors of [8,9]. This suggests the feasibility of employing our proposed CWD-based time-frequency features to successfully distinguish between the EEG segments associated with the no-pain and pain classes.

Limitations and Future Work
In spite of the promising results achieved using our proposed CWD-based approach in distinguishing between the no-pain and pain classes, we are planning to investigate several research directions that can enhance the accuracy and applicability of our proposed pain detection approach. Specifically, we are intending to explore four future research directions: (1) Recently, few researchers have successfully utilized deep learning techniques to analyze the time-frequency maps constructed for the EEG signals [42,43]. In this regard, we intend to investigate the possibility of employing deep learning techniques, such as convolutional neural networks (CNN), to extract data-driven deep features from the constructed CWD-based TFR of the EEG signals that can enhance the accuracy of pain detection. (2) The subjective nature of pain, which introduces a significant intra-and intersubject variations in the responses of subjects to external painful stimuli, along with the nonstationary nature of the EEG signals increase the necessity to acquire a large-scale and balanced database that can be used to design a subject-independent pain detection system. In addition, the results obtained based on physiological signals, such as the EEG signals, that are acquired under well-controlled lab conditions may overestimate the results that can be achieved in real-world applications, in which minimally controlled acquisition conditions are maintained. In light of this, we plan to record a large-scale database under realistic recording conditions to assess the capability of our proposed CWD-based pain detection approach to support real-world applications. (3) We also plan to extend our work by utilizing the extracted time-frequency features to estimate the pain level that a patient is feeling. Such an extension can enhance the ability of the caregivers to characterize and mange clinical pain for patients who are not capable to verbally express their feelings, such as infants. (4) We plan to extend the analyses performed in the current study from subject-specific analyses to subject-independent analyses. The subject-independent analyses can measure the ability of our proposed CWD-based pain detection approach to detect pain for new subjects that were not included in the training set, which can be useful for several real-world applications.

Conclusions
In this paper, we have proposed an EEG-based approach for detecting tonic cold pain. The proposed approach utilizes a set of time-frequency features that are extracted from the constructed CWD-based TFR of the EEG signals to train a SVM classifier that can accurately distinguish between the no-pain and pain states. The results presented in this work indicate that the EEG signals comprise rich information that can be used to understand the pain perception mechanism. In addition, the results suggest the feasibility of using the CWD to analyze the EEG signals and extract quantitative time-frequency features that are capable of distinguishing between different finger movements.