Correntropy-Based Pulse Rate Variability Analysis in Children with Sleep Disordered Breathing

Pulse rate variability (PRV), an alternative measure of heart rate variability (HRV), is altered during obstructive sleep apnea. Correntropy spectral density (CSD) is a novel spectral analysis that includes nonlinear information. We recruited 160 children and recorded SpO2 and photoplethysmography (PPG), alongside standard polysomnography. PPG signals were divided into 1-min epochs and apnea/hypoapnea (A/H) epochs labeled. CSD was applied to the pulse-to-pulse interval time series (PPIs) and five features extracted: the total spectral power (TP: 0.01–0.6 Hz), the power in the very low frequency band (VLF: 0.01–0.04 Hz), the normalized power in the low and high frequency bands (LFn: 0.04–0.15 Hz, HFn: 0.15–0.6 Hz), and the LF/HF ratio. Nonlinearity was assessed with the surrogate data technique. Multivariate logistic regression models were developed for CSD and power spectral density (PSD) analysis to detect epochs with A/H events. The CSD-based features and model identified epochs with and without A/H events more accurately relative to PSD-based analysis (area under the curve (AUC) 0.72 vs. 0.67) due to the nonlinearity of the data. In conclusion, CSD-based PRV analysis provided enhanced performance in detecting A/H epochs, however, a combination with overnight SpO2 analysis is suggested for optimal results.


Introduction
Obstructive sleep apnea (OSA), the most common form of sleep-disordered breathing, is characterized by the obstruction of the upper airway, which disrupts normal breathing during sleep.In children, the prevalence of OSA ranges from 0.1% to 13% [1]; and can result in severe complications if untreated, such as: heart failure, developmental delay, and growth and behavioural problems [2].The gold standard for OSA diagnosis, polysomnography, is resource intensive and not widely available.Thus, overnight oximetry has been studied as a potential standalone method to diagnose OSA.In a previous study, we showed that characterization of oxygen saturation recorded by the Phone Oximeter (a smartphone-based pulse oximeter) could identify children with OSA [3].However, we observed some sleep apnea events that occurred in the absence of oxygen desaturation.
The occurrence of breathing cessation or apnea events during sleep is known to impact the autonomic regulation of heart rate by enhancing sympathetic activity and reducing parasympathetic activity [4].Thus, analysis of heart rate variability (HRV) has been investigated as a possible predictor of OSA [5][6][7].Studies have shown that OSA affects the normal variation of heart rate and that both HRV and pulse rate variability (PRV), a surrogate measure of HRV estimated from the photoplethysmogram (PPG) [8], may assist with the identification of OSA events [9,10].
The most widely used methods to assess cardiac autonomic regulation are conventional statistical methods and power spectral density (PSD) analysis of HRV/PRV.HRV analysis is based on the inter-beat interval (RRI) variation calculated from Electrocardiography (ECG), whereas PRV analysis is based on the pulse-to-pulse interval (PPI) variation calculated from PPG [10].Some studies have reported a chaotic heart beat behaviour [11,12] suggesting that measures describing nonlinear dynamics of heart rate, such as fractal measures, may reveal prognostic information beyond that obtained by conventional measures [6,10,13].Several studies have proposed the use of entropy methods, i.e., sample entropy [14] and multiscale entropy [6,15], as measures of complexity to differentiate between the HRV pattern of normal and OSA subjects.However, very few studies have analyzed the heart's nonlinear behavior in the spectral domain [6].
Correntropy is a generalized correlation function that provides information on higher-order statistics [16].It is a similarity measure that is able to detect nonlinearities that conventional techniques, based on second-order statistics (i.e., correlation or PSD), may be unable to detect [17].Similar to the conventional correlation function, the correntropy is a positive definite function and therefore lends itself to analysis in the spectral domain through the correntropy spectral density (CSD), a generalization of the conventional PSD [17].
In this study, we proposed the use of CSD, which includes nonlinear information, to improve the identification of apnea/hypoapnea (A/H) events in children for the analysis of the cardiac autonomic regulation in the spectral domain.We investigated the effect of OSA on sympathetic and parasympathetic activity, applying CSD to the pulse-to-pulse intervals of the PPG signal recorded with the Phone Oximeter.In addition, to support the use of nonlinear techniques such as CSD, we performed a nonlinearity test, using the surrogate data technique and spectral CSD-based analysis.

Dataset
Following the approval of the University of British Columbia and Children's and Women's Health Centre of British Columbia Research Ethics Board (H11-01769), we recruited 160 children with signs of OSA, who had been referred to the British Columbia Children's Hospital (BCCH) for a polysomnography.In total, 146 children (59 female and 87 male), 9.1 ± 4.3 years old were analyzed, as 14 children were excluded from the dataset due to inadequate length of sleep (less than 3 h).
Data acquisition was performed at BCCH, in a dedicated facility where standard polysomnographies are performed.The polysomnography consisted of the overnight recording of electrocardiography, electroencephalography, chest movement, nasal airflow, pulse oximetry, and video/audio data using the Embla Sandman S4500s (Middleton, WI, USA).In addition, a second pulse oximeter sensor, commercially available from Masimo, was applied to the finger adjacent to the one used during the standard polysomnography.This sensor was attached to the Phone Oximeter, and SpO 2 and PPG signals were recorded alongside polysomnography.
A sleep technician scored the polysomnography recordings, based on the American Academy of Sleep Medicine (AASM) criteria, identifying all A/H events.As defined by AASM, in children, a sleep apnea event occurs when the respiratory flow signal amplitude drops by 90% of pre-event baseline and lasts for more than two breaths.Hypopnea events occur when the respiratory signal amplitude drops by 50%, relative to the two preceding breaths.Respiratory oxygen desaturation events are defined as oxygen desaturation decreases of more than 3% relative to baseline [18].

PPG Signal Processing
After baseline removal, all PPG signals recorded using the Phone Oximeter were segmented into epochs of 1-min duration.Using the scores performed by the sleep technician based on the polysomnography study, all epochs were labelled as OSA epochs if they contained A/H events or non-OSA events if they did not.
Each segment was assigned a signal quality index between 0 and 100 based on a cross correlation method [19] and epochs with a low signal quality index (less than 50) were rejected from further analysis even if a very small part of the segment was contaminated by artifact.We applied a peak detection algorithm based on zero-crossing to locate the pulse peaks in the PPG signal and to obtain pulse-to-pulse interval time series (PPIs).PPIs shorter than 0.33 s and greater than 1.5 s were considered artifacts [20] and consequently deleted from the time series.

PRV Analysis in the Spectral Domain
The resulting labelled epochs were used to assess the cardiac autonomic regulation during OSA in the spectral domain.Thus, power spectral analysis of PPIs was used to study the frequency distribution of heart rate; the power in the very low frequency band (VLF) ranging from 0.01 to 0.04 Hz, the power in the low frequency band (LF) ranging from 0.04 to 0.15 Hz, which is mostly related to sympathetic activity, and the power in the high frequency band (HF) ranging from 0.15 to 0.6 Hz, which is commonly utilized to quantify parasympathetic activity.The ratio of LF to HF (LF/HF ratio) was defined as an index that represents the sympathetic/parasympathetic balance; a higher LF/HF ratio reflects a shift towards sympathetic activity.The power in each frequency band was computed by integrating the area under the power spectrum curve bound by the band of interest.LF and HF powers were normalized (LFn and HFn) by the total spectral power (TP) between 0.04 and 0.6 Hz.

Correntropy Spectral Density (CSD)
Correntropy is an innovative similarity measure defined in terms of inner products of vectors in a kernel parameter space [16].This measure, being directly related to Renyi's quadratic entropy estimate of data using Parzen windowing, contains information on both the time structure and the statistical distribution of the data, consisting of the information of higher order moments of the data.The ability to reflect nonlinear characteristics of the signal makes correntropy a suitable candidate for characterization of nonlinear dynamics [21].
For a stationary process, x(n), the correntropy function, V(m), is defined as where E[•] is the statistical expectation and κ(•) a symmetric, positive definite kernel.The use of kernel methods makes the correntropy computationally efficient since it can be calculated directly from the data.In this study, the Gaussian kernel is applied, defined as where σ is the kernel parameter, here set using Silverman's rule of density estimation: σ = 0.9AN −1/5 , where A is the smaller value between the standard deviation of the data and the interquartile range of the data scaled by 1.34, and N is the number of samples [16].Similar to PSD, defined as the Fourier transform of the autocorrelation function, the CSD is defined as the Fourier transform of the correntropy function.Thus, the CSD is a generalization of the conventional PSD

Data Analysis
The CSD was applied to the PPIs extracted from each epoch, and five spectral features (TP, VLF, LFn, HFn, and LF/HF ratio) were computed.The performance of the CSD-based analysis was compared to that of the conventional PRV analysis in the spectral domain, by extracting the same features from the PSD.Both CSD and PSD were normalized to unity before feature extraction.
First, a Mann-Whitney U test was applied to evaluate the differences between the features extracted from CSD-based analysis and PSD-based analysis.Then, a second Mann-Whitney U test and a univariate logistic regression model were applied to evaluate the differences of each feature (CSD-based and PSD-based) between non-OSA versus OSA epochs.Bonferroni correction was used to adjust for multiple (n = 10) comparisons.A probability of p-value < 0.05/n was considered significant.
A stepwise selection method was applied to identify the most relevant features and to develop two multivariate logistic regression models to detect epochs with OSA: (1) using CSD-based analysis; (2) using PSD-based analysis.The stepwise selection method added/dropped one feature at a time and stopped when further inclusion/exclusion no longer improved the model, as determined by the Akaike information criterion [22].
To validate the developed models, we used the bootstrap with replacement method, since it is recommended as the optimal technique for estimating internal validity of a predictive logistic regression model [23].This method consists of using N data sets (of the same size as the original data set), created by drawing samples with replacement from the original dataset to test the model.Thus, using the original data set, N = 100 bootstrap samples were generated using sampling with replacement [24].The stepwise selection method was applied to each bootstrap sample to develop a multivariate model.This model was applied to predict the outcome in the bootstrap sample and in the original data set.The difference in the prediction performance, quantified by the area under the receiver operating characteristic curve (AUC), was computed to estimate the optimism of the models.
The data analysis was conducted using MATLAB (R2015b, USA) and R v3.2.0 (R Foundation for statistical computing, Vienna, Austria).

Nonlinearity Test Using Surrogate Data
The surrogate data method was used to test the nonlinearity of the data and justify the use of nonlinear techniques such as CSD [25][26][27].First, a null hypothesis that a Gaussian linear stochastic process generated the PPIs was considered.Next, surrogate data was created using the iteratively refined amplitude adjusted Fourier transform (IAAFT) algorithm.This technique keeps the power spectrum and the amplitude distribution of the original PPIs data but with random phases.The conventional PSD is the squared amplitude of the Fourier transform; therefore, the original and its surrogate data obtained by this method share the same PSDs, and therefore the same autocorrelation function.By such phase randomization, linear correlations are maintained, but any underlying nonlinear dynamic structure within the original data is altered [25].Then, the CSD-based features evaluated on the original and surrogate data were compared.If significant differences were shown between both data, the null hypothesis was rejected.
For a level of significance of α, a set of M = 1/α − 1 surrogate sequences need to be generated for a one-sided test [25].To test at a 95% level of significance (α = 0.05), at least 39 surrogates were required for each epoch.Therefore, 50 surrogates were created for each 1-min epoch.Finally, the distribution of each CSD-based feature obtained with the surrogate data and real data was compared, for each patient, using a third Mann-Whitney U test.

Univariate Analysis
In total, 67,635 epochs were studied, 4547 with A/H events and 63,092 without.All the features extracted from CSD-based analysis resulted significantly different from the features extracted from PSD-based analysis (see Table 1).Associated with an increase of 0.01 times the feature unit; ‡ Associated with an increase of 0.1 times the feature unit; ˘Associated with an increase of 10 times the feature unit; ˆAssociated with an increase of 1000 times the feature unit; * Note that longer signal segments are required to further validate these results.
In addition, as illustrated in Table 1, OSA epochs presented lower power in the high frequency band (HFn) relative to non-OSA epochs, reflecting lower parasympathetic activity during OSA.In addition, a higher power in the low frequency band (LHn) and higher LF/HF ratio in OSA epochs compared to non-OSA epochs, showed a shift towards sympathetic activity during OSA.This was observed with both analyses (PSD and CSD), however, CSD-based features were more discriminant showing lower p-values and higher AUCs (Table 1).

Multivariate Model Development and Classification
The two logistic regression models implemented using spectral PRV features extracted from CSD-based analysis (Table 2) and PSD-based analysis (Table 3) presented an area under the ROC curve (AUC) of at least 0.72 and 0.67, respectively (Figure 1).

Nonlinearity Test
For each patient, we compared the distribution of the CSD-based features, computed for each epoch, from real and surrogate data.
The CSD-based features extracted from original and surrogate data were statistically different in at least 81% of the patients.More specifically the features: VLF, LFn, HFn, LF/LF ratio, and TP, were significantly different in 99%, 89%, 81%, 87%, and 99% of the patients, respectively.

Discussion
This study showed that analyzing the cardiac autonomic regulation in the spectral domain using the correntropy function improved the performance of spectral PRV features identifying 1-min epochs with sleep A/H events.The spectral analysis of the correntropy function (CSD) was applied to the PPIs extracted from each PPG epoch.All the PRV features (VLF, LFn, HFn, LF/HF ratio, and TP) calculated from CSD provided better individual results distinguishing OSA versus non-OSA epochs (i.e., higher AUC and lower p-value) compared to the PRV features extracted from conventional PSD-based analysis.Also, the multivariate logistic regression model developed using features extracted from CSD-based analysis provided better classification results, in terms of accuracy, sensitivity, specificity and AUC, than the model created with PSD-based features.The reason for this improvement could be the capability of the correntropy, and subsequently CSD, to preserve nonlinear characteristics and high-order moments of the data [16,28,29], in this case PPIs, which have been previously documented as chaotic or nonlinear [11,12].In addition, we justified the use of CSD by demonstrating that the majority of the patients showed evidence of nonlinear PPIs behaviour.
CSD-based analysis could provide augmented information on the spectral PRV compared to conventional linear methods such as PSD-based analysis, due to the nonlinear structure of the PPIs.This fact was also reflected by the significant differences found between the same features extracted These models provided the probability of each epoch containing an A/H event.To identify OSA epochs, based on the predicted probabilities, we used a risk threshold of 0.0655, which is similar to the percentage of OSA epochs in our dataset.The model using CSD-based analysis provided the best classification results (Table 4).

Nonlinearity Test
For each patient, we compared the distribution of the CSD-based features, computed for each epoch, from real and surrogate data.
The CSD-based features extracted from original and surrogate data were statistically different in at least 81% of the patients.More specifically the features: VLF, LFn, HFn, LF/LF ratio, and TP, were significantly different in 99%, 89%, 81%, 87%, and 99% of the patients, respectively.

Discussion
This study showed that analyzing the cardiac autonomic regulation in the spectral domain using the correntropy function improved the performance of spectral PRV features identifying 1-min epochs with sleep A/H events.The spectral analysis of the correntropy function (CSD) was applied to the PPIs extracted from each PPG epoch.All the PRV features (VLF, LFn, HFn, LF/HF ratio, and TP) calculated from CSD provided better individual results distinguishing OSA versus non-OSA epochs (i.e., higher AUC and lower p-value) compared to the PRV features extracted from conventional PSD-based analysis.Also, the multivariate logistic regression model developed using features extracted from CSD-based analysis provided better classification results, in terms of accuracy, sensitivity, specificity and AUC, than the model created with PSD-based features.The reason for this improvement could be the capability of the correntropy, and subsequently CSD, to preserve nonlinear characteristics and high-order moments of the data [16,28,29], in this case PPIs, which have been previously documented as chaotic or nonlinear [11,12].In addition, we justified the use of CSD by demonstrating that the majority of the patients showed evidence of nonlinear PPIs behaviour.
CSD-based analysis could provide augmented information on the spectral PRV compared to conventional linear methods such as PSD-based analysis, due to the nonlinear structure of the PPIs.This fact was also reflected by the significant differences found between the same features extracted through both analyses (CSD versus PSD).As reported previously in studies focused on the OSA effect on HRV [5,10,30], both PRV analyses (CSD and PSD) showed higher power in the low frequency band and lower power in the high frequency band in OSA epochs relative to non-OSA epochs.The most significant features were the HFn and LF/HF ratio, reflecting higher sympathetic activity and lower parasympathetic activity during OSA events.All the PRV features extracted from CSD-based analysis were more significant than the ones obtained from conventional PSD analysis, showing that CSD-analysis could be a promising tool to evaluate spectral PRV.
Several studies have shown that OSA affects the normal variation of heart rate and that both HRV [5,7,15,31] and PRV [9,32] might help identifying OSA events.Some studies have suggested that certain nonlinear measures of HRV are better predictors of future adverse events in various patient groups than the standard linear measures [13].For example, Gutiérrez-Tobal et al. proved the usefulness of the spectral entropy and multiscale entropy analyses of HRV to detect OSA [6], and Pan et al. showed that multiscale entropy analysis may serve as a preliminary screening tool for assessing OSA severity prior to polysomnography [15].Al-Angari et al. applied sample entropy to the analysis of HRV complexity in subjects with and without OSA [14].The results of that study demonstrated not only a reduction in parasympathetic activity and decreased sample entropy, but also an imbalance between sympathetic and parasympathetic modulation in subjects with OSA as compared to those without.In addition to previous research, the results of this study suggested that using a more advanced spectral analysis that includes nonlinear information could enhance the use of PRV analysis identifying OSA events.
Similar to previous studies, the findings of this study suggest that OSA has an effect on the regulation of cardiac function, suggesting that characterizing this effect through PRV analysis could permit detecting A/H events in children [9,10].More importantly, we showed that there is nonlinear information in the PPIs extracted from 1-min PPG epochs and therefore using nonlinear techniques, such as CSD, could enhance the standard PRV analysis.Previous studies have shown the potential of nonlinear analysis for PRV analysis in time domain, but very few have studied complexity in the spectral domain.The major advantage of CSD, is that it provides information on both the time structure and the statistical distribution of the data, which can be expected to facilitate the detection of heart rate nonlinearities that conventional techniques, based on second-order statistics, are otherwise unable to do.Nevertheless, the disadvantage of this analysis is that the physiological background of the nonlinear measures of HRV is much less well understood than that of the conventional measures [13].
Despite the improvement of CSD-based features identifying epochs with sleep A/H events, a limitation of the study is that the physiological background of the additional nonlinear information preserved in the CSD-based analysis is not well understood.Further research is needed to better interpret the physiology of this nonlinear information.Another limitation is that the performance of the best PRV model, which was developed using CSD-based features, was still modest and should be improved.For example, by combining it with features extracted from the overnight oxygen saturation (SpO 2 ) analysis, which will be the next logical step.In previous studies, we showed that the characterization of overnight SpO 2 pattern recorded using the Phone Oximeter successfully identified children with significant OSA [3].We have also shown that some OSA events can occur without a significant SpO 2 desaturation, and that combining the characterization of SpO 2 with PRV analysis, both recorded using the Phone Oximeter, improved the results identifying OSA [33].More recently, we have developed a multivariate logistic model based on features extracted from SpO 2 and PRV analysis can help to automatically identify epochs containing A/H events [34].However, in all these studies we used conventional PRV techniques.Thus, introducing CSD-based analysis we expect to improve upon our previous results.

Conclusions
In this study, CSD is found to be promising tool for the characterization of spectral PRV, to identify epochs with OSA.Compared to conventional PSD-based PRV analysis, the CSD-based PRV analysis, provided more significant and more accurate features distinguishing PPG epochs with and without A/H events.The correntropy function would appear to have enhanced the extraction of PRV features and was therefore more accurate than conventional linear methods such as PSD.The analyzed data (PPIs) contained nonlinear information that correntropy, with its ability to preserve nonlinear characteristics and high order moments of the data, was able to quantify.However, the performance of PRV alone, even when using CSD, needs to be improved by for example combining it with SpO 2 characterization.

Figure 1 .Table 4 .
Figure 1.The area under the curve (AUC) of the receiver operating characteristic (ROC) of (a) the model that uses spectral PRV features extracted from CSD-based analysis and (b) the model that uses spectral PRV features extracted from conventional PSD-based analysis.

Figure 1 .
Figure 1.The area under the curve (AUC) of the receiver operating characteristic (ROC) of (a) the model that uses spectral PRV features extracted from CSD-based analysis and (b) the model that uses spectral PRV features extracted from conventional PSD-based analysis.

Table 1 .
Distribution of significant spectral PRV features extracted using CSD-based and PSD-based analyses.Features are represented in terms of median and interquartile range (IQR), for epochs with and without A/H events, and their OR and AUC (95% CI) values.All these features were statistically significant with p-value < 0.05/n, with n = 10.

Table 2 .
Parameter estimates from the model to identify OSA epochs using CSD-based PRV analysis.

Table 3 .
Parameter estimates from the model to identify OSA epochs using PSD-based PRV analysis.

Table 4 .
Classification results represented by the mean and 95% CI of the accuracy (Acc), sensitivity (Sn), specificity (Sp), area under the roc curve (AUC), and optimism of the AUC.