Validity of Ultra-Short-Term HRV Analysis Using PPG—A Preliminary Study

Continuous measurement of heart rate variability (HRV) in the short and ultra-short-term using wearable devices allows monitoring of physiological status and prevention of diseases. This study aims to evaluate the agreement of HRV features between a commercial device (Bora Band, Biosency) measuring photoplethysmography (PPG) and reference electrocardiography (ECG) and to assess the validity of ultra-short-term HRV as a surrogate for short-term HRV features. PPG and ECG recordings were acquired from 5 healthy subjects over 18 nights in total. HRV features include time-domain, frequency-domain, nonlinear, and visibility graph features and are extracted from 5 min 30 s and 1 min 30 s duration PPG recordings. The extracted features are compared with reference features of 5 min 30 s duration ECG recordings using repeated-measures correlation, Bland–Altman plots with 95% limits of agreements, Cliff’s delta, and an equivalence test. Results showed agreement between PPG recordings and ECG reference recordings for 37 out of 48 HRV features in short-term durations. Sixteen of the forty-eight HRV features were valid and retained very strong correlations, negligible to small bias, with statistical equivalence in the ultra-short recordings (1 min 30 s). The current study concludes that the Bora Band provides valid and reliable measurement of HRV features in short and ultra-short duration recordings.


Introduction
In recent years, especially during the COVID-19 pandemic, there has been a growing trend toward remote online health monitoring using wearable devices. These devices are easy to use, inexpensive, and capable of collecting non-invasive and continuous long-term data, allowing for health status monitoring, diagnosis, and disease prevention [1]. Heart rate (HR) and heart rate variability (HRV) are vitally important in e-health applications as they reflect the activity of the cardiovascular autonomic nervous system, thus, the physiological status of a patient [2,3]. They were found to have prognostic value in several cardiovascular and pulmonary diseases [4][5][6]. One of the leading causes of morbidity and mortality among pulmonary diseases is chronic obstructive pulmonary disease (COPD) [7]. It represents high health and economic burden, especially due to exacerbations [8]. Exacerbations are defined as worsening of the symptoms related to COPD and are often related to cardiac autonomic system dysfunction, resulting in changes in HRV [5,6]. Non-invasive and continuous monitoring of the HRV response in COPD patients is highly beneficial to prevent severe exacerbations.
Two main techniques are often used for HR estimation: conventional Electrocardiography (ECG) and photoplethysmography (PPG) [9]. Although the ECG signal has often been used as a reference for HR estimation, the associated monitoring system is not practical and comfortable for continuous monitoring, because it requires the attachment of multiple electrodes to the patient's chest [10]. In contrast, PPG is a popular, non-invasive technique

Data Processing
The processing algorithm is shown in Figure 2. It starts with evaluating the PPG signal quality and eliminating poor-quality recordings, followed by aligning and trimming the continuous ECG recording in accordance with the good-quality PPG recordings. Then, the PPG and ECG data are filtered, and pulse peaks (P-peaks) and R-peaks are detected to calculate pulse-to-pulse and beat-to-beat intervals, P-P intervals, and R-R intervals, respectively. Then, features are extracted from P-P and R-R intervals as surrogates of HRV features. Finally, we evaluate the agreements of HRV features between PPG and ECG at two levels:

•
Recording duration: two recording durations are tested-the default duration of 5 min 30 s and 1 min 30 s. The first 90 s of each recording was considered to obtain the shorter recordings of 1 min 30 s.

•
Sampling rate: As mentioned earlier, the nominal sampling rate for the Bora Band is 25 Hz. However, the temporal resolution over this sampling rate is low compared to ECG. Therefore, the PPG recordings are resampled to 200 Hz using the fast Fourier transform.

Data Processing
The processing algorithm is shown in Figure 2. It starts with evaluating the PPG signal quality and eliminating poor-quality recordings, followed by aligning and trimming the continuous ECG recording in accordance with the good-quality PPG recordings. Then, the PPG and ECG data are filtered, and pulse peaks (P-peaks) and R-peaks are detected to calculate pulse-to-pulse and beat-to-beat intervals, P-P intervals, and R-R intervals, respectively. Then, features are extracted from P-P and R-R intervals as surrogates of HRV features. Finally, we evaluate the agreements of HRV features between PPG and ECG at two levels:

•
Recording duration: two recording durations are tested-the default duration of 5 min 30 s and 1 min 30 s. The first 90 s of each recording was considered to obtain the shorter recordings of 1 min 30 s. • Sampling rate: As mentioned earlier, the nominal sampling rate for the Bora Band is 25 Hz. However, the temporal resolution over this sampling rate is low compared to ECG. Therefore, the PPG recordings are resampled to 200 Hz using the fast Fourier transform.

Pre-Processing
We began the pre-processing step by identifying and eliminating poor-quality PPG acquisitions based on quality metrics that were computed on 4 s blocks with 2 s overlap. They consist of identifying whether the bracelet is worn, the acquisition is stable, a pulsatile signal is detected, and no important movement is detected.
Then, a 4 s block is considered invalid if any of the above conditions are not met. A PPG recording is considered too noisy and eliminated from the study if there are more than 10% invalid 4 s blocks. Among the selected good-quality PPG recordings, only the PPG amplitude of the green channel was used in the peaks detection algorithm as it is more robust to the movement noise [14]. The green-channel PPG signal was filtered using

Pre-Processing
We began the pre-processing step by identifying and eliminating poor-quality PPG acquisitions based on quality metrics that were computed on 4 s blocks with 2 s overlap. They consist of identifying whether the bracelet is worn, the acquisition is stable, a pulsatile signal is detected, and no important movement is detected.
Then, a 4 s block is considered invalid if any of the above conditions are not met. A PPG recording is considered too noisy and eliminated from the study if there are more than 10% invalid 4 s blocks. Among the selected good-quality PPG recordings, only the PPG amplitude of the green channel was used in the peaks detection algorithm as it is more robust to the movement noise [14]. The green-channel PPG signal was filtered using a forward-backward second-order Butterworth bandpass filter with a frequency band of (0.5, 4 Hz).
The continuous ECG data were sliced into shorter duration recordings of 5 min 30 s and aligned on the remaining PPG recordings by using their respective time stamps. ECG recordings were filtered by a forward-backward second-order Butterworth bandpass filter with a frequency band of (5, 30 Hz). Then, the second derivative of the Actiwave's accelerometer signal was computed, and ECG values corresponding to high accelerations were replaced by zero.

R-Peak Detection from the ECG
Since PPG and ECG recordings are distinct, two peak detection algorithms were implemented. R-peaks were detected from the ECG recordings using a modified version of the "adaptive and time-efficient algorithm" proposed by Qin et al. [19]. The algorithm was applied using a sliding window of 4 s with a 3 s overlap to account for amplitude changes during the night. ECG data obtained from each window were normalized by dividing the values by their maximum. Then, R-peaks were identified by searching for local maxima and selecting "true" R-peaks based on adaptive amplitude and time interval thresholds.

P-Peak Detection from the PPG
The waveform of the PPG signal indicates the changes in pulsatile blood flow from which the detection of signal peaks allows the calculation of peak-to-peak (P-P) intervals, which translates to a measure of HR [10]. Pulse peaks were detected by searching for minima and maxima whose relative difference exceeds a threshold based on the interquartile range of the data, as proposed by Navarro et al. [20]. We included in the algorithm an adaptive amplitude threshold and a backward search process to improve the algorithm and account for changes in the amplitude of the PPG signal. The first step was to initialize the algorithm on a 10 s window and a fixed threshold. The amplitude threshold was updated according to the detected extrema in the initialization phase. Then, a search for maxima and minima was performed in the signal based on the updated amplitude threshold.

Pre-processing Beat-to-beat detection Features extraction Comparison
Correct P-P intervals  The continuous ECG data were sliced into shorter duration recordings of 5 min 30 s and aligned on the remaining PPG recordings by using their respective time stamps. ECG recordings were filtered by a forward-backward second-order Butterworth bandpass filter with a frequency band of (5, 30 Hz). Then, the second derivative of the Actiwave's accelerometer signal was computed, and ECG values corresponding to high accelerations were replaced by zero.

R-Peak Detection from the ECG
Since PPG and ECG recordings are distinct, two peak detection algorithms were implemented. R-peaks were detected from the ECG recordings using a modified version of the "adaptive and time-efficient algorithm" proposed by Qin et al. [19]. The algorithm was applied using a sliding window of 4 s with a 3 s overlap to account for amplitude changes during the night. ECG data obtained from each window were normalized by dividing the values by their maximum. Then, R-peaks were identified by searching for local maxima and selecting "true" R-peaks based on adaptive amplitude and time interval thresholds.

P-Peak Detection from the PPG
The waveform of the PPG signal indicates the changes in pulsatile blood flow from which the detection of signal peaks allows the calculation of peak-to-peak (P-P) intervals, which translates to a measure of HR [10]. Pulse peaks were detected by searching for minima and maxima whose relative difference exceeds a threshold based on the interquartile range of the data, as proposed by Navarro et al. [20]. We included in the algorithm an adaptive amplitude threshold and a backward search process to improve the algorithm and account for changes in the amplitude of the PPG signal. The first step was to initialize the algorithm on a 10 s window and a fixed threshold. The amplitude threshold was updated according to the detected extrema in the initialization phase. Then, a search for maxima and minima was performed in the signal based on the updated amplitude threshold. The latter was updated at each detection of extrema. If no maximum was detected for more than 5 s, the algorithm returned to the last detected maximum and divided the threshold by two until extrema were detected.

R-R and P-P Intervals Computation
R-R and P-P intervals were calculated from the R-peaks detected from the ECG and the P-peaks detected from PPG recordings, respectively. They were corrected for outliers (intervals < 600 ms and intervals > 1500 ms) and ectopic beats using linear interpolation. Periods of poor-quality PPG recordings were identified in the R-R and P-P intervals of the ECGs and PPGs and removed. These periods are defined by four or more consecutive invalid 4 s blocks. If three invalid 4 s blocks are separated by one valid block, the entire period is considered invalid. Figure 3 illustrates an example of P-P and R-R intervals computation from PPG and ECG recordings in two cases: PPG recording with and without poor-quality periods. In Figure 3A, the PPG recording does not present any invalid period, the resulted P-P interval is completely maintained, and the RR-interval is corrected from ectopic beat. On the other hand, there is an invalid period in the PPG recording, as shown in Figure 3B (highlighted in grey). This period of poor-quality recordings results in a misidentification of peaks and in erroneous P-P intervals. Thus, it is identified and the corresponding R-R and P-P intervals are eliminated (dotted grey lines). For the sake of simplicity, NN intervals will be used in the following to refer to both P-P and R-R corrected intervals. It is worth noting that there is a temporal shift of 2-3 s between ECG and PPG recordings coming from the delay between the internal clock of the two monitors. This delay does not affect the computation of HRV features.

R-R and P-P Intervals Computation
R-R and P-P intervals were calculated from the R-peaks detected from the ECG and the P-peaks detected from PPG recordings, respectively. They were corrected for outliers (intervals < 600 ms and intervals > 1500 ms) and ectopic beats using linear interpolation. Periods of poor-quality PPG recordings were identified in the R-R and P-P intervals of the ECGs and PPGs and removed. These periods are defined by four or more consecutive invalid 4 s blocks. If three invalid 4 s blocks are separated by one valid block, the entire period is considered invalid. Figure 3 illustrates an example of P-P and R-R intervals computation from PPG and ECG recordings in two cases: PPG recording with and without poor-quality periods. In Figure 3A, the PPG recording does not present any invalid period, the resulted P-P interval is completely maintained, and the RR-interval is corrected from ectopic beat. On the other hand, there is an invalid period in the PPG recording, as shown in Figure 3B (highlighted in grey). This period of poor-quality recordings results in a misidentification of peaks and in erroneous P-P intervals. Thus, it is identified and the corresponding R-R and P-P intervals are eliminated (dotted grey lines). For the sake of simplicity, NN intervals will be used in the following to refer to both P-P and R-R corrected intervals. It is worth noting that there is a temporal shift of 2-3 s between ECG and PPG recordings coming from the delay between the internal clock of the two monitors. This delay does not affect the computation of HRV features. Examples of PPG and ECG recordings with corresponding peaks detection and peak-topeak intervals. (A) Peaks detections from a good-quality PPG recording and an ECG recording. (B) Peaks detections and poor-quality periods identification from a noisy PPG recording and an ECG recording. Red cross markers represent the peaks detected. In the P-P and R-R intervals plots, the grey dotted lines present the initial computed P-P and R-R intervals that were eliminated in the correction phase, and the blue plain lines present the NN intervals after correction of outliers, ectopic beats, and removal of poor-quality periods. (B) Peaks detections and poor-quality periods identification from a noisy PPG recording and an ECG recording. Red cross markers represent the peaks detected. In the P-P and R-R intervals plots, the grey dotted lines present the initial computed P-P and R-R intervals that were eliminated in the correction phase, and the blue plain lines present the NN intervals after correction of outliers, ectopic beats, and removal of poor-quality periods.

HRV Features Extraction
Time-domain features: These included statistical and geometric features, computed using the Neurokit2 Python toolbox. Statistical features computed directly from NN intervals were mean (MeanNN), median (MedianNN), standard deviation (SDNN), coefficient of variation (CVNN), and interquartile range (IQRNN). Features computed from the differences of the successive NN intervals were the root mean square (RMSSD), standard deviation (SDSD), coefficient of variation (CVSD), and the proportion of successive NN intervals differing by more than 20 ms and 50 ms to the total number of NN intervals (pNN20 and pNN50, respectively) [2]. Geometric and distribution-related features included HRV triangular index (HTI), skewness, and kurtosis [21].

Frequency-domain features:
Since the NN intervals are unevenly sampled and may have missing data due to removed poor-quality periods, the Lomb-Scargle method was used to estimate the power spectral density [22]. Frequency-domain features included low-frequency power (LF; (0.04, 0.15 Hz)), high-frequency power (HF; (0.15, 0.4 Hz)), normalized low-and high-frequency powers (LFnu and HFnu, respectively), and the LF/HF power ratio. The calculation of the power in the very-low-frequency band requires a recording period of at least 5 min [21], which is applicable to short-duration recordings but not to ultra-short duration recordings. Therefore, it was not calculated among the frequency domain features. In addition, the frequency bands were adjusted according to the respiratory frequency measured by the Bora Band, and new frequency-domain features were computed over the adjusted bands [23]. The high-frequency band was centered on the respiratory frequency (F R ), as F R ± 0.05 Hz, and the low-frequency band was readjusted to eliminate any overlap with the high-frequency band. Frequency-domain features were computed using a custom Python code inspired by the Python pyHRV toolbox.
Nonlinear features: We computed the approximate entropy (ApEn), sample entropy (SampEn), and the short-range fractal correlations from the detrended fluctuation analysis (DFA-α1) [21]. From the Poincaré plot, SD1 and SD2 ellipse standard deviations, their ratio (SD1/SD2), and the ellipse area (S) were computed, as well as the cardiac sympathetic index (CSI) and cardiac vagal index (CVI), both extracted from SD1 and SD2 [24]. These features were computed using the Neurokit2 Python toolbox. We computed the acceleration (AC) and deceleration (DC) capacities of the heart rate [25], their total contributions (Ca, Cd), and the total variances of their contributions (SDNNa, SDNNd) using a custom Python code following the algorithm proposed by Piskorski et al. [26]. Moreover, visibility indices were computed from the global visibility graph (VG) and the horizontal visibility graph (HVG) using a MATLAB code and the MATLAB networks toolbox [27,28]. They included the mean degree (MD-VG, MD-HVG) of the nodes, the clusters coefficient (C-VG, C-HVG), the transitivity (Tr-VG, Tr-HVG), and the assortativity (r-VG, r-HVG) of the visibility graphs.
In the following, the HRV features extracted from the 5 min 30 s ECG recordings and PPG recordings are defined as HRV E5 and HRV F PX , respectively. E and P denote the ECG and PPG recordings. X represents the duration of recordings; X = 1 if recordings of 1 min 30 s duration are considered, and X = 5 if recordings of 5 min 30 s duration are considered. As previously noted, ECG recordings are analyzed for a duration of 5 min 30 s only and considered as the reference for the comparisons. F is the sampling rate of the analyzed PPG recordings of 25 Hz or 200 Hz.

Statistical Analysis
A Shapiro-Wilk test was performed to assess the normality of the data. As the data were not normally distributed, the extracted features were transformed using the natural logarithmic (log) transformation when necessary to allow for parametric statistical comparisons that assume normality. As previously mentioned, multiple recordings were obtained for each night and for each participant, which increases the number of measurements analyzed but increases the inter-subject variability. The agreement between HRV F PX and HRV E5 was evaluated using different techniques that were adapted to account for repeated measures and within-subject variability. First, the correlation between HRV E5 and HRV F PX was tested by calculating a repeated-measures correlation coefficient with 95% confidence intervals to consider the within-subject variability using the Pingouin Python package [29,30]. The correlation was defined as poor if the coefficient was <0.3, fair (<0.6), moderately strong (<0.7), strong (<0.9), and nearly perfect (>0.9).
Although correlation gives an indication of the strength of the relationship between the two methods, it does not necessarily guarantee agreement [9]. Therefore, Bland-Altman plots were constructed with mixed-effects limits of agreement (LoA) [31] to assess changes in bias by modifying duration or frequency. Differences in log-transformed HRV features (log(HRV F PX ) − log(HRV E5 )) were considered. The Bland-Altman method with mixedeffects LoA measures the mean bias (95% LoA) and the within-subject standard deviation with 95% confidence intervals. The Bland-Altman plots were constructed on RStudio using the code provided by Parker et al. [31].
In addition, the non-parametric Cliff's delta (δ) was used to test the agreement between the two methods. It was computed using the cliffs-delta Python package. A value of |δ| < 0.11 was considered negligible, |δ| < 0.28 a small effect, |δ| < 0.42 a moderate effect, and |δ| > 0.42 a large effect [32].
Finally, a non-parametric two one-sided equivalence test (TOST) was performed to test for equivalence, rather than differences, as in standard statistical tests [33]. Hence, an equivalence region of ±10% from the mean of the HRV E5 features was considered. Then, the null hypothesis of non-equivalence was tested separately on either side of the equivalence region to check if the HRV F PX features fell outside this region. The null hypothesis was rejected if both one-sided tests were rejected, indicating statistical equivalence. A p-value < 0.05 was considered significant. The TOST method was applied using the TOSTER package on RStudio.

Results
ECG and PPG measurements were recorded for 18 nights. Recordings from two nights were excluded from the study because the Actiwave's accelerometer signal was not recorded on one night and the Bora Band was set to record PPG data for durations of 1 min 30 s on the other night. From a total of 936 PPG recordings, 449 recordings with a duration of 5 min 30 s and 429 recordings with a duration of 1 min 30 s were maintained after an initial selection using the aforementioned quality metrics. These measurements were filtered and processed to extract HRV features that were compared with those extracted from ECG recordings of 5 min 30 s duration. The median and interquartile ranges of HRV E5 and HRV F PX features are provided in Table S1. Results of the statistical comparisons between HRV E5 and HRV F PX features and between HRV F P5 and HRV F P1 can be found in Table S2. Bland-Altman plots for a selection of HRV features can be found in Figure S1.

Comparison between ECG and PPG Measurements
A first comparison was performed between the HRV E5 features and the features from the Bora Band with the initial settings of 5 min 30 s duration and 25 Hz sampling frequency. Table 1 presents the HRV features where agreement between PPG and ECG recordings are validated with a correlation coefficient > 0.7 and Cliff's delta < 0.28. All time-domain features, except for the geometric ones, have high correlations and low Cliff's delta between PPG and ECG recordings. HRV 25 P5 features extracted directly from NN intervals are significantly equivalent to HRV E5 features (p-value < 0.05); however, those computed from the successive differences of NN intervals (RMSSD, SDSD, CVSD) are not statistically equivalent. All frequency-domain features have high correlations and low Cliff's delta. LF and LFnu in the standard and adjusted frequency bands and HFnu in the adjusted frequency bands are statistically equivalent between ECG and PPG recordings. For nonlinear features, only those shown in Table 1 have high correlations and low Cliff's delta, while all other extracted features have very low correlations between ECG and PPG recordings. Although SD1, SD1/SD2, S, and CSI show good agreement between ECG and PPG, they are not statistically equivalent. Among all the features of visibility graphs, only C-VG shows acceptable agreement between ECG and PPG recordings. Figure 4 illustrates the change in mean bias and 95% LoA for LF (F R ) and DFA-α1 as the duration of PPG recordings decreases from 5 min 30 s to 1 min 30 s. Detailed results for the HRV features can be found in Tables S1 and S2A. As the duration of PPG recordings decreases, the absolute value of the mean bias and the 95% LoA increase for almost all the extracted HRV features. In addition, correlation coefficients decrease, and fewer parameters show statistical equivalence between ECG and PPG recordings. HRV 25 P1 features that  Figure 4 illustrates the change in mean bias and 95% LoA for LF (FR) and DFA-α1 as the duration of PPG recordings decreases from 5 min 30 s to 1 min 30 s. Detailed results for the HRV features can be found in Tables S1 and S2A. As the duration of PPG recordings decreases, the absolute value of the mean bias and the 95% LoA increase for almost all the extracted HRV features. In addition, correlation coefficients decrease, and fewer parameters show statistical equivalence between ECG and PPG recordings. HRV 25 P1 features that present a good agreement with HRVE5 are MeanNN, MedianNN, SDNN, pNN50, HF (FR), LFnu (FR), SD2, S, CVI, SDNNa, and SDNNd, as shown in Table 2.   It could be contested that the effect of duration is assessed by comparing two different waveforms recorded at different sampling rates. To this end, we evaluated the agreement between short-and ultra-short-term HRV features extracted from PPGs recorded at a sampling rate of 25 Hz. Table 3 shows the HRV features for which the agreement between the 5 min 30 s and 1 min 30 s PPG recordings is validated by a correlation coefficient > 0.7, Cliff's delta < 0.28, and a validated equivalence test with a p-value < 0.05. The detailed results of the statistical tests can be found in Table S2B.

Effect of the Sampling Rate
As previously mentioned, we evaluated the effect of upsampling the PPG waveforms to 200 Hz over the default sampling rate of 25 Hz. Figure 5 shows the change in mean bias and 95% LoA for MeanNN, RMSSD, LFnu, SD1, DFA-α1, and Tr-HVG as the sampling rate of recordings increases for the two durations of 5 min 30 s and 1 min 30 s. As can be seen, increasing the recording frequency decreases the absolute value of the mean bias and the 95% LoA for almost all the extracted HRV features and for both recording durations. This can be explained by the increased temporal resolution of the waveforms (which is eight times higher at 200 Hz than at 25 Hz) and, thus, of the detected NN intervals.      Table 4 lists the HRV 200 P5 and HRV 200 P1 features, for which the agreement was validated with the reference HRV E5 features. Most HRV features in the time domain show good agreement between PPG and ECG for short and ultra-short recordings. However, the agreements for HRV features in frequency-domain, nonlinear, and visibility graphs are different depending on the recording durations. Those with good agreement between HRV 200 P1 and HRV E5 are: HF and LFnu in the adjusted frequency bands, SD1, SD2, S, and CVI from Poincaré plots, and SDNNa and SDNNd from accelerations and decelerations. Table 4. HRV features showing good agreement between PPG and ECG recordings. All features have strong correlation coefficients (>0.7), negligible Cliff's delta (δ < 0.11), and statistical equivalence (p-value < 0.05).

Discussion
The objective of this study was to assess the validity of the Bora Band PPG recordings for measuring short-term HRV compared with the reference ECG in healthy subjects. The effects of increasing the sampling frequency and decreasing the recording duration (ultrashort duration) were tested against standard 5 min 30 s ECG recordings. Additionally, the validity of ultra-short-term HRV analysis was evaluated by comparing ultra-short and short PPG recordings of the same sampling frequency.
The literature reviews identified studies that addressed the validity and reliability of ultra-short-term HRV analysis compared with short-term recordings [9,15]. Most existing studies have assessed agreement using correlation and/or statistical differences tests. If performed alone, these tests are insufficient to draw conclusions about the reliability of shorter-term recording because they do not control for measurement bias. Of the studies identified, only three studies assessed measurement bias using Bland-Altman plots with LoA followed by Cohen's d statistic [9]. Since then, recent studies have followed the recommendations of Shafer et al. and Pecchia et al. [9,15] to assess the reliability of ultrashort HRV analysis [34][35][36]. To our knowledge, there are no studies that have evaluated the reliability of ultra-short-term HRV features using equivalence tests; instead, they have used standard statistical tests of difference. Applying these tests is inappropriate since they are intended to detect differences, not equivalence; non-significant group differences are not necessarily evidence of equivalence [9,37]. Therefore, in the current study, we evaluated the agreement between HRV features extracted from PPG and ECG recordings using the correlation coefficients, Cliff's delta, and Bland-Altman plots, as recommended in the literature [9,15], but also tested the equivalence between the two methods using equivalence tests.
This study showed that the Bora Band is valid for measuring all short-term (5 min 30 s) HRV features in the time and frequency domains and most nonlinear and visibility graphs features. Upsampling the PPG recordings to 200 Hz provided adequate temporal resolution with respect to the reference ECG recordings, resulting in better agreements with reference HRV features. Consistent with the findings of Shaffer et al. [21], this paper showed that the low sampling frequency influenced the validity of the HRV features. Indeed, time-domain features extracted from successive differences of NN intervals, such as RMSSD, SDSD, CVSD, and pNN20, showed lower correlation coefficients, higher Cliff's delta, and nonequivalence when extracted from PPGs recorded at 25 Hz rather than 200 Hz because these features depend directly on the temporal resolution of the processed data. Similar trends were observed for frequency-domain features, nonlinear features, and visibility graphs features. In this study, the ECG was recorded at a sampling rate of 256 Hz, whereas the PPG was recorded at a sampling rate of 25 Hz, which is 10 times lower than the sampling rate of the ECG. The heart rate signal obtained from the PPG has a temporal resolution (40 ms) that is 10 times less than that obtained from the ECG (4 ms), which induces a difference in the HRV features extracted from the two recordings and affects the agreement and correlation. In addition, the distinct nature of the two waveforms affects the calculation of HRV features and, thus, the agreement. Based on these findings, the Bora Band might be set to record with a sampling frequency of 25 Hz; then, the PPG recordings could be upsampled to 200 Hz in the processing algorithm to increase the temporal resolution. This way, battery life and storage would be preserved while ensuring agreement with ECG reference data.
The validity of the ultra-short-term HRV features was assessed by comparison with the short-term HRV features computed first from the standard goal for HRV features computation (ECG) and then from the same signal (PPG). The results showed concordance between the two validity assessments. In general, research on assessing the validity of ultrashort-term (<2 min) HRV features against short-term features using robust methodological evidence and a large set of HRV features is limited. Due to their practical use, most studies have considered only time-domain features or have added frequency-domain and some nonlinear features. In the time-domain, the main features whose validity was evaluated in ultra-short recordings were the MeanNN, SDNN, RMSSD, and pNN50 [34][35][36][38][39][40], tested alone or in combination with other HRV features. In agreement with our results, these studies found that the tested time-domain HRV features provided strong correlations and high agreement with small bias whenever the latter was assessed.
However, we could not draw a common conclusion from studies that evaluated the validity of ultra-short-term frequency-domain features of HRV [34,35,38,40]. As with our study, the same frequency-domain features were evaluated in ultra-short length ECG recordings using correlation and Bland-Altman plots with a 95% LoA, followed by a Student's t-test [40]. Only LF was valid in 1 min 30 s recordings, whereas HF, LFnu, HFnu, and LF/HF were not valid in recordings shorter than 3 min. LFnu and HFnu also required at least 3 min of recording to be valid with 5 min ECG recordings [34]. Conversely, Wehler et al. tested the validity of ultra-short LF and HF features from ECG to find that both features were reliable and valid on 1 min 30 s recordings based on the assessment of correlations, Cliff's delta, and log-transformed Bland-Altman plots with 95% LoA [35]. Likewise, strong correlations were found for HF between 1 min and 5 min ECG recordings [38]. In the latter study, LF and LF/HF were not assessed in the 1 min recordings because of theoretical doubts about the loss of information for these features. It is noteworthy that all these studies extracted frequency-domain features over the standard frequency bands and used different methods to estimate the spectral density. According to our results, the HF computed over the standard frequency band was valid only in the 1 min 30 s duration recordings when compared with the 5 min 30 s PPG recordings and not with the 5 min 30 s ECG recordings. However, HF and LFnu, computed over the adjusted frequency bands, provided a surrogate for the 5 min 30 s features computed from both PPG and ECG recordings.
Agreement and validity studies of nonlinear HRV features included SD1 and SD2 from Poincaré plots, ApEn, SampEn, and DFA-α1 [38,40]. These two studies showed that only SD1 and SD2 were valid on recordings ≤ 1 min 30 s. These findings were consistent with those shown in the current study. In addition, in this study, we tested a broader set of nonlinear features to find that most features computed from the Poincaré plots and SDNNa and SDNNd of accelerations and decelerations were valid in ultra-short duration recordings. Moreover, this is the first study that evaluated the features from visibility graphs for PPG-ECG agreement and their validity in ultra-short duration recordings. Features related to connectivity between nodes of global visibility graphs showed good agreements between ECG and PPG for short-term recordings, but none of the features were valid in ultra-short length recordings.
Despite the confidence in the results reported in this paper, this study has some limitations. The results presented in this study were measured from a small number of participants. As only five participants were available for the current study, our results need to be validated on a larger database. Second, the reader should keep in mind that the results are valid for a comparison with an ECG sampled at 256 Hz. It is worth recalling that HRV is often studied with an ECG sampled at 1000 Hz and that a dedicated study should have been carried out at this sampling rate.

Clinical Interest
The analysis of HRV allows for the profiling of the variability and complexity of the heart rhythm and the autonomic nervous system involved. HRV features were first associated with health to predict fetal distress [41]. Then, many studies were conducted to understand HRV and assess HRV analysis in the prevention of diseases or mortality [2,9,21]. As acute exacerbations of COPD are one of the main causes of morbidity and mortality worldwide and result in high healthcare costs, they need to be detected and treated early. COPD has been associated with functional alterations in sympathetic and parasympathetic activities, which could be studied using HRV features [6]. Most studies that assessed HRV analysis in COPD populations included stable COPD patients or exacerbated COPD patients 24-48 h after treatment [5,6]. A limited number of studies have proposed longitudinal designs for COPD patients based on monthly home visits to record physiological signals [42]. These designs do not provide continuous monitoring of patient health status, making early detection of exacerbations difficult.
As previously mentioned, the ultimate goal of the Bora Band is its implementation in the continuous monitoring of COPD patients in their daily life to detect exacerbations early, before their occurrences. In this way, patients with exacerbation symptoms could be taken care of by clinicians to start the treatment and avoid hospital admission. Future work would address the monitoring of the valid ultra-short-term HRV features in COPD patients and the implementation of an algorithm for predicting exacerbations.

Conclusions
Overall, this study is the first to provide methodologically complete and robust evidence for the evaluation of the agreement of HRV features between ECG and PPG recordings in short-term recordings and of the reliability of PPG in ultra-short-term recordings. This allowed us to draw certain conclusions about the validity of HRV analysis extracted from the Bora Band recordings over short and ultra-short durations. According to the results, recording PPG at 25 Hz for durations of 5 min 30 s every 10 min would provide the best agreement with ECG reference data if the PPG recordings were upsampled to 200 Hz in the processing algorithm. If there is a need to shorten the recording duration (to spare battery power, for instance), one should be careful when extracting the HRV features because not all of them were valid in the ultra-short recordings. To confirm these findings, it would be interesting to consider a larger database with more participants included in the experiment. Further work would be to evaluate HRV analysis for the prediction of exacerbations in COPD patients.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/s22207995/s1, Table S1: Descriptive statistics of HRV features obtained from ECG and PPG recordings; Table S2: Statistical comparisons between HRV features extracted from ECG recordings (5 min 30 s) and those extracted from PPG recordings; Figure S1  Institutional Review Board Statement: This preliminary study was conducted in order to prepare the DACRE study (Study of Physiological Signals During and After COPD Exacerbations, NCT04034901).

Informed Consent Statement:
Informed consent was obtained from all subjects involved in the study.
Data Availability Statement: This study was conducted by an industrial company (Biosency), and, for protection and confidentiality reasons, the database will not be shared.